PPoPP 2025

Meta Info

20.1% (= 38 / 189)

LLM Training
- ATTNChecker: Highly-Optimized Fault Tolerant Attention for Large Language Model Training
  - Oregon & Pacific Northwest National Laboratory & William and Mary
- Mario: Near Zero-cost Activation Checkpointing in Pipeline Parallelism
  - ICT, CAS
- WeiPipe: Weight Pipeline Parallelism for Communication-Effective Long-Context Large Model Training
  - THU & NUS & CETHIK & Lynxi Technology
LLM Inference
- MARLIN: Mixed-Precision Auto-Regressive Parallel Inference on Large Language Models
  - ISTA & Universidade da Coruña & ETH & IST Austria

MoE Training
- Harnessing Inter-GPU Shared Memory for Seamless MoE Communication-Computation Fusion
  - WHU & NVIDIA & UMacau

GNN Training
- Adaptive Parallel Training for Graph Neural Networks [Code]
  - CUHK
GNN Inference
- Helios: Efficient Distributed Dynamic Graph Sampling for Online GNN Inference
  - ZJU & Alibaba

SGDRC: Software-Defined Dynamic Resource Control for Concurrent DNN Inference on NVIDIA GPUs
- HKUST

Acc-SpMM: Accelerating General-purpose Sparse Matrix-Matrix Multiplication with GPU Tensor Cores
- Computer Network Information Center, CAS & RUC & Hangzhou Dianzi University
FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores
- BUPT

Last updated 5 months ago

Was this helpful?