PPoPP 2025
Last updated
Was this helpful?
Last updated
Was this helpful?
Homepage:
Paper list:
20.1% (= 38 / 189)
LLM Training
ATTNChecker: Highly-Optimized Fault Tolerant Attention for Large Language Model Training
Oregon & Pacific Northwest National Laboratory & William and Mary
Mario: Near Zero-cost Activation Checkpointing in Pipeline Parallelism
ICT, CAS
WeiPipe: Weight Pipeline Parallelism for Communication-Effective Long-Context Large Model Training
THU & NUS & CETHIK & Lynxi Technology
LLM Inference
MARLIN: Mixed-Precision Auto-Regressive Parallel Inference on Large Language Models
ISTA & Universidade da Coruña & ETH & IST Austria
MoE Training
Harnessing Inter-GPU Shared Memory for Seamless MoE Communication-Computation Fusion
WHU & NVIDIA & UMacau
GNN Training
CUHK
GNN Inference
Helios: Efficient Distributed Dynamic Graph Sampling for Online GNN Inference
ZJU & Alibaba
SGDRC: Software-Defined Dynamic Resource Control for Concurrent DNN Inference on NVIDIA GPUs
HKUST
Acc-SpMM: Accelerating General-purpose Sparse Matrix-Matrix Multiplication with GPU Tensor Cores
Computer Network Information Center, CAS & RUC & Hangzhou Dianzi University
FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores
BUPT
Adaptive Parallel Training for Graph Neural Networks []