Deep Learning Recommendation Model (DLRM)
DLRM Training
Accelerating Neural Recommendation Training with Embedding Scheduling (NSDI 2024) [Paper] [Slides] [Code]
HKUST
Herald: an adaptive location-aware inputs allocator to determine where embeddings should be trained and an optimal communication plan generator to determine which embeddings should be synchronized.
DLRM Inference
DisaggRec: Architecting Disaggregated Systems for Large-Scale Personalized Recommendation (arXiv 2212.00939) [Personal Notes] [Paper]
Meta AI & WashU & UPenn & Cornell & Intel
Disaggregated system; decouple CPUs and memory resources; partition embedding tables.
Pruning
GPU Cache
UGache: A Unified GPU Cache for Embedding-based Deep Learning (SOSP 2023) [Personal Notes] [Paper]
SJTU
A unified multi-GPU cache system.
Used for GNN training and DLR inference.
EVStore: Storage and Caching Capabilities for Scaling Embedding Tables in Deep Recommendation Systems (ASPLOS 2023) [Personal Notes] [Paper] [Code]
UChicago & Beijing University of Technology & Bandung Institute of Technology, Indonesia & Seagate Technology & Emory
A caching layer optimized for embedding access patterns.
Model Update
Acronyms
DLRM: Deep Learning Recommendation Model
Last updated