# Miscellaneous

- [arXiv](/reading-notes/miscellaneous/arxiv.md): A free distribution service and an open-access archive.
- [2024](/reading-notes/miscellaneous/arxiv/2024.md)
- [Efficiently programming large language models using SGLang](/reading-notes/miscellaneous/arxiv/2024/sglang.md): LLM Inference
- [2023](/reading-notes/miscellaneous/arxiv/2023.md)
- [HexGen: Generative inference of foundation model over heterogeneous decentralized environment](/reading-notes/miscellaneous/arxiv/2023/hexgen.md)
- [High-throughput generative inference of large language models with a single GPU](/reading-notes/miscellaneous/arxiv/2023/flexgen.md): An offloading framework for high-throughput LLM inference.
- [2022](/reading-notes/miscellaneous/arxiv/2022.md)
- [DisaggRec: Architecting disaggregated systems for large-scale personalized recommendation](/reading-notes/miscellaneous/arxiv/2022/disaggrec.md): #deep\_learning\_recommender\_system #memory\_disaggregation #total\_cost\_of\_ownership #RDMA
- [A case for disaggregation of ML data processing](/reading-notes/miscellaneous/arxiv/2022/tf-data.md)
- [Singularity: Planet-scale, preemptive and elastic scheduling of AI workloads](/reading-notes/miscellaneous/arxiv/2022/singularity.md): Live GPU job migration.
- [Aryl: An elastic cluster scheduler for deep learning](/reading-notes/miscellaneous/arxiv/2022/aryl.md)
- [2016](/reading-notes/miscellaneous/arxiv/2016.md)
- [Wide & deep learning for recommender systems](/reading-notes/miscellaneous/arxiv/2016/wide-and-deep-learning-for-recommender-systems.md): A recommender system with a wide & deep model (WDL).
- [Training deep nets with sublinear memory cost](/reading-notes/miscellaneous/arxiv/2016/training-deep-nets-with-sublinear-memory-cost.md): Reduce memory cost to store intermediate results and gradients.
- [MSR Technical Report](/reading-notes/miscellaneous/msr-technical-report.md)
- [2011](/reading-notes/miscellaneous/msr-technical-report/2011.md)
- [Heuristics for vector bin packing](/reading-notes/miscellaneous/msr-technical-report/2011/heuristics-for-vector-bin-packing.md)
