📜
Awesome Papers
search
Ctrlk
  • Introduction
  • Paper List
    • Systems for MLchevron-right
    • ML for Systems
    • Artificial Intelligence (AI)chevron-right
    • Hardware Virtualizationchevron-right
    • Resource Disaggregationchevron-right
    • Resource Fragmentation
    • Cloud Computingchevron-right
    • Remote Direct Memory Access (RDMA)
    • Research Skills
    • Miscellaneous
  • Reading Notes
    • Conferencechevron-right
    • Journalchevron-right
    • Miscellaneouschevron-right
      • arXivchevron-right
        • 2024chevron-right
        • 2023chevron-right
          • HexGen: Generative inference of foundation model over heterogeneous decentralized environment
          • High-throughput generative inference of large language models with a single GPU
        • 2022chevron-right
        • 2016chevron-right
      • MSR Technical Reportchevron-right
  • About Myself
    • Academic Profilearrow-up-right-from-square
    • Personal Blog (in Chinese)arrow-up-right-from-square
gitbookPowered by GitBook
block-quoteOn this pagechevron-down
githubEdit
  1. Reading Noteschevron-right
  2. Miscellaneouschevron-right
  3. arXiv

2023

HexGen: Generative inference of foundation model over heterogeneous decentralized environmentchevron-rightHigh-throughput generative inference of large language models with a single GPUchevron-right

Last updated 2 years ago