# DisaggRec: Architecting disaggregated systems for large-scale personalized recommendation

## Meta Info

Presented in [arxiv:2212.00939](https://arxiv.org/abs/2212.00939).

Authors: Liu Ke (*Meta AI & UWash*), Xuan Zhang (*UWash*), Benjamin Lee (*Meta AI & UPenn*), G. Edward Suh (*Meta AI & Cornell*), Hsien-Hsin S. Lee (*Meta AI & Intel*).

## Understanding the paper

### TL;DRs

This paper presents **DisaggRec**, *a disaggregated system* for *large-scale recommendation serving*, that decouples the compute and memory resources.

### Terminology

* Node
  * **Compute nodes (CNs)**: supply high-performance processors but only a limited amount of memory.
  * **Memory nodes (MNs)**: supply high-capacity DRAM devices.
* Strategy
  * **Scale-up**: equip *a single server* with *sufficient resources* to serve end-to-end model inference.
  * **Scale-out**: the model's SparseNet is *sharded and distributed across multiple servers* when *the embedding tables* cannot fit into a single server's memory.

### Problems

* Monolithic servers provision computing and memory in *fixed proportions*, leading to *idle resources* and *wasted costs*.

### Technical Details

* *Co-optimize the partitioning strategies* for recommendation models and design strategies for disaggregated CNs and MNs.
* Minimize the cost subject to latency targets and availability requirements.
* Focus on *two industry-grade models* — a memory-intensive RM1 and a compute-intensive RM2.

### System Architecture

<figure><img src="https://819228986-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MkzeiawY8SkBarQBDVm-659326392%2Fuploads%2FqBC1z15aNHal8EEtisY5%2Fimage.png?alt=media&#x26;token=5116d6cb-83ca-4e83-8251-c021815a8af0" alt=""><figcaption><p>Disaggregated System Architecture.</p></figcaption></figure>

### Model Serving

<figure><img src="https://819228986-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MkzeiawY8SkBarQBDVm-659326392%2Fuploads%2FH3FkmrXVMebzGQEWzI5v%2Fimage.png?alt=media&#x26;token=b233ede6-0272-4690-b84d-f79b75bf6f49" alt=""><figcaption><p>RPC-based Model Serving.</p></figcaption></figure>
