# DisaggRec: Architecting disaggregated systems for large-scale personalized recommendation

## Meta Info

Presented in [arxiv:2212.00939](https://arxiv.org/abs/2212.00939).

Authors: Liu Ke (*Meta AI & UWash*), Xuan Zhang (*UWash*), Benjamin Lee (*Meta AI & UPenn*), G. Edward Suh (*Meta AI & Cornell*), Hsien-Hsin S. Lee (*Meta AI & Intel*).

## Understanding the paper

### TL;DRs

This paper presents **DisaggRec**, *a disaggregated system* for *large-scale recommendation serving*, that decouples the compute and memory resources.

### Terminology

* Node
  * **Compute nodes (CNs)**: supply high-performance processors but only a limited amount of memory.
  * **Memory nodes (MNs)**: supply high-capacity DRAM devices.
* Strategy
  * **Scale-up**: equip *a single server* with *sufficient resources* to serve end-to-end model inference.
  * **Scale-out**: the model's SparseNet is *sharded and distributed across multiple servers* when *the embedding tables* cannot fit into a single server's memory.

### Problems

* Monolithic servers provision computing and memory in *fixed proportions*, leading to *idle resources* and *wasted costs*.

### Technical Details

* *Co-optimize the partitioning strategies* for recommendation models and design strategies for disaggregated CNs and MNs.
* Minimize the cost subject to latency targets and availability requirements.
* Focus on *two industry-grade models* — a memory-intensive RM1 and a compute-intensive RM2.

### System Architecture

<figure><img src="/files/0U1KX1FqFxxg7gbjvJCY" alt=""><figcaption><p>Disaggregated System Architecture.</p></figcaption></figure>

### Model Serving

<figure><img src="/files/QobriUqOafjvEuNgA6DZ" alt=""><figcaption><p>RPC-based Model Serving.</p></figcaption></figure>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://paper.lingyunyang.com/reading-notes/miscellaneous/arxiv/2022/disaggrec.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
