# SoCC 2023

## Meta Info

Homepage: <https://acmsocc.org/2023/>

Paper list: <https://acmsocc.org/2023/accepted-papers.html>

## Papers

### Resource Allocation

* Lifting the Fog of Uncertainties: Dynamic Resource Orchestration for the Containerized Cloud \[[Paper](https://dl.acm.org/doi/10.1145/3620678.3624646)]
  * UofT
  * Adaptively configure resource parameters
  * Built on contextual bandit techniques
  * Balance between performance and resource cost
* Not All Resources are Visible: Exploiting Fragmented Shadow Resources in Shared-State Scheduler Architecture \[[Paper](https://dl.acm.org/doi/10.1145/3620678.3624650)]
  * SJTU & Huawei
  * Shared-state schedulers: A central state view *periodically* updates the *global* cluster status to *distributed* schedulers
  * Shadow resources: Resources invisible to *shared-state schedulers* until the next view update
  * Resource Miner (RMiner) includes a *shadow resource manager* to manage shadow resources, an *RM filter* to select suitable tasks as RM tasks, an *RM scheduler* to allocate shadow resources to RM tasks
* Gödel: Unified large-scale resource management and scheduling at ByteDance \[[Paper](https://dl.acm.org/doi/10.1145/3620678.3624663)]
  * ByteDance & UVA
  * Industry Paper
  * A unified infrastructure for all business groups to run their diverse workloads
  * Built upon Kubernetes

### Machine Learning

* Anticipatory Resource Allocation for ML Training Clusters \[[Paper](https://dl.acm.org/doi/10.1145/3620678.3624669)]
  * Microsoft Research & UW
  * Schedule based on *predictions of future job arrivals and durations*
  * Deal with prediction errors
* tf.data service: A Case for Disaggregating ML Input Data Processing \[[Paper](https://dl.acm.org/doi/10.1145/3620678.3624666)]
  * Google & ETH
  * Industry Paper
  * A disaggregated input data processing service built on top of tf.data in TensorFlow
  * Horizontally scale out to right-size host resources (CPU/RAM) for data processing in each job
  * Share ephemeral preprocessed data results across jobs
  * Coordinated reads to avoid stragglers
* Is Machine Learning Necessary for Cloud Resource Usage Forecasting? \[[Paper](https://dl.acm.org/doi/10.1145/3620678.3624790)]
  * IMDEA Software Institute
  * Vision Paper
  * Question: Whether *complex machine learning models* are necessary to use?
  * Proposal: Practical memory management systems need to first identify the extent to which simple solutions can be effective.

### Serverless Computing

* Golgi: Performance-Aware, Resource-Efficient Function Scheduling for Serverless Computing \[[Paper](https://dl.acm.org/doi/10.1145/3620678.3624645)]
  * HKUST & WeBank
  * **Best Paper Award!**
  * A scheduling system for serverless functions to *minimize resource provisioning costs* while *meeting the function latency requirements*
  * Overcommit functions based on their past resource usage; Identify nine low-level metrics (e.g., request load, resource allocation, contention on shared resources); Use the Mondrian Forest to predict the function performance
  * Employ a conservative exploration-exploitation strategy for request routing; By default, route requests to non-overcommitted instances; Explore to use overcommitted instances
  * Vertical scaling to dynamically adjust the concurrency of overcommitted instances
* Parrotfish: Parametric Regression for Optimizing Serverless Functions \[[Paper](https://dl.acm.org/doi/10.1145/3620678.3624654)]
  * UBC & UTokyo & INSAT
  * Find optimal configurations through an online learning process
  * Use parametric regression to choose the right memory configurations for serverless functions
* AsyFunc: A High-Performance and Resource-Efficient Serverless Inference System via Asymmetric Functions \[[Paper](https://dl.acm.org/doi/10.1145/3620678.3624664)] \[[Code](https://github.com/peiqiangyu/AsyFunc)]
  * HUST & Huawei & Peng Cheng Laboratory
  * Problem: The time-consuming and resource-hungry model-loading process when scaling out function instances
  * Observation: The sensitivity of each layer to the computing resources is mostly anti-correlated with its memory resource usage
  * Asymmetric Functions
    * The original Body Function loads a complete model to meet stable demands
    * The proposed lightweight Shadow Function only loads a portion of resource-sensitive layers to deal with sudden demands effortlessly
  * AsyFunc — an inference serving system with an auto-scaling and scheduling engine; Built on top of Knative
* Chitu: Accelerating Serverless Workflows with Asynchronous State Replication Pipeline \[[Paper](https://dl.acm.org/doi/10.1145/3620678.3624794)] \[[Code](https://github.com/sigserverless/chitu)]
  * ISCAS & ICT, CAS
  * **Asynchronous State Replication Pipelines (ASRP)** to speed up serverless workflows for general applications
  * Three insights
    * Provide differentiable data types (DDT) at the programming model level to support incremental state sharing and computation
    * Continuously deliver changes of DDT objects in real-time
    * Direct communication and change propagation
  * Built atop OpenFaaS
* How Does It Function? Characterizing Long-term Trends in Production Serverless Workloads \[[Paper](https://dl.acm.org/doi/10.1145/3620678.3624783)] \[[Trace](https://github.com/sir-lab/data-release)]
  * Huawei
  * Industry Paper
  * Two new serverless traces in Huawei Cloud
    * The first trace: Huawei's *internal* workloads; Per-second statistics for 200 functions
    * The second trace: Huawei's public FaaS platform; Per-minute arrival rates for over 5000 functions
  * Characterize resource consumption, cold-start times, programming languages used, periodicity, per-second versus per-minute burstiness, correlations, and popularity.
  * Findings
    * Requests vary by up to 9 orders of magnitude across functions, with some functions executed over 1 billion times per day
    * Scheduling time, execution time and cold-start distributions vary across 2 to 4 orders of magnitude and have very long tails
    * Function invocation counts demonstrate strong periodicity for many individual functions and on an aggregate level
  * The need for further research in *estimating resource reservations and time-series prediction*
* Function as a Function \[[Paper](https://dl.acm.org/doi/10.1145/3620678.3624648)]
  * ETH
  * Vision Paper
  * Dandelion -- a clean state FaaS system; Treat serverless functions as pure functions; Explicitly separate computation and I/O; Hardware acceleration; Enable dataflow-aware function orchestration
* The Gap Between Serverless Research and Real-world Systems \[[Paper](https://dl.acm.org/doi/10.1145/3620678.3624785)]
  * SJTU & Huawei Cloud
  * Vision Paper
  * Five open challenges
    * Optimize cold start latency: Most existing works only consider synchronous starts; Asynchronous start in Industry
    * Declarative approach: Whether Kubernetes is the right system for serverless computing?
    * Scheduling cost
    * Balance different scheduling policies within a serverless system
    * Costs of sidecar

### Sustainable Computing

* Sustainable Supercomputing for AI: GPU Power Capping at HPC Scale \[[Paper](https://dl.acm.org/doi/10.1145/3620678.3624793)]
  * MIT & NEU
  * Significant decreases in both temperature and power draw, reducing power consumption and potentially *improving hardware life-span*, with *minimal impact on job performance*


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://paper.lingyunyang.com/reading-notes/conference/socc-2023.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
