SoCC 2023
Meta Info
Homepage: https://acmsocc.org/2023/
Paper list: https://acmsocc.org/2023/accepted-papers.html
Papers
Resource Allocation
Lifting the Fog of Uncertainties: Dynamic Resource Orchestration for the Containerized Cloud [Paper]
UofT
Adaptively configure resource parameters
Built on contextual bandit techniques
Balance between performance and resource cost
Not All Resources are Visible: Exploiting Fragmented Shadow Resources in Shared-State Scheduler Architecture [Paper]
SJTU & Huawei
Shared-state schedulers: A central state view periodically updates the global cluster status to distributed schedulers
Shadow resources: Resources invisible to shared-state schedulers until the next view update
Resource Miner (RMiner) includes a shadow resource manager to manage shadow resources, an RM filter to select suitable tasks as RM tasks, an RM scheduler to allocate shadow resources to RM tasks
Gödel: Unified large-scale resource management and scheduling at ByteDance [Paper]
ByteDance & UVA
Industry Paper
A unified infrastructure for all business groups to run their diverse workloads
Built upon Kubernetes
Machine Learning
Anticipatory Resource Allocation for ML Training Clusters [Paper]
Microsoft Research & UW
Schedule based on predictions of future job arrivals and durations
Deal with prediction errors
tf.data service: A Case for Disaggregating ML Input Data Processing [Paper]
Google & ETH
Industry Paper
A disaggregated input data processing service built on top of tf.data in TensorFlow
Horizontally scale out to right-size host resources (CPU/RAM) for data processing in each job
Share ephemeral preprocessed data results across jobs
Coordinated reads to avoid stragglers
Is Machine Learning Necessary for Cloud Resource Usage Forecasting? [Paper]
IMDEA Software Institute
Vision Paper
Question: Whether complex machine learning models are necessary to use?
Proposal: Practical memory management systems need to first identify the extent to which simple solutions can be effective.
Serverless Computing
Golgi: Performance-Aware, Resource-Efficient Function Scheduling for Serverless Computing [Paper]
HKUST & WeBank
Best Paper Award!
A scheduling system for serverless functions to minimize resource provisioning costs while meeting the function latency requirements
Overcommit functions based on their past resource usage; Identify nine low-level metrics (e.g., request load, resource allocation, contention on shared resources); Use the Mondrian Forest to predict the function performance
Employ a conservative exploration-exploitation strategy for request routing; By default, route requests to non-overcommitted instances; Explore to use overcommitted instances
Vertical scaling to dynamically adjust the concurrency of overcommitted instances
Parrotfish: Parametric Regression for Optimizing Serverless Functions [Paper]
UBC & UTokyo & INSAT
Find optimal configurations through an online learning process
Use parametric regression to choose the right memory configurations for serverless functions
AsyFunc: A High-Performance and Resource-Efficient Serverless Inference System via Asymmetric Functions [Paper] [Code]
HUST & Huawei & Peng Cheng Laboratory
Problem: The time-consuming and resource-hungry model-loading process when scaling out function instances
Observation: The sensitivity of each layer to the computing resources is mostly anti-correlated with its memory resource usage
Asymmetric Functions
The original Body Function loads a complete model to meet stable demands
The proposed lightweight Shadow Function only loads a portion of resource-sensitive layers to deal with sudden demands effortlessly
AsyFunc — an inference serving system with an auto-scaling and scheduling engine; Built on top of Knative
Chitu: Accelerating Serverless Workflows with Asynchronous State Replication Pipeline [Paper] [Code]
ISCAS & ICT, CAS
Asynchronous State Replication Pipelines (ASRP) to speed up serverless workflows for general applications
Three insights
Provide differentiable data types (DDT) at the programming model level to support incremental state sharing and computation
Continuously deliver changes of DDT objects in real-time
Direct communication and change propagation
Built atop OpenFaaS
How Does It Function? Characterizing Long-term Trends in Production Serverless Workloads [Paper] [Trace]
Huawei
Industry Paper
Two new serverless traces in Huawei Cloud
The first trace: Huawei's internal workloads; Per-second statistics for 200 functions
The second trace: Huawei's public FaaS platform; Per-minute arrival rates for over 5000 functions
Characterize resource consumption, cold-start times, programming languages used, periodicity, per-second versus per-minute burstiness, correlations, and popularity.
Findings
Requests vary by up to 9 orders of magnitude across functions, with some functions executed over 1 billion times per day
Scheduling time, execution time and cold-start distributions vary across 2 to 4 orders of magnitude and have very long tails
Function invocation counts demonstrate strong periodicity for many individual functions and on an aggregate level
The need for further research in estimating resource reservations and time-series prediction
Function as a Function [Paper]
ETH
Vision Paper
Dandelion -- a clean state FaaS system; Treat serverless functions as pure functions; Explicitly separate computation and I/O; Hardware acceleration; Enable dataflow-aware function orchestration
The Gap Between Serverless Research and Real-world Systems [Paper]
SJTU & Huawei Cloud
Vision Paper
Five open challenges
Optimize cold start latency: Most existing works only consider synchronous starts; Asynchronous start in Industry
Declarative approach: Whether Kubernetes is the right system for serverless computing?
Scheduling cost
Balance different scheduling policies within a serverless system
Costs of sidecar
Sustainable Computing
Sustainable Supercomputing for AI: GPU Power Capping at HPC Scale [Paper]
MIT & NEU
Significant decreases in both temperature and power draw, reducing power consumption and potentially improving hardware life-span, with minimal impact on job performance
Last updated