SoCC 2023
Last updated
Was this helpful?
Last updated
Was this helpful?
Homepage:
Paper list:
Lifting the Fog of Uncertainties: Dynamic Resource Orchestration for the Containerized Cloud []
UofT
Adaptively configure resource parameters
Built on contextual bandit techniques
Balance between performance and resource cost
Not All Resources are Visible: Exploiting Fragmented Shadow Resources in Shared-State Scheduler Architecture []
SJTU & Huawei
Shared-state schedulers: A central state view periodically updates the global cluster status to distributed schedulers
Shadow resources: Resources invisible to shared-state schedulers until the next view update
Resource Miner (RMiner) includes a shadow resource manager to manage shadow resources, an RM filter to select suitable tasks as RM tasks, an RM scheduler to allocate shadow resources to RM tasks
Gödel: Unified large-scale resource management and scheduling at ByteDance []
ByteDance & UVA
Industry Paper
A unified infrastructure for all business groups to run their diverse workloads
Built upon Kubernetes
Microsoft Research & UW
Schedule based on predictions of future job arrivals and durations
Deal with prediction errors
Google & ETH
Industry Paper
A disaggregated input data processing service built on top of tf.data in TensorFlow
Horizontally scale out to right-size host resources (CPU/RAM) for data processing in each job
Share ephemeral preprocessed data results across jobs
Coordinated reads to avoid stragglers
IMDEA Software Institute
Vision Paper
Question: Whether complex machine learning models are necessary to use?
Proposal: Practical memory management systems need to first identify the extent to which simple solutions can be effective.
HKUST & WeBank
Best Paper Award!
A scheduling system for serverless functions to minimize resource provisioning costs while meeting the function latency requirements
Overcommit functions based on their past resource usage; Identify nine low-level metrics (e.g., request load, resource allocation, contention on shared resources); Use the Mondrian Forest to predict the function performance
Employ a conservative exploration-exploitation strategy for request routing; By default, route requests to non-overcommitted instances; Explore to use overcommitted instances
Vertical scaling to dynamically adjust the concurrency of overcommitted instances
UBC & UTokyo & INSAT
Find optimal configurations through an online learning process
Use parametric regression to choose the right memory configurations for serverless functions
HUST & Huawei & Peng Cheng Laboratory
Problem: The time-consuming and resource-hungry model-loading process when scaling out function instances
Observation: The sensitivity of each layer to the computing resources is mostly anti-correlated with its memory resource usage
Asymmetric Functions
The original Body Function loads a complete model to meet stable demands
The proposed lightweight Shadow Function only loads a portion of resource-sensitive layers to deal with sudden demands effortlessly
AsyFunc — an inference serving system with an auto-scaling and scheduling engine; Built on top of Knative
ISCAS & ICT, CAS
Asynchronous State Replication Pipelines (ASRP) to speed up serverless workflows for general applications
Three insights
Provide differentiable data types (DDT) at the programming model level to support incremental state sharing and computation
Continuously deliver changes of DDT objects in real-time
Direct communication and change propagation
Built atop OpenFaaS
Huawei
Industry Paper
Two new serverless traces in Huawei Cloud
The first trace: Huawei's internal workloads; Per-second statistics for 200 functions
The second trace: Huawei's public FaaS platform; Per-minute arrival rates for over 5000 functions
Characterize resource consumption, cold-start times, programming languages used, periodicity, per-second versus per-minute burstiness, correlations, and popularity.
Findings
Requests vary by up to 9 orders of magnitude across functions, with some functions executed over 1 billion times per day
Scheduling time, execution time and cold-start distributions vary across 2 to 4 orders of magnitude and have very long tails
Function invocation counts demonstrate strong periodicity for many individual functions and on an aggregate level
The need for further research in estimating resource reservations and time-series prediction
ETH
Vision Paper
Dandelion -- a clean state FaaS system; Treat serverless functions as pure functions; Explicitly separate computation and I/O; Hardware acceleration; Enable dataflow-aware function orchestration
SJTU & Huawei Cloud
Vision Paper
Five open challenges
Optimize cold start latency: Most existing works only consider synchronous starts; Asynchronous start in Industry
Declarative approach: Whether Kubernetes is the right system for serverless computing?
Scheduling cost
Balance different scheduling policies within a serverless system
Costs of sidecar
MIT & NEU
Significant decreases in both temperature and power draw, reducing power consumption and potentially improving hardware life-span, with minimal impact on job performance
Anticipatory Resource Allocation for ML Training Clusters []
tf.data service: A Case for Disaggregating ML Input Data Processing []
Is Machine Learning Necessary for Cloud Resource Usage Forecasting? []
Golgi: Performance-Aware, Resource-Efficient Function Scheduling for Serverless Computing []
Parrotfish: Parametric Regression for Optimizing Serverless Functions []
AsyFunc: A High-Performance and Resource-Efficient Serverless Inference System via Asymmetric Functions [] []
Chitu: Accelerating Serverless Workflows with Asynchronous State Replication Pipeline [] []
How Does It Function? Characterizing Long-term Trends in Production Serverless Workloads [] []
Function as a Function []
The Gap Between Serverless Research and Real-world Systems []
Sustainable Supercomputing for AI: GPU Power Capping at HPC Scale []