VLDB 2024
Meta Info
Homepage: https://vldb.org/2024/
Paper list
Papers
Resource Management
Big data analytic workloads
Intelligent Pooling: Proactive Resource Provisioning in Large-scale Cloud Service [Paper]
Microsoft
Predict usage patterns using a hybrid ML model; optimize the pool size dynamically.
Job scheduling
ResLake: Towards Minimum Job Latency and Balanced Resource Utilization in Geo-distributed Job Scheduling
ByteDance
Autoscaling
OptScaler: A Collaborative Framework for Robust Autoscaling in the Cloud [arXiv]
Ant Group
Serverless
Resource Management in Aurora Serverless [Paper]
AWS
Industry Paper
Model Serving
Approximate Inference
Biathlon: Harnessing Model Resilience for Accelerating ML Inference Pipelines [Paper] [arXiv] [Code]
CUHK
Approximate input features to accelerate inference pipelines.
Trade-off between latency and accuracy.
Evaluation: All inference pipelines were implemented using Python and scikit-learn; run on CPU servers.
InferDB: In-Database Machine Learning Inference Using Indexes [Paper]
Hasso Plattner Institute & University of Potsdam & University of Illinois Chicago
Approximate ML inference pipelines using index structures available in DBMS.
Predictions are preserved in the embedding space; select binned features for indexing.
IMO: Aggressive...
Edge Computing
SmartLite: A DBMS-based Serving System for DNN Inference in Resource-constrained Environments [Paper] [Code]
ZJU & Alibaba
SmartLite, a lightweight DBMS
Store the parameters and structural information of neural networks as database tables.
Implement neural network operators inside the DBMS engine.
Quantize model parameters as binarized values, apply neural pruning techniques to compress the models, and transform tensor manipulations into value lookup operations of the DBMS.
Notebook
Feature Stores
RALF: Accuracy-Aware Scheduling for Feature Store Maintenance [Paper]
UC Berkeley
Limitations of existing works
Naively apply a one-size-fits-all policy as to when/how to update these features.
Do not consider query access patterns or impacts on prediction accuracy.
Feature store regret: a metric for how much featurization degrades downstream accuracy.
Leverage downstream error feedback to minimize feature store regret.
Data Pre-processing
FusionFlow: Accelerating Data Preprocessing for Machine Learning with CPU-GPU Cooperation [Paper] [Code]
UNIST
Cooperatively utilizes both CPUs and GPUs to accelerate the data preprocessing stage of DL training that runs the data augmentation algorithm.
Orchestrate data preprocessing tasks across CPUs and GPUs while minimizing interference with GPU-based model training.
Deep Learning Recommendation Model (DLRM)
Graph Neural Network (GNN)
Last updated