# VLDB 2024

## Meta Info

Homepage: <https://vldb.org/2024/>

### Paper list

* <https://vldb.org/2024/?program-schedule>
* <https://www.vldb.org/pvldb/volumes/17/>

## Papers

### Resource Management

* DL training workloads
  * Saturn: An Optimized Data System for Multi-Large-Model Deep Learning Workloads \[[Paper](https://www.vldb.org/pvldb/volumes/17/paper/Saturn%3A%20An%20Optimized%20Data%20System%20for%20Multi-Large-Model%20Deep%20Learning%20Workloads)] \[[Code](https://github.com/knagrecha/saturn)]
    * UCSD
    * **Saturn** -> SPASE: Select a Parallelism, Allocate resources, and Schedule; formulate the joint SPASE problem as an MILP.
* Big data analytic workloads
  * Intelligent Pooling: Proactive Resource Provisioning in Large-scale Cloud Service \[[Paper](https://www.vldb.org/pvldb/volumes/17/paper/Intelligent%20Pooling%3A%20Proactive%20Resource%20Provisioning%20in%20Large-scale%20Cloud%20Service)]
    * Microsoft
    * Predict usage patterns using a hybrid ML model; optimize the pool size dynamically.
* Job scheduling
  * ResLake: Towards Minimum Job Latency and Balanced Resource Utilization in Geo-distributed Job Scheduling
    * ByteDance
* Autoscaling
  * OptScaler: A Collaborative Framework for Robust Autoscaling in the Cloud \[[arXiv](https://arxiv.org/abs/2311.12864)]
    * Ant Group
* Serverless
  * Resource Management in Aurora Serverless \[[Paper](https://www.amazon.science/publications/resource-management-in-aurora-serverless)]
    * AWS
    * Industry Paper

### Model Serving

* Approximate Inference
  * Biathlon: Harnessing Model Resilience for Accelerating ML Inference Pipelines \[[Paper](https://dl.acm.org/doi/abs/10.14778/3675034.3675052)] \[[arXiv](https://arxiv.org/abs/2405.11191)] \[[Code](https://github.com/ChaokunChang/Biathlon)]
    * CUHK
      * *Approximate* input features to accelerate inference pipelines.
      * Trade-off between latency and accuracy.
      * Evaluation: All inference pipelines were implemented using Python and scikit-learn; run on CPU servers.
  * InferDB: In-Database Machine Learning Inference Using Indexes \[[Paper](https://www.vldb.org/pvldb/volumes/17/paper/InferDB%3A%20In-Database%20Machine%20Learning%20Inference%20Using%20Indexes)]
    * Hasso Plattner Institute & University of Potsdam & University of Illinois Chicago
    * Approximate ML inference pipelines using index structures available in DBMS.
    * Predictions are preserved in the embedding space; select binned features for indexing.
    * IMO: Aggressive...
* Edge Computing
  * SmartLite: A DBMS-based Serving System for DNN Inference in Resource-constrained Environments \[[Paper](https://www.vldb.org/pvldb/volumes/17/paper/SmartLite%3A%20A%20DBMS-based%20Serving%20System%20for%20DNN%20Inference%20in%20Resource-constrained%20Environments)] \[[Code](https://github.com/lynn2089/SmartLite)]
    * ZJU & Alibaba
    * SmartLite, a lightweight DBMS
      * Store the parameters and structural information of neural networks as database tables.
      * Implement neural network operators inside the DBMS engine.
      * Quantize model parameters as binarized values, apply neural pruning techniques to compress the models, and transform tensor manipulations into value lookup operations of the DBMS.

### Notebook

* ElasticNotebook: Enabling Live Migration for Computational Notebooks \[[Paper](https://www.vldb.org/pvldb/volumes/17/paper/ElasticNotebook%3A%20Enabling%20Live%20Migration%20for%20Computational%20Notebooks)] \[[arXiv](https://arxiv.org/abs/2309.11083)] \[[Code](https://github.com/illinoisdata/ElasticNotebook)]
  * UIUC & UMich
  * Live migration via checkpointing/restoration.
  * Reconstruct all variables from a subset of variables.

### Feature Stores

* RALF: Accuracy-Aware Scheduling for Feature Store Maintenance \[[Paper](https://www.vldb.org/pvldb/volumes/17/paper/RALF%3A%20Accuracy-Aware%20Scheduling%20for%20Feature%20Store%20Maintenance)]
  * UC Berkeley
  * Limitations of existing works
    * Naively apply a one-size-fits-all policy as to when/how to update these features.
    * Do not consider query access patterns or impacts on prediction accuracy.
  * *Feature store regret*: a metric for how much featurization degrades downstream accuracy.
  * Leverage *downstream error feedback* to minimize feature store regret.

### Data Pre-processing

* FusionFlow: Accelerating Data Preprocessing for Machine Learning with CPU-GPU Cooperation \[[Paper](https://www.vldb.org/pvldb/volumes/17/paper/FusionFlow%3A%20Accelerating%20Data%20Preprocessing%20for%20Machine%20Learning%20with%20CPU-GPU%20Cooperation)] \[[Code](https://github.com/omnia-unist/FusionFlow)]
  * UNIST
  * Cooperatively utilizes *both CPUs and GPUs* to accelerate the data preprocessing stage of DL training that runs the data augmentation algorithm.
  * Orchestrate data preprocessing tasks across CPUs and GPUs while minimizing *interference* with GPU-based model training.

### Deep Learning Recommendation Model (DLRM)

* DLRover: Resource Optimization for Deep Recommendation Models Training at AntGroup \[[arXiv](https://arxiv.org/abs/2304.01468)] \[[Code](https://github.com/intelligent-machine-learning/dlrover)]
  * AntGroup & Sichuan University

### Graph Neural Network (GNN)

* Accelerating Sampling and Aggregation Operations in GNN Frameworks with GPU Initiated Direct Storage Accesses \[[Paper](https://www.vldb.org/pvldb/volumes/17/paper/Accelerating%20Sampling%20and%20Aggregation%20Operations%20in%20GNN%20Frameworks%20with%20GPU%20Initiated%20Direct%20Storage%20Accesses)] \[[Code](https://github.com/jeongminpark417/GIDS)]
  * UIUC & NVIDIA
  * GIDS: GPU Initiated Direct Storage Access -> A data loader to utilize all hardware resources (i.e., CPU memory, storage, and GPU memory)
