# VLDB 2024

## Meta Info

Homepage: <https://vldb.org/2024/>

### Paper list

* <https://vldb.org/2024/?program-schedule>
* <https://www.vldb.org/pvldb/volumes/17/>

## Papers

### Resource Management

* DL training workloads
  * Saturn: An Optimized Data System for Multi-Large-Model Deep Learning Workloads \[[Paper](https://www.vldb.org/pvldb/volumes/17/paper/Saturn%3A%20An%20Optimized%20Data%20System%20for%20Multi-Large-Model%20Deep%20Learning%20Workloads)] \[[Code](https://github.com/knagrecha/saturn)]
    * UCSD
    * **Saturn** -> SPASE: Select a Parallelism, Allocate resources, and Schedule; formulate the joint SPASE problem as an MILP.
* Big data analytic workloads
  * Intelligent Pooling: Proactive Resource Provisioning in Large-scale Cloud Service \[[Paper](https://www.vldb.org/pvldb/volumes/17/paper/Intelligent%20Pooling%3A%20Proactive%20Resource%20Provisioning%20in%20Large-scale%20Cloud%20Service)]
    * Microsoft
    * Predict usage patterns using a hybrid ML model; optimize the pool size dynamically.
* Job scheduling
  * ResLake: Towards Minimum Job Latency and Balanced Resource Utilization in Geo-distributed Job Scheduling
    * ByteDance
* Autoscaling
  * OptScaler: A Collaborative Framework for Robust Autoscaling in the Cloud \[[arXiv](https://arxiv.org/abs/2311.12864)]
    * Ant Group
* Serverless
  * Resource Management in Aurora Serverless \[[Paper](https://www.amazon.science/publications/resource-management-in-aurora-serverless)]
    * AWS
    * Industry Paper

### Model Serving

* Approximate Inference
  * Biathlon: Harnessing Model Resilience for Accelerating ML Inference Pipelines \[[Paper](https://dl.acm.org/doi/abs/10.14778/3675034.3675052)] \[[arXiv](https://arxiv.org/abs/2405.11191)] \[[Code](https://github.com/ChaokunChang/Biathlon)]
    * CUHK
      * *Approximate* input features to accelerate inference pipelines.
      * Trade-off between latency and accuracy.
      * Evaluation: All inference pipelines were implemented using Python and scikit-learn; run on CPU servers.
  * InferDB: In-Database Machine Learning Inference Using Indexes \[[Paper](https://www.vldb.org/pvldb/volumes/17/paper/InferDB%3A%20In-Database%20Machine%20Learning%20Inference%20Using%20Indexes)]
    * Hasso Plattner Institute & University of Potsdam & University of Illinois Chicago
    * Approximate ML inference pipelines using index structures available in DBMS.
    * Predictions are preserved in the embedding space; select binned features for indexing.
    * IMO: Aggressive...
* Edge Computing
  * SmartLite: A DBMS-based Serving System for DNN Inference in Resource-constrained Environments \[[Paper](https://www.vldb.org/pvldb/volumes/17/paper/SmartLite%3A%20A%20DBMS-based%20Serving%20System%20for%20DNN%20Inference%20in%20Resource-constrained%20Environments)] \[[Code](https://github.com/lynn2089/SmartLite)]
    * ZJU & Alibaba
    * SmartLite, a lightweight DBMS
      * Store the parameters and structural information of neural networks as database tables.
      * Implement neural network operators inside the DBMS engine.
      * Quantize model parameters as binarized values, apply neural pruning techniques to compress the models, and transform tensor manipulations into value lookup operations of the DBMS.

### Notebook

* ElasticNotebook: Enabling Live Migration for Computational Notebooks \[[Paper](https://www.vldb.org/pvldb/volumes/17/paper/ElasticNotebook%3A%20Enabling%20Live%20Migration%20for%20Computational%20Notebooks)] \[[arXiv](https://arxiv.org/abs/2309.11083)] \[[Code](https://github.com/illinoisdata/ElasticNotebook)]
  * UIUC & UMich
  * Live migration via checkpointing/restoration.
  * Reconstruct all variables from a subset of variables.

### Feature Stores

* RALF: Accuracy-Aware Scheduling for Feature Store Maintenance \[[Paper](https://www.vldb.org/pvldb/volumes/17/paper/RALF%3A%20Accuracy-Aware%20Scheduling%20for%20Feature%20Store%20Maintenance)]
  * UC Berkeley
  * Limitations of existing works
    * Naively apply a one-size-fits-all policy as to when/how to update these features.
    * Do not consider query access patterns or impacts on prediction accuracy.
  * *Feature store regret*: a metric for how much featurization degrades downstream accuracy.
  * Leverage *downstream error feedback* to minimize feature store regret.

### Data Pre-processing

* FusionFlow: Accelerating Data Preprocessing for Machine Learning with CPU-GPU Cooperation \[[Paper](https://www.vldb.org/pvldb/volumes/17/paper/FusionFlow%3A%20Accelerating%20Data%20Preprocessing%20for%20Machine%20Learning%20with%20CPU-GPU%20Cooperation)] \[[Code](https://github.com/omnia-unist/FusionFlow)]
  * UNIST
  * Cooperatively utilizes *both CPUs and GPUs* to accelerate the data preprocessing stage of DL training that runs the data augmentation algorithm.
  * Orchestrate data preprocessing tasks across CPUs and GPUs while minimizing *interference* with GPU-based model training.

### Deep Learning Recommendation Model (DLRM)

* DLRover: Resource Optimization for Deep Recommendation Models Training at AntGroup \[[arXiv](https://arxiv.org/abs/2304.01468)] \[[Code](https://github.com/intelligent-machine-learning/dlrover)]
  * AntGroup & Sichuan University

### Graph Neural Network (GNN)

* Accelerating Sampling and Aggregation Operations in GNN Frameworks with GPU Initiated Direct Storage Accesses \[[Paper](https://www.vldb.org/pvldb/volumes/17/paper/Accelerating%20Sampling%20and%20Aggregation%20Operations%20in%20GNN%20Frameworks%20with%20GPU%20Initiated%20Direct%20Storage%20Accesses)] \[[Code](https://github.com/jeongminpark417/GIDS)]
  * UIUC & NVIDIA
  * GIDS: GPU Initiated Direct Storage Access -> A data loader to utilize all hardware resources (i.e., CPU memory, storage, and GPU memory)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://paper.lingyunyang.com/reading-notes/conference/vldb-2024.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
