VLDB 2024

Meta Info

Homepage: https://vldb.org/2024/

Paper list

Papers

Resource Management

  • DL training workloads

    • Saturn: An Optimized Data System for Multi-Large-Model Deep Learning Workloads [Paper] [Code]

      • UCSD

      • Saturn -> SPASE: Select a Parallelism, Allocate resources, and Schedule; formulate the joint SPASE problem as an MILP.

  • Big data analytic workloads

    • Intelligent Pooling: Proactive Resource Provisioning in Large-scale Cloud Service [Paper]

      • Microsoft

      • Predict usage patterns using a hybrid ML model; optimize the pool size dynamically.

  • Job scheduling

    • ResLake: Towards Minimum Job Latency and Balanced Resource Utilization in Geo-distributed Job Scheduling

      • ByteDance

  • Autoscaling

    • OptScaler: A Collaborative Framework for Robust Autoscaling in the Cloud [arXiv]

      • Ant Group

  • Serverless

    • Resource Management in Aurora Serverless [Paper]

      • AWS

      • Industry Paper

Model Serving

  • Approximate Inference

    • Biathlon: Harnessing Model Resilience for Accelerating ML Inference Pipelines [Paper] [arXiv] [Code]

      • CUHK

        • Approximate input features to accelerate inference pipelines.

        • Trade-off between latency and accuracy.

        • Evaluation: All inference pipelines were implemented using Python and scikit-learn; run on CPU servers.

    • InferDB: In-Database Machine Learning Inference Using Indexes [Paper]

      • Hasso Plattner Institute & University of Potsdam & University of Illinois Chicago

      • Approximate ML inference pipelines using index structures available in DBMS.

      • Predictions are preserved in the embedding space; select binned features for indexing.

      • IMO: Aggressive...

  • Edge Computing

    • SmartLite: A DBMS-based Serving System for DNN Inference in Resource-constrained Environments [Paper] [Code]

      • ZJU & Alibaba

      • SmartLite, a lightweight DBMS

        • Store the parameters and structural information of neural networks as database tables.

        • Implement neural network operators inside the DBMS engine.

        • Quantize model parameters as binarized values, apply neural pruning techniques to compress the models, and transform tensor manipulations into value lookup operations of the DBMS.

Notebook

  • ElasticNotebook: Enabling Live Migration for Computational Notebooks [Paper] [arXiv] [Code]

    • UIUC & UMich

    • Live migration via checkpointing/restoration.

    • Reconstruct all variables from a subset of variables.

Feature Stores

  • RALF: Accuracy-Aware Scheduling for Feature Store Maintenance [Paper]

    • UC Berkeley

    • Limitations of existing works

      • Naively apply a one-size-fits-all policy as to when/how to update these features.

      • Do not consider query access patterns or impacts on prediction accuracy.

    • Feature store regret: a metric for how much featurization degrades downstream accuracy.

    • Leverage downstream error feedback to minimize feature store regret.

Data Pre-processing

  • FusionFlow: Accelerating Data Preprocessing for Machine Learning with CPU-GPU Cooperation [Paper] [Code]

    • UNIST

    • Cooperatively utilizes both CPUs and GPUs to accelerate the data preprocessing stage of DL training that runs the data augmentation algorithm.

    • Orchestrate data preprocessing tasks across CPUs and GPUs while minimizing interference with GPU-based model training.

Deep Learning Recommendation Model (DLRM)

  • DLRover: Resource Optimization for Deep Recommendation Models Training at AntGroup [arXiv] [Code]

    • AntGroup & Sichuan University

Graph Neural Network (GNN)

  • Accelerating Sampling and Aggregation Operations in GNN Frameworks with GPU Initiated Direct Storage Accesses [Paper] [Code]

    • UIUC & NVIDIA

    • GIDS: GPU Initiated Direct Storage Access -> A data loader to utilize all hardware resources (i.e., CPU memory, storage, and GPU memory)

Last updated