SAND: A New Programming Abstraction for Video-based Deep Learning

solve heavy preprocessing problem with abstraction & planning + priority scheduling to reuse

Featured image

Venue: SOSP 2025 doi

Topic: Video-based Deep Learning, abstraction, scheduling, Delete redundancy


Summary

solve heavy preprocessing problem with abstraction & planning + priority scheduling to reuse


Background

Video-based Deep Learning’s pre-processing is difficult because: processing compressed video data is complex and bottleneck

Complex Implementation: Pipelines

Pipelines incurs complex implementation burden. 2x codes than model training.

Overhead

Normally, CPU does that but

overhead is upto 6.5x higher than GPU case -> bottleneck of the entire process

Offload to GPU

Reduce overhead but partially, and Reduce available memory for training


Resources

Repeated decoding

effect

GPU underutilization -> drastically reduces the training throughput

Root cause : lack of system lelvel support for sharing decoded objects across independent jobs.


Related Work/ Existing Solutions

Partially mitigate, but inefficient & resource constrained

Programming abstraction for video analysis

only application level

Image preprocessing overhead

do not adress video-specific overhead: the repeated decoding problem

Streamed video pipeline

only basic platform level capbilities, doesn’t handle iterative nature


Solution: Storage level abstraction

provides a file system : exposes handles to critical poobjects in the video training pipeline

Complex Implementation: Pipelines -> abstraction

Abstracting away -> reduce developers’ workload Simplified Preprocessing Management. NO complex preprocessing pipelines or maintain relationships between data objects

Codes fewer than 10 lines…

Overhead -> System Level Optimization

solve CPU case problem by leverage system level object reuse -> eliminate redundant computation enables system-wide decisions for caching & scheduling data processing

Caching

intermediate objects


Design

1. Abstract the preprocessing workflow with “view”

view

high level abstraction representing virtual objects that encapsulate the intermediate stages of video preprocessing

2. Constructs a view materialization plan (models the video preprocessing workflow)

generates abstract view dependency graph / per task - serves as a blue print -> construct a materialization plan as a concrete graph. / per k epoch chunks + graph pruning for materialization under storage Limit

3. Reduces redundant decoding operations by reusing, act like a well functioned cache.

4. Parallelizes the task & priority based scheduling


Evaluation

SAND achieves significant improvements in training time and GPU utilization compared to both CPU and GPU preprocessing baselines.

Compared to CPU baseline, SAND improves training time up to 10.2× and GPU utilization 12.3× in hyperparameter search.

Compared to GPU baseline, SAND improves training time up to 2.8× and GPU utilization 2.9× in hyperparameter search.


Inputs

1. Running GPU consumes more energy than running CPU

2. Creativeness comes from knowing priorworks

“view” concept from DB -> used it

3. having a plan is always better than not having a plan

strategic planning of preprocessing sequences to maximize the reuse of intermediate objects across multiple tasks.


Thoughts.