June 3, 2026

SAND: A New Programming Abstraction for Video-based Deep Learning

solve heavy preprocessing problem with abstraction & planning + priority scheduling to reuse

Venue: SOSP 2025 doi

Topic: Video-based Deep Learning, abstraction, scheduling, Delete redundancy

Summary

solve heavy preprocessing problem with abstraction & planning + priority scheduling to reuse

Background

Video-based Deep Learning’s pre-processing is difficult because: processing compressed video data is complex and bottleneck

Complex Implementation: Pipelines

Pipelines incurs complex implementation burden. 2x codes than model training.

Overhead

Normally, CPU does that but

overhead is upto 6.5x higher than GPU case -> bottleneck of the entire process

Offload to GPU

Reduce overhead but partially, and Reduce available memory for training

Resources

Repeated decoding

the preprocessing workflow (from decoding to augmentation) must be repeated for each video in every epoch.
because the frames from previous epochs are rarely reusuable. Decoded frames are never reused within the same epoch.
Decode would not use frames: Due to the video codec dependencies, extracting the required frames necessitates decoding mamy additional frames that are immediately discarded.

effect

GPU underutilization -> drastically reduces the training throughput

Partially mitigate, but inefficient & resource constrained

Programming abstraction for video analysis

only application level

Image preprocessing overhead

do not adress video-specific overhead: the repeated decoding problem

Streamed video pipeline

only basic platform level capbilities, doesn’t handle iterative nature

Solution: Storage level abstraction

provides a file system : exposes handles to critical poobjects in the video training pipeline

Complex Implementation: Pipelines -> abstraction

Abstracting away -> reduce developers’ workload Simplified Preprocessing Management. NO complex preprocessing pipelines or maintain relationships between data objects

Codes fewer than 10 lines…

Overhead -> System Level Optimization

solve CPU case problem by leverage system level object reuse -> eliminate redundant computation enables system-wide decisions for caching & scheduling data processing

Caching

intermediate objects

Design

1. Abstract the preprocessing workflow with “view”

view

high level abstraction representing virtual objects that encapsulate the intermediate stages of video preprocessing

2. Constructs a view materialization plan (models the video preprocessing workflow)

generates abstract view dependency graph / per task - serves as a blue print -> construct a materialization plan as a concrete graph. / per k epoch chunks + graph pruning for materialization under storage Limit

3. Reduces redundant decoding operations by reusing, act like a well functioned cache.

4. Parallelizes the task & priority based scheduling

Evaluation

SAND achieves significant improvements in training time and GPU utilization compared to both CPU and GPU preprocessing baselines.

Compared to CPU baseline, SAND improves training time up to 10.2× and GPU utilization 12.3× in hyperparameter search.

Compared to GPU baseline, SAND improves training time up to 2.8× and GPU utilization 2.9× in hyperparameter search.

Inputs

1. Running GPU consumes more energy than running CPU

2. Creativeness comes from knowing priorworks

“view” concept from DB -> used it

3. having a plan is always better than not having a plan

strategic planning of preprocessing sequences to maximize the reuse of intermediate objects across multiple tasks.

Thoughts.

have good APIs.
Literally “solve” the problem : ex. SAND shifts the focus from data processing to model development.