Kinetic Modeling of Data Eviction in Cache
AET — composable, linear-time Miss Ratio Curve profiling via average eviction time sampling
Venue: ATC 2016
Link: USENIX ATC ‘16
Topic: Use average eviction time (AET) measured by sampling to compute Miss Ratio Curves in linear time with extremely low space overhead, enabling shared-cache MRC profiling.
Summary
Prior MRC profiling methods either lacked shared-cache support or had high overhead. AET models the average eviction time — a composable metric — allowing MRC computation in O(N) time with O(1) space via reservoir sampling. Key benefit: MRCs for multi-programmed workloads in shared cache can be computed directly from the AET of individual programs.
Background
Why MRC matters
- Miss Ratio Curves (MRCs) characterize a program’s locality — essential for cache partitioning and shared-cache optimization.
- MRC gives the miss ratio at each possible cache size.
Prior work limitations
- Counter Stacks and SHARDS: good accuracy, low time/space overhead.
- BUT: cannot characterize shared cache behavior by modeling individual programs.
- StatStack: assumes every reference has a next reuse → inaccurate for working-set-sized programs.
Key concept: reuse distance vs. eviction time
- Reuse distance: number of distinct accesses between two consecutive accesses to the same location.
- AET replaces stack-position tracking with time: “a block moves position whenever its reuse time exceeds the arrival time.”
- Miss ratio at cache size c = probability that reuse time > AET(c).
Key Idea
Average Eviction Time (AET)
- AET is composable: the shared-cache MRC of a multi-programmed workload can be derived from the AET of individual programs.
- Cache sharing means all co-run programs have the same AET.
- AET runs in linear time O(N) using sampled Reuse Time Histograms (RTH).
Efficiency via sampling
- Random sampling + reservoir sampling reduce space from O(M) to O(1).
- Phase sampling: for programs with unstable access patterns, divide execution into phases and average MRCs.
Design
MRC computation
miss_ratio(c)= probability that reuse time > AET(c).- Cold misses have infinite reuse time by definition.
Implementation
- Reservoir sampling — reduces RTH space complexity from O(M) to O(1).
- Phase sampling — evenly divide execution, compute per-phase MRC and average. Better accuracy for programs with non-stationary locality.
Evaluation
- Evaluated on CPU workloads and storage workloads.
- AET has the lowest time and space overhead among compared techniques (Counter Stacks, SHARDS, StatStack).
- AET in O(N); Counter Stacks in O(N log M) — significant practical speedup.
- Composability: AET can model shared cache, Counter Stacks and SHARDS cannot.
- Average accuracy improvement over StatStack: 35.8%.
- Phase AET sampling is more accurate than non-phase in most benchmarks.
New Knowledge
- Locality characterization techniques: useful keyword for Working Set Size estimation research.
- Reuse distance = number of distinct data accesses between two consecutive accesses to the same location.
- Cache sharing → all co-run programs share the same AET.
- Storage workloads are much larger than CPU workloads; their lifespan may last weeks or more.
Insight
- To estimate cache usage, exact stack position is unnecessary — relative time (reuse time) is sufficient.
- Trade-off: AET makes a statistical assumption (good accuracy in O(N)), Counter Stacks makes no assumption (perfect accuracy in O(N log M)).
- ABF (bursty interval sampling) is simpler but loses accuracy outside burst intervals.
Meeting Notes
(to be filled)