Replayable Execution Optimized for Page Sharing for a Managed Runtime Environment

Checkpoint with GC-compacted heap → share pages across containers → speedy restoration with fewer page faults

Featured image

Venue: EuroSys

Topic: Extend checkpointing to optimize page sharing across containers for managed runtime environments, enabling speedy restoration with minimal page faults.


Summary

Standard checkpointing saves a process image to disk and restores it later. Key insight: by triggering Garbage Collection before checkpointing, the heap is compacted → objects are placed adjacent in memory → fewer pages are needed → better page sharing across container instances during restoration. Result: speedy restoration via shared memory pages and reduced page faults.


Background

Checkpointing in managed runtimes

The page sharing opportunity


Key Idea

GC before checkpoint → compact heap → better page sharing

  1. Trigger Garbage Collection before taking the checkpoint.
    • GC compacts the heap: live objects are packed together, eliminating gaps.
    • Objects end up closer together in virtual address space.
  2. Fewer pages needed to represent the heap.
  3. More page sharing across instances — containers can share the same physical pages during restoration.
  4. Fewer page faults on restoration — less memory needs to be re-faulted in from disk.

Design


Evaluation


Meeting Notes

(to be filled)