Reducing Minor Page Fault Overheads through Enhanced Page Walker

Hardware-software co-design offloads minor page fault critical path → 33× latency improvement, 6.6% runtime improvement

Featured image

Venue: ACM TACO (Journal)
Link: ACM DL

Topic: Growing application memory footprints induce increasing minor page fault rates. Each fault takes thousands of CPU cycles. MFOE offloads the fault critical path to dedicated hardware and parallelizes pre-fault work in a background thread.


Summary

Minor page faults occur when a page is in physical memory but not yet mapped in the MMU for the faulting thread. As application memory footprints grow, minor page fault overhead can reach 29% of execution time. MFOE (Minor Fault Offload Engine) splits fault handling into pre-fault, critical-path, and post-fault phases:


Background

Minor page fault definition

Growing problem


Key Idea

Decompose fault handling into three phases

  1. Pre-fault tasks: can be done before the fault occurs → run ahead of time in a background thread.
  2. Critical-path tasks: must happen during the fault → hardware-amenable → offload to MFOE.
  3. Post-fault tasks: can be done after the fault → run in the background thread.

Pre/post-fault functions combined into a background thread → their latency is removed from the critical path of the faulting program.


Design

Software: parallel pre-fault page allocation

Hardware: MFOE (Minor Fault Offload Engine)


Evaluation


Meeting Notes

(to be filled)