Criticality-Aware Instruction-Centric Bandwidth Partitioning for Data Center Applications

Static partitioning can't react quickly to load changes → use fine-grained control signals and core allocation instead

Featured image

Venue: HPCA 2025
Link: Paper PDF

Topic: Hardware resource partitioning for interference isolation is too slow to react to rapidly changing workloads. A better approach uses criticality-aware control signals and fast core allocation to mitigate interference at fine granularity.


Summary

Tail latency is critical in data centers, but co-located tasks compete for shared resources (cores, memory bandwidth, LLC, execution units) causing interference. Hardware partitioning is the common solution but reacts too slowly — static assignment wastes utilization, dynamic adjustment takes tens of seconds. The key insight: core allocation can achieve performance isolation faster and more efficiently than resource partitioning, given the right control signals.


Background

Why tail latency matters

Types of CPU interference

  1. Hyperthreading interference (within a physical core): serious when shared execution resources are contended.
  2. Memory bandwidth interference: impacts all cores sharing the same physical CPU.
  3. LLC interference: impacts all cores sharing the same physical CPU.

Limitation of partitioning-based approaches


Key Idea

Core allocation as the primary isolation mechanism

Challenges

  1. Sensitivity: many interference types → need control signals that accurately detect each type.
  2. Scalability: software overhead of gathering signals and adjusting allocations must be minimal.

Design

Dedicated scheduler core

Controllers

KSCHED: Fast and Scalable Scheduling


Insights


Meeting Notes

(to be filled)