Parallelizing Packet Processing in Container Overlay Networks

Serialized softirqs on a single core bottleneck overlay networks → pipeline them across multiple cores with Falcon

Featured image

Venue: EuroSys

Topic: Container overlay networks cause significant throughput and latency degradation compared to physical networks. The main bottleneck is serialization of softirqs on a single core. Falcon pipelines softirqs across multiple cores to remove this bottleneck.


Summary

Overlay networks are widely adopted in production container environments but cause significant performance degradation (throughput and latency) compared to physical networks. Root cause: a large number of software interrupts (softirqs) associated with different network devices of a single flow are serialized on a single core. Falcon prevents this serialization via softirq pipelining, splitting, and dynamic balancing — enabling fine-grained, low-cost flow parallelization on multicore machines.


Background

Overlay networks and performance

The bottleneck


Key Idea

Falcon: softirq pipelining

Distribute softirq processing for a single flow across multiple cores to prevent any one core from being overwhelmed.

Three core designs:

  1. Softirq pipelining: chain softirq processing stages across cores.
  2. Softirq splitting: divide softirq work within a stage across cores.
  3. Dynamic balancing: adapt the distribution based on load.

Design


Evaluation


Meeting Notes

(to be filled)