June 3, 2026 SAND: A New Programming Abstraction for Video-based Deep Learning solve heavy preprocessing problem with abstraction & planning + priority scheduling to reuse #paper-review #SOSP #2025
June 3, 2026 Tiga: Accelerating Geo-Distributed Transactions with Synchronized Clocks #paper-review #SOSP #2025
June 3, 2026 Sleeping with One Eye Open:Fast, Sustainable Storage with Sandman #paper-review #SOSP #2025
May 27, 2026 cache_ext: Customizing the Page Cache with eBPF OS page cache evicition policy: only 1 -> supports multiple policy, FLEXIBILITY #paper-review #SOSP #2025 #flexlibility #eBPF
May 22, 2026 A large scale analysis of hundreds of in-memory cache clusters at Twitter 5 important facts about in-memory caching #paper-review #OSDI #2020 #survey
May 20, 2026 Jenga: Effective Memory Management for Serving LLM with Heterogeneity a memory allocation framework for managing heterogeneous embeddings by leveraging layer properties #paper-review #SOSP #2025 #add-granularity
May 11, 2026 PrefillOnly: An Inference Engine for Prefill-only Workloads in Large Language Model Applications We only need to store 1 layer. Let's not do unnecessary things and UTILIZE this condition. #paper-review #SOSP #2025 #remove-unnecessary
May 6, 2026 IC-Cache: Efficient Large Language Model Serving via In-context Caching In-Context Caching system/ leverage caching, let small LLM serves like a giant. #paper-review #SOSP #2025 #caching #off-loading
April 29, 2026 COpter: Efficient Large-Scale Resource-Allocation via Continual Optimization remove overhead of round-based resource allocation -> sequence of interconnected problems #paper-review #SOSP #2025
April 22, 2026 Mitigating Application Resource Overload with Targeted Task Cancellation Eliminate Head-Of-Line Blocking request to solve resource overload problem. #paper-review #SOSP #2025 #Head-of-Line-Blocking
February 6, 2026 Kinetic Modeling of Data Eviction in Cache AET — composable, linear-time Miss Ratio Curve profiling via average eviction time sampling #paper-review #ATC #2016
February 5, 2026 Sailor: Automating Distributed Training over Dynamic, Heterogeneous, and Geo-distributed Clusters Co-optimize resource allocation and parallelization plan for heterogeneous GPU training with accurate simulation #paper-review #SOSP #2025
January 28, 2026 Robust LLM Training Infrastructure at ByteDance Automated failure diagnosis and recovery for large-scale LLM training — minimize unproductive time across 16K+ GPUs #paper-review #SOSP #2025
January 15, 2026 Spirit: Fair Allocation of Interdependent Resources in Remote Memory Systems Fair allocation when resources are interdependent — remote memory bandwidth, capacity, and compute interact #paper-review #SOSP #2025
January 7, 2026 Oasis: Pooling PCIe Devices Over CXL to Boost Utilization Use CXL shared memory as a message channel to pool PCIe devices (NICs, SSDs) across hosts in a CXL pod #paper-review #SOSP #2025
October 23, 2025 Disentangling the Dual Role of NIC Receive Rings Split Rx ring into allocation (Ax) and reception (Bx) rings → reduce I/O working set, improve throughput up to 37% #paper-review #OSDI #2025
October 10, 2025 Criticality-Aware Instruction-Centric Bandwidth Partitioning for Data Center Applications Static partitioning can't react quickly to load changes → use fine-grained control signals and core allocation instead #paper-review #HPCA #2025
October 6, 2025 Extending Applications Safely and Efficiently EIM model abstracts extension resources for fine-grained safety/interconnectedness tradeoffs; bpftime enforces it efficiently #paper-review #OSDI #2025
October 5, 2025 Tile Size Selection Using Cache Organization and Data Layout Select tile sizes that fit data working sets in cache, accounting for layout and associativity #paper-review #PLDI
October 5, 2025 SOCK: Rapid Task Provisioning with Serverless-Optimized Containers Lean containers + generalized Zygote provisioning + three-tier package-aware caching → 45× cold-start speedup #paper-review #ATC #2018
October 5, 2025 Shared Address Translation Revisited Share page tables across processes for shared libraries → reduce TLB overhead and page faults on Android #paper-review #EUROSYS #2016
October 5, 2025 SEUSS: Skip Redundant Paths to Make Serverless Fast Deploy serverless functions from unikernel snapshots to skip boot, runtime init, and import overhead — 51× throughput improvement #paper-review #EUROSYS #2020
October 5, 2025 Serverless in the Wild: Characterizing and Optimizing the Serverless Workload at a Large Cloud Provider Characterize FaaS workloads → propose per-function keep-alive and pre-warm policies to cut cold starts #paper-review #ATC
October 5, 2025 Replayable Execution Optimized for Page Sharing for a Managed Runtime Environment Checkpoint with GC-compacted heap → share pages across containers → speedy restoration with fewer page faults #paper-review #EUROSYS
October 5, 2025 Benchmarking, Analysis, and Optimization of Serverless Function Snapshots Functions access stable working sets across invocations → prefetch from disk to cut cold-start latency by 3.7× #paper-review #ASPLOS #2022
October 5, 2025 Prebaking Functions to Warm the Serverless Cold Start The timing of the snapshot determines cold-start latency — prebake at the right execution point #paper-review #Middleware
October 5, 2025 Reducing Minor Page Fault Overheads through Enhanced Page Walker Hardware-software co-design offloads minor page fault critical path → 33× latency improvement, 6.6% runtime improvement #paper-review #Journal
October 5, 2025 MEGA: Overcoming Traditional Problems with OS Huge Page Management Analyze and address the fundamental problems with Linux huge page management — fragmentation, bloat, fault latency, non-swappability #paper-review #SYSTOR #2019
October 5, 2025 Coordinated and Efficient Huge Page Management with Ingens Treat memory contiguity as a first-class resource; track utilization and access frequency for principled huge page management #paper-review #OSDI #2016
October 5, 2025 Memory Efficient Fork-based Checkpointing Mechanism for In-Memory Database Systems Fork-based checkpointing with efficient copy-on-write management to minimize memory overhead for in-memory databases #paper-review #SAC
October 5, 2025 FlashCube: Fast Provisioning of Serverless Functions with Streamlined Container Runtimes Streamline container runtimes to eliminate unnecessary initialization overhead for fast serverless provisioning #paper-review #PLOS
October 5, 2025 Parallelizing Packet Processing in Container Overlay Networks Serialized softirqs on a single core bottleneck overlay networks → pipeline them across multiple cores with Falcon #paper-review #EUROSYS
October 5, 2025 FaaSNet: Scalable and Fast Provisioning of Custom Serverless Container Runtimes at Alibaba Cloud Function Compute Decentralize container provisioning across host VMs in function-tree structures to eliminate cold-start bottleneck #paper-review #ATC #2021
October 5, 2025 Architectural Implications of Function-as-a-Service Computing FaaS containerization brings up to 20× slowdown vs. native; cold start can exceed 10× a function's execution time #paper-review #MICRO #2019
October 5, 2025 Cold Start Influencing Factors in Function as a Service Programming language, package size, and memory/CPU settings significantly affect FaaS cold-start latency #paper-review #UCC
October 5, 2025 Caladan: Mitigating Interference at Microsecond Timescales Dedicated scheduler core + fast core allocation reacts to interference in microseconds — no slow hardware partitioning #paper-review #OSDI #2020