Low latency programming is increasingly important across a variety of use cases. Still, many of the tips and tricks of low latency are only part of developer folklore. This document attempts to codify that knowledge for people to (re)discover the art of low-latency programming.
- Colocate compute and data.
- Avoid dynamic memory management.
- Avoid context switching.
- Use wait-free data synchronization.
- Partition data to avoid sharing (and therefore synchronization).
- Make shared data structures read-only (when possible).
- Use busy-polling instead of wakeups.
- Use non-blocking I/O.
- Use kernel-bypass networking such as DPDK or XDP.
- Use hardware offload with accelerators and FPGA.
- Avoid coordinated omission when measuring latency.
- Configure your system (for example, use preemptible kernel, watch out for bad device drivers).
- 11 Best Practices for Low Latency Systems by Ben Darfler (2014).
- Optimizing web servers for high throughput and low latency by Alexey Ivanov (2017).
- The Tail at Scale by Jeffrey Dean and Luiz André Barroso (2013)
- Tales of the Tail: Hardware, OS, and Application-level Sources of Tail Latency by Jialin Li et al (2014)
- Amdahl’s Law for Tail Latency by Christina Delimitrou and Christos Kozyrakis (2018)