Giter VIP home page Giter VIP logo

remzi's Introduction

\centering 2024-07-30

Page Cache Benchmarking

  • The original hypothesis: Page Cache is too slow to be used with modern NVMe devices.
$ fio --direct=0 --rw=write --size=32G --bs=1M --filename=/dev/nvme1n1p2
2952MiB/s

$ fio --direct=1 --rw=write --size=32G --bs=1M --filename=/dev/nvme1n1p2
6587MiB/s

Interesting params

  • Parameters to control writeback when benchmarking page cache:
# default 10, triggers writeback when threshold reached
vm.dirty_background_ratio=80

# default 20, blocks when threshold reached and switch to direct
vm.dirty_ratio=90

Fio with block device

  • O_DIRECT behavior is different for a block device and a regular file.
  • With block device:
$ head -5 /proc/meminfo
MemTotal:       64951852 kB
MemFree:        29651472 kB
MemAvailable:   63573604 kB
Buffers:        33556620 kB     # !!!
Cached:           132944 kB

Fio with regular file

  • With regular file:
$ head -5 /proc/meminfo
MemTotal:       64951852 kB
MemFree:        30639684 kB
MemAvailable:   63801840 kB
Buffers:            2196 kB
Cached:         33703140 kB     # !!!

Virtual Memory

  • Cached works as expected for a page cache.
  • Buffers represents an IO buffer cache and does not survive longer than an issuing process.
    • This is used while updating on-disk metadata (inode tables, allocation bitmaps…).

Flamegraph: Block device

$ fio --direct=0 --rw=write --size=32G --bs=1M --filename=/dev/nvme1n1p2
2952MiB/s

Block Device

  • Similar to the first run on a regular file (following section).

Flamegraph: Regular File (None Cached)

$ echo 3 | sudo tee /proc/sys/vm/drop_caches
$ fio --direct=0 --rw=write --size=32G --bs=1M --iodepth=1 --numjobs=1 # First run (Nothing in page cache)
3795MiB/s

Regular File Uncached

Regular File (All Cached)

$ fio --direct=0 --rw=write --size=32G --bs=1M --iodepth=1 --numjobs=1 # Second write (All cached)
10.9GiB/s
$ fio --direct=0 --rw=read --size=32G --bs=1M --iodepth=1  --numjobs=1 # Read (All cached)
18.2.GiB/s
  • Raw memory bandwidth limit is around 96GB/s (2x DDDR5-6000 = 2x48GB/s).
  • With C++ code I can read:
    • 55GB/s 1-thread
    • 70GB/s 2-threads
    • No CCD (core chiplet die) pinning

Flamegraph: Regular File (All Cached)

  • Flamegraph link
  • Dominated by copy_page_from_iter_atomic but the memory throughput is not saturated.

Regular File Cached

\centering 2024-08-16

More benchmarks

  1. Populate page cache and write different file. To check if we can eliminate the allocation overhead.

    • The same performance. Allocation is not the bottleneck.
    • Flamegraph looks the same.
  2. Since simple read memory benchmark scales with number of threads (1t = 55GB/s, 2t = 70GB/s), does page cache scale as well?

    • No. With varying number of FIO jobs the page cache performance stays the same.
    • This can imply big lock on the page cache limiting parallelism.
  • Surprising detail. Btrfs code faster (10+%) than ext4 on pure page cache operations.

Experimenting with maximal memory throughput

  • To actually get some idea what is practically possible.
  • Writing memory with AVX-512
  • 1 thread, 500GB in 17.2s = 29GB/s
  • 2 threads, 1000GB in 18.059 = 55GB/s
  • Interesting special case: write only zeroes, 1 thread, 500GB in 6.95s = 72GB/s

Revisiting flamegraphs

  • Writing to already cached pages in a pagecache: 11GB/s
  • copy_page_from_iter_atomic counts for ~50%.
  • pagecache_get_page counts for the significant rest.
  • Maximal single threaded write throughput is 29GB/s.
    • 11GB/s is reasonable for 50% share.
  • Pagecache has a spinlock which does not allow multiple threads.

The flush part of the page cache

  • Pagecache populated with 50GB of data and nothing is persisted.
  • Issue sync and measure performance.
  • 50GB/19.029s = 2.63GB/s
  • Direct IO to the disk is ~6GB/s.

Sync

remzi's People

Contributors

asch avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.