Giter VIP home page Giter VIP logo

Comments (7)

cyyself avatar cyyself commented on June 12, 2024

Update: After directly flashing my rpi4 with OpenWRT 23.05.2 with Linux v5.15.137 compiled by OpenWRT, I got 1.01 Gbit/sec!

| Raspberry Pi 4 / BCM2711*      | OpenWRT 23.05.2 / 5.15.137 | 1.01 Gbits/sec |

from wg-bench.

cyyself avatar cyyself commented on June 12, 2024

One interesting finding: Use CONFIG_PREEMPT_NONE instead of CONFIG_PREEMPT in kernel config we can reach ~700Mbps on 6.1.y Kernel. CONFIG_PREEMPT_NONE is set by default in OpenWRT Kernel.

Connecting to host 169.254.200.2, port 5201
[  5] local 169.254.200.1 port 47296 connected to 169.254.200.2 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  78.2 MBytes   656 Mbits/sec    0    402 KBytes       
[  5]   1.00-2.00   sec  80.2 MBytes   672 Mbits/sec    0    441 KBytes       
[  5]   2.00-3.00   sec  79.6 MBytes   668 Mbits/sec    0    441 KBytes       
[  5]   3.00-4.00   sec  80.3 MBytes   674 Mbits/sec    0    441 KBytes       
[  5]   4.00-5.00   sec  80.8 MBytes   678 Mbits/sec    0    441 KBytes       
[  5]   5.00-6.00   sec  81.0 MBytes   679 Mbits/sec    0    441 KBytes       
[  5]   6.00-7.00   sec  79.5 MBytes   667 Mbits/sec    0    441 KBytes       
[  5]   7.00-8.00   sec  80.1 MBytes   672 Mbits/sec    0    441 KBytes       
[  5]   8.00-9.00   sec  80.1 MBytes   672 Mbits/sec    0    441 KBytes       
[  5]   9.00-10.00  sec  79.7 MBytes   668 Mbits/sec    0    441 KBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   799 MBytes   671 Mbits/sec    0             sender
[  5]   0.00-10.00  sec   798 MBytes   669 Mbits/sec                  receiver

iperf Done.

from wg-bench.

cyyself avatar cyyself commented on June 12, 2024

Another interesting finding: Turn off CONFIG_FTRACE together with CONFIG_PREEMPT_NONE we can reach ~1.1Gbps on bcm2711_defconfig with rpi-6.1.y.

Connecting to host 169.254.200.2, port 5201
[  5] local 169.254.200.1 port 37182 connected to 169.254.200.2 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   135 MBytes  1.13 Gbits/sec    0    818 KBytes       
[  5]   1.00-2.00   sec   130 MBytes  1.09 Gbits/sec    0    860 KBytes       
[  5]   2.00-3.00   sec   126 MBytes  1.05 Gbits/sec    0    975 KBytes       
[  5]   3.00-4.00   sec   130 MBytes  1.09 Gbits/sec    0   1022 KBytes       
[  5]   4.00-5.00   sec   130 MBytes  1.09 Gbits/sec    0   1.07 MBytes       
[  5]   5.00-6.00   sec   132 MBytes  1.11 Gbits/sec    0   1.07 MBytes       
[  5]   6.00-7.00   sec   132 MBytes  1.11 Gbits/sec    0   1.14 MBytes       
[  5]   7.00-8.00   sec   132 MBytes  1.11 Gbits/sec    0   1.26 MBytes       
[  5]   8.00-9.00   sec   129 MBytes  1.08 Gbits/sec    0   1.26 MBytes       
[  5]   9.00-10.01  sec   130 MBytes  1.08 Gbits/sec    0   1.48 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.01  sec  1.28 GBytes  1.09 Gbits/sec    0             sender
[  5]   0.00-10.01  sec  1.27 GBytes  1.09 Gbits/sec                  receiver

iperf Done.

However, if we turn off CONFIG_FTRACE then a series of configurations that depend on it will also be turned off. Thus, we will need further investigation to see what config hinders the performance.

104d103
< # CONFIG_BPF_LSM is not set
139d137
< CONFIG_TASKS_RUDE_RCU=y
260d257
< CONFIG_TRACEPOINTS=y
1603d1599
< # CONFIG_BATMAN_ADV_TRACING is not set
1637d1632
< # CONFIG_NET_DROP_MONITOR is not set
2965d2959
< # CONFIG_ATH6KL_TRACING is not set
8041d8034
< # CONFIG_PSTORE_FTRACE is not set
8492d8484
< # CONFIG_TRACE_MMIO_ACCESS is not set
8726d8717
< # CONFIG_DEBUG_PAGE_REF is not set
8803,8804d8793
< CONFIG_TRACE_IRQFLAGS=y
< CONFIG_TRACE_IRQFLAGS_NMI=y
8837d8825
< CONFIG_NOP_TRACER=y
8845d8832
< CONFIG_TRACER_MAX_TRACE=y
8847,8853d8833
< CONFIG_RING_BUFFER=y
< CONFIG_EVENT_TRACING=y
< CONFIG_CONTEXT_SWITCH_TRACER=y
< CONFIG_RING_BUFFER_ALLOW_SWAP=y
< CONFIG_PREEMPTIRQ_TRACEPOINTS=y
< CONFIG_TRACING=y
< CONFIG_GENERIC_TRACER=y
8855,8895c8835
< CONFIG_FTRACE=y
< # CONFIG_BOOTTIME_TRACING is not set
< CONFIG_FUNCTION_TRACER=y
< CONFIG_FUNCTION_GRAPH_TRACER=y
< CONFIG_DYNAMIC_FTRACE=y
< CONFIG_DYNAMIC_FTRACE_WITH_REGS=y
< CONFIG_FUNCTION_PROFILER=y
< CONFIG_STACK_TRACER=y
< CONFIG_IRQSOFF_TRACER=y
< CONFIG_SCHED_TRACER=y
< # CONFIG_HWLAT_TRACER is not set
< # CONFIG_OSNOISE_TRACER is not set
< # CONFIG_TIMERLAT_TRACER is not set
< # CONFIG_FTRACE_SYSCALLS is not set
< CONFIG_TRACER_SNAPSHOT=y
< CONFIG_TRACER_SNAPSHOT_PER_CPU_SWAP=y
< CONFIG_BRANCH_PROFILE_NONE=y
< # CONFIG_PROFILE_ANNOTATED_BRANCHES is not set
< # CONFIG_PROFILE_ALL_BRANCHES is not set
< CONFIG_BLK_DEV_IO_TRACE=y
< CONFIG_KPROBE_EVENTS=y
< # CONFIG_KPROBE_EVENTS_ON_NOTRACE is not set
< # CONFIG_UPROBE_EVENTS is not set
< CONFIG_BPF_EVENTS=y
< CONFIG_DYNAMIC_EVENTS=y
< CONFIG_PROBE_EVENTS=y
< CONFIG_FTRACE_MCOUNT_RECORD=y
< CONFIG_FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY=y
< # CONFIG_SYNTH_EVENTS is not set
< # CONFIG_HIST_TRIGGERS is not set
< # CONFIG_TRACE_EVENT_INJECT is not set
< # CONFIG_TRACEPOINT_BENCHMARK is not set
< # CONFIG_RING_BUFFER_BENCHMARK is not set
< # CONFIG_TRACE_EVAL_MAP_FILE is not set
< # CONFIG_FTRACE_RECORD_RECURSION is not set
< # CONFIG_FTRACE_STARTUP_TEST is not set
< # CONFIG_RING_BUFFER_STARTUP_TEST is not set
< # CONFIG_RING_BUFFER_VALIDATE_TIME_DELTAS is not set
< # CONFIG_PREEMPTIRQ_DELAY_TEST is not set
< # CONFIG_KPROBE_EVENT_GEN_TEST is not set
< # CONFIG_RV is not set
---
> # CONFIG_FTRACE is not set

from wg-bench.

cyyself avatar cyyself commented on June 12, 2024

Yet another interesting finding: turn off CONFIG_IRQSOFF_TRACER along with CONFIG_PREEMPT_NONE can also reach ~1.1Gbps.

Turn on CONFIG_IRQSOFF_TRACER will also affect the following configurations:

8803a8804,8805
> CONFIG_TRACE_IRQFLAGS=y
> CONFIG_TRACE_IRQFLAGS_NMI=y
8849a8852
> CONFIG_PREEMPTIRQ_TRACEPOINTS=y
8861c8864
< # CONFIG_IRQSOFF_TRACER is not set
---
> CONFIG_IRQSOFF_TRACER=y

from wg-bench.

fakemanhk avatar fakemanhk commented on June 12, 2024

In my RPi 4B, using OpenWrt 23.05.2 (64bit), the tested result was 881Mbps.

from wg-bench.

fakemanhk avatar fakemanhk commented on June 12, 2024

BTW I believe 32bit VS 64bit should show some difference, probably we should indicate this?

from wg-bench.

cyyself avatar cyyself commented on June 12, 2024

BTW I believe 32bit VS 64bit should show some difference, probably we should indicate this?

For an out-of-order CPU, 32bit vs 64bit shows same performance is normal, sometimes 64bit may slower for fatter pointer size which consumes more cache capacity. Intuitively we think 64bit will be fast is based on the register width doubled so it will be faster to processing something like 64-bit arithmetic operations only take one instruction to finish. But please remind that 64-bit operations also has longer latency on the CPU physical circuit which may needs to lower the frequency or more cycles to produce. It’s the same on SIMD.

The crypto algorithm in WireGuard is chacha20 and poly1305 also uses SIMD i.e. arm neon to calculate, if uarch implementation does not provide wide enough simd processing in a single cycle, we will get the same performance on whatever 32/64 bit.

from wg-bench.

Related Issues (4)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.