Giter VIP home page Giter VIP logo

Comments (25)

ggmartins avatar ggmartins commented on July 17, 2024 3

wow, code from branch sandbox-roberto-remove-memory-alloc really unlocked the beast

also, the encrypted measurements against your staging look quite promising, thanks a lot for this!
when do you think we'll have both server and client fixes merged into the main branches?

if no one else has any further things, we can close this one. thanks again!

./ndt7-client-go/cmd/ndt7-client/ndt7-rob -scheme ws

    upload: complete
             Server: ndt-mlab1-ord06.mlab-oti.measurement-lab.org
             Client: [REDACTED]
            Latency:     4.3 ms
           Download:   912.9 Mbit/s
             Upload:    43.1 Mbit/s
     Retransmission:    0.00 %
         upload: complete
             Server: ndt-mlab3-ord05.mlab-oti.measurement-lab.org
             Client: [REDACTED]
            Latency:     3.7 ms
           Download:   872.6 Mbit/s
             Upload:    43.0 Mbit/s
     Retransmission:    0.00 %
     
         upload: complete
             Server: ndt-mlab3-ord06.mlab-oti.measurement-lab.org
             Client: [REDACTED]
            Latency:     4.4 ms
           Download:   920.9 Mbit/s
             Upload:    43.0 Mbit/s
     Retransmission:    0.00 % 

/ndt7-client-go/cmd/ndt7-client/ndt7-rob -locate.url=https://locate-dot-mlab-staging.appspot.com/v2/nearest -scheme=ws

    upload: complete
             Server: ndt-mlab4-ord05.mlab-staging.measurement-lab.org
             Client: [REDACTED]
            Latency:     4.5 ms
           Download:   919.6 Mbit/s
             Upload:    43.0 Mbit/s
     Retransmission:    0.00 %
     
     upload: complete
             Server: ndt-mlab4-ord05.mlab-staging.measurement-lab.org
             Client: [REDACTED]
            Latency:     4.1 ms
           Download:   898.8 Mbit/s
             Upload:    43.1 Mbit/s
     Retransmission:    0.00 %
     
    upload: complete
             Server: ndt-mlab4-ord05.mlab-staging.measurement-lab.org
             Client: [REDACTED]
            Latency:     4.0 ms
           Download:   891.3 Mbit/s
             Upload:    43.0 Mbit/s
     Retransmission:    0.00 %

./ndt7-client-go/cmd/ndt7-client/ndt7-rob -locate.url=https://locate-dot-mlab-staging.appspot.com/v2/nearest

    upload: complete
             Server: ndt-mlab4-ord03.mlab-staging.measurement-lab.org
             Client: [REDACTED]
            Latency:     3.1 ms
           Download:   841.9 Mbit/s
             Upload:    43.0 Mbit/s
     Retransmission:    0.00 %
     
     upload: complete
             Server: ndt-mlab4-ord05.mlab-staging.measurement-lab.org
             Client: [REDACTED]
            Latency:     4.0 ms
           Download:   850.7 Mbit/s
             Upload:    43.0 Mbit/s
     Retransmission:    0.02 %
     
     upload: complete
             Server: ndt-mlab4-ord04.mlab-staging.measurement-lab.org
             Client: [REDACTED]
            Latency:     3.9 ms
           Download:   855.8 Mbit/s
             Upload:    43.0 Mbit/s
     Retransmission:    0.00 %

./ndt7-client-go/cmd/ndt7-client/ndt7-rob

    upload: complete
             Server: ndt-mlab2-ord05.mlab-oti.measurement-lab.org
             Client: [REDACTED]
            Latency:     2.9 ms
           Download:   193.9 Mbit/s
             Upload:    43.0 Mbit/s
     Retransmission:    0.00 %

from ndt7-client-go.

laiyi-ohlsen avatar laiyi-ohlsen commented on July 17, 2024 3

Another thing to note is that the changes represented here only affect the subset of clients that are low-resourced embedded devices with gigabit connections. The ndt-client-go change that @ggmartins mentioned above seems to have made the most significant difference to the results reported above and we are currently setting up a framework to test the impact of the server side change on other clients, devices and operating systems. We'll be posting a blog post about all the recent changes to NDT that goes into more detail.

from ndt7-client-go.

ggmartins avatar ggmartins commented on July 17, 2024 2

more info on this:
running Ndt7 from multiple "vantage" points inside the same network:

Jetson NX: ~450Mbps (ndt7 dw) Linux appflow 4.9.140-tegra 1 SMP PREEMPT Fri Oct 16 12:25:00 PDT 2020 aarch64 aarch64 aarch64 GNU/Linux (Ubuntu 20)
Jetson Nano: ~450Mbps (ndt7 dw) Linux netrics 4.9.201-tegra 1 SMP PREEMPT Fri Feb 19 08:40:32 PST 2021 aarch64 aarch64 aarch64 GNU/Linux (Ubuntu 20)
Raspberry Pi: ~150Mbps (ndt7 dw) Linux netrics 5.4.0-1035-raspi 38-Ubuntu SMP PREEMPT Tue Apr 20 21:37:03 UTC 2021 aarch64 aarch64 aarch64 GNU/Linux (Ubuntu 20)
Full output here: https://chicago-cdac.github.io/nm-exp-active-netrics/debug/output.ndt7.txt

good to mention that the degraded ndt7 speed (~150Mbps) is also happening on my teammate with a Comcast connection > 300Mbps dw. He runs the same Raspberry Pi model / OS.

Thanks,

G

from ndt7-client-go.

ggmartins avatar ggmartins commented on July 17, 2024 2

well, certainly less than 100% for individual cores/threads,

image

full ps
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root           1  0.0  0.1 169808 12044 ?        Ss   May18   0:37 /sbin/init fixrtc splash
root           2  0.0  0.0      0     0 ?        S    May18   0:00 [kthreadd]
root           3  0.0  0.0      0     0 ?        I<   May18   0:00 [rcu_gp]
root           4  0.0  0.0      0     0 ?        I<   May18   0:00 [rcu_par_gp]
root           8  0.0  0.0      0     0 ?        I<   May18   0:00 [mm_percpu_wq]
root           9  0.0  0.0      0     0 ?        S    May18   0:19 [ksoftirqd/0]
root          10  0.0  0.0      0     0 ?        I    May18   0:17 [rcu_preempt]
root          11  0.0  0.0      0     0 ?        S    May18   0:01 [migration/0]
root          12  0.0  0.0      0     0 ?        S    May18   0:00 [idle_inject/0]
root          14  0.0  0.0      0     0 ?        S    May18   0:00 [cpuhp/0]
root          15  0.0  0.0      0     0 ?        S    May18   0:00 [cpuhp/1]
root          16  0.0  0.0      0     0 ?        S    May18   0:00 [idle_inject/1]
root          17  0.0  0.0      0     0 ?        S    May18   0:01 [migration/1]
root          18  0.0  0.0      0     0 ?        S    May18   0:02 [ksoftirqd/1]
root          20  0.0  0.0      0     0 ?        I<   May18   0:03 [kworker/1:0H-kblockd]
root          21  0.0  0.0      0     0 ?        S    May18   0:00 [cpuhp/2]
root          22  0.0  0.0      0     0 ?        S    May18   0:00 [idle_inject/2]
root          23  0.0  0.0      0     0 ?        S    May18   0:01 [migration/2]
root          24  0.0  0.0      0     0 ?        S    May18   0:01 [ksoftirqd/2]
root          27  0.0  0.0      0     0 ?        S    May18   0:00 [cpuhp/3]
root          28  0.0  0.0      0     0 ?        S    May18   0:00 [idle_inject/3]
root          29  0.0  0.0      0     0 ?        S    May18   0:01 [migration/3]
root          30  0.0  0.0      0     0 ?        S    May18   0:01 [ksoftirqd/3]
root          33  0.0  0.0      0     0 ?        S    May18   0:00 [kdevtmpfs]
root          34  0.0  0.0      0     0 ?        I<   May18   0:00 [netns]
root          35  0.0  0.0      0     0 ?        S    May18   0:00 [rcu_tasks_kthre]
root          36  0.0  0.0      0     0 ?        S    May18   0:00 [kauditd]
root          37  0.0  0.0      0     0 ?        S    May18   0:00 [khungtaskd]
root          38  0.0  0.0      0     0 ?        S    May18   0:00 [oom_reaper]
root          39  0.0  0.0      0     0 ?        I<   May18   0:00 [writeback]
root          40  0.0  0.0      0     0 ?        S    May18   0:00 [kcompactd0]
root          41  0.0  0.0      0     0 ?        SN   May18   0:00 [ksmd]
root          90  0.0  0.0      0     0 ?        I<   May18   0:00 [kintegrityd]
root          91  0.0  0.0      0     0 ?        I<   May18   0:00 [kblockd]
root          92  0.0  0.0      0     0 ?        I<   May18   0:00 [blkcg_punt_bio]
root          93  0.0  0.0      0     0 ?        I<   May18   0:00 [tpm_dev_wq]
root          94  0.0  0.0      0     0 ?        I<   May18   0:00 [ata_sff]
root          95  0.0  0.0      0     0 ?        I<   May18   0:00 [md]
root          96  0.0  0.0      0     0 ?        I<   May18   0:00 [edac-poller]
root          97  0.0  0.0      0     0 ?        I<   May18   0:00 [devfreq_wq]
root          98  0.0  0.0      0     0 ?        S    May18   0:00 [watchdogd]
root         101  0.0  0.0      0     0 ?        S    May18   0:00 [kswapd0]
root         102  0.0  0.0      0     0 ?        S    May18   0:00 [ecryptfs-kthrea]
root         104  0.0  0.0      0     0 ?        I<   May18   0:00 [kthrotld]
root         105  0.0  0.0      0     0 ?        S    May18   0:00 [irq/41-aerdrv]
root         107  0.0  0.0      0     0 ?        I<   May18   0:00 [DWC Notificatio]
root         109  0.0  0.0      0     0 ?        S<   May18   0:00 [vchiq-slot/0]
root         110  0.0  0.0      0     0 ?        S<   May18   0:00 [vchiq-recy/0]
root         111  0.0  0.0      0     0 ?        S<   May18   0:00 [vchiq-sync/0]
root         112  0.0  0.0      0     0 ?        I<   May18   0:00 [ipv6_addrconf]
root         121  0.0  0.0      0     0 ?        I<   May18   0:00 [kstrp]
root         124  0.0  0.0      0     0 ?        I<   May18   0:00 [kworker/u9:0]
root         130  0.0  0.0      0     0 ?        I<   May18   0:00 [cryptd]
root         154  0.0  0.0      0     0 ?        S    May18   0:00 [spi0]
root         155  0.0  0.0      0     0 ?        I<   May18   0:00 [sdhci]
root         156  0.0  0.0      0     0 ?        S    May18   0:00 [irq/28-mmc0]
root         157  0.0  0.0      0     0 ?        I<   May18   0:00 [charger_manager]
root         166  0.0  0.0      0     0 ?        I<   May18   0:00 [mmc_complete]
root         167  0.0  0.0      0     0 ?        I<   May18   0:04 [kworker/0:1H-mmc_complete]
root         210  0.0  0.0      0     0 ?        I<   May18   0:03 [kworker/2:1H-kblockd]
root         211  0.0  0.0      0     0 ?        I<   May18   0:02 [kworker/3:2H-kblockd]
root         754  0.0  0.0      0     0 ?        I<   May18   0:00 [raid5wq]
root         813  0.0  0.0      0     0 ?        S    May18   0:03 [jbd2/mmcblk0p2-]
root         814  0.0  0.0      0     0 ?        I<   May18   0:00 [ext4-rsv-conver]
root         892  0.0  0.2  67452 19080 ?        S<s  May18   0:06 /lib/systemd/systemd-journald
root         919  0.0  0.0  20164  4604 ?        Ss   May18   0:01 /lib/systemd/systemd-udevd
root         951  0.0  0.0      0     0 ?        S    May18   0:00 [vchiq-keep/0]
root         952  0.0  0.0      0     0 ?        S<   May18   0:00 [SMIO]
root        1194  0.0  0.0      0     0 ?        I<   May18   0:00 [mmal-vchiq]
root        1195  0.0  0.0      0     0 ?        I<   May18   0:00 [cfg80211]
root        1199  0.0  0.0      0     0 ?        I<   May18   0:00 [mmal-vchiq]
root        1209  0.0  0.0      0     0 ?        I<   May18   0:00 [mmal-vchiq]
root        1213  0.0  0.0      0     0 ?        I<   May18   0:00 [mmal-vchiq]
root        1323  0.0  0.0      0     0 ?        I<   May18   0:00 [brcmf_wq/mmc1:0]
root        1325  0.0  0.0      0     0 ?        S    May18   0:00 [brcmf_wdog/mmc1]
root        1530  0.0  0.0      0     0 ?        I<   May18   0:00 [kaluad]
root        1531  0.0  0.0      0     0 ?        I<   May18   0:00 [kmpath_rdacd]
root        1532  0.0  0.0      0     0 ?        I<   May18   0:00 [kmpathd]
root        1533  0.0  0.0      0     0 ?        I<   May18   0:00 [kmpath_handlerd]
root        1534  0.0  0.2 280232 16604 ?        SLsl May18   0:31 /sbin/multipathd -d -s
root        1545  0.0  0.0      0     0 ?        S<   May18   0:00 [loop0]
root        1548  0.0  0.0      0     0 ?        S<   May18   0:00 [loop1]
root        1549  0.0  0.0      0     0 ?        S<   May18   0:00 [loop2]
systemd+    1573  0.0  0.0  89940  6240 ?        Ssl  May18   0:01 /lib/systemd/systemd-timesyncd
systemd+    1613  0.0  0.0  26136  5964 ?        Ss   May18   0:16 /lib/systemd/systemd-networkd
systemd+    1615  0.0  0.1  24124 11860 ?        Ss   May18   0:35 /lib/systemd/systemd-resolved
root        1649  0.0  0.0 237504  6736 ?        Ssl  May18   0:11 /usr/lib/accountsservice/accounts-daemon
message+    1650  0.0  0.0   8304  4628 ?        Ss   May18   0:03 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only
root        1653  0.0  0.0  80940  1508 ?        Ssl  May18   0:49 /usr/sbin/irqbalance --foreground
root        1654  0.0  0.2  29108 16644 ?        Ss   May18   0:00 /usr/bin/python3 /usr/bin/networkd-dispatcher --run-startup-triggers
syslog      1656  0.0  0.0 221180  4172 ?        Ssl  May18   0:01 /usr/sbin/rsyslogd -n -iNONE
root        1659  0.0  0.0  16468  6608 ?        Ss   May18   0:02 /lib/systemd/systemd-logind
root        1661  0.0  0.0  12376  4752 ?        Ss   May18   0:02 /sbin/wpa_supplicant -u -s -O /run/wpa_supplicant
root        1688  0.0  0.0   8336  2576 ?        Ss   May18   0:01 /usr/sbin/cron -f
root        1692  0.0  0.4  48524 32780 ?        Ss   May18   0:01 /usr/bin/python3 /usr/bin/salt-minion
root        1705  0.0  0.2 107832 19316 ?        Ssl  May18   0:00 /usr/bin/python3 /usr/share/unattended-upgrades/unattended-upgrade-shutdown --wait-for-signal
daemon      1707  0.0  0.0   3592  1816 ?        Ss   May18   0:00 /usr/sbin/atd -f
root        1711  0.0  0.0   6836  1892 ttyS0    Ss+  May18   0:00 /sbin/agetty -o -p -- \u --keep-baud 115200,38400,9600 ttyS0 vt220
root        1713  0.0  0.0   5312  1468 tty1     Ss+  May18   0:00 /sbin/agetty -o -p -- \u --noclear tty1 linux
root        1726  0.0  0.0  12208  6408 ?        Ss   May18   0:00 sshd: /usr/sbin/sshd -D [listener] 0 of 10-100 startups
root        1732  0.0  0.0 232936  6072 ?        Ssl  May18   0:00 /usr/lib/policykit-1/polkitd --no-debug
root        1811  0.1  0.7 986536 57444 ?        Sl   May18   6:06 /usr/bin/python3 /usr/bin/salt-minion
root        1813  0.0  0.3 125164 26592 ?        S    May18   0:00 /usr/bin/python3 /usr/bin/salt-minion
root        2125  0.0  0.0      0     0 ?        S<   May18   0:00 [loop3]
root        2193  0.0  0.0      0     0 ?        S<   May18   0:00 [loop4]
root        2253  0.0  0.3 1148884 29436 ?       Ssl  May18   1:10 /usr/lib/snapd/snapd
root        2544  0.0  0.0      0     0 ?        S<   May18   0:00 [loop5]
avahi       6392  0.0  0.0   7152  3072 ?        Ss   May18   0:01 avahi-daemon: running [netrics-2.local]
avahi       6393  0.0  0.0   6888   328 ?        S    May18   0:00 avahi-daemon: chroot helper
root       37722  0.0  0.0      0     0 ?        I    17:55   0:00 [kworker/3:0-events]
root       39183  0.0  0.0      0     0 ?        I    19:00   0:01 [kworker/2:1-events]
root       40291  0.0  0.0      0     0 ?        I    20:03   0:00 [kworker/0:0-events]
root       40493  0.0  0.0      0     0 ?        I    20:15   0:00 [kworker/u8:1-events_power_efficient]
root       40671  0.0  0.0      0     0 ?        I    20:25   0:00 [kworker/0:1-events]
root       40750  0.0  0.0      0     0 ?        I<   20:25   0:00 [kworker/3:1H]
root       40753  0.0  0.0      0     0 ?        I<   20:25   0:00 [kworker/1:2H]
root       40756  0.0  0.0      0     0 ?        I    20:25   0:00 [kworker/2:2-events]
root       40757  0.0  0.0      0     0 ?        I    20:25   0:00 [kworker/1:0-events]
ubuntu     40761  0.1  0.1  19372  9920 ?        Ss   20:26   0:00 /lib/systemd/systemd --user
ubuntu     40762  0.0  0.0 171608  6876 ?        S    20:26   0:00 (sd-pam)
root       40845  0.0  0.0      0     0 ?        I    20:26   0:00 [kworker/3:2-events]
root       40872  0.0  0.0      0     0 ?        I<   20:26   0:00 [kworker/2:0H]
root       40891  0.0  0.0  15460  7808 ?        Ss   20:27   0:00 sshd: ubuntu [priv]
ubuntu     40965  0.0  0.0  15596  4364 ?        S    20:27   0:00 sshd: ubuntu@pts/0
ubuntu     40966  0.0  0.0   9664  4608 pts/0    Ss   20:27   0:00 -bash
root       40976  0.0  0.0  15460  7680 ?        Ss   20:28   0:00 sshd: ubuntu [priv]
ubuntu     41050  0.0  0.0  15596  4464 ?        S    20:28   0:00 sshd: ubuntu@pts/1
ubuntu     41051  0.0  0.0   9664  4608 pts/1    Ss+  20:28   0:00 -bash
root       41093  0.0  0.0      0     0 ?        I    20:30   0:00 [kworker/0:2-events]
root       41164  0.0  0.0      0     0 ?        I<   20:31   0:00 [kworker/0:2H]
root       41165  0.0  0.0      0     0 ?        I    20:31   0:00 [kworker/1:1-events]
root       41166  0.0  0.0      0     0 ?        I    20:31   0:00 [kworker/u8:0-events_power_efficient]
root       41206  0.0  0.0      0     0 ?        I    20:32   0:00 [kworker/2:0-events]
root       41218  0.0  0.0      0     0 ?        I    20:32   0:00 [kworker/3:1-mm_percpu_wq]
root       41236  0.0  0.0      0     0 ?        I<   20:32   0:00 [kworker/3:0H]
root       41255  0.0  0.0      0     0 ?        I<   20:33   0:00 [kworker/1:1H]
root       41262  0.0  0.0      0     0 ?        I<   20:35   0:00 [kworker/2:2H]
root       41344  0.0  0.0      0     0 ?        I    20:36   0:00 [kworker/1:2-events]
root       41345  0.0  0.0      0     0 ?        I    20:36   0:00 [kworker/u8:2-events_unbound]
ubuntu     41346  0.0  0.0  11140  3144 pts/0    R+   20:36   0:00 ps -auxww

from ndt7-client-go.

ggmartins avatar ggmartins commented on July 17, 2024 2

same problem with a fresh raspbian install 32bits:
Linux raspberrypi 5.10.17-v7l+ #1414 SMP Fri Apr 30 13:20:47 BST 2021 armv7l GNU/Linux

go1.16.4.linux-armv6l.tar.gz

log
pi@raspberrypi:~ $ go get -v github.com/m-lab/ndt7-client-go/cmd/ndt7-client
go: downloading github.com/m-lab/ndt7-client-go v0.4.1
go: downloading github.com/m-lab/go v0.1.43
go: downloading github.com/gorilla/websocket v1.4.2
go: downloading github.com/m-lab/locate v0.4.1
go: downloading github.com/m-lab/ndt-server v0.20.2
go: downloading github.com/araddon/dateparse v0.0.0-20200409225146-d820a6159ab1
go: downloading github.com/m-lab/tcp-info v1.5.2
github.com/m-lab/ndt-server/metadata
runtime/cgo
github.com/araddon/dateparse
github.com/m-lab/go/rtx
github.com/m-lab/locate/api/v2
github.com/m-lab/ndt7-client-go/internal/params
github.com/m-lab/tcp-info/tcp
github.com/m-lab/go/flagx
net
net/textproto
github.com/m-lab/go/anonymize
crypto/x509
vendor/golang.org/x/net/http/httpproxy
github.com/m-lab/tcp-info/inetdiag
github.com/m-lab/ndt-server/ndt7/model
vendor/golang.org/x/net/http/httpguts
github.com/m-lab/ndt7-client-go/spec
github.com/m-lab/ndt7-client-go/cmd/ndt7-client/internal/emitter
crypto/tls
net/http/httptrace
net/http
github.com/m-lab/locate/api/locate
github.com/gorilla/websocket
github.com/m-lab/ndt7-client-go/internal/websocketx
github.com/m-lab/ndt7-client-go/internal/download
github.com/m-lab/ndt7-client-go/internal/upload
github.com/m-lab/ndt7-client-go
github.com/m-lab/ndt7-client-go/cmd/ndt7-client
pi@raspberrypi:~ $ cd ~/go/bin/
pi@raspberrypi:~/go/bin $ ./ndt7-client 
download in progress with ndt-mlab2-ord06.mlab-oti.measurement-lab.org
Avg. speed  :   121.0 Mbit/s
download: complete
upload in progress with ndt-mlab2-ord06.mlab-oti.measurement-lab.org
Avg. speed  :    23.7 Mbit/s
upload: complete
         Server: ndt-mlab2-ord06.mlab-oti.measurement-lab.org
         Client: [REDACTED]
        Latency:     7.2 ms
       Download:   121.0 Mbit/s
         Upload:    23.7 Mbit/s
 Retransmission:    0.00 %
pi@raspberrypi:~/go/bin $ ./ndt7-client 
download in progress with ndt-mlab1-ord05.mlab-oti.measurement-lab.org
Avg. speed  :   122.0 Mbit/s
download: complete
upload in progress with ndt-mlab1-ord05.mlab-oti.measurement-lab.org
Avg. speed  :    24.0 Mbit/s
upload: complete
         Server: ndt-mlab1-ord05.mlab-oti.measurement-lab.org
         Client: [REDACTED]
        Latency:     6.2 ms
       Download:   122.0 Mbit/s
         Upload:    24.0 Mbit/s
 Retransmission:    0.00 %
pi@raspberrypi:~/go/bin $ ./ndt7-client 
download in progress with ndt-mlab3-ord06.mlab-oti.measurement-lab.org
Avg. speed  :   121.3 Mbit/s
download: complete
upload in progress with ndt-mlab3-ord06.mlab-oti.measurement-lab.org
Avg. speed  :    24.1 Mbit/s
upload: complete
         Server: ndt-mlab3-ord06.mlab-oti.measurement-lab.org
         Client: [REDACTED]
        Latency:     5.8 ms
       Download:   121.3 Mbit/s
         Upload:    24.1 Mbit/s
 Retransmission:    0.00 %
pi@raspberrypi:~/go/bin $ 

from ndt7-client-go.

ggmartins avatar ggmartins commented on July 17, 2024 2

sure, here you go:

./ndt7-client-go/cmd/ndt7-client/ndt7-rob -locate.url=https://locate-dot-mlab-staging.appspot.com/v2/nearest -scheme=wss

    upload: complete
             Server: ndt-mlab4-ord05.mlab-staging.measurement-lab.org
             Client: [REDACTED]
            Latency:     3.9 ms
           Download:   814.9 Mbit/s
             Upload:    42.9 Mbit/s
     Retransmission:    0.01 %
    upload: complete
             Server: ndt-mlab4-ord06.mlab-staging.measurement-lab.org
             Client: [REDACTED]
            Latency:     3.7 ms
           Download:   827.5 Mbit/s
             Upload:    42.9 Mbit/s
     Retransmission:    0.00 %
    upload: complete
             Server: ndt-mlab4-ord05.mlab-staging.measurement-lab.org
             Client: [REDACTED]
            Latency:     4.3 ms
           Download:   851.3 Mbit/s
             Upload:    42.9 Mbit/s
     Retransmission:    0.00 %
    upload: complete
             Server: ndt-mlab4-ord05.mlab-staging.measurement-lab.org
             Client: [REDACTED]
            Latency:     4.0 ms
           Download:   844.3 Mbit/s
             Upload:    42.9 Mbit/s
     Retransmission:    0.00 %

FYI, your PR is being tested on multiple connections ranging from 100Mbps to 1000Mbps and with RPis and Jetson, a total of ~10 devices. If you don't hear anything else from me, that means your PR is nailing it.

from ndt7-client-go.

feamster avatar feamster commented on July 17, 2024 2

This looks much more sane for me now, too.

image

from ndt7-client-go.

feamster avatar feamster commented on July 17, 2024 1

Same issue for me from my home, compare Ookla to NDT7.

image

from ndt7-client-go.

laiyi-ohlsen avatar laiyi-ohlsen commented on July 17, 2024 1

@feamster are those results also using ndt7-client-go on a raspi?

from ndt7-client-go.

bassosimone avatar bassosimone commented on July 17, 2024 1

Thank you both for following up! I am going to help people in the core M-Lab team to better understand those issues! πŸ€—

from ndt7-client-go.

robertodauria avatar robertodauria commented on July 17, 2024 1

@ggmartins Thank you for reporting this and the kind words! May I ask which version of Go did you use to build the client running on the Raspberry Pi?

On some CPUs (namely, armv7 without a hardware AES implementation) TLS adds a lot of overhead because the crypto is implemented in Go. In those cases, I would recommend using -scheme=ws to disable TLS, which usually makes a big difference. However, what's strange is that you're seeing bad performances on what I believe being an arm64 CPU that supports AES. Could you please make sure you're using a recent go release (ideally, 1.16) to build the client?

Also, the output of cat /proc/cpuinfo on the Raspberry Pi would be helpful. Thanks!

from ndt7-client-go.

robertodauria avatar robertodauria commented on July 17, 2024 1

@ggmartins Yes, AES is (only) used by TLS encryption and the lack of hardware support is definitely the culprit. :)

Would you be willing to build and test the sandbox-roberto-remove-memory-alloc branch? I've made some optimizations that improved the performance on an armv7 device I have here quite significantly when running the measurement without TLS.

Additionally, it's probably also worth testing with the next release (v0.20.6) of ndt-server that is available on our staging environment. You can use a server from the staging environment by running the ndt7-client like this:

./ndt7-client -locate.url=https://locate-dot-mlab-staging.appspot.com/v2/nearest -scheme=ws

(I expect the staging environment to perform a bit better even when using TLS, but you won't get reasonable speeds without hardware encryption anyway.)

A test with just -scheme=ws and one with -locate.url=https://locate-dot-mlab-staging.appspot.com/v2/nearest -scheme=ws would allow me to understand what's slowing the client down a bit better. Thanks!

from ndt7-client-go.

ggmartins avatar ggmartins commented on July 17, 2024 1

sure, I can do that in the next couple of hours, bwm.

from ndt7-client-go.

ggmartins avatar ggmartins commented on July 17, 2024 1

@jlivingood let me break this down for you:

  • segment @ ~150Mbps, rpi4 wss (encrypted, old client code)
  • segment @ ~740Mbps, rpi4 ws (unencrypted, old client code)
  • segment @ ~900Mbps, rpi4 ws (unencrypted, new client code)
    (all of this using production servers)

not in the graph:
~850Mbps rpi4 wss (encrypted, new client code on staging servers, to be released soon)
https://locate-dot-mlab-staging.appspot.com/v2/nearest

my takeaway here: even with ChaCha20 we'll still see encrypted measurements underperforming unencrypted, the question is, do we really need encryption? imho, I don't think so, but we do need obfuscation, so I think the holy grail here would be having WebSockets supporting an ultra-lightweight obfuscation method outperforming the lightest encryption method available, I'm not an expert in the matter, maybe this already exists.

cheers,

G

from ndt7-client-go.

bassosimone avatar bassosimone commented on July 17, 2024

What is the CPU load when running the test? Is ndt7-client-go using 100% of the CPU?

from ndt7-client-go.

bassosimone avatar bassosimone commented on July 17, 2024

@ggmartins Thank you very much for providing detailed information! πŸ™ πŸ’―

I find it interesting that in the htop screenshot you posted there is a Go thread using 150% of the CPU. This strikes me as a goroutine doing too much work, even though I don't know very well what this goroutine may be doing.

It would probably help to shed more light on what's happening to use the diff at #60 to collect a CPU profile. Would you mind building ndt7-client using my fork at the branch indicated in #60? This PR adds a CLI flag, -profile <file>, which collects CPU profile information. If you don't mind sharing the collected cpuprofile file, then looking at it would certainly help me and the core time to much better understanding of what could be the bottleneck there.

I suppose the ideal is to gather a profile for both the arm64 and the arm32 devices you mentioned.

from ndt7-client-go.

bassosimone avatar bassosimone commented on July 17, 2024

@ggmartins it just occurred to me another useful data point we could collect here (long time, no looking at this ndt7 codebase but still I'm interested to understand this bug, which could also affect us at @ooni).

There is a way run ndt7-client w/o encryption: ./ndt7-client -scheme ws. This forces the unencrypted ws scheme. A basic test is to execute the client in this mode and check whether there are differences in the performance. In case there are striking performance differences, this is a reasonable indicator that the problem is encryption. (I know the Go codebase does not support hardware acceleration for AES on arm32, though I am not sure this could be the root cause of your issue because the problem you reported occurs with an arm64 device, for which there should be support.)

Thanks again for investigating this problem! πŸ€—

from ndt7-client-go.

bassosimone avatar bassosimone commented on July 17, 2024

@feamster with more info on you setup we can make it faster! Is that arm32, arm64, or desktop? Thanks for reporting that!

from ndt7-client-go.

ggmartins avatar ggmartins commented on July 17, 2024

@bassosimone running w/o enc unlocked better numbers:

ubuntu@netrics:~/ndt7profile/ndt7-client-go/cmd/ndt7-client$ ./ndt7-client-prof
download in progress with ndt-mlab3-ord03.mlab-oti.measurement-lab.org
Avg. speed  :   155.1 Mbit/s
download: complete
upload in progress with ndt-mlab3-ord03.mlab-oti.measurement-lab.org
Avg. speed  :    24.5 Mbit/s
upload: complete
...
 Retransmission:    0.00 %
ubuntu@netrics:~/ndt7profile/ndt7-client-go/cmd/ndt7-client$ ./ndt7-client-prof -scheme ws
download in progress with ndt-mlab1-ord06.mlab-oti.measurement-lab.org
Avg. speed  :   414.3 Mbit/s
download: complete
upload in progress with ndt-mlab1-ord06.mlab-oti.measurement-lab.org
Avg. speed  :    24.6 Mbit/s
upload: complete
...
 Retransmission:    0.00 %
ubuntu@netrics:~/ndt7profile/ndt7-client-go/cmd/ndt7-client$ ./ndt7-client-prof -scheme ws
download in progress with ndt-mlab3-ord04.mlab-oti.measurement-lab.org
Avg. speed  :   409.3 Mbit/s
download: complete
upload in progress with ndt-mlab3-ord04.mlab-oti.measurement-lab.org
Avg. speed  :    24.6 Mbit/s
upload: complete
...
 Retransmission:    0.00 %

although Jetson Nano can do better:

./ndt7-client-prof 
download in progress with ndt-mlab1-ord05.mlab-oti.measurement-lab.org
Avg. speed  :   456.4 Mbit/s
download: complete
upload in progress with ndt-mlab1-ord05.mlab-oti.measurement-lab.org
Avg. speed  :    23.5 Mbit/s
upload: complete
         Server: ndt-mlab1-ord05.mlab-oti.measurement-lab.org
         Client: [REDACTED]
        Latency:    27.6 ms
       Download:   456.4 Mbit/s
         Upload:    23.5 Mbit/s
 Retransmission:    0.00 %

from ndt7-client-go.

feamster avatar feamster commented on July 17, 2024

@bassosimone yes, this: Linux raspberrypi 5.10.17-v7l+ #1414 SMP Fri Apr 30 13:20:47 BST 2021 armv7l GNU/Linux

Disabling encryption sped things up a bit, but still nowhere near iperf3

image

from ndt7-client-go.

ggmartins avatar ggmartins commented on July 17, 2024

Hi @robertodauria, thanks for helping on this.
yes, we see the problem, no aes on hw for the rpi, that explains the cheap price, at least :-)
is the aes only used for encryption? because we tried disabling it with no joy for speeds > 500Mbps (see Nick's graphics above). I mean, there's an unmatched performance with Ookla's speedtest that gets noticeable at higher speeds and we desperately need the retransmission rate from ndt7 :-) Let us know your thoughts

Jetson Nano
processor	: 0
model name	: ARMv8 Processor rev 1 (v8l)
BogoMIPS	: 38.40
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0xd07
CPU revision	: 1

processor	: 1
model name	: ARMv8 Processor rev 1 (v8l)
BogoMIPS	: 38.40
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0xd07
CPU revision	: 1

processor	: 2
model name	: ARMv8 Processor rev 1 (v8l)
BogoMIPS	: 38.40
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0xd07
CPU revision	: 1

processor	: 3
model name	: ARMv8 Processor rev 1 (v8l)
BogoMIPS	: 38.40
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0xd07
CPU revision	: 1
Jetson NX
cat /proc/cpuinfo 
processor	: 0
model name	: ARMv8 Processor rev 0 (v8l)
BogoMIPS	: 62.50
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp
CPU implementer	: 0x4e
CPU architecture: 8
CPU variant	: 0x0
CPU part	: 0x004
CPU revision	: 0
MTS version	: 50168445

processor	: 1
model name	: ARMv8 Processor rev 0 (v8l)
BogoMIPS	: 62.50
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp
CPU implementer	: 0x4e
CPU architecture: 8
CPU variant	: 0x0
CPU part	: 0x004
CPU revision	: 0
MTS version	: 50168445

processor	: 2
model name	: ARMv8 Processor rev 0 (v8l)
BogoMIPS	: 62.50
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp
CPU implementer	: 0x4e
CPU architecture: 8
CPU variant	: 0x0
CPU part	: 0x004
CPU revision	: 0
MTS version	: 50168445

processor	: 3
model name	: ARMv8 Processor rev 0 (v8l)
BogoMIPS	: 62.50
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp
CPU implementer	: 0x4e
CPU architecture: 8
CPU variant	: 0x0
CPU part	: 0x004
CPU revision	: 0
MTS version	: 50168445

processor	: 4
model name	: ARMv8 Processor rev 0 (v8l)
BogoMIPS	: 62.50
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp
CPU implementer	: 0x4e
CPU architecture: 8
CPU variant	: 0x0
CPU part	: 0x004
CPU revision	: 0
MTS version	: 50168445

processor	: 5
model name	: ARMv8 Processor rev 0 (v8l)
BogoMIPS	: 62.50
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp
CPU implementer	: 0x4e
CPU architecture: 8
CPU variant	: 0x0
CPU part	: 0x004
CPU revision	: 0
MTS version	: 50168445
Raspberry Pi 4 Model B Rev 1.4
processor	: 0
BogoMIPS	: 108.00
Features	: fp asimd evtstrm crc32 cpuid
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x0
CPU part	: 0xd08
CPU revision	: 3

processor	: 1
BogoMIPS	: 108.00
Features	: fp asimd evtstrm crc32 cpuid
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x0
CPU part	: 0xd08
CPU revision	: 3

processor	: 2
BogoMIPS	: 108.00
Features	: fp asimd evtstrm crc32 cpuid
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x0
CPU part	: 0xd08
CPU revision	: 3

processor	: 3
BogoMIPS	: 108.00
Features	: fp asimd evtstrm crc32 cpuid
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x0
CPU part	: 0xd08
CPU revision	: 3

Hardware	: BCM2835
Revision	: d03114
Serial		: 100000005d172b32
Model		: Raspberry Pi 4 Model B Rev 1.4

on both rpi and nano we're using ubuntu 64 mostly, and with go1.16.4.linux-arm64.tar.gz, same results on rpi:

go version
go version go1.16.4 linux/arm64
ubuntu@netrics:~/golang$ ~/go/bin/ndt7-client 
download in progress with ndt-mlab1-ord05.mlab-oti.measurement-lab.org
Avg. speed  :   154.3 Mbit/s
download: complete
upload in progress with ndt-mlab1-ord05.mlab-oti.measurement-lab.org
Avg. speed  :    23.6 Mbit/s
upload: complete
         Server: ndt-mlab1-ord05.mlab-oti.measurement-lab.org
         Client: [REDACTED]
        Latency:    26.8 ms
       Download:   154.3 Mbit/s
         Upload:    23.6 Mbit/s
 Retransmission:    0.00 %

fwiw, I was able to replicate the problem on a raspbian 32bits:
Linux raspberrypi 5.10.17-v7l+ #1414 SMP Fri Apr 30 13:20:47 BST 2021 armv7l GNU/Linux
go1.16.4.linux-armv6l.tar.gz

Thanks,

G

from ndt7-client-go.

robertodauria avatar robertodauria commented on July 17, 2024

@ggmartins Very happy to hear that! πŸŽ‰

The changes to the client have just been merged to the master branch with the PR above.

There is also an additional change I've made to automatically detect if the AES crypto extensions are available (on x86, ARMv7 and ARMv8) and default to using WS if not, so if you are building the latest code you won't need to specify the scheme anymore.

I'm quite surprised by the improvements with TLS and the staging ndt-server. On my test device TLS was ~50% faster (~130Mb/s -> ~190Mb/s), but you are seeing 4.4x the previous speed. Could you please confirm that result by explicitly setting -scheme=wss -locate.url=https://locate-dot-mlab-staging.appspot.com/v2/nearest?

The promotion of ndt-server v0.20.6 from staging to production should happen before the end of the month -- likely next week. :)

from ndt7-client-go.

robertodauria avatar robertodauria commented on July 17, 2024

OK, it makes more sense now. Thanks for testing again! :)

Go1.16 changed the TLS negotiation so that if the client does not signal that it supports hardware AES it defaults to ChaCha20, for which there is an assembly implementation in go's crypto library for ARM64 but not for ARMv7. That explains why I wasn't seeing that much improvement (my ODROID runs on ARMv7).

However, without TLS my ODROID is pretty consistently giving me the same or slightly better results than Ookla's command-line client after that fix.

Closing this issue, for now. Please feel free to reopen if needed.

from ndt7-client-go.

jlivingood avatar jlivingood commented on July 17, 2024

What's the exact increase in speed above? Eyeballing it looks like a ~300% increase in measurement results or alternatively, the prior measurements under-reported "speeds" by around 80%? Anyway - it would be good to understand both of these stats.

from ndt7-client-go.

feamster avatar feamster commented on July 17, 2024

These improvements look great.

Many of our initial deployments of ndt7 was seeing 150 Mbps; then 740 Mbps before eventually a fix was deployed. (ndt1 never even came close to 1 Gbps). Some clients are still seeing 850 Mbps in comparison to other speedtests. Presumably there are many clients who have higher downstream speeds than 150 Mbps at home and some of the ndt7 data in the public data may also be invalid.

The changes here do affect the subset of clients we tested on, but you don't have any way of knowing that they only affect those clients without extensive testing. The issues were related to Go's garbage collector and the default encryption used in an old version of Goβ€”both of which would be exacerbated by running on an embedded device, but nonetheless omnipresent and possibly an issue for other tests. The particular garbage collection slowdown is a known issue for Websockets in Go more broadly: gorilla/websocket#134

And it is in fact pretty common for people to run these kinds of tests on embedded devices that are always on, attached to routers, etc. That's how we've been doing it since 2010, and many projects deploy router-based speed tests. Anyone who was using this test prior to June 4, 2021 may very well have injected bad data into the database.

In hindsight, if you have metadata about the nature of the device on which the test was being performed, it may be possible to go back and look for patterns in the old data to try to clean it up. There may be other ways to clean things up. For example, if you see a client that consistently measures 900 Mbps but periodically experiences drops, that's less likely this particular bug. But if the measurement is always 150 Mbps, you really have no way of knowing what's causing that absent more metadata.

Absent a cleanup effort, the only conclusion we can draw is that some non-zero and possibly significant amount of NDT test data may be subject to client-side limitations that affect the accuracy of the test, and that any data prior to June 4, 2021 should be discarded. Ethical practice suggests that you should expunge the old, inaccurate data from the M-Lab servers (or at the least, annotate it or move it to a legacy table) so that it doesn't continue to be misused by the public.

from ndt7-client-go.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.