Giter VIP home page Giter VIP logo

Comments (19)

blyxxyz avatar blyxxyz commented on May 21, 2024 1

I think 8 MB/s is just within the ballpark of what the current implementation can manage. You only run into it if your connection is fast enough. It's more of an upper cap than a constant slowdown.

I've had to test on a VPS before when my home connection didn't cut it.

from xh.

ducaale avatar ducaale commented on May 21, 2024 1

xh v0.9.2 has been released which has a download speed comparable to wget 🎉

from xh.

blyxxyz avatar blyxxyz commented on May 21, 2024

Hm, I can't reproduce this myself.

Does it happen when downloading from localhost?

Does it happen with a trivial reqwest-based downloader? e.g.

main.rs

fn main() {
    let url = std::env::args().nth(1).unwrap();
    let out = std::env::args().nth(2).unwrap();
    let mut file = std::fs::OpenOptions::new()
        .write(true)
        .create(true)
        .open(&out)
        .unwrap();
    reqwest::blocking::get(&url)
        .unwrap()
        .copy_to(&mut file)
        .unwrap();
}

Cargo.toml

[package]
name = "simpledl"
version = "0.0.0"
edition = "2018"

[dependencies]
reqwest = { version = "0.11", default-features = false, features = ["blocking", "rustls-tls", "stream", "json", "gzip", "brotli", "multipart"] }

(features copied from ht's, plus blocking)

from xh.

svenstaro avatar svenstaro commented on May 21, 2024

I can actually reproduce it like this, yes:

fn main() {
    let url = "http://mirror.pseudoform.org/community/os/x86_64/cuda-11.2.0-3-x86_64.pkg.tar.zst";
    let out = "/tmp/cuda.zst";
    let mut file = std::fs::OpenOptions::new()
        .write(true)
        .create(true)
        .open(&out)
        .unwrap();
    reqwest::blocking::get(url)
        .unwrap()
        .copy_to(&mut file)
        .unwrap();
}

takes 192s to run while

wget http://mirror.pseudoform.org/community/os/x86_64/cuda-11.2.0-3-x86_64.pkg.tar.zst

takes 41s.

Perhaps this is related to latency?

from xh.

blyxxyz avatar blyxxyz commented on May 21, 2024

Thank you, I've narrowed it down a little. When I download that file ht is significantly slower than wget, but if I recompile ht with default-tls instead of rustls-tls the difference disappears.

(There could also be some relation to HTTP/2, as it's used when I download with rustls-tls but not default-tls. But curl --http2 is as fast as wget.)

@ducaale, what was the reason for switching to rustls? Could it be worth switching back?

from xh.

ducaale avatar ducaale commented on May 21, 2024

I made the switch from default-tls to rustls-tls after encountering some problems in configuring CI/CD for ht. Some people liked the fact that ht is now using rustls-tls but I suppose we could switch back.

from xh.

svenstaro avatar svenstaro commented on May 21, 2024

Hm it's a tough call and maybe this is just about setting some buffer sizes correctly. I actually like rustls and think it makes deployment a bit easier and it also allows for entirely static builds so that's definitely a plus.

Perhaps we can leave this issue open and investigate the issue a bit? It might also actually be an issue in how reqwest does stuff. Since this appears to be latency related my buffer sizes hypothesis might be a lead.

from xh.

blyxxyz avatar blyxxyz commented on May 21, 2024

For what it's worth, I use a large buffer for downloading in #41 and that doesn't make a difference. So if it is a buffer then it's not one we can easily reach.

Followup questions could be:

  • Does it happen with other hosts?
    • Is it related to HTTP/2?
    • Is it related to Caddy (mirror.pseudoform.org's server)?
  • Does it happen when calling hyper directly, without reqwests?

from xh.

svenstaro avatar svenstaro commented on May 21, 2024

There are multiple other mirrors for the same file that you can check:

So a pattern emerges that downloading is slow via HTTP2. This would seem to fit very well with your earlier research about the tls lib.

Now the question is: What exactly makes this slow since my CPU isn't anywhere near busy with downloading the file using xh?

from xh.

svenstaro avatar svenstaro commented on May 21, 2024

Ok so after a bit of trial and error I was at least able to find a workaround to the problem. I set

    let client = Client::builder()
        .http2_initial_stream_window_size(100_000_000)
        .http2_initial_connection_window_size(100_000_000)
        .redirect(redirect)
        .build()?;

So basically I set giant initial window sizes and that really made it go! I'm not sure this is a proper solution but perhaps we can now investigate a bit more along these lines?

from xh.

svenstaro avatar svenstaro commented on May 21, 2024

There's this issue which seems somewhat related: hyperium/hyper#1960

from xh.

blyxxyz avatar blyxxyz commented on May 21, 2024

Aha! How does the download fare with a connection window of 5MB and stream window of 3MB, as suggested in that issue?

from xh.

svenstaro avatar svenstaro commented on May 21, 2024

I believe that's actually the current default (and it's not great). I think the proper fix is to enable adaptive window sizing and I posted the fix upstream: seanmonstar/reqwest#1194

from xh.

blyxxyz avatar blyxxyz commented on May 21, 2024

reqwest's docs list 65,535 for both windows.
I can't easily find that default implemented in the source code though.

from xh.

svenstaro avatar svenstaro commented on May 21, 2024

Progress: This is merged in reqwest and should be turned on in xh in the next release.

from xh.

blyxxyz avatar blyxxyz commented on May 21, 2024

reqwest v0.11.2 has been released.

Enabling http2_adaptive_window raises the max speed (in my test, 15 MB/s vs wget's 18 MB/s), but it causes the speed to ramp up much more slowly initially. I'm not sure why. http2_initial_stream_window_size() and http2_initial_connection_window_size() don't seem to help.

from xh.

svenstaro avatar svenstaro commented on May 21, 2024

I think we should still turn it on in xh as I think the advantages outweigh the disadvantages.

from xh.

blyxxyz avatar blyxxyz commented on May 21, 2024

Yes, but I'm wondering if there's some other knob that can improve it. My knowledge is very surface-level. Do you have any ideas?

from xh.

ducaale avatar ducaale commented on May 21, 2024

Not sure why but I am seeing comparable speeds between xh (using http2 and rustls) and wget

$ wget https://mirror.23media.com/archlinux/community/os/x86_64/cuda-11.2.2-1-x86_64.pkg.tar.zst
--2021-03-24 22:34:07--  https://mirror.23media.com/archlinux/community/os/x86_64/cuda-11.2.2-1-x86_64.pkg.tar.zst
Resolving mirror.23media.com (mirror.23media.com)... 212.83.32.30, 2a00:f48:1007:32::30
Connecting to mirror.23media.com (mirror.23media.com)|212.83.32.30|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1264829973 (1.2G) [application/octet-stream]
Saving to: ‘cuda-11.2.2-1-x86_64.pkg.tar.zst’

cuda-11.2.2-1-x86_64.pkg.tar. 100%[=================================================>]   1.18G  8.22MB/s    in 2m 28s

2021-03-24 22:36:35 (8.17 MB/s) - ‘cuda-11.2.2-1-x86_64.pkg.tar.zst’ saved [1264829973/1264829973]
$ xh https://mirror.23media.com/archlinux/community/os/x86_64/cuda-11.2.2-1-x86_64.pkg.tar.zst -d
HTTP/2.0 200 OK
accept-ranges: bytes
content-length: 1264829973
content-type: application/octet-stream
date: Wed, 24 Mar 2021 22:37:08 GMT
etag: "6054d3da-4b63c615"
last-modified: Fri, 19 Mar 2021 16:39:54 GMT
server: nginx
strict-transport-security: max-age=31536000; includeSubDomains
x-content-type-options: nosniff
x-download-options: noopen
x-permitted-cross-domain-policies: none
x-robots-tag: none
x-xss-protection: 1; mode=block

Downloading 1.18GB to "cuda-11.2.2-1-x86_64.pkg.tar.zst"
  [00:02:27] [##################################################################################] 1.18GB 8.22MB/s ETA 0s

PS: this was tested on both a Windows machine and a MacBook.

from xh.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.