Giter VIP home page Giter VIP logo

Comments (7)

GuillaumeGomez avatar GuillaumeGomez commented on August 22, 2024

Merged doc tests is tied to the Rust 2024 edition because it can break behavior.

No it's not the reason. It's because it changes the output and adds a new code example attribute.

I believe and doing some code analysis via grep.app would support that the majority of doc tests can be run in parallel in the same binary.

I have to admit that I don't see how any script would make it possible to detect such cases. If it's code from a dependency, there is no way to detect it (and let's not even mention FFI).

For the cases that depend on some kind of singleton, log lib init, etc. they can be marked with an attribute as requiring a standalone process.

Again, I don't see how any script could detect that.

I think the only reasonable way to do it is to add an option or a crate doc attribute to say "run all tests in the same thread" which would be disabled by default.

And finally: did you compare the before before/after using this new feature? Are the numbers you provided from the current doctests or with the feature enabled?

from rust.

Voultapher avatar Voultapher commented on August 22, 2024

Again, I don't see how any script could detect that.

No script, I just looked at a hundred or some examples of doc-tests with my eyes. My argument is that it seems to be the more common case, so we should optimize for that and make it the default.

I think the only reasonable way to do it is to add an option or a crate doc attribute to say "run all tests in the same thread" which would be disabled by default.

Maybe I'm missing something, but why can't this be a per doc-test attribute?

And finally: did you compare the before before/after using this new feature? Are the numbers you provided from the current doctests or with the feature enabled?

I tried it out yesterday on master after the feature had been merged and saw no difference for my use-case, but that might be because the bootstrap wasn't updated yet, maybe?

from rust.

GuillaumeGomez avatar GuillaumeGomez commented on August 22, 2024

No script, I just looked at a hundred or some examples of doc-tests with my eyes. My argument is that it seems to be the more common case, so we should optimize for that and make it the default.

That doesn't sound like a good metric. ^^'

Maybe I'm missing something, but why can't this be a per doc-test attribute?

You would prefer to mark all code examples manually instead?

I tried it out yesterday on master after the feature had been merged and saw no difference for my use-case, but that might be because the bootstrap wasn't updated yet, maybe?

rustc isn't compiled using the 2024 edition so the feature is not enabled there yet.

But again: before trying to build even more options on top of this feature, maybe it'd be better first to check the impact of performance if all doctests were running the same thread.

from rust.

Voultapher avatar Voultapher commented on August 22, 2024

That doesn't sound like a good metric. ^^'

It's not great, but better than nothing. We could enable it and do some crater run. I think it's worth a shot.

You would prefer to mark all code examples manually instead?

Not all, only those that need it. Rust is a lot about explicitness, and while for some projects every doctest will need it, I think that's the exception, and for 80+% of cases no attributes are needed at all.

But again: before trying to build even more options on top of this feature, maybe it'd be better first to check the impact of performance if all doctests were running the same thread.

I'm suggesting, multiple threads inside one process. I'm all for giving it a try, how much effort do you guess would such an experiment be for you? I can look into trying it myself, but haven't touched the doctest code at all until now. Starting a process is 20+ million cycles of work usually before main gets even executed, which adds up.

from rust.

Voultapher avatar Voultapher commented on August 22, 2024

I ran an approximation of a best-case, where all tests can be run in the same
process and the tests themselves are relatively lightweight. The same code with
minor variations is added as 1000 integration tests and as 1000 doctests. The
code in question is:

/// This is a demo function, that contains a doc example.
///
/// # Example
///
/// ```
/// let mut vec_example_fn_0 = vec![1, 2, 85590];
/// vec_example_fn_0.push(3);
/// assert_eq!(vec_example_fn_0, [1, 2, 85590, 3]);
/// ```
pub fn example_fn_0() {}

#[test]
fn example_fn_0() {
    let mut vec_example_fn_0 = vec![1, 2, 16996];
    vec_example_fn_0.push(3);
    assert_eq!(vec_example_fn_0, [1, 2, 16996, 3]);
}

Results Linux

Setup

Linux 6.10
rustc 1.82.0-nightly (2c93fabd9 2024-08-15)
AMD Ryzen 9 5900X 12-Core Processor (Zen 3 micro-architecture)
CPU boost enabled.

edition = "2021"

$ hyperfine --min-runs 3 --prepare 'cargo clean' 'cargo t'
  Time (mean ± σ):     10.545 s ±  0.022 s    [User: 104.292 s, System: 95.885 s]
  Range (min … max):   10.522 s … 10.565 s    3 runs

Breakdown: Clean compile ~1.4s, integration tests ~0.02s, doctests ~16.8s [1]

edition = "2024"

$ hyperfine --min-runs 3 --prepare 'cargo clean' 'cargo t'
  Time (mean ± σ):      2.846 s ±  0.015 s    [User: 3.548 s, System: 0.743 s]
  Range (min … max):    2.829 s …  2.859 s    3 runs

Breakdown: Clean compile ~1.4s, integration tests ~0.02s, doctests ~0.1s [1]

edition = "2024" and doctest = false

$ hyperfine --min-runs 3 --prepare 'cargo clean' 'cargo t'
  Time (mean ± σ):      1.470 s ±  0.001 s    [User: 1.552 s, System: 0.219 s]
  Range (min … max):    1.469 s …  1.471 s    3 runs

Breakdown: Clean compile ~1.4s, integration tests ~0.02s [1]

Results Windows

Setup

Windows 10
rustc 1.82.0-nightly (2c93fabd9 2024-08-15)
AMD Ryzen 9 5900X 12-Core Processor (Zen 3 micro-architecture)
CPU boost enabled.

edition = "2021"

$ hyperfine --min-runs 3 --prepare 'cargo clean' 'cargo t'
  Time (mean ± σ):     55.053 s ±  1.733 s    [User: 103.338 s, System: 108.279 s]
  Range (min … max):   53.398 s … 56.855 s    3 runs

Breakdown: Clean compile ~2.4s, integration tests ~0.08s, doctests ~51.3s [1]

edition = "2024"

$ hyperfine --min-runs 3 --prepare 'cargo clean' 'cargo t'
  Time (mean ± σ):      7.107 s ±  0.015 s    [User: 7.170 s, System: 8.052 s]
  Range (min … max):    7.097 s …  7.124 s    3 runs

Breakdown: Clean compile ~2.4s, integration tests ~0.08s, doctests ~2.3s [1]

edition = "2024" and doctest = false

$ hyperfine --min-runs 3 --prepare 'cargo clean' 'cargo t'
  Time (mean ± σ):      2.575 s ±  0.066 s    [User: 2.920 s, System: 0.677 s]
  Range (min … max):    2.518 s …  2.648 s    3 runs

Breakdown: Clean compile ~2.4s, integration tests ~0.08s [1]

[1] Numbers as reported by cargo, in reality felt latency can be quite
different, and very small numbers are inaccurate. E.g. the integration test
binary alone takes ~60ms when measured with perf stat which accounts for
process startup, which is 3x what cargo reported. Doctests claim to to run in
100ms but that does not account for some other step, maybe compiling them,
beforehand. Without any code changes perf stat -d cargo t --test it is ~90ms.
In contrast running perf stat -d cargo t --doc the wall time rises to ~2750ms,
for the exact same test coverage. I'm not sure how to further drill into how
much of that is process startup and how much is compiling the doctests.

Key takeaways

  • #126245 is a huge improvement!
  • The numbers reported by cargo and wall-time waited by users can show large
    discrepancies.
  • Given a sufficient quantity of doctests, iterative cargo t will become slow,
    no matter how little content is tested.
  • Windows is maybe not the best OS for software development.
  • Comparing cargo t --test it vs cargo t --doc, and assuming the doctest
    build doesn't get cached, and that compiling the doctests takes a similar
    amount of time as the integration test binary, we get a "run as individual
    processes" overhead of ~1s on Linux and ~3s on Windows for 1k doc-tests.

Repro

All code can be found here: https://github.com/Voultapher/doctest-parallel-experiment

from rust.

Kobzol avatar Kobzol commented on August 22, 2024

That overhead seems to be quite low (surprisingly even on Windows). That being said, it would be nice if we could default into running doctests in the same process, especially since it's already the default mode of operation for normal unit tests. Unfortunately, that ship has kind of sailed because doctests were run as separate processes since Rust 1.0.

In theory, we could switch the default in Edition 2024, but it might be a too disruptive of a change; it would require people to manually mark problematic tests with an attribute, to opt-out of thread execution. I don't think that we would need a separate attribute though, we could reuse the standalone attribute for this, because it has similar semantics; when a doctest is compiled as a separate binary, it cannot be executed in the same process as the rest of the tests anyway, so standalone means separate binary + separate process.

Since some people actually have problems with running unit tests in threads, maybe a more general solution could be to add some option like --use-processes/threads, which would allow people to explicitly state how tests should be executed. The existing standalone attribute could then be used as an opt-out for thread execution. I even think that libtest already supports thread/process execution currently, but it's not exposed outside (but maybe I'm confusing this with compiletest).

from rust.

Voultapher avatar Voultapher commented on August 22, 2024

Good point, standalone already implies run in a separate process.

Regarding the breaking change, isn't this feature already a breaking change? If so, I think it's reasonably easy to explain that from edition 2024 onwards doctests will be compiled and run the same way as unit and integration tests. My understanding is that the edition mechanism exists exactly to avoid the C++ situation of "yeah we'd like to change that but we can't".

Regarding a --use-processes option, cargo nextest fills that niche quite nicely, maybe we could do a better job of surfacing that information to users? In my experience usually code that requires process isolated testing, shows little concern for re-entrancy and parallel usage, often by ignorance and not desire. I'd like to avoid promoting such use-cases by default for unit and integration tests. Reading the linked issue, the desire comes from having to interface with software that was written without such considerations. Rust by default is thread-safe and that by necessity requires thoughtful usage of shared global state.

from rust.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.