Giter VIP home page Giter VIP logo

book's People

Contributors

brson avatar dbdr avatar dhardy avatar enet4 avatar golddranks avatar jbrudant avatar jruderman avatar limira avatar mt-caret avatar ndebuhr avatar noslaver avatar rlnt avatar thomwiggers avatar timdegroote avatar vks avatar vlad-shcherbina avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

book's Issues

Update RNG performance stats

The rngs repository benchmarks have been updated to use Criterion and report cycles-per-byte (rust-random/rngs#34); we likely use that, either duplicating the framework to the rand repository or by adding dev-dependencies in rngs to all the RNGs we want to benchmark.

We should wait until the next rand release, at least to get all RNGs on the same version of rand_core.

Guide page for generating seeded random floats

Hi all,

I'd love to see recipe page on the best way to generate seeded random floats for procedural generation applications.

This request comes from a recent unsuccessful experience I had trying to upgrade from 0.3 to 0.6 of rand. My 0.3 code uses a {SeedableRng, from_seed(), next_f32()} recipe I found on the web somewhere that's not suitable for 0.6, plus a --seed command line option to set the seed.

  • Do I need to use from_seed_u64() now?
  • Do I need to implement a trait on a typedef like in the tests?
  • Do I need to add rand_core as well as the usual rand to Cargo.toml.
  • Do I need to decide on a PRNG? Can you suggest a suitable one for procedural generation work?

Thanks!

Broken links due to rand_distr split

Overview

Severity: Low

I do not have the time and am not familiar with this codebase, but there are several broken links in the "guide-dist" page. This is due to the fact that the distributions were split out from this crate at some point in the past into the rand_dist crate.

Verification of Issue

Go to the documentation generated from this page and find this text:

The Normal distribution (also known as Gaussian) simulates sampling from the Normal distribution ("Bell curve") with the given mean and standard deviation.

The word "Normal" is pointing to the link

https://rust-random.github.com/rust-random/rand/rand/distributions/struct.Normal.html

but it should point to

https://rust-random.github.io/rand/rand_distr/struct.Normal.html

Explanation

This is because it is generated from the text

The [`Normal`] distribution (also known as Gaussian) simulates sampling from
the Normal distribution ("Bell curve") with the given mean and standard
deviation.

which uses a WIkilink style link.

Proposed Mitigation

This would need to be updated to use a full Markdown link (in the style (text)[URI]) to be fixed.

I have checked, and there are more instances of this on the page, probably many, so I don't want to try to fix only 1 or 2. If anyone more familiar with the project sees this, they might have a better idea which links need fixing.

Incorrect formula

The formula reported in the book for the probability of overlap:

1 - e^(-u * n^2 / (2 * p))

is slightly off, the actual (approximated) formula is

1 - e^(-u * n(n-1) / (p-1))

The off-by-ones are not so relevant, but the spurious factor 2 halves the actual probability.

I also think it would be nice to tell the user that if un^2 is much smaller than p, the formula is very well approximated by

un^2 / p

(see http://prng.di.unimi.it/#remarks) as it is much easier to understand.

Conversion to 64-bit float

In the book, I read:

"f64: we treat this as an approximation of the real numbers, and, by convention, restrict to the range 0 to 1 (if not otherwise specified). Note that this type has finite precision, so we use the coin-flipping method above (but with random bits instead of coins) until we get as much precision as the type can represent; however, since floating-point numbers are much more precise close to 0 than they are near 1, we typically simplify here and stop once we have enough precision to differentiate between 1 and the next smallest value representable (1 - ε/2)."

This looks like you're generating float values using the high-precision methods described here: http://prng.di.unimi.it/random_real.c

But when I look at the code (sorry, I don't speak Rust, so this might be wrong) looks like you're using the multiplication-free method that gives 52 bits of precision instead of 53. Is it so? In this case, maybe this should be specified in the book.

Document how to make a type randomizable

I don't see any documentation for how to make a type createable with random() and rng.gen(). I think the book could use a page for type implementers with an example of impl Distribution<T> for Standard.

Document how to run RNGs in parallel

There was some discussion in rust-random/rand#997:

Perhaps you mean that one can safely seed multiple independent streams. For this purpose we specifically chose RNGs supporting ~128-bit seeds or larger. There are some potential issues with PCG streams being too similar, so not all of those RNGs are suitable (but the ones with 64-bit output already use 128-bit internal state).

You also need to ensure the seeding mechanism does not create similarities in state — some people recommend using a different type of RNG for seeding each thread's generator; I'd recommend using a (near) crypto-grade generator such as ChaCha since any such similarities would violate the requirements on a crypto generator. Seeding one RNG from another is easy; see the book.

Use case is you want to split a Monte-Carlo simulation over N processors.
This can be done with skip-ahead, e.g. with ChaCha, but typically we recommend using a cryptographic master generator to seed each parallel generator. Our ChaCha implementation (and most I believe) uses a 64-bit counter, which likely isn't enough for parallel usage, however it uses a 256-bit seed. The parallel generator (ChaCha or other) still needs to support at least 128-bits of state and independent streams (so not recommended to use PCG), but a 64-bit period is acceptable.

I think it would be good to document how to use StdRng (with random seeds) and maybe rand_xoshiro (with skipping) in parallel.

confusing lack of full name for StdRng on guide-rngs.html

The table in https://rust-random.github.io/book/guide-rngs.html#cryptographically-secure-pseudo-random-number-generators-csprngs lists StdRng with known properties for "performance", "initialization", "memory", "security (predictability)", "forward secrecy", but notably does NOT give its full name: ChaCha12. This is confusing.

Like, what are we trying to do here? If we don't want to tie the choice of RNG down to ChaCha12, we would have no definite answers for "performance", "initialization", "memory", etc. either. If we have definite answers for these columns, it naturally follows that we definitely know what generator it is using.

While we're at it, it's a little fishy that ChaCha12 is running slower than ChaCha20. Is this a fluke in the benchmark, outdated table, or some real performance issue?

"Crates and features" implies that `rand` subsumes `rand_chacha`

Based on the dependency graph on https://rust-random.github.io/book/crates.html it looks like rand provides access to rand_chacha. While, according to https://docs.rs/rand/0.8.3/rand/rngs/struct.StdRng.html, rand_chacha is actually a private dependency of rand, and if you want access to the chacha family of PRNGs you should depend on rand_chacha yourself, similar to rand_distr. (I assume similar for rand_pcg/rand_hc, and I think rand_pcg actually even isn't a dependency of rand anymore).

"This is not a property of true randomness."

This is a statement appearing in the book about equidistribution.

This is a very slippery statement. Any property of the full period is not a property of full randomness.

Say, you have a w-bit generator with w-bit output. To be "truly random", every output must be possible. OK, so your generator generates every output.

But now it is necessarily equidistributed: it generates each value exactly once along its period. if it was truly random, after O(sqrt(2^w)) outputs you should find collisions (i.e., duplicate outputs), but you won't because collisions can happen only after 2^w outputs.

You can do the same with any generator with state size kw—just look for blocks of kw bits. They must all appear (it's random, right) but then you won't find collisions at the right time.

In essence, whenever you make a statement about the full period, you can turn it into whatever you want, depending on the viewpoint.

"Not a crypto library" warning is confusing

I've seen a few projects use rand in security sensitive code.
A reviewer may eventually point them to this warning in the book:
https://github.com/rust-random/book/blame/master/src/guide-rngs.md#L263-L271
Inferring that rand does not provide cryptographically secure prngs and they should use a different random library.

That warning was added ~5 years ago.
However in what looks like the same commit there's a section on Cryptographically secure pseudo-random number generators (CSPRNGs).
https://github.com/rust-random/book/blame/master/src/guide-rngs.md#L62

Is this warning out of date?

If it is not out of date, should it be interpreted to mean:
While this library has CSPRNGs that are in fact cryptographically secure, this is not a general purpose cryptographic library providing other algorithms like encryption and authentication? (And if you want encryption and authentication you should go to the referenced libraries instead of building your own using rand.)

If it is neither out of date nor a warning against rolling your own crypto, then it seems like the rand library documentation should be updated to remove CSPRNG references. https://docs.rs/rand/0.8.5/src/rand/rngs/mod.rs.html#53-62

Example issue:
confidential-containers/confidential-containers#44 (comment)

Claim about obsolescence of non-crypto PRNGs

In the book, there's the statement "since we now have fast cryptographic generators, some people argue that the non-cryptographic ones are now obsolete".

I think this should at least be circumstantiated carefully. Like, restricting the statement to programming languages in the large on significantly powerful hardware.

If you're programming a small embedded system with multiple threads, you cannot certainly spend hundreds of bits for a CSPRNG, and not even the electricity for generating data: we tend to overlook these problems, but not every computation needing randomness happens in a large box with an unlimited supply of electrical power. The same holds for embedded microprocessors. And if you're running in a language like go in which you can have hundred of thousands of lightweight threads, you don't wanna pollute your cache with millions of bits from PRNGs.

I agree that in a large number of situations today you can go for crypto—as I state in my page on PRNGs, with dedicated hardware and a bit of vectorization you can stream AES in about 1.2ns/64bit, which is more than faste enough for every application. But, with dedicated hardware and unlimited power.

If you consider randen, Google's crypto super-fast generator, with full hardware support it is still 10 times slower, say, than a vectorized xoshiro256++. I think the gap is too large to claim that standard PRNGs are of no use.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.