Giter VIP home page Giter VIP logo

Comments (7)

stephentyrone avatar stephentyrone commented on May 17, 2024 4

Reasonable. I'm on vacation until next week, and probably won't get a chance to write a more detailed response until I return, but a couple quick things to consider:

  • I plan to provide a random module in SN, including both sampling from the usual distributions and some additional generators beyond what the stdlib has (especially one or more fast seeded generators; I have a branch somewhere with a quick draft of a PCG implementation, for example). Obviously this would be an easy first thing for such a module.

  • There are some interesting questions around tradeoffs in quality vs speed for these transforms, and also how to design an API that allows improving the transform in ways that may produce better results (i.e. users should be able to explicitly opt-into stable-forever behavior, if they need to get the exact same results for all time, even if we have a better algorithm or fix a bug in the future). There's no really great precedent for what this should look like yet, AFAIK (though some fairly obvious options).

  • There are a few workloads that fundamentally depend on correctly-rounded samples of distributions--e.g. if the noise added in differential privacy is not a correctly-rounded sample of the appropriate distribution, the entire scheme collapses. So we may want to think about what an API that supports that sort of guarantee would look like (you wouldn't want it to be the default behavior, because it's prohibitively slow for many other uses).

from swift-numerics.

stephentyrone avatar stephentyrone commented on May 17, 2024 1

Oh, well, that doesn't require anything of sincos; you still get to benefit from doing half as many sqrt and log operations (sqrt is fast on some recent hardware, but log is about as costly as sin or cos is), so that seems like a no-brainer.

from swift-numerics.

karwa avatar karwa commented on May 17, 2024

Yes please :)

Here's TensorFlow's version, for reference.

from swift-numerics.

NevinBR avatar NevinBR commented on May 17, 2024

I wonder—is sincos optimized enough to make it worthwhile to generate pairs of values from a normal distribution? And how about, um, sincos(piTimes:)?

extension Real {
  static func boxMullerTransform(_ x: Self, _ y: Self) -> (Self, Self) {
    let r = sqrt(-2 * log(x))
    return (r * sin(piTimes: 2*y), r * cos(piTimes: 2*y))
  }
}

If it is well-optimized, then for use-cases that require multiple values a sequence can be built two at a time. And when only one number is needed, a convenience method can provide it.

extension Real where Self: BinaryFloatingPoint, RawSignificand: FixedWidthInteger {
  static func randomNormalPair() -> (Self, Self) {
    return boxMullerTransform(random(in: 0...1), random(in: 0...1))
  }
  
  static func randomNormal(mean: Self = 0, standardDeviation sigma: Self = 1) -> Self {
    return randomNormalPair().0 * sigma + mean
  }
}

extension Complex where RealType: BinaryFloatingPoint, RealType.RawSignificand: FixedWidthInteger {
  static func randomNormal(mean: Self = 0, standardDeviation sigma: RealType = 1) -> Self {
    let (a, b) = RealType.randomNormalPair()
    return Self(a * sigma, b * sigma) + mean
  }
}

from swift-numerics.

stephentyrone avatar stephentyrone commented on May 17, 2024

I wonder—is sincos optimized enough to make it worthwhile to generate pairs of values from a normal distribution? And how about, um, sincos(piTimes:).

No need; when the host system libm provides __sincospi (Darwin), LLVM will automatically fuse separate calls to __sinpi and __cospi (Evan Cheng added this optimization way back when I put those functions in the Darwin libm). When we're not using a host system libm function, the optimizer can see both functions and do outlining+CSE to eliminate the redundant operations.

from swift-numerics.

NevinBR avatar NevinBR commented on May 17, 2024

My question was not whether we need to write sincos.

My question was whether we should design the API to include a function which generates pairs of normally-distributed values.

from swift-numerics.

peteroupc avatar peteroupc commented on May 17, 2024

There are a few workloads that fundamentally depend on correctly-rounded samples of distributions--e.g. if the noise added in differential privacy is not a correctly-rounded sample of the appropriate distribution, the entire scheme collapses. So we may want to think about what an API that supports that sort of guarantee would look like (you wouldn't want it to be the default behavior, because it's prohibitively slow for many other uses).

Note that the paper you cite introduces an attack that applies only to floating-point algorithms (more-specifically, floating-point implementations of the Laplacian distribution); as the paper says: "Fixed-point or integer-valued algorithms are immune to our attack". In fact, as I write in "Randomization with Real Numbers", "random non-integer numbers are rarely if ever seen in serious information security applications", and I wasn't much aware of the use of floating-point numbers in differential privacy until today.

I can't think of a security application where random integers (or at most random fixed-point numbers) could not have been used instead of random floating-point numbers.

Also, as for generating normally-distributed random numbers, the Box-Muller transform is only one possibility, and is not the only one that generates pairs of numbers at a time (another is the polar method). Other possibilities include ratio of uniforms, CDF inversion, Karney's algorithm, and ziggurat, which don't necessarily generate pairs of random numbers.

from swift-numerics.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.