antifuchs / governor Goto Github PK

View Code? Open in Web Editor NEW

497.0 6.0 38.0 593 KB

A rate-limiting library for Rust (f.k.a. ratelimit_meter)

Home Page: https://github.com/boinkor-net/governor

License: MIT License

Rust 100.00%

rate-limiting in-memory gcra async no-std rust

governor's Introduction

governor - a library for regulating the flow of data

This library is an implementation of the Generic Cell Rate Algorithm for rate limiting in Rust programs.

See the README for the governor crate for details.

Related projects

tide-governor: A tide middleware that provides rate-limiting functionality backed by governor.
actix-governor: A middleware for actix-web that provides rate-limiting backed by governor.

governor's People

Contributors

Stargazers

Watchers

governor's Issues

[SECURITY] New release needed to fix audit warnings

In the current version of governor on crates.io (0.5.1), the quanta dependency version (0.9.3) depends on mach, which is abandoned.

quanta's version was bumped in 3c2dc99 but no corresponding release has been made.

A new release should be made as this abandoned dependency is causing cargo audit warnings in any crate that uses governor.

Timeline for new release?

First off, thanks for this awesome library!

Do you have a timeline for a new release, given the dashmap CVE which has now been fixed on master?

KeyedRateLimiter with different quota per key

Thanks for this fantastic library. I took a look at the implementation and it's extremely extensible.

I'm trying rate limit API requests but the each URL has a different quota. As far as I understand this cannot be done by the governor currently. However it doesn't appear to be impossible either.

At it's core gcra takes rate limiter configuration and returns a positive or negative outcome. But the function itself doesn't modify &self, so gcra is just providing readonly quota related data. Is it possible to store the gcra quota values in a keyed store? That way the quota for that particular key can be retrieved and then tested.

The mapping of keys to quotas can be immutable and defined when the rate limiter is initialized. A default quota can also be assigned to any keys that are not part of the mapping. This way the QuotaStore is readonly and does not require complex blocking/syncronization.

    /// Tests a single cell against the rate limiter state and updates it at the given key.
    pub(crate) fn test_and_update<
        K,
        P: clock::Reference,
        S: StateStore<Key = K>,
        MW: RateLimitingMiddleware<P>,
    >(
        &self,
        start: P,
        key: &K,
        state: &S,
        t0: P,
    ) -> Result<MW::PositiveOutcome, MW::NegativeOutcome> {

Is it possible to have something like this? Are you planning on support such a rate limiter?

I believe there is a somewhat related issue on multiple quotas #156. I think this case is slightly different and might need less changes. I'm happy to contribute.

[QUESTION] Chaining multiple rate-limiters

Hey @antifuchs! Appreciate your effort on this library. One use-case that I've run into is that I'd like to combine multiple rate-limiters into an and-chain.

Example:
Service foo has an overall quota of 100 calls / s but then certain costly endpoints should have a lower quota of say 10 calls / s

foo/ ( 100 c/s overall)
foo/costly_op (only 10 c/s on this one)
foo/cheap_op1
...
foo/cheap_opn

Right now this doesn't seem possible since a cell is always consumed when checked. The missing part here is a way to check for a cell without consuming it and then probably updating the state on demand (would require some sort of locking to be useful), or alternatively a way of reversing failed attemps to consume all limiters. This is of course the tricky part and potential solutions might no behave well for your existing use-cases. (seems like I more need a lock instead of an atomic at the heart of the GCRA state)
What would be your take on this? Is this outside of the scope of this project?

Combining rate limiters

Hey!

We have a use case in vectordotdev/vector#14280 where it would be useful to have multiple rate limiters apply to the same input stream where if any one of them fails to check we consider it limited. The trick is, if one rate limiter fails to take tokens, we don't want to have consumed tokens from the other rate limiters that have been checked (or we want to readd tokens to the checked rate limiter).

Is there a way to do this that I'm missing? Or would this require some changes to this crate?

Characterize Memory Usage for Keyed Variants

Hello!

Thank you for implementing this!

A common rate limiting scenario in the HTTP world involves keyed limiting based on IP addresses. The keyed variants in this crate look like a great tool to implement this, wherein the key would be the IP address. However, I'm worried that this route would make it somewhat trivial to DoS a server by exhausting its memory usage.

The specific scenario I'm worried about is an attacker controlling many IP address and issuing requests which go through the rate limiting code. The code path would mean that the key is looked up/stored by the rate limiter. If keys are stored and never cleared, then each IP address increases memory usage by at least 4 bytes, or for a savvy attacker, at least 16. After only 2B requests, the attacker will have forced usage of at least 32GiB of memory (or just 62M to consume 1GiB). A /64 IPv6 range contains way more than 2B IPs, so this attack is trivially mountable.

What recourse exists in this crate to prevent this from happening? Does calling retain_recent() and shrink_to_fit() periodically alleviate the concern? Or is there something else than can be done externally?

Thanks!

Unexpected Behaviour for Batched Cells

Hello,

I'm looking for a crate to implement HTTP rate limiting (similar to #39 (comment) I believe), in particular rules of the form "Cannot send more than X requests per Y units of time."

From my dabbling it looks like this crate isn't the one for me (it seems more focused on cells replenishing at a constant rate than on limiting with a time window - would love to hear if you know of a way to implement that behaviour using this crate???).

Anyway, as part of my testing I came across some unexpected behaviour, so I made a test to demonstrate it below, and thought I'd share 😄

Would be very happy to hear your thoughts, but regardless keep up the great work!

use std::num::NonZeroU32;
use std::time::Duration;

use governor::clock::Clock;
use governor::NegativeMultiDecision;

/// When you send cells in bursts (as is the case for weighted API calls), you seem able
/// to fire more cells than you might expect.
///
/// In this test, cells replenish at a rate of 1200 / minute, but we're mostly sending them in bursts of 5.
/// `one_unit` is the time taken for 5 cells to replenish.
///
/// We use up capacity at the start, then elapse 2 units of time, and we're somehow able to fire 1.2 units,
/// instead of the expected 1.0.
#[test]
fn demonstrate_extra_capacity() {
    let one_minute: u64 = Duration::from_secs(60).as_nanos().try_into().unwrap();
    let one_unit = one_minute / 1200 * 5;

    let clock = governor::clock::FakeRelativeClock::default();
    let quota = governor::Quota::per_minute(NonZeroU32::new(1200).unwrap())
        .allow_burst(NonZeroU32::new(5).unwrap());
    let rate_limiter = governor::RateLimiter::direct_with_clock(quota, &clock);

    // We can fire five out straight away.
    assert!(rate_limiter.check_n(NonZeroU32::new(5).unwrap()).is_ok());

    // Check that we can't fire another 5 again before a unit has passed
    clock.advance(Duration::from_nanos(one_unit - 1));
    assert!(rate_limiter.check_n(NonZeroU32::new(5).unwrap()).is_err());

    // Advance one more nanosecond - now we have replenished one unit so can fire
    clock.advance(Duration::from_nanos(1));
    assert!(rate_limiter.check_n(NonZeroU32::new(5).unwrap()).is_ok());

    // Now that we've fired, back to not having any capacity - we can't even fire 1 cell.
    // We would need to wait for a fifth of a unit to pass
    let res = rate_limiter.check_n(NonZeroU32::new(1).unwrap());
    match res.err().unwrap() {
        NegativeMultiDecision::BatchNonConforming(num, until) => {
            assert_eq!(num, 1);
            assert_eq!(
                until.earliest_possible().as_u64() - clock.now().as_u64(),
                one_unit / 5
            );
        }
        _ => panic!(),
    }

    // Now the interesting bit: we advance two units of time.
    clock.advance(Duration::from_nanos(2 * one_unit));

    // We can fire one unit (our maximum burst) just fine
    assert!(rate_limiter.check_n(NonZeroU32::new(5).unwrap()).is_ok());

    // BUT we now have some magical extra capacity to fire one cell???
    assert!(rate_limiter.check_n(NonZeroU32::new(1).unwrap()).is_ok());

    // Not another one though.
    assert!(rate_limiter.check_n(NonZeroU32::new(1).unwrap()).is_err());
}

Dependency `parking_lot` is not `no_std`

How to know RateLimiter inner Quota

Burst documentation

Thanks for this amazing crate!

Struct governor::Quota:

"(..) However, the quota of Quota::per_minute(60) has a burst size of 60 cells, meaning it is possible to accommodate 60 cells in one go, followed by a minute of waiting."

Based on the wording, it may be misunderstood that after the initial burst, the limiter will enforce a one-minute wait, during which time no cells will be allowed through. Would you be willing to consider rephrasing it as: "... meaning it is possible to accomodate 60 cells in one go, after which the equivalent of a minute of inactivity is required for the burst allowance to be fully restored?"

The sentence appearing later makes the intention clear: "In other words, the burst size is the maximum number of cells that the rate limiter will ever allow through without replenishing them." In my opinion it's just hard to parse if you've already misread the above. The misconception may be more likely for users like me approaching from the angle of average/max events per period rather than cell replenishment.

Quick example says to allow 5 elements per second, but the code (and comment in it) says 50

https://github.com/antifuchs/governor/blob/master/src/lib.rs#L12-L22

Question: Is governor suited to limit an IO throughput?

I would like to limit an AsyncWrite for instance, to write at 10MB/s.
The documentation only mentions discrete things: "API requests, emails, phone calls".

The API does allow such a usage but is governor suited for this use case?
For instance, if I have two writers sharing the same rate limiter.
If one of the writer is writing 1kb at a time, and the other at 9kb at a time, will they end up
taking 50% 50% of the overall throughput limit, or will it be closer to 90% / 10%?

[QUESTION] how to use until_ready/until_n_ready

        println!("waiting");
        ratelimit.until_n_ready(api_cost).await.unwrap();

        println!("trying");
        if let Ok(_) = ratelimit.check_n(api_cost) {

this is the correct usage right?

Unexpected rejections with rate limiters that allow more than a billion requests/second and are accessed from more than one thread

Here is a minimal repro:

use governor::{
    clock::QuantaClock,
    middleware::StateInformationMiddleware,
    state::{InMemoryState, NotKeyed},
    Quota, RateLimiter,
};
use std::sync::Arc;
use std::thread;

fn rlspin(rl: Arc<RateLimiter<NotKeyed, InMemoryState, QuantaClock, StateInformationMiddleware>>) {
    for _ in 0..1_000_000 {
        rl.check().map_err(|e| dbg!(e)).unwrap();
    }
}

fn main() {
    let clock = QuantaClock::default();
    let quota = Quota::per_second(1_000_000_001.try_into().unwrap());
    dbg!(quota);

    let rate_limiter: Arc<
        RateLimiter<NotKeyed, InMemoryState, QuantaClock, StateInformationMiddleware>,
    > = Arc::new(
        RateLimiter::direct_with_clock(quota, &clock)
            .with_middleware::<StateInformationMiddleware>(),
    );

    let rate_limiter2 = rate_limiter.clone();

    thread::spawn(move || {
        rlspin(rate_limiter2);
    });
    rlspin(rate_limiter);
}

(with a Cargo.toml that just depends on governor 0.6.0)

This reliably fails for me:

$ cargo run --release
   Compiling govtest v0.1.0 (/home/robday/src/mntnlake/govtest)
    Finished release [optimized] target(s) in 0.98s
     Running `target/release/govtest`
[src/main.rs:19] quota = Quota {
    max_burst: 1000000001,
    replenish_1_per: 0ns,
}
[src/main.rs:12] e = NotUntil {
    state: StateSnapshot {
        t: Nanos(0ns),
        tau: Nanos(0ns),
        time_of_measurement: Nanos(224.941µs),
        tat: Nanos(224.941µs),
    },
    start: QuantaInstant(
        Nanos(170098.543484062s),
    ),
}
thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: NotUntil { state: StateSnapshot { t: Nanos(0ns), tau: Nanos(0ns), time_of_measurement: Nanos(224.941µs), tat: Nanos(224.941µs) }, start: QuantaInstant(Nanos(170098.543484062s)) }', src/main.rs:12:41
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

However, if I change line 18 to let quota = Quota::per_second(1_000_000_000.try_into().unwrap()); it reliably passes:

$ cargo run --release
   Compiling govtest v0.1.0 (/home/robday/src/mntnlake/govtest)
    Finished release [optimized] target(s) in 0.88s
     Running `target/release/govtest`
[src/main.rs:19] quota = Quota {
    max_burst: 1000000000,
    replenish_1_per: 1ns,
}
$

and if I comment out the thread::spawn it also passes:

$ cargo run --release
   Compiling govtest v0.1.0 (/home/robday/src/mntnlake/govtest)
warning: unused import: `std::thread`
 --> src/main.rs:8:5
  |
8 | use std::thread;
  |     ^^^^^^^^^^^
  |
  = note: `#[warn(unused_imports)]` on by default

warning: unused variable: `rate_limiter2`
  --> src/main.rs:28:9
   |
28 |     let rate_limiter2 = rate_limiter.clone();
   |         ^^^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_rate_limiter2`
   |
   = note: `#[warn(unused_variables)]` on by default

warning: `govtest` (bin "govtest") generated 2 warnings (run `cargo fix --bin "govtest"` to apply 2 suggestions)
    Finished release [optimized] target(s) in 0.73s
     Running `target/release/govtest`
[src/main.rs:19] quota = Quota {
    max_burst: 1000000001,
    replenish_1_per: 0ns,
}

I hit this when attempting to disable rate-limiting by configuring a rate-limit of u32::MAX, so I have an easy workaround of configuring a rate-limit of one billion, which is much higher than I need - but I thought you might appreciate the bug report anyway!

Queues

Is there something on top of governor that allows queuing of callbacks similar to https://www.npmjs.com/package/bottleneck ?

Possible regression from 0.4 to 0.5

Maintainer of actix-governor here.

After upgrading actix-governor's version of governor, our test cases started to tail.
It doesn't look like the rate-limiting itself is broken, but the reporting of the remaining burst size.

While looking through the changes I noticed that this test was added:

https://github.com/antifuchs/governor/blob/b8d99cf6b01c1969bce2e3497986b2ae59d33ac4/governor/tests/middleware.rs#L73-L102

According to the test, everything seemed to work as expected: After one check, the remaining burst size goes down by one.
Yet in our tests it went down from two to zero in just one step.

After some testing I found that using FakeRelativeClock made a difference.
Running with the fake clock works, but with a real clock it doesn't anymore.

This test case shows this behavior:

#[test]
fn state_snapshot_tracks_quota_accurately() {
    use governor::middleware::StateInformationMiddleware;
    use governor::{Quota, RateLimiter};
    use std::num::NonZeroU32;
    use std::time::Duration;

    let burst_size = NonZeroU32::new(2).unwrap();
    let period = Duration::from_millis(90);
    let quota = Quota::with_period(period).unwrap().allow_burst(burst_size);

    let clock = FakeRelativeClock::default();

    // First test
    let lim = RateLimiter::direct_with_clock(quota, &clock)
        .with_middleware::<StateInformationMiddleware>();

    assert_eq!(lim.check().unwrap().remaining_burst_capacity(), 1);
    assert_eq!(lim.check().unwrap().remaining_burst_capacity(), 0);
    assert_eq!(lim.check().map_err(|_| ()), Err(()), "should rate limit");

    // Now with a real clock
    let lim = RateLimiter::direct(quota)
        .with_middleware::<StateInformationMiddleware>();

    assert_eq!(lim.check().unwrap().remaining_burst_capacity(), 1); // <- This returns 0 instead
    assert_eq!(lim.check().unwrap().remaining_burst_capacity(), 0);
    assert_eq!(lim.check().map_err(|_| ()), Err(()), "should rate limit");
}

Storing and re-storing rate-limit state?

I've got a usecase for which I need to store the state of the rate limit on disk when quitting the program, and restoring it when starting it again. What I'm currently considering is deriving Serialize and Deserialize on a bunch of the types in governor, and hiding that behind a serde feature flag. Do you think that makes sense to include here?

Combination of rate_limiters

I think there is a use case you want to protect a resource with multiple rate limiters. E.g. one direct and one keyed or multiple keyed.
Example usecase:

you only want to have max 10req/s for a particular API call A and only max 5req/s per user group B and only max 1req/s per individual users, except user C, who can have 2 req/s

You cannot just consecutively call check() or check_key() because it will be a mess if you only have partial success. E.g. if you have two rate limiters and the first check() returns a positive outcome, and the second one in a negative outcome, you cannot continue but you have spent some of the first rate limiter's quota regardless.
Morever, the current architecture does not allow you to easily have different quotas (you have to make a different rate limiter then).

I am not sure how to solve it.
Maybe we have to introduce a third type of RateLimiter: the DictionaryRateLimiter.
For a given set of (key, value) pairs, it accepts a Quota, at construction time.
Example

let tuples = vec![
    ("api", Some("API_A"), Quota::per_second(nonzero!(10u32))),
    ("usergroup", Some("B"), Quota::per_second(nonzero!(5u32))),
    ("user", Some("C"), Quota::per_second(nonzero!(2u32))),
    ("user", None, Quota::per_second(nonzero!(1u32)))
];
let lim = RateLimiter::dictionary(tuples);

The usage would then be:

//Maybe use a Map<> instead - not sure
let tuples = vec![
    ("api", "API_A"),
    ("usergroup", "B"),
    ("user", "D")
];
lim.check_dictionary_n(tuples, 4)

This last function call would only have a positive outcome if all quotas for the different 'levels' (i.e. "api", "usergroup", "user") are satisfied.

Let me know what you think.

No support for 32-bit platforms (e.g. armv5te)

I am wishing to ship a program with governor to armv5te target. However, due to the fact that AtomicU64 doesn't exist on the target, I am getting the following:

   |
56 | pub use std::sync::atomic::{AtomicI64, AtomicU64};
   |                             ^^^^^^^^^  ^^^^^^^^^ no `AtomicU64` in `sync::atomic`
   |                             |
   |                             no `AtomicI64` in `sync::atomic`
   |
help: a similar name exists in the module
   |
56 | pub use std::sync::atomic::{AtomicI8, AtomicU64};
   |                             ^^^^^^^^
help: a similar name exists in the module
   |
56 | pub use std::sync::atomic::{AtomicI64, AtomicU8};
   |                                        ^^^^^^^^

Is there anyway to work around?

How to find out the quota after a rate limiter is created?

Essentially I want to do this. It's a simple example that doesn't account for the other possible constructors for Quota.

use std::num::NonZeroU32;
use nonzero_ext::*;
use governor::{Quota, RateLimiter};

let mut lim = RateLimiter::direct(Quota::per_second(nonzero!(50u32))); // Allow 50 units per second

assert_eq!(lim.get_quota(), nonzero!(50u32));

re-enable `cargo audit` once the tool cache issue is fixed

Once actions-rs/install#6 is fixed, we should revert e21ba48.

The ability to update/refund the value of token bucket directly from user code

The functionality I cannot find, please excuse if am wrong.

Suppose I execute a request for which I don't have a good way to estimate its cost in advance. After the request is done, I know its real cost more precisely. Then I can take the difference between initial cost estimate and real cost end refund/penalize the token bucket from the resulting cost.

Another case is when the backend has free resources or severely overloaded. I may want to accelerate or slowdown the token drip in all buckets. It could be done if I have an ability to change the current value one way or another.

Coverage for keyed state store methods

It would be great to have test coverage for shrink_to_fit and check_n_key.

support for rate limiting something that takes non-trivial amount of time and throttles based on completion time

As a simplified example, imagine there is a service that does some slow computation on demand. Amount of time the computation takes is non-deterministic and is essentially random. That service also requires that you wait specific amount of time until you can submit another request. However that amount of time is small compared to how long the work takes.

E.g. let's say service requires 1 second downtime from when previous requests finishes (not starts! finishes!) and new one can be submitted. Requests take anywhere from 0 to 100 seconds to proceed. If you naively create governor with 1 cell that takes 1 second to replenish, then you can easily start submitting requests before previous ones are done. If you use 100 seconds as replenish interval - you will end up being suboptimal and service will be idle most of the time.

Note that in this simple case, governor crate is arguably an overkill, but there are more general cases of this.

It doesn't seem to be possible with the current API. Does this make sense? Is this in scope for this crate?

Test whether a cell can be allowed

I have a use case where I want to test if the rate limiter can currently allow n cells and then make a decision based on that. It is possible that I don't actually end up allowing cells through so I don't want to actually consume the quota.

Currently check & check_n don't simply check whether the cell can be allowed through, which I think is confusing naming.

I'm thinking of check & check_n are renamed to something like try_* & try_*_n so that we avoid confusion with the naming and we add check & check_n that simply performs a check. Ideally it would be consistent with until_ready but try_until_ready would be pretty misleading naming. Maybe try_wait & try_wait_n?

Sync rate double async rate

Hello! Thank you for this great library, it is really helpful for my project where I want to set the frequency I want to poll a controller.

However, I noticed that I was generating more data than expected with the set polling rate. I did some experimenting and noticed that my synchronous code was polling twice as fast as the expected rate. I am not quite sure if my code is wrong, but doesn't seems to be as I based it on governor's async code.

To workaround this I am halving the rate on my codebase, but that is not a proper solution and might not apply at any time...

I actually do not want to add any async runtime as a dependency to reduce binary size, so I would like to understand what is wrong. I attached to this issue a sample code and the results.

Thanks for reading, and I hope we can get this fixed :)

Test machine

Kubuntu 20.04 and Windows 10 (MSYS2)
Intel Core i5-4570

Expected output

[Async] Count: 250
[Sync] Count: 250

Actual result

[Async] Count: 250
[Sync] Count: 500

Sample Code

Cargo.toml:

[profile.release]
lto = true

[package]
name = "test-governor"
edition = "2018"
version = "1.0.0"

[dependencies]
governor = "0.3"
tokio = { version = "1", features = ["full"] }

main.rs:

use std::{num::NonZeroU32, thread, time::{Duration, Instant}};

use governor::{Quota, RateLimiter, clock::{self, Clock}};

#[tokio::main]
pub async fn main() {
    let polling_rate = 250; // Hz
    let sync_thread = thread::spawn(move || sync(polling_rate));

    async_(polling_rate).await;

    let _ = sync_thread.join();
}

async fn async_(polling_rate: u32) {
    let limiter = RateLimiter::direct(
        Quota::per_second(NonZeroU32::new(polling_rate)
        .unwrap())
        .allow_burst(NonZeroU32::new(1u32).unwrap())
    );

    let mut i = 0;
    let mut last = Instant::now();
    let interval = Duration::from_secs(1);
    loop {
        i = i + 1;
        let now = Instant::now();
        if now > last + interval {
            println!("[Async] Count: {}", i);
            i = 0;
            last = now;
        }

        limiter.until_ready().await;
    }
}

fn sync(polling_rate: u32) {
    let clock = clock::DefaultClock::default();
    let limiter = RateLimiter::direct_with_clock(
        Quota::per_second(NonZeroU32::new(polling_rate).unwrap())
                .allow_burst(NonZeroU32::new(1u32).unwrap()),
        &clock
    );

    let mut i = 0;
    let mut last = Instant::now();
    let interval = Duration::from_secs(1);
    loop {
        i = i + 1;
        let now = Instant::now();
        if now > last + interval {
            println!("[Sync] Count: {}", i);
            i = 0;
            last = now;
        }

        if let Err(negative) = limiter.check() {
            thread::sleep(negative.wait_time_from(clock.now()));
        }
    }
}

Quota exceeded when used in tokio executor

I'm getting behavior where the quota per minute that I set is exceeded in a multi-threaded environment. Specifically it seems like the real quota seems about 50% higher than the one I set.

Here is the minimal replication code.

[package]
name = "governor_rate_limit"
version = "0.1.0"
authors = ["jean-airoldie <[email protected]>"]
edition = "2018"

[dependencies]
governor = { git = "https://github.com/antifuchs/governor" }
tokio = { version = "0.2", features = ["rt-threaded", "time", "macros"] }
nonzero_ext = "0.2.0"

use {
    governor::{
        clock::DefaultClock,
        state::{direct::NotKeyed, InMemoryState},
        Quota, RateLimiter,
    },
    nonzero_ext::nonzero,
};

use std::{sync::{Arc, atomic::{AtomicU32, Ordering}}, time::{Duration, Instant}};

const LIMIT: u32 = 1_200;

async fn fetch_add_forever(
    limiter: Arc<RateLimiter<NotKeyed, InMemoryState, DefaultClock>>,
    count: Arc<AtomicU32>,
) {
    let n = nonzero!(10u32);
    loop {
        limiter.until_n_ready(n).await.unwrap();
        count.fetch_add(n.get(), Ordering::Relaxed);
    }
}

#[tokio::main]
async fn main() {
    let start = Instant::now();

    let quota = Quota::per_minute(nonzero!(LIMIT));
    let limiter = Arc::new(RateLimiter::direct(quota));
    let count = Arc::new(AtomicU32::new(0));

    for _ in 0..4 {
        let fut = fetch_add_forever(limiter.clone(), count.clone());
        tokio::spawn(fut);
    }

    // Wait some time for the spawn tasks to use the rate limiter,
    // but not enought that the quota is reset.
    tokio::time::delay_for(Duration::from_secs(30)).await;

    let value = count.load(Ordering::Acquire);

    // Make sure that the total time of the experiment is
    // less than a minute.
    let delta = Instant::now() - start;
    assert!(delta < Duration::from_secs(60));

    dbg!(value);

    assert!(value <= LIMIT);
}

Which prints

[src/main.rs:49] value = 1800
thread 'main' panicked at 'assertion failed: value <= LIMIT', src/main.rs:51:5

when executed.

Keyed vs Direct

Performance wise, is it preferable to construct a direct RateLimiter for each new connection or use a keyed one?

Test&document the matrix of this crate's features

Currently, only a small subset of the possible feature combinations can be used, so it would be good to get the matrix fully multiplied out in CI, and to have docs on what each feature means.

macOS on M1 architecture: Spurious test failure on clock::with_std::test::instant_impls_coverage

There is a problem in the Add impl for Instant+Duration on aarch64-apple-darwin right now: rust-lang/rust#91417; this leads to a test failure on M1 macs running macOS.

It's peaceful, but we should probably not run the test on that architecture until the bug is fixed.

Fails to build on x86 without SSE2 due to outdated quanta dep

This is the underlying issue:

metrics-rs/quanta#69

surfaces over in:

wez/wezterm#3472

Please consider updating to a current version of quanta

[QUESTION] Using governor for HTTP Rate Limiting weirdness

Hello!

Thanks for writing and providing this library, it's been a big help!
I had a few questions about how it works and if you had any suggestions to my problem.

It seems on first start of using this ratelimiter, it seems to allow more requests than the defined rate limit. e.g limit of 10 rp/s and for the first few seconds of sending requests, around 15ish rp/s make it though.
One idea I had was to allow "hot configuration" of the rate limiter so I can change the ratelimit value on the fly. I'm currently doing this by creating a new RateLimiter with the new quota and using a RwLock to ensure all threads can safely use the new rate limiter. Similar to issue 1, when this happens there's a large amount of extra rp/s make it through on top of the rate limit.

Extra Context:

Using RateLimiter::direct(Quota::per_second(NonZeroU32::new(10) to create the RateLimit
Using bucket..check() to check if I should process a request, if the bucket is full I immediately respond with a 429

I appreciate this could just be due to my implimentation, just wondered if you had any ideas.
Thanks!

[QUESTION] Examples

Why there is no examples?

Tests for proceeds take creation time of the RateLimiter into account

I am uncertain if this is intended, I think it is not, but...

My tests were randomly failing with a proceed time of more than 200ms, which I found odd.

Moving the creation of the timestamp after the creation of the limiter gave stable and reasonable results.

However, I am not a really experienced Rust coder, so I might misjudge this. Ignore my PR in this case.

Thank you for this great crate.

Link related projects in README

I think it would be nice to have a list of related projects in the README.md. I just found this crate because of a post on r/rust about tide-governor so I hope having links in this repository will help people finding an implementation of governor for their favourite framework.

I really like the idea to allow different ecosystems to work on the same foundation with only framework-specific wrappers instead of having unique implementations that are bound to a single framework.

The projects I was thinking of were tide-governor and actix-governor and hopefully there will be more in the future :)

Thanks a lot for your work 👍

Any reason for MonotomicClock to be an empty tuple struct?

Hi, I'm wondering if there's a reason that struct MonotonicClock() is defined as such rather than just struct MonotonicClock. It's extremely, terribly minor, but the second is generally considered better style AFAIK.

EDIT: typo'd second example

DirectRaterLimiter::check_all does not error on exceeded cap

If I correctly understand the DirectRateLimiter::check_all doc, it should error when a n greater than the initial max_burst when creating the Quota is passed. However it is not the case:

#[test]
fn errors_on_exceeded_cap() {
    let lim = RateLimiter::direct(Quota::per_second(nonzero!(10u32)));

    // This does not error even though we exceed the capacity.
    block_on(lim.check(nonzero!(11u32))).unwrap_err();
}

until_ready weird behavior

I am trying to consume a very simple API, but I am hitting some unexpected behavior when using until_ready to wait for the rate limiter to be available again.

My code looks like this

struct RateLimitedClient {
    client: reqwest::Client,
    limiter: governor::DefaultDirectRateLimiter,
}

impl RateLimitedClient {
    fn new(reqs_per_minute: u32) -> Result<RateLimitedClient> {
        let reqs_per_minute =
            NonZeroU32::new(reqs_per_minute).ok_or(anyhow!("reqs_per_minute can't be zero."))?;
        let quota = governor::Quota::per_minute(reqs_per_minute);
        Ok(RateLimitedClient {
            client: reqwest::Client::new(),
            limiter: governor::RateLimiter::direct(quota),
        })
    }
}

async fn make_request(
    url: &str,
    rate_limited_client: &RateLimitedClient
) -> Result<()> {
    rate_limited_client.limiter.until_ready().await;
    if rate_limited_client.limiter.check().is_err() {
        return Err(anyhow!("Rate limit blocked request"));
    };
    let res = rate_limited_client
        .client
        .get(url)
        .send()
        .await
        .context("Error on the request.")?;
}

But I am hitting the rate limiter error (i.e. check() not being ready) every time I reach the limit, which I understand I should be hitting because of the previous until_ready() call. I am missing something?

Thanks in advance!

Missing license

Hi,
How is this crate licensed? 🙂

Edit: I now found the license field in Cargo.toml. A separate file would make this more obvious.

Support WASM

Heya, I'm interested in adding WASM support to governor, especially because I've tried using both this and async-std's stream::interval for rate limiting, and governor's performance is much better.

First bit is to use instant::Instant in place of std::time::Instant.

I'll have a go and post back when I've gotten far.

Unclear how to store RateLimiter in struct

Minor thing I encountered while updating a small project from rate_limiter to governor. It's using it in a very basic way, but roughly like:

struct Example {
    rate_limit: DirectRateLimiter<GCRA>,
}

However on first glance, it's not clear how to go from the example in the governor docs:

use std::num::NonZeroU32;
use nonzero_ext::*;
use governor::{Quota, RateLimiter};

let mut lim = RateLimiter::direct(Quota::per_second(nonzero!(50u32))); // Allow 50 units per second
assert_eq!(Ok(()), lim.check());

to something along these lines:

use std::num::NonZeroU32;
use nonzero_ext::*;
use governor::{Quota, RateLimiter};

struct Example {
    limiter: RateLimiter<???>,
}

let eg = Example {
    limiter: RateLimiter::direct(Quota::per_second(nonzero!(50u32))),
}
assert_eq!(Ok(()), eg.limiter.check()); // although these are in methods on Example

All the examples I could find just create the limiter in a function (so just simple let x = binding), and working out the generic parameters is slightly unclear

Is this an "expected" use of the API, or is there a more conventional approach which avoids needing to specify the generic parameters?

smallvec/parking_lot vulnerability

Hey 👋

There was a vulnerability in smallvec:

I came across this while using cargo deny on our project radicle-link and there was a transitive dep from governor to parking_lot.

I created a pull-request for the parking_lot repo Amanieu/parking_lot#276 and I wanted to track an issue here for updating this project with the fixed version too.

Hope that works for you, and let me know if I can do anything to help ✌️

plans to separate check with token operation?

Hi we have an use case to perform check and and token operation independently. for an example when a request come in we check if the remaining quota is enough if so we let request hit db and get the actual amount of token base on db response and finally we try to take n token out?

is the above scenario supported?

`StateSnapshot::remaining_burst_capacity()` never "replenishes"

StateSnapshot::remaining_burst_capacity() does not seem to replenish correctly, despite correct rate limiting behaviour being observed; once the initial burst quota is exceeded, it is permanently set at zero, even after the replenishment time has passed (and the rate limiter is not rejecting requests). Some tracing indicates that the calculation uses tat as if it were relative to the last "replenishment window", but it is always relative to the RateLimiter's start time, leading to the calculation being as if no time had passed since the first construction of the RateLimiter.

I will update this issue shortly with a reproducible test case.

Support waiting for an abitrary weight

In some contexts the rate can be limited by a budget that gets refilled at a given rate where certain types of requests have a higher cost or weight than others.

Do you think this use case could be supported by the DirectRateLimiter?

I'm thinking of something along the lines of:

pub async fn until_n_ready(&self, n: NoneZeroU32) -> Result<(), InsufficientCapacity> {
    loop {
        match self.check_all(n) {
            Ok(()) => return Ok(()),
            Err(NegativeMultiDecision::BatchNonConforming(_, deadline)) => {
                tokio::time::delay_until(deadline).await;
            }
            Err(NegativeMultiDecision::InsufficientCapacity(cap) => {
                return Err(InsufficientCapacity(cap));
            }
        }
    }
}

This would allow the user to specify a weight for each request.

cargo audit finds two addressable warnings

First off, thanks for this awesome crate!

However, cargo audit finds:

Crate: mach
Version: 0.3.2
Warning: unmaintained
Title: mach is unmaintained
Date: 2020-07-14
ID: RUSTSEC-2020-0168
URL: https://rustsec.org/advisories/RUSTSEC-2020-0168
Dependency tree:
mach 0.3.2
└── quanta 0.9.3
└── governor 0.5.1

Crate: serde_cbor
Version: 0.11.2
Warning: unmaintained
Title: serde_cbor is unmaintained
Date: 2021-08-15
ID: RUSTSEC-2021-0127
URL: https://rustsec.org/advisories/RUSTSEC-2021-0127
Dependency tree:
serde_cbor 0.11.2
└── criterion 0.3.6
└── governor 0.5.1

The first warning can be handled by upgrading quanta to 0.11.

The second warning can be handled by upgrading criterion to 0.4.0.

Observe current rate

Is it possible to observe the current rate at which we're allowing cells through? I'm trying to collect metrics on the current rate for each of my rate limiters.

Currently, I'm using a separate metric to arrive at the same number (basically a prometheus counter and I'm applying a rate function on it, over time). I'd be surprised if these matched up perfectly. I'd rather expose the rate directly as a gauge if possible.

Looking at the code there's currently no way to determine that. I wonder if this could just work like the check function, but without allowing a cell. I realize this would change the code a lot because most of it appears to be using compare_exchange or other functions that mutate values.

Consider using `Arc::clone` in the guide instead of `.clone`

This has precedence for being acceptable in the stdlib https://doc.rust-lang.org/std/sync/struct.Arc.html#deref-behavior
The only reason I'm making the suggestion is because of the explanation of what is being cloned.

bit of code in question:
https://github.com/antifuchs/governor/blob/b40e7947fa7ef464d1fee596abcb5bfa359c68ab/src/_guide.rs#L145-L169

Gate `rand` behind a feature gate by making `Jitter` optional

Heya 👋,

I'm writing a library that targets the web, and would like to be able to turn off dependencies as much as possible -- mainly to reduce compilation times as well as load times on web pages. One of these is rand, which is only used by Jitter when that is required.

I've got a PR coming up that does what this says, though I couldn't seem to get documentation to generate the helpful note that says:

This is only available with the "jitter" feature enabled.

The dependency tree with `--no-default-features --features "jitter"`:

governor v0.3.0-dev (/mnt/data/work/github/antifuchs/governor)
├── no-std-compat v0.4.0
│   └── hashbrown v0.6.3
│       └── ahash v0.2.18
│           └── const-random v0.1.8
│               ├── const-random-macro v0.1.8
│               │   ├── getrandom v0.1.14
│               │   │   ├── cfg-if v0.1.10
│               │   │   └── libc v0.2.70
│               │   └── proc-macro-hack v0.5.15
│               └── proc-macro-hack v0.5.15
├── nonzero_ext v0.2.0
├── parking_lot v0.10.2
│   ├── lock_api v0.3.4
│   │   └── scopeguard v1.1.0
│   └── parking_lot_core v0.7.2
│       ├── cfg-if v0.1.10
│       ├── libc v0.2.70
│       └── smallvec v1.4.0
└── rand v0.7.3
    ├── getrandom v0.1.14 (*)
    ├── libc v0.2.70
    ├── rand_chacha v0.2.2
    │   ├── ppv-lite86 v0.2.6
    │   └── rand_core v0.5.1
    │       └── getrandom v0.1.14 (*)
    └── rand_core v0.5.1 (*)

Without `--features "jitter"` it's:

governor v0.3.0-dev (/mnt/data/work/github/antifuchs/governor)
├── no-std-compat v0.4.0
│   └── hashbrown v0.6.3
│       └── ahash v0.2.18
│           └── const-random v0.1.8
│               ├── const-random-macro v0.1.8
│               │   ├── getrandom v0.1.14
│               │   │   ├── cfg-if v0.1.10
│               │   │   └── libc v0.2.70
│               │   └── proc-macro-hack v0.5.15
│               └── proc-macro-hack v0.5.15
├── nonzero_ext v0.2.0
└── parking_lot v0.10.2
    ├── lock_api v0.3.4
    │   └── scopeguard v1.1.0
    └── parking_lot_core v0.7.2
        ├── cfg-if v0.1.10
        ├── libc v0.2.70
        └── smallvec v1.4.0

Hm, looks like we only save on these:

    ├── libc v0.2.70
    ├── rand_chacha v0.2.2
    │   ├── ppv-lite86 v0.2.6
    │   └── rand_core v0.5.1

haha, looks like the value is not as high as I thought. I'll submit the PR anyway, in case you think it's worth including.

cleanup non_zero macro with const generics?

Hiya. i like this crate, but i feel that using the non_zero macro on every construction makes for quite a bit of visual noise, and what it does is mostly enforceable at compile time using const generics.
heres a link to the playground with the idea (try with 0 values to prove it doesnt compile.)
https://play.rust-lang.org/?version=stable&mode=release&edition=2021&gist=805e21d247907ef83759e8066de5957d
this allows for more contexts to provide parameters to construct quotas, namely named constants and const-evaluable functions in addition to literals, and removes the need to specify u32 constantly.
is does require a specific minimum rust version, but would you be amenable to expanding the api with these kind of constructors?