Comments (20)
@kumy apologies for the delay resolving this, I just pushed a commit that should mitigate this issue. I will cut a new release soon but if you / any of your users would like to test the updated binary before I do so that would be great! Artifacts can be found here.
from bkt.
I think this is rust-lang/rust#100141 - surprising!
from bkt.
can you also share any more details about how your tool is invoking bkt?
We're using that pattern:
bkt \
--scope="${ENVIRONMENT}.${params[0]}.foo" \
--ttl="${TTL}" \
--discard-failures \
-- \
$THE_COMMAND "${params[@]}"
is it using a long
--ttl
We're using ttl: 15min, 1h, 1d, 1w, depending on the queried data
from bkt.
Hi @dimo414
Someone had the issue this morning. This is the stack trace.
We still have a copy of the cache directory, in case you need something else.
Hope it'll be useful π€
thread 'main' panicked at 'overflow when subtracting duration from instant', library/std/src/time.rs:424:33
stack backtrace:
0: 0x106f5dde4 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::hb34d93b518a69020
1: 0x106f9caeb - core::fmt::write::ha76b1d04ecc430e1
2: 0x106f67e90 - std::io::Write::write_fmt::h4e16facc4367e611
3: 0x106f5dc00 - std::sys_common::backtrace::print::hdb50332c504929b8
4: 0x106f687f5 - std::panicking::default_hook::{{closure}}::h59fe0a3fe5f24ecc
5: 0x106f685d1 - std::panicking::default_hook::h907db83ced102c19
6: 0x106f68dfa - std::panicking::rust_panic_with_hook::h428ca3503467e89a
7: 0x106f5e0b3 - std::panicking::begin_panic_handler::{{closure}}::h051de3881f27cd34
8: 0x106f5deb9 - std::sys_common::backtrace::__rust_end_short_backtrace::hda22941d8908552b
9: 0x106f689ad - _rust_begin_unwind
10: 0x106faef33 - core::panicking::panic_fmt::hed2e16a438b6f44b
11: 0x106faed2a - core::option::expect_failed::haa40f9b5b7a5c9b9
12: 0x106f77dd7 - <std::time::Instant as core::ops::arith::Sub<core::time::Duration>>::sub::hcccab8d2f746e34c
13: 0x106ef443c - bkt::Bkt::retrieve_impl::hcf0543033a636198
14: 0x106f0080f - bkt::main::hf321a34913da4dc1
15: 0x106ef0ca6 - std::sys_common::backtrace::__rust_begin_short_backtrace::h6f324c6f687f5aea
16: 0x106ef9cdc - std::rt::lang_start::{{closure}}::h6e3e120d3a431146
17: 0x106f68894 - std::panicking::try::ha03e99799e87108d
18: 0x106f8a9de - std::rt::lang_start_internal::hbf92d065385ce277
19: 0x106f0494c - _main
20: 0x7ff803e053a6 - <unknown>
from bkt.
Someone else had it this morning
thread 'main' panicked at 'overflow when subtracting duration from instant', library/std/src/time.rs:424:9
stack backtrace:
0: rust_begin_unwind
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/panicking.rs:584:5
1: core::panicking::panic_fmt
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/panicking.rs:142:14
2: core::panicking::panic_display
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/panicking.rs:72:5
3: core::panicking::panic_str
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/panicking.rs:56:5
4: core::option::expect_failed
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/option.rs:1880:5
5: core::option::Option<T>::expect
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/option.rs:738:21
6: <std::time::Instant as core::ops::arith::Sub<core::time::Duration>>::sub
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/time.rs:424:9
7: bkt::Bkt::retrieve_streaming
8: bkt::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
And with RUST_BACKTRACE=full
thread 'main' panicked at 'overflow when subtracting duration from instant', library/std/src/time.rs:424:9
stack backtrace:
0: 0x10d4389f2 - std::backtrace_rs::backtrace::libunwind::trace::h3274e49b4ebf7e53
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
1: 0x10d4389f2 - std::backtrace_rs::backtrace::trace_unsynchronized::hd211d632c7624604
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
2: 0x10d4389f2 - std::sys_common::backtrace::_print_fmt::h1e774fa26a8ee490
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/sys_common/backtrace.rs:66:5
3: 0x10d4389f2 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::hd2e8cbde22b780fc
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/sys_common/backtrace.rs:45:22
4: 0x10d455bba - core::fmt::write::hd6692086cdd356a7
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/fmt/mod.rs:1202:17
5: 0x10d43577c - std::io::Write::write_fmt::h6043124a2486acbb
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/io/mod.rs:1679:15
6: 0x10d43a12b - std::sys_common::backtrace::_print::h3d78a31fac988d9b
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/sys_common/backtrace.rs:48:5
7: 0x10d43a12b - std::sys_common::backtrace::print::hdfd67a42b253a0c5
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/sys_common/backtrace.rs:35:9
8: 0x10d43a12b - std::panicking::default_hook::{{closure}}::h87a12b8b06887dd7
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/panicking.rs:295:22
9: 0x10d439e37 - std::panicking::default_hook::h7f68dad17e0bfaa4
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/panicking.rs:314:9
10: 0x10d43a74f - std::panicking::rust_panic_with_hook::hd57427cbbfc3717a
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/panicking.rs:698:17
11: 0x10d43a693 - std::panicking::begin_panic_handler::{{closure}}::h33aab6d04e2bba70
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/panicking.rs:588:13
12: 0x10d438e88 - std::sys_common::backtrace::__rust_end_short_backtrace::h0e7a76f927db9964
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/sys_common/backtrace.rs:138:18
13: 0x10d43a35d - rust_begin_unwind
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/panicking.rs:584:5
14: 0x10d461403 - core::panicking::panic_fmt::hcf6f3c517c6f3cb3
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/panicking.rs:142:14
15: 0x10d4546eb - core::panicking::panic_display::ha4c5bde01ffb9415
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/panicking.rs:72:5
16: 0x10d45469c - core::panicking::panic_str::h93da7ca3f4eb85fd
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/panicking.rs:56:5
17: 0x10d461299 - core::option::expect_failed::h7ed1f1962cc2062e
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/option.rs:1880:5
18: 0x10d4385c2 - core::option::Option<T>::expect::h43a9229dec635650
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/option.rs:738:21
19: 0x10d4385c2 - <std::time::Instant as core::ops::arith::Sub<core::time::Duration>>::sub::h71b7739ee54a949a
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/time.rs:424:9
20: 0x10d3b02b5 - bkt::Bkt::retrieve_streaming::h77a38c05e52cbe27
21: 0x10d3ba5f0 - bkt::main::hd77d3c47a29551e3
22: 0x10d3b2c66 - std::sys_common::backtrace::__rust_begin_short_backtrace::hba5bd1edd81f5832
23: 0x10d3b678c - std::rt::lang_start::{{closure}}::h0f91855c29255625
24: 0x10d43001e - core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once::hb30f3c124d6422bd
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/ops/function.rs:283:13
25: 0x10d43001e - std::panicking::try::do_call::h6ab65f1fbf11caa5
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/panicking.rs:492:40
26: 0x10d43001e - std::panicking::try::hb963b3a8541f11db
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/panicking.rs:456:19
27: 0x10d43001e - std::panic::catch_unwind::hd044952603e5f56c
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/panic.rs:137:14
28: 0x10d43001e - std::rt::lang_start_internal::{{closure}}::h3a2dc7ba533401e5
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/rt.rs:148:48
29: 0x10d43001e - std::panicking::try::do_call::h88480942a6edc3bf
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/panicking.rs:492:40
30: 0x10d43001e - std::panicking::try::ha84f131c957dae86
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/panicking.rs:456:19
31: 0x10d43001e - std::panic::catch_unwind::hab613c35aec6b94f
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/panic.rs:137:14
32: 0x10d43001e - std::rt::lang_start_internal::ha5deaf08dab8765b
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/rt.rs:148:20
33: 0x10d3bf76f - _main
34: 0x7ff802ee841f - <unknown>
from bkt.
Ooh interesting, yes please let me know if you uncover a backtrace. I have never observed this but we should definitely fix it. Will take a deeper look this evening but if you identify a fix in the meantime feel free to send a PR.
from bkt.
Sounds to be our issue. Do you think the workaround will help?
from bkt.
Yeah I'm sure we can work around it :) If you're able to get a backtrace or reproduction steps that'll help pinpoint things but I'll do some more investigating this weekend.
from bkt.
Unfortunately not soon. I'm a Linux user, so I'm not concerned. And no one reported it since I asked internallyβ¦
π€
from bkt.
Could you add RUST_BACKTRACE=1
to your internal tool (if you haven't already)?
from bkt.
@kumy can you also share any more details about how your tool is invoking bkt
? E.g. is it using a long --ttl
, or passing any other flags? I tried to reproduce the trace by caching a command with a long TTL and rebooting, but I think my Mac isn't affected by this bug - it appears to use a signed type on my machine rather than a u64
.
I suspect the most reasonable fix will unfortunately be to stop using Instant
and instead swap to chrono::DateTime
or similar. I don't see a reasonable way to fix a line like this one without changing either the type or the behavior.
from bkt.
(From memory, more details to come on monday)
From the last user report, the failing call was with a --ttl 15min
. But it may not be related especially to this cache entry, I think this is the first call we have in all the script, so maybe related to pruning routine when bkt is launched.
I hope to be able to provide a trace as soon as I have a new user report.
EDIT 2023-09-29: no user reported the issue yet
from bkt.
Anything else we can do to help here?
from bkt.
π tell my employer to prioritize this bug! I will try to figure out the right fix for this soon, it's a little messy.
from bkt.
Took a deeper look at this just now. The good news is it looks like the only problematic line is the one linked above, where we construct the CacheStatus::Hit
. The bad news is I don't think there's a simple fix. I see a couple of options (assuming Rust doesn't fix the Instance
implementation on their end...):
- Swap to
Instant::checked_sub()
. This is the least-invasive change, but it's not clear what to do with failures. The only safe value to return isInstant::now()
(i.e. reporting that the cached data is brand new, regardless of its age) which would break--warm
- the data should eventually still be cleared when the TTL expires.- We could invalidate the cache when this happens, which is probably harmless but I worry this could invalidate data too often. The linked Rust bug indicates the issue occurs on reboots, but I'm not sure if that's the only circumstance where this happens. If I could reproduce this locally that would help.
- If we can confirm this issue is tied to reboots we could adjust the default cache dir to not carry over across reboots (
/tmp
is typically deleted on boot, but given that the reported cases above aren't using a custom dir it sounds like OSX doesn't reliably do so).- This could even be a usage-specific fix, i.e. instead of changing
bkt
callers could use a different--cache_dir
per-boot. Not elegant, of course, but it might be easiest π
- This could even be a usage-specific fix, i.e. instead of changing
- The implication in the Rust bug is that
Instant
is simply not the right type to use. This is a bit surprising to me and the intended alternative isn't clear, but we could stop usingInstant
. This would be a breaking API change so I'd rather avoid / delay it unless we're clearly migrating to a type that is intended to supportbkt
's use case and won't have these sort of surprising sharp edges.- Maybe this isn't so risky -
bkt
relies onSystemTime
today to determine when the TTL is expired, so perhaps we should just use that consistently and not worry about situations whereSystemTime
doesn't work as expected (namely when the system clock is changed / updated).
- Maybe this isn't so risky -
from bkt.
- Given the few number of occurrences we had on our side. I think we could leave with: destroying the failing cache entry and return the real content from "the command". Maybe displaying a line on
stderr
to warn when this error arise π€· . (That is my own personal preferred choice) - I asked users, and they cold boot every morning. We are not storing the cache in a volatile dir, and we would like to keep it's content between reboots.
- Sorry I'm not in the internal of
bkt
, I won't comment.
Thanks for bkt
and for all your time!
from bkt.
Thanks for the context on #2, that does make sense. I would actually suggest changing the cache dir to be cleared on reboot to align with the behavior your Linux users experience. According to this post /tmp
is cleared on boot while /var/tmp
(which I believe is what OSX uses by default as $TMPDIR
) is not, so passing --cache_dir=/tmp
for your OSX users might be sufficient to prevent this issue.
I played around a little more with swapping to SystemTime
everywhere, and despite its limitations it's looking like a viable change. I could swap to e.g. time
but I'm not yet sure if there's sufficient value to doing so. Will flesh it out and publish a fix soon, probably this weekend.
from bkt.
from bkt.
Oh you didn't mention using a custom cache directory above. In that case your OSX users will need to pick up this fix (once it's released).
from bkt.
FTR, one user reported the error this morning after powering his Mac...
from bkt.
Related Issues (20)
- BrokenPipe panic when stdout is closed early
- cli::discard_failures_in_background is flaky HOT 6
- Provide arm64 Darwin build HOT 5
- Record cache statistics
- Feature Request: return when value changes (differs from previous invocation) HOT 4
- [Feature suggestion] Flag to report bkt invocation values, and bkt context HOT 2
- cannot build rev 0.5.4 without updating cargo.lock HOT 5
- Flag to exit immediately on cache misses HOT 3
- Clean up cached data linked to old file modtimes HOT 1
- Feature request: stream the subprocess' stdout and stderr while it's running HOT 8
- Consider %LocalAppData% for Windows
- Default cache directory should be user-specific HOT 9
- Executable missing HOT 6
- Possible Race When Starting Many Parallel Bkt Instances HOT 1
- cache_tests::stable_hash test fails on s390x HOT 1
- Several tests occasionally flake due to missing cwd HOT 3
- bkt: stdout pipe failed: failed to write whole buffer HOT 4
- Mechanism to report cache age to callers
- bkt should make scope "filename-safe" HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bkt.