Comments (6)
I'll have to go over the continue_time related code later to double check for the same bug (and revisit alternative implementations), but for the time being, I've applied your fix.
from hexbright.
I'm glad to help! :)
By the way, for a more detailed explanation of what happens, below is the
email I sent to my dad just after trying out the fix. As for the project
where this occurred, as part of a recovery from a serious accident, I used
the hexbright to build an animated tron-themed cane to help remove some of
the stigma. It was fantastically successful (garnered tons of praise at PAX
Prime), and I intend to post a video of it and share the source code once
it gets approved by the lawyer for my case (widely-shared information is
apparently something to be careful about).
Thank you for making the hexbright easier to use! Given how people
responded, the end product felt like a magic show. :)
Playing with numbers here told me what to data-storage and time-units to
look for: http://www.wolframalpha.com/input/?i=2%5E32+microseconds
Then I found this throttle in the library I'm using, which uses 32-bit
unsigned storage and microseconds, exactly the suspect I'm looking for!
do {
now = micros();
} while (continue_time > now); // not ready for update
This is a classic bug, usually seen with occasional hangs in programs that
have been running for 49.7 days (2^32 milliseconds). The fixed code is
simply:
do {
now = micros();
} while ((signed long)(continue_time - now) > 0); // not ready for
update
It's an evil bug, because each hour when the values returned by "micros()"
wrap around, there's a random chance that "now" overshoots "continue_time",
wrapping around and then being an entire hour behind it (which is why the
cane appeared to lock up for about an hour). Even worse, when that hour is
up, there's a small chance it will overshoot again and be hung for
another hour.
I'll test it tomorrow, but it should be easier for me to sleep knowing that
I'm damn-sure I found the bug!
Once I test out the fix and become damn certain it's correct, I'll notify
the hexbright programmer so he can fix the bug in their library. :)
-Austin
P.S. While this was a relatively easy bug to fix, it's definitely
rejuvenating to do something like that!
On Fri, Sep 6, 2013 at 5:23 AM, David Hilton [email protected]:
I'll have to go over the continue_time related code later to double check
for the same bug (and revisit alternative implementations), but for the
time being, I've applied your fix.—
Reply to this email directly or view it on GitHubhttps://github.com//issues/38#issuecomment-23936129
.
from hexbright.
As soon as you sent in the bug I knew what condition you were talking about; I'm mostly surprised as I've never seen the bug in practice.
I've run code for at least 8 hours and not hit this bug. That's got to be due to a slight timing issue. Maybe your code is taking a little too long to run? If you set DEBUG to DEBUG_LOOP in libraries/hexbright/hexbright.h, it will tell you your average loop time.
I've just run a sample program;
In a non-debug mode, continue_time is initially set to 1136 microseconds.
I plugged in the exact formula into a loop that would cycle, and print the difference from maxint if we were within 20000 of maxint. Here is the output (how close continue_time gets to maxint):
starting values: 1136
starting values: 9470
starting values: 17803
countdown to overflow: -19711
countdown to overflow: -11263
countdown to overflow: -2815
overflowed!: 1
overflowed!: 8334
overflowed!: 16667
countdown to overflow: -12287
countdown to overflow: -3839
overflowed!: 1
overflowed!: 8334
overflowed!: 16667
countdown to overflow: -12287
countdown to overflow: -3839
overflowed!: 1
overflowed!: 8334
overflowed!: 16667
countdown to overflow: -12287
countdown to overflow: -3839
overflowed!: 1
overflowed!: 8334
overflowed!: 16667
countdown to overflow: -12287
countdown to overflow: -3839
Here is the relevant code:
const float update_delay = 8.3333333; // in lock-step with the accelerometer
unsigned long continue_time = 1136; // actual startup time
unsigned long maxint = 4294967295;
while(true) {
continue_time = continue_time+(1000*update_delay);
if(continue_time>4294947296) {
Serial.print("countdown to overflow: -");
Serial.println(maxint-continue_time);
} else if (continue_time<20000) {
Serial.print("overflowed!: ");
Serial.println(continue_time-maxint);
}
}
I'm actually quite surprised by the lack of proper overflow - though it does make this analysis quite simple. It does seem to indicate that your program is taking a little too long to execute - effectively, you're missing some of the accelerometer samples and light change operations take a little longer than requested (and you hit this freezing bug).
By the way, there is an onboard reset switch that could have been useful.
from hexbright.
Indeed, at the time I was running into it because some lazy serial-output
was dragging the flashlight down to 10 Hz. Come to think of it, that
explains quite clearly why I was seeing it reliably once an hour, and why
it wasn't sticking for subsequent hours (the first iteration was lapsing in
my heinously slow output code, but the throttle was breaking out due to the
shear update-frequency). Hmm, it even explains why it seemed to only stick
while in awake-mode (versus sleep-mode), since awake-mode was pouring out
about three times the debug text (especially trying to disambiguate
walking-steps from other motions).
That night I of course went through and trimmed down all the serial output
to be just event-based messages. Besides, the spam of orientation and
light-level data was no longer needed.
However, I learned my lesson that day, since I had not brought the tools
needed to open the cane and get to the flashlight to reset it. It did lead
to a funny situation where I said "I'm sorry, but my cane appears to have
crashed!", much to the amazement of those I was apologizing to.
On Fri, Sep 6, 2013 at 2:43 PM, David Hilton [email protected]:
As soon as you sent in the bug I knew what condition you were talking
about; I'm mostly surprised I've never seen the bug in practice.I've run code for at least 8 hours and not hit this bug. That's got to be
due to a slight timing issue. Maybe your code is taking a little too long
to run? If you set DEBUG to DEBUG_LOOP in libraries/hexbright/hexbright.h,
it will tell you your average loop time.I've just run a sample program;
In a non-debug mode, continue_time is initially set to 1136 microseconds.
I plugged in the exact formula into a loop that would cycle, and print the
difference from maxint if we were within 20000 of maxint. Here is the
output (how close continue_time gets to maxint):
starting values: 1136
starting values: 9470
starting values: 17803
countdown to overflow: -19711
countdown to overflow: -11263
countdown to overflow: -2815
overflowed!: 1
overflowed!: 8334
overflowed!: 16667
countdown to overflow: -12287
countdown to overflow: -3839
overflowed!: 1
overflowed!: 8334
overflowed!: 16667I'm actually quite surprised by the lack of proper overflow - though it
does make this analysis quite simple. It does seem to indicate that your
program is taking a little too long to execute - effectively, you're
missing some of the accelerometer samples and light change operations take
a little longer than requested (and you hit this freezing bug).By the way, there is an onboard reset switch that could have been useful.
—
Reply to this email directly or view it on GitHubhttps://github.com//issues/38#issuecomment-23971330
.
from hexbright.
So, this is executed as a floating point operation, and it has the reset to 1 problem.
continue_time = continue_time+(1000*update_delay);
This doesn't have the problem, though.
continue_time = continue_time+(int)(1000*update_delay);
I've applied this fix.
from hexbright.
I'm confused, since continue_time has to wrap around about once an hour, as the restriction is micros() itself. Are you saying that this was in addition to fixing the while loop?
That's a good find regarding the implicit float-to-int cast. I'm not sure how 8333 compares versus the resolution near (float)2^32, but the primary danger would be continue_time failing to increment, rather than it wrapping around (large-float plus tiny-float equals unchanged large-float).
Hmm, attacking the root of the problem, would it make sense to instead redefine update_delay as update_delay_micros?
In any case, thank you for writing the library, and good luck!
On Sep 7, 2013, at 6:12 AM, David Hilton [email protected] wrote:
So, this is executed as a floating point operation, and it has the reset to 1 problem.
continue_time = continue_time+(1000*update_delay);
This doesn't have the problem, though.
continue_time = continue_time+(int)(1000*update_delay);
I've applied this fix.
—
Reply to this email directly or view it on GitHub.
from hexbright.
Related Issues (20)
- investigate cost of non-functional rled brightness option HOT 1
- print_number does not print 0 HOT 1
- "Text Instructions" incomplete HOT 2
- hexbright.cpp fails to compile on ubuntu 12.04.1 "precise" HOT 4
- add click counter, with variable timeout and info on completion HOT 1
- set_light is not guaranteed to reach end_level in given time HOT 12
- Modification to the storbe in Tactical program HOT 2
- Serial.h: No such file or directory HOT 10
- unable to make up_n_down work.. HOT 4
- print_binary.cpp is linked/included unintentionally and bloats program HOT 3
- Change structure of library to organize things better HOT 5
- Replace Serial with Debug? HOT 2
- Accelerometer functions HOT 1
- If i try to upload it to my hexbright it gives me a error HOT 7
- Suggestion: High beam/low beam toggle in BikeLight HOT 1
- Suggestion: Two programs at once HOT 3
- Getting an Error when Compileing Tactical in Arduino 1.0.6 HOT 1
- Hard blink functionality? (Emulating a flakey / bad flashlight) HOT 3
- Where to locate a complete set of schematics
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hexbright.