Comments (16)
Do you know when it started happening? We released httpuv 1.6.7 on 2022-12-14, and previously we released 1.6.6 on 2022-09-08.
In 1.6.7, the only real change was a new implementation of timegm
, and I doubt that has anything to do with the crash you're seeing.
In 1.6.6, we updated to libuv 1.43.0, and that conceivably could have changed the behavior.
Just looking at those stack traces, it appears that the crash is happening when the background thread is started, which happens the first time an httpuv server is started within an R session. If that is correct, I would expect code like this to be able to reproduce the crash:
for i in {1..100}; do
Rscript -e "s <- httpuv::startServer('0.0.0.0', 8000, list()); cat('.')"
done
However, when I run it on my machine, I don't get any errors (with R 4.1.2, httpuv 1.6.7, on an M1 Mac). I see that the GHA runners use R 4.2.2 and x86_64, however, and it's possible that those differences matter.
If you are able to reliably reproduce the problem, that would really help for debugging it.
from httpuv.
I am sorry this is so difficult to reproduce, it happens very randomly.
However, when I run it on my machine, I don't get any errors (with R 4.1.2, httpuv 1.6.7, on an M1 Mac). I see that the GHA runners use R 4.2.2 and x86_64, however, and it's possible that those differences matter.
I have not been able to make it crash locally either but it keeps happening in my actions. Here is a GitHub action run with your test example from above and indeed it crashes, but not reliably: https://github.com/jeroen/httpuvtest/actions/runs/3883434668/jobs/6624740869
I hope this helps. For some reason this Rscript action does not generate MacOS crashreports in the same way that R CMD check
does, I don't understand why. Therefore I haven't been able to get the backtrace, but I'm pretty sure it's the same issue. Let me know if you figure out how to do this.
from httpuv.
Actually your example seems to crash on GitHub actions on all versions of httpuv that I tested back to 1.6.0. So perhaps this is a long standing bug, and something has changed in GitHub actions that now triggers it.
- Run httpuv 1.6.0 with R-4.1
- Run httpuv 1.6.4 with R-4.1
- Run httpuv 1.6.5 with R-4.1
- Run httpuv 1.6.6 with R-4.1
from httpuv.
Just curious: what platforms did you use when testing locally?
Also, I suspect that the crashes are more likely when the system is slowed down by a heavy computational load, so it may "help" to run lots of other stuff in the background.
from httpuv.
I tested on my MacOS 11.7 (arm) MacOS 13.1 (intel), and in vmware MacOS 10.13.
from httpuv.
I tested on the following but wasn't able to reproduce the problem:
- Intel, macOS 12.6, R 4.1.1, httpuv 1.6.7 (CRAN binary build)
- M1, macOS 12.5.1, R 4.1.2, httpuv 1.6.7 (CRAN binary build)
On the Intel machine, I also ran yes > /dev/null &
eight times to increase the CPU load, but the crash still didn't happen.
from httpuv.
Oh wait, I think I was just able to reproduce this on the M1 machine, by running the yes
command many times before running the test script.
[winston@MBP16:~]$ yes > /dev/null &
[1] 9698
[winston@MBP16:~]$ yes > /dev/null &
[2] 9699
[winston@MBP16:~]$ yes > /dev/null &
[3] 9700
[winston@MBP16:~]$ yes > /dev/null &
[4] 9701
[winston@MBP16:~]$ yes > /dev/null &
[5] 9702
[winston@MBP16:~]$ yes > /dev/null &
[6] 9703
[winston@MBP16:~]$ yes > /dev/null &
[7] 9704
[winston@MBP16:~]$ yes > /dev/null &
[8] 9705
[winston@MBP16:~]$ yes > /dev/null &
[9] 9706
[winston@MBP16:~]$ yes > /dev/null &
[10] 9707
[winston@MBP16:~]$ yes > /dev/null &
[11] 9708
[winston@MBP16:~]$ yes > /dev/null &
[12] 9709
[winston@MBP16:~]$ yes > /dev/null &
[13] 9710
[winston@MBP16:~]$ yes > /dev/null &
[14] 9711
[winston@MBP16:~]$ yes > /dev/null &
[15] 9712
[winston@MBP16:~]$ for i in {1..100}; do
> Rscript -e "s <- httpuv::startServer('0.0.0.0', 8000, list()); cat('.')"
> done
..........Abort trap: 6
....................................................................................Abort trap: 6
....
from httpuv.
I modified the code in libuv's thread.c
to print out the error code from pthread_mutex_unlock
.
The error code is EINVAL
, which means that the mutex was not initialized.
from httpuv.
I think I understand what's happening.
The main thread calls ensure_io_thread()
, which does a few things.
- It creates a
Barrier
object (namedblocker)
, which contains aCondWait
object, which in turn contains auv_mutex_t
. - It also starts up the background thread, and on that thread it calls
io_thread()
.
Both the main thread and background thread call blocker.wait()
at some point; after both have called it, then both are allowed to continue.
The problem is that ensure_io_thread()
(on the main thread) finishes and returns before blocker->wait()
on the background thread returns. When ensure_io_thread()
finishes, it triggers the destructor for the Barrier
object, which in turns triggers the destructor for its CondWait
object, which in turn calls uv_mutex_destroy()
. That happens before the background thread returns from Barrier.wait()
; when it does return from that function, it calls the destructor on a guard
object (which was created with the mutex), and that calls uv_mutex_unlock
. But the mutex was already destroyed, and so it results in an error.
Relevant code:
ensure_io_thread()
:
Lines 148 to 162 in a1211df
io_thread()
:
Lines 112 to 127 in a1211df
The guard,
CondWait, and
Barrier` classes:
Lines 30 to 130 in a1211df
from httpuv.
What are you thinking for a fix? shared_ptr
?
Also, a comment on the Barrier class says "uv_barrier_t causes crashes on Windows (with libuv 1.15.0), so we have our own implementation here." Is it possible that crash was caused by a similar race condition?
from httpuv.
The lifetime of the object used for synchronization will need to to be longer than the ensure_io_thread()
call; the current way that it's implemented, where it gets allocated on the stack within ensure_io_thread()
, isn't viable.
I don't recall the details of the crashes on Windows, unfortunately.
from httpuv.
Just a quick summary of how to reproduce the error:
for i in {1..16}; do
yes > /dev/null &
done
for i in {1..50}; do
Rscript -e "s <- httpuv::startServer('0.0.0.0', 8000, list()); cat('.')"
done
killall yes
from httpuv.
@jeroen I have a potential fix in #352. Can you try it out in GHA with your test repository?
from httpuv.
Thanks, looks like this fixes the problem.
from httpuv.
Thanks for getting this diagnosed and fixed so quickly.
from httpuv.
No problem, I'm glad you found it.
from httpuv.
Related Issues (20)
- staticPath logging HOT 1
- HTTP/2 support HOT 2
- Unable to install httput - R 4.2.2 HOT 3
- Warning with clang 16 HOT 2
- Possible to use multiple worker threads? HOT 2
- Unable to install httpuv - R 3.5.3 HOT 2
- macOS: Assertion fails when closing `{httpuv}` server HOT 2
- CRAN firedrill 2023-05 HOT 4
- clang-ASAN: "runtime error: applying zero offset to null pointer"
- httpuv v1.6.11 release checklist HOT 1
- httpuv 1.6.10 crashes with "uv__finish_close: Assertion `handle->flags & UV_HANDLE_CLOSING' failed" HOT 4
- Don't dynamically link to libuv on macOS
- CI workflows for sanitizers HOT 1
- Simple server API HOT 1
- Heartbeat in discord HOT 1
- Compile error with clang on WIndows HOT 1
- Release httpuv 1.6.12
- warning: format string is not a string literal
- Release httpuv 1.6.13
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from httpuv.