Comments (5)
I've tested the kernel patch from axboe in axboe/liburing#665 (comment) and indeed it fixes the bug.
from eio.
Sorry I had missed the original message, yes it hangs for me, rather fast:
RUN 579
Testing `eio_linux'.
This run has ID `26FSRR2T'.
... io 0 copy.
with
sam:eio: local i=0;while true; do echo RUN $i; i=$((i+1)); ./_build/default/lib_eio_linux/tests/test.exe;done
Linux sam 5.18.16-arch1-1 #1 SMP PREEMPT_DYNAMIC Wed, 03 Aug 2022 11:25:04 +0000 x86_64 GNU/Linux
from eio.
I reduced the program that hangs as much as I could to this, I think my initial impression is correct, somehow the EOF is never seen by the reader.
let test_copy () =
Eio_linux.run ~queue_depth:10 @@ fun _stdenv ->
Eio.Switch.run @@ fun sw ->
let from_pipe, to_pipe = Eio_linux.pipe sw in
let buffer = Cstruct.create 20 in
Eio.Flow.copy (Eio.Flow.string_source "a") to_pipe;
Eio.Flow.close to_pipe;
let () =
try
while true do
ignore (Eio.Flow.read from_pipe buffer)
done
with End_of_file -> ()
in
Eio.Flow.close from_pipe
strace here: https://gist.github.com/haesbaert/437fd9e30e4568cc3f5ba95f0387d63a
Writer is FD6, which is actually closed during the hang (by looking at /proc/foo), FD5 (reader) is still opened and we are blocked in io_uring_wait_cqe()
.
The pattern I see is that if the close
happens before a readv
is queued, sometimes the readv
will never see EOF. If the readv
is submitted before the close
, it always sees the EOF. I've tested this in two machines with slightly different kernels 5.18 vs 5.19, with released as well as current code base for eio and uring, behaviour is the same.
My only theory of why you can't trigger the bug is because on your tests the writer/reader dance terminates always in the order where the close only happens after the reader is queued, the order really depends on which CQE comes back first on the Fiber.both() tests. This program above should always trigger the bad case, I can hang it in < 5 seconds.
Tomorrow I wanna try to peek at the uring stats, like dropped requests and whatnot, I'll also write the equivalent in C and try to trigger.
At this point, it smells like a kernel bug though.
from eio.
I can confirm the bug with a simple C program https://gist.github.com/haesbaert/10d3e3bb5fa9171dfcf65e1f5b58e95c
The non-blocking version works.
cc -o uring_tests uring_tests.c -Wall -luring && while true;do date; ./uring_tests;done
from eio.
I can confirm the bug with a simple C program
OK, that hangs for me after a while too! Using Linux 5.19.9.
from eio.
Related Issues (20)
- Problematic `assert()` in `Eio.File.pread` HOT 1
- Add Condition to `Eio.Std` HOT 3
- Creating a process group in `Fork_action` (?) HOT 1
- Add a systhread pool of workers for eio_posix HOT 1
- README discussion of `type 'a env` convention is out of date HOT 2
- Restore "available" metadata on eio_windows
- Keep writes within IOV_MAX HOT 1
- Compilation failure HOT 7
- api questions and some migration feedback HOT 7
- Full tracing support
- `_os_unfair_lock_corruption_abort` after fork on MacOS HOT 15
- `getsockname` missing? HOT 3
- Switches should be domain-safe
- Eio_unix: Allocation warning from sched.ml breaks cram tests HOT 2
- Unix.Unix_error(Unix.EPERM, "io_uring_queue_init", "") failures in Docker HOT 5
- Occasional deadlocks with systhreads
- Multiple message in TCP communication with flow HOT 3
- Explanation of "connect-in-progress" HOT 3
- Net.listen hangs if the port is already in use HOT 2
- Extending eio from the outside HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from eio.