Comments (16)
For comparison, GNU parallel is limited to 15Mb of RAM on my system on the same workload.
from parallel.
I'll be landing a large amount of changes soon that refactors a decent portion of the source code and adds quiet mode, and once that is done I'll look into how I can improve the resource consumption of inputs. I may need to remove the feature of counting the total number of jobs and use an iterator for reading inputs from standard input.
from parallel.
Spent a whole day trying to fix a bug when it was actually caused by the beta and nightly compiler. Anyway, I've landed the changes that I meant to land yesterday, so now I will begin to work on handling standard input in a more efficient manner.
from parallel.
Ah, I should have warned you about the nightly compiler - I've already figured out that it's buggy with your parallel.
I've already tried chars()
iterator on BufReader
in my Rust version of tr
, and in addition to being an unstable language feature it's also very slow. lines()
is probably a better idea.
Also, in 0.4.1 your parallel was 2x slower than GNU parallel when reading arguments from stdin.
from parallel.
Updating the issue to say that I have Nightly builds fixed by eliminating the unsafe { mem::uninitialized<Child>() }
usage. Additionally, I have been working on fixing this issue, but it will take some time to implement as I want to implement it as efficiently as possible the first time, such as trying not to use Vectors
.
I plan to solve the issue by buffering 64K worth of arguments at a time and writing the arguments to disk in an unprocessed
file in reverse-newline-delimited order. Then, creating an iterator that will buffer 64K worth of arguments at a time and truncate the unprocessed
file after reading arguments. As arguments are completed, they will be written to processed
file. This should allow me to retain the ability to determine total number of jobs and getting the Nth job, whether it's currently in memory, or in either the processed
or unprocessed
file. This should keep memory usage very low. Once everything is working, I'll benchmark the program with perf stat and time to get memory consumption and cycles/time spent and modify the size of the buffer to reduce the number of syscalls.
from parallel.
I'm pretty close to resolving this problem. The issue of memory is fixed with my local changes now that inputs are being buffered to and from the disk into byte arrays. However, I've yet to resolve the issue of OS Error 11 as that's being caused by Rust failing to close child processes for some reason. I'm not sure how to ensure that child processes are closed so I'm asking the community for help with this issue.
from parallel.
Processes that you're done working with stay around as zombie processes; this means that they have terminated, and the only thing left from them is an entry in the process table and the exit code. As soon as the exit code is read by parallel
, the process table entry will be gone.
This is done by waitpid
syscall and I believe the appropriate function to call from Rust is https://doc.rust-lang.org/std/process/struct.Child.html#method.wait
from parallel.
I've been able to fix it by borrowing the Child process as a mutable reference and then borrowing the child's fields with the as_mut()
methods. Previously, I was not able to use the wait()
method because it caused a borrow checker conflict with the child's fields being borrowed.
It will still be a while before I push the fixes though. I'm in the middle of refactoring a large portion of the code I've written so far, which has caused some bugs that I'm having to track down.
from parallel.
The good news is that I just successfully processed 100,000 inputs, seq 1 100000
, using only 13 Mbytes according to the maximum resident set size reported by time
.
from parallel.
Later today I'll have the changes landed for you to test out. It's going to be quite the update.
20 files changed, 5265 insertions(+), 612 deletions(-)
And some benchmarks:
Rust Parallel
~/D/parallel (master) $ seq 1 10000 | time -v target/release/parallel echo > /dev/null
Command being timed: "target/release/parallel echo"
User time (seconds): 0.48
System time (seconds): 2.48
Percent of CPU this job got: 59%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:04.93
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 12928
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 2198164
Voluntary context switches: 73174
Involuntary context switches: 36678
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
GNU Parallel
~/D/parallel (master) $ seq 1 10000 | time -v parallel echo > /dev/null
Command being timed: "parallel echo"
User time (seconds): 97.04
System time (seconds): 29.17
Percent of CPU this job got: 232%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:54.17
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 66848
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 15070207
Voluntary context switches: 250452
Involuntary context switches: 113320
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
from parallel.
The new release has been made, so you can try it out to see if it's working as you like.
from parallel.
It doesn't seem to leave a lot of zombie processes around anymore. Huzzah!
Tried it on 10,000 files so far, and it's now significantly slower on a simple 'cat' workload than gnu parallel. The command line is:
find '/folder/with/lots/of/text/files/' -type f | head -n 10000 | parallel -j 6 cat '{}' > /dev/null
Runtime and peak memory usage for each:
rust: 1:40, 36,8MiB
rust, --no-shell: 1:36, 58,9MiB
gnu: 0:31, 14,6MiB
Additionally, the memory usage for Rust parallel grows over time while GNU parallel uses a fixed amount of memory.
For the record, the regular use case for this is piping all that stuff to grep instead of /dev/null to get aggregate statistics for the entire dataset.
from parallel.
This issue is resolved by 0.5.0 release.
from parallel.
Shall I open another issue for the lack of performance parity with GNU?
from parallel.
It should be opened as a bug. I'm guessing that memory consumption is rising because the standard output and error of each task is being temporarily buffered into memory, and subsequently dropped from memory after that process has had it's turn being printed. The solution will be to modify the piping to use the DiskBuffer
mechanism I created for inputs.
from parallel.
I think you'll find with the latest version, 0.7.0
, the issue of memory consumption has been thoroughly resolved.
from parallel.
Related Issues (20)
- --quote for arguments?
- --dry-run isn't what gets executed HOT 4
- Input validation: panic with `parallel --shell`
- Separator / NUL support
- Referencing argument from input list n
- Treat each line from stdin or from file list as one argument and use --colsep to split this in multiple arguments if requested.
- RFC: Package/Binary naimng HOT 2
- installation instructions for users without rustup
- parallel: command error: I/O error: No such file or directory (os error 2) HOT 4
- Task numbers with `-v` don't match up
- 'slice index starts at 8184 but ends at 0' panic
- disable the notice of "parallel: reading inputs from standard input" HOT 5
- Github releases / binary builds HOT 2
- Security bug HOT 8
- Parallel with Asian characters
- Is this project abandonned? HOT 2
- GNU parallel -X equivalent parameter(s) HOT 3
- Parallel always uses the same /tmp directory -- cannot run two parallel commands in parallel HOT 1
- parallel does not complete and exit its own process HOT 1
- Bug: {.} does not properly remove the extension when the basename of a file is one byte HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from parallel.