Giter VIP home page Giter VIP logo

Comments (3)

AlbanSagouis avatar AlbanSagouis commented on September 25, 2024

Well actually, gnparser by default runs on as many cores as there are on the system, see here:

-j, --jobs (positive integer, default is a number of CPUs on a machine)

The number of jobs running concurrently. This flag is ignored when parsing one name:

gnparser -j 200 names.txt

So I think we should offer the user the possibility to change this value but, if possible, we should let gnparser determine the number of threads by itself. Maybe rgnparser::gnparser_cmd() has to pass this argument?

I see two use cases where the users might want to set the number of core themselves:

  • they are already running the function on a distributed pipeline and want to use only one thread
  • they don't want to saturate their system

Additional point: whether we implement this or not, running several parallel calls to rgnparser::gn_parse(x, threads = 4) at the same time on any system seems wrong. Maybe we should warn users that gnparser runs in parallel by default? Especially if we detect that the system is an HPC?

from rgnparser.

AlbanSagouis avatar AlbanSagouis commented on September 25, 2024

Oh... actually threads has a default value of 4 in gn_parse_tidy() but not in gn_parse() where it's NULL and works correctly of course so that solves most of my concerns.

Now remains the question of using several cores when running parallel calls to rgnparser::gn_parse(): is there something to worry about? should we try to detect these situations and warn users?

from rgnparser.

joelnitta avatar joelnitta commented on September 25, 2024

I think the correct course of action here is to set the default number of threads for gn_parse() and gn_parse_tidy() to 1. Setting to NULL could correspond to the default value of gnparser (i.e., the number of threads available on the machine).

Reasons:

  • gnparser is already quite fast (approx 9,000 names / sec on one thread), which is probably fine for most use cases.
  • It can be problematic for software to automatically use all threads especially on shared machines.

from rgnparser.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.