Giter VIP home page Giter VIP logo

Comments (9)

rj-patrick avatar rj-patrick commented on August 21, 2024

Sorry for the slow response. I wouldn't expect three datasets to take a long time to merge. How many peaks are in each of the files?

from sierra.

alanlamsiu avatar alanlamsiu commented on August 21, 2024

Hi @rj-patrick. We are facing a similar issue. With a total of 37 samples, the merge step has been running for three weeks and not finished yet. There are > 2.8 million peaks across all samples. "ncores" is set to 50. Do you have any insights regarding our situation? Thanks.

from sierra.

rj-patrick avatar rj-patrick commented on August 21, 2024

Hi @alanlamsiu,
Thanks for that. Do you have output from the MergePeaks function? It would be helpful to know where it has gotten up to. Three weeks is too long, but this is a much larger dataset than what we've tested it on. There's a couple of potential issues, but output would be the best for diagnosing the issue.
Cheers,
Ralph

from sierra.

alanlamsiu avatar alanlamsiu commented on August 21, 2024

Thanks @rj-patrick for getting back.
I can see the size of the log file keeps increasing, which has now 27,482,128 lines, yet the output specified by "output.file" is not generated. Based on the log file, I guess that "internal peak merging" is done for all 37 samples, while the messages "Comparing peaks from to remaining data-sets" are already printed. For computing resources, the average memory usage is 4.5Gb and peak CPU usage is 39.05.

I recalled that when I ran a test using five samples, with a total of > 490 thousand peaks, it took less than a week to complete.

Please let me know if you need more details.

Thanks.

from sierra.

rj-patrick avatar rj-patrick commented on August 21, 2024

Thanks, to clarify, the message "Comparing peaks from [DATASET X] to remaining data-sets" is printed for how many of the 37 datasets?

from sierra.

alanlamsiu avatar alanlamsiu commented on August 21, 2024

The message has been printed for all 37 datasets.

from sierra.

rj-patrick avatar rj-patrick commented on August 21, 2024

Thanks. I think I know where the problem is. There is a final step where peaks are iteratively checked for merging, but with your dataset, my guess is it's getting stuck in a loop. Perhaps the best thing at this point would be to merge whatever is remaining, but for now I've set a limit on the number of iterations to go through. Pull the latest update and see if that fixes the issue.

from sierra.

alanlamsiu avatar alanlamsiu commented on August 21, 2024

Thanks @rj-patrick for the fix. I tried the updated version. The run was finished within a few days. I think I am good to go with it.

from sierra.

cghchuwudai avatar cghchuwudai commented on August 21, 2024

Thank you for solving the problem!

from sierra.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.