Giter VIP home page Giter VIP logo

Comments (9)

warrenlr avatar warrenlr commented on August 19, 2024

is the arcs_test_original.gv file empty?

from arcs.

stephenrdoyle avatar stephenrdoyle commented on August 19, 2024

No. 422155 lines of the following.

head arcs_test_original.gv
graph G {
0 [id=1];
1 [id=58];
2 [id=107];
3 [id=610];
4 [id=691];
5 [id=744];
6 [id=1019];
7 [id=1176];
8 [id=1327];

tail arcs_test_original.gv
446--4157 [label=2, weight=1];
4156--7197 [label=0, weight=1];
11166--2331 [label=0, weight=3];
10678--10475 [label=0, weight=1];
19656--7384 [label=0, weight=1];
4235--4157 [label=2, weight=1];
5964--4768 [label=3, weight=1];
4768--2541 [label=2, weight=1];
19060--428 [label=2, weight=1];
}

from arcs.

sarahyeo avatar sarahyeo commented on August 19, 2024

How are the headers of your FASTA file formatted? The script makeTSVfile.py expects headers to be in the form:

>12345 other_info

where 12345 is a unique sequence id.

If there are strings in your seq id (ex. >abc123efg4) ARCS will try to extract all the digits from the name (abc123efg4 becomes 1234).

If this is the case does replacing lines 24-26 in makeTSVfile.py with:

test = re.findall(r'\d+', line.rstrip().split()[0])
counter += 1
links_numbering["".join(test)] = str(counter)

fix the problem?

from arcs.

stephenrdoyle avatar stephenrdoyle commented on August 19, 2024

from arcs.

sarahyeo avatar sarahyeo commented on August 19, 2024

If you renumbered the headers to only contain a unique ID, you will have to rerun ARCS using the renumbered FASTA to ensure the seq ID's remain consistent throughout the pipeline. Likewise you many need to rerun the alignments to generate a bam file with the correct renumbered seq IDs to input to ARCS.

from arcs.

stephenrdoyle avatar stephenrdoyle commented on August 19, 2024

from arcs.

sarahyeo avatar sarahyeo commented on August 19, 2024

Yes that might be the case. I just pushed a bug fix to the "develop" branch which allows the use of strings for seq IDs instead of ints. Let me know if that resolves the issue!

from arcs.

DeWitP avatar DeWitP commented on August 19, 2024

Hi Sarah and others, I am trying to run the makeTSVfile.py script (using the new version from the develop branch), and although I am not getting any error messages, it is taking a very long time. Last run I had to abort after 5 days, is it normal for it to take this long? My input graph file is not huge (around 15 000 lines), but my assembly is quite fragmented..
It would be great to hear your experiences about running this script! Thanks!

from arcs.

lcoombe avatar lcoombe commented on August 19, 2024

Closing this old issue -- feel free to re-open if you still have questions.

from arcs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.