Giter VIP home page Giter VIP logo

Comments (8)

ksahlin avatar ksahlin commented on July 19, 2024 1

I see :) let’s see if this happens for others and I'll revisit then.

from ultra.

ksahlin avatar ksahlin commented on July 19, 2024

Do you have the possibility to share/link to your test data so that I can see if I can reproduce it?

from ultra.

sguizard avatar sguizard commented on July 19, 2024

I recreated the alz.chunk3_tama.fa fasta file.
You'll find everything here.

from ultra.

ksahlin avatar ksahlin commented on July 19, 2024

Thank you! downloaded the data and ran with your command on my MacBook. I could not reproduce the bug.

I ran both with head of master and head of develop (commit c0ac67c and commit 8313e9b, respectively)

This is my call:

./uLTRA     pipeline     --t 2     --prefix alz.chunk3     --isoseq --disable_infer     ~/Downloads/ultra_bug/Homo_sapiens.GRCh38.dna.chromosome.19.fasta     ~/Downloads/ultra_bug/Homo_sapiens.GRCh38.104.chr.13_18_19.gtf     ~/Downloads/ultra_bug/alz.chunk3_tama.fa     ~/Downloads/ultra_bug/ultra_out

This is my output log of uLTRA:

creating /Users/kxs624/Downloads/ultra_bug/ultra_out
total_flanks2: 20352
total_flank_size 20395551
total_unique_segment_counter 7303161
total_segments_bad 5517888
bad 89736
total parts size: 7391334
total exons size: 20040918
min_intron: 1
Number of ref seqs in gff: 15570
Number of ref seqs in fasta: 1
6622546.0 Unique kmers in reference part sequences with abundance > 1
AAAAAAAAAAAAAAAAAAAA 538
TTTTTTTTTTTTTTTTTTTT 496
CCTCCCAAAGTGCTGGGATT 385
GCCTCCCAAAGTGCTGGGAT 382
[...]
ACGGTGAAACCCCGTCTCTA 200
GAGTGCAGTGGCGTGATCTC 200
AAAAAATTAGCCGGGCGTGG 200
CCAAAGTGCTAGGATTACAG 200
CACACCCAGCTAATTTTTGT 200
GATTGCGCCACTGCACTCCA 200
CCAGGAGGTGGAGGTTGCAG 200
11038 11038 out of 20352 sequences has been modified in masking step.
<multiprocessing.context.SpawnContext object at 0x7fc20382c820>
Environment set: <multiprocessing.context.SpawnContext object at 0x7fc20382c820>
Using 2 cores.
Filtering reads aligned to unindexed regions with minimap2 
Done filtering. Reads filtered:21
batch nt: 97383 total_nt: 194765
37
36
Using SLAMEM
Using SLAMEM
Using SLAMEM
Using SLAMEM
Time for slaMEM to find mems:45.91997790336609 seconds.
Starting aligning reads.
Nr reads: 73 nr batches: 2 [37, 36]
READ 37 RECORDS.
READ 37 RECORDS.
Number of instances solved with quadratic collinear chainer solution: 84
Number of instances solved with n*log n collinear chainer solution: 15
READ 36 RECORDS.
READ 36 RECORDS.
Number of instances solved with quadratic collinear chainer solution: 103
Number of instances solved with n*log n collinear chainer solution: 8
Time elapesd multiprocessing: 1.5507452487945557
Time to align reads:1.5510809421539307 seconds.
Time to merge SAM-files:0.00915384292602539 seconds.
21 reads were not attempted to be aligned with ultra, instead alternative aligner was used.
6 reads with primary alignments were replaced with alternative aligner because they were unaligned with uLTRA.
1 primary alignments had best fit with uLTRA.
4 primary alignments had equal fit.
3 primary alignments had best fit with alternative aligner.
Table of score-difference between alignment methods (negative number: alternative aligner better fit, positive number is uLTRA better fit)
Score is calculated as sum(identities) - sum(ins, del, subs)
Format: Score difference: Number of primary alignments 
[-4294967296 - -100): 2
[-100 - -50): 0
[-50 - -20): 1
[-20 - -10): 0
[-10 - -5): 0
[-5 - -4): 0
[-4 - -3): 0
[-3 - -2): 0
[-2 - -1): 0
[-1 - 0): 0
[0 - 1): 4
[1 - 2): 0
[2 - 3): 0
[3 - 4): 0
[4 - 5): 0
[5 - 10): 0
[10 - 20): 0
[20 - 50): 0
[50 - 100): 0
[100 - 4294967296): 1
defaultdict(<class 'int'>, {'unaligned': 65, 'FSM': 4, 'NO_SPLICE': 2, 'Insufficient_junction_coverage_unclassified': 1, 'NIC_novel': 1})
total alignment coverage: 7.350311311286634
Deleting temporary files...
removed: /Users/kxs624/Downloads/ultra_bug/ultra_out/seeds_batch_0_rc.txt
removed: /Users/kxs624/Downloads/ultra_bug/ultra_out/seeds_batch_1.txt
removed: /Users/kxs624/Downloads/ultra_bug/ultra_out/seeds_batch_0.txt
removed: /Users/kxs624/Downloads/ultra_bug/ultra_out/seeds_batch_1_rc.txt
removed: /Users/kxs624/Downloads/ultra_bug/ultra_out/slamem_stdout_1.1
removed: /Users/kxs624/Downloads/ultra_bug/ultra_out/slamem_stderr_0_rc.1
removed: /Users/kxs624/Downloads/ultra_bug/ultra_out/slamem_stderr_0.1
removed: /Users/kxs624/Downloads/ultra_bug/ultra_out/slamem_stderr_1_rc.1
removed: /Users/kxs624/Downloads/ultra_bug/ultra_out/slamem_stdout_0_rc.1
removed: /Users/kxs624/Downloads/ultra_bug/ultra_out/slamem_stderr_1.1
removed: /Users/kxs624/Downloads/ultra_bug/ultra_out/slamem_stdout_1_rc.1
removed: /Users/kxs624/Downloads/ultra_bug/ultra_out/slamem_stdout_0.1
removed: /Users/kxs624/Downloads/ultra_bug/ultra_out/reads_batch_0.fa
removed: /Users/kxs624/Downloads/ultra_bug/ultra_out/reads_batch_1_rc.fa
removed: /Users/kxs624/Downloads/ultra_bug/ultra_out/reads_batch_1.fa
removed: /Users/kxs624/Downloads/ultra_bug/ultra_out/reads_batch_0.sam
removed: /Users/kxs624/Downloads/ultra_bug/ultra_out/reads_batch_1.sam
removed: /Users/kxs624/Downloads/ultra_bug/ultra_out/reads_batch_0_rc.fa
removed: /Users/kxs624/Downloads/ultra_bug/ultra_out/minimap2.sam
removed: /Users/kxs624/Downloads/ultra_bug/ultra_out/minimap2_errors.1
removed: /Users/kxs624/Downloads/ultra_bug/ultra_out/uLTRA_batch_0.stderr
removed: /Users/kxs624/Downloads/ultra_bug/ultra_out/uLTRA_batch_1.stderr
removed: /Users/kxs624/Downloads/ultra_bug/ultra_out/reads_after_genomic_filtering.fasta
removed: /Users/kxs624/Downloads/ultra_bug/ultra_out/indexed.sam
removed: /Users/kxs624/Downloads/ultra_bug/ultra_out/unindexed.sam
removed: /Users/kxs624/Downloads/ultra_bug/ultra_out/refs_sequences.fa
Done.

from ultra.

ksahlin avatar ksahlin commented on July 19, 2024

My version of python (that executes uLTRA) is Python 3.8.2.

from ultra.

sguizard avatar sguizard commented on July 19, 2024

I'm not surprised, it seems to be very random :/
On the 22 To of isoseq a analyzed, I noticed some failing uLTRA jobs, but each time they succeeded after resubmit.
uLTRA is install packed with python 3.9.13.

from ultra.

ksahlin avatar ksahlin commented on July 19, 2024

ouch..

I wonder if it is because of slaMEM that uLTRA fails (as this is a package I have no control/understanding about). If --use_NAM_seeds option is used, which calls StrobeMap instead, I would be the developer of all the steps in uLTRA, but --use_NAM_seeds is slightly lower in accuracy than the default. Still debating if it is worth the tradeoff (would probablyneed eval on more datasets to make this decision).

from ultra.

sguizard avatar sguizard commented on July 19, 2024

From my point of view, it's not a big deal.
It's a very small number of failing job, 0, 1, 2, 3 on 300 jobs. And as I said, they succeed at the second try.
The only annoying point (in nf-core/isoseq context) is that the job do not stop and is killed by the walltime.
It won't stop me to use it :)

from ultra.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.