Giter VIP home page Giter VIP logo

Comments (11)

rob-p avatar rob-p commented on June 10, 2024 3

Ok, SAM output of the mapping information is available in 0.7.2 :).

from salmon.

rob-p avatar rob-p commented on June 10, 2024 1

@rbenel,

If you are using a version prior to 0.14.0, you will also have to pass --no-version-check to avoid contamination of stdout by the versioning message.

from salmon.

rob-p avatar rob-p commented on June 10, 2024

Hi @roryk,

Salmon doesn't currently have the ability to output a pseudobam, but that is definitely possible (and not too difficult). We have a related feature planned; perhaps you could tell me if it suits your use case. However, first, I should mention that if you'd simply like a pseudobam for all the mapping locations of the reads, you can use RapMap. RapMap implements the quasi-mapping algorithm upon which Salmon and Sailfish are based (and RapMap is used as a library in the Salmon and Sailfish codebases). Given an index and set of reads, RapMap will report all of the multi-mapping locations that Salmon and Sailfish would consider during quantification.

The other feature we have in the works is to have Salmon optionally output a .bam file (with actual alignments) post-quantification. It turns out that, given the quasi-mapping information and the quantification results, taking the extra step from quasi-mapping to an actual alignment can be done fairly efficiently. In this mode, Salmon would make one more pass over the reads and, considering the estimated abundances, sample a single alignment for each multi-mapping read proportional to the relative abundance of the different multi-mapping targets (i.e. it would perform a sampling over the multi-mapping locations that would, in expectation, give the same abundances as the soft assignments computed by the optimization algorithm). This feature will be very useful for transrate. However, given that your goal is to use outside information to perform the filtering yourself, this option may not be ideal for you.

from salmon.

vals avatar vals commented on June 10, 2024

Is removing the read name a new thing in Kallisto? I counted UMI's with Kallisto pseudobams a while ago. Here's one record in that pseudobam:

SRR1545849.1:CELL_A12:UMI_CTGGCA    0   ERCC-00002  1   255 8S33M   *   0   0   GGGAATTCTCCAGATTACTTCCATTTCCGCCCAAGCTGCTC   ggcccccbccccbbcbcccccbbcdd]__a_ccc_^bb`b_   NH:i:1

Regarding what you want to do, you should look at RapMap: https://github.com/COMBINE-lab/RapMap

It's the pseudoalignment part of Salmon without the quantification. I'm using it in my single cell processing: https://github.com/vals/umis

Are you using something which put's UMI's on cDNA fragments? Because Salmon or Kallisto assumes coverage over full transcript.

from salmon.

roryk avatar roryk commented on June 10, 2024

Hi Valentine,

You're totally right about kallisto, I was looking at the wrong FASTQ file when comparing the names. Thanks for the RapMap tip and pointing to your repo, maybe we can roll that into bcbio-nextgen.

Rob, it sounds like what you are suggesting would be great. Having an actual alignment would be so useful.

from salmon.

rob-p avatar rob-p commented on June 10, 2024

The implementation to output mapping information from within salmon (not yet full alignments) is almost complete. The feature needs some testing, but it will definitely make it into the next release.

from salmon.

InesdeSantiago avatar InesdeSantiago commented on June 10, 2024

@rob-p hi,
any chance of making them BAM output for future releases? In some cases (e.g. TCGA files) the SAM output can be really huge...

from salmon.

mdshw5 avatar mdshw5 commented on June 10, 2024

You can just add a pipe to samtools to your existing command: --writeMappings | samtools view -Sb - | samtools sort -T sort.tmp -o - > out.bam

from salmon.

rbenel avatar rbenel commented on June 10, 2024

@mdshw5, did that work for you? I am trying to pipe samtools to my current salmon quant command, using the same line that you wrote above and it is not successful.
salmonCommand="${salmon} quant -i $index -l SF -r ${sample1} -p 8 -o ${output_folder}/${basename_sample}_quant --writeMappings | samtools view -Sb - | samtools sort -T sort.tmp -o - > ${output_folder}/${basename_sample}_quant/${basename_sample}.bam"

from salmon.

mdshw5 avatar mdshw5 commented on June 10, 2024

That makes sense. I’ve always been disabling the version check.

from salmon.

rbenel avatar rbenel commented on June 10, 2024

@rob-p @mdshw5
Thanks for the quick response! For this specific run I am running an older version.
However, even with the --no-version-check flag I can't seem to pipe writeMappings to samtools (same command as above).

(mapping-based mode) Exception : [unrecognised option '-b'].
Please be sure you are passing correct options, and that you are running in the intended mode.
alignment-based mode is detected and enabled via the '-a' flag. Exiting.

On the other hand, --writeMappings=output.sam works fine. I would just like to save the hassle of converting all of the .sam files to .bam files following the run ...

from salmon.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.