Giter VIP home page Giter VIP logo

Comments (8)

valeu avatar valeu commented on September 2, 2024

do you use window=0 or window=500?
readCountThreshold is applied to the number of read starts in a window or exon in the control dataset. If you see a strange behavior of FREEC, can you send me an example via email please?

from freec.

igordot avatar igordot commented on September 2, 2024

I use window=0. I think that means the regions should match the BED file. With 500, regions should be 500 bp windows.

I think the point about the read starts may explain my problem. I am using the capture probes as the regions BED file. Most DNA fragments should be larger then the capture probes. Thus, most of the read starts would be outside my given regions. Should I be padding the regions to avoid this? Then I would have a lot of overlapping regions due to adjacent probes. Is that acceptable or should they be merged?

from freec.

valeu avatar valeu commented on September 2, 2024

In the previous versions of FREEC, with window=0, regions were extended by something like 150 so that read starts fall within regions surrounding exons. However, I could not find this piece of code in the latest version anymore. Apparently, it was removed by one of my collaborators when he was rewriting some functions.. :-/ I am really sorry for this. I must be back to the lab on Monday, I will put it back. Thanks a lot for pointing this out.

from freec.

valeu avatar valeu commented on September 2, 2024

So I read the code of v11.0, here is how it works now:

  • in case of singe end BAM: a read is assigned to an exon if it overlap this exon (the whole read, not just its start is considered)
  • in case of paired-end BAM: a read is assigned to the exon if the corresponding fragment overlaps the exon.
  • in case of pileUp input file (instead of BAM), the read is considered to be 150bp long, and it is assigned to an exon, if it overlaps it.

If you run FREEC with mateOrientation=0, all PE reads are considered as SE.

If you think that this is not what happens on your data, please share the corresponding BAM file and .cnp file and I will have a look.

from freec.

igordot avatar igordot commented on September 2, 2024

Is there a good way to share the BAMs since they are so large?

Here is an example screenshot (maybe you can see a problem from that alone):
image
The first two regions are present with 25x cutoff, but not with 50x cutoff. All of them have coverage above 50x or even 100x.

from freec.

valeu avatar valeu commented on September 2, 2024

Igor, and I believe that in the corresponding _control.cnp you have values below 50 for these regions, right?
Do you use mateOrientation=0?

from freec.

igordot avatar igordot commented on September 2, 2024

The first region has coverage below 25x at the beginning, but it still passes the 25x threshold. The second region has coverage above 50x everywhere, but only passes the 25x threshold.

These are paired-end reads and mateOrientation is set to FR.

Is it possible that overlapping mate pairs are counted as 2 for the IGV coverage track, but counted as 1 by Control-FREEC?

from freec.

valeu avatar valeu commented on September 2, 2024

with mateOrientation=0 the reads in a pair are counted as 2. And with mateOrientation=FR each pair is counted as 1. I think I should mention this behavior in the README.

from freec.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.