Giter VIP home page Giter VIP logo

Comments (16)

davhum avatar davhum commented on August 21, 2024

Hi Saskia,

it appears as if the chromosome entries in gtf file do not match that from the genome index used for mapping. At this point in time I am not sure if the "GL" contigs come from the BAM file or the GTF file. You should be able to work this out by examining the chromosome entries in the object "peak.merge.output.file". IF they do exist in this object then you could exclude them before passing to the function. Alternatively if they exist in the gtf file you could bypass error by modifying file.

This is not a very satisfactory solution so I'll work up a solution that will do it within AnnotatePeaksFromGTF function.

Cheers,
Dave

from sierra.

davhum avatar davhum commented on August 21, 2024

Hi Saskia,

Upon closer examination I now see that the error is derived from none of your peaks falling within regions of the annotation file. I have added some code that will provide a better error message. The question now is are you using the right gtf file for your aligned data.

Perhaps this come down to something simple like the naming of chromsomes. Currently the annotation function within Sierra by default appends "chr" to chromosome peak data. If your BAM file was build on a reference that already has a "chr" then this could be the problem (i.e. chr1 would then look like chrchr1. The solution if this case would be to set append.chr.peaks = FALSE in AnnotatePeaksFromGTF.

Let me know how you go.
Regards,
Dave

from sierra.

SaskiaFreytag avatar SaskiaFreytag commented on August 21, 2024

Hi Dave,

I have finally had time to test your new function (Sorry for taking ages). I get the following error message now:

Each of the 2 combined objects has sequence levels not in the other:
  - in 'x': M
  - in 'y': MT, GL000009.2, GL000194.1, GL000195.1, GL000205.2, GL000213.1, GL000218.1, GL000219.1, KI270711.1, KI270713.1, KI270721.1, KI270726.1, KI270727.1, KI270728.1, KI270731.1, KI270734.1
  Make sure to always combine/compare objects based on the same reference
  genome (use suppressWarnings() to suppress this warning).No peaks aligned to any entry within gtf reference.
 Sanity check: 24 seqnames (i.e. chromosomes) match between peak and reference fileError in `rownames<-`(`*tmp*`, value = as.character(all.peaks)) : 
  attempt to set 'rownames' on an object with no dimensions

I am using the same gtf file that was used to make the cellranger output. So I am not sure what is going on here.

Cheers,

Saskia

from sierra.

davhum avatar davhum commented on August 21, 2024

Hi Saskia,
so no peaks align to the reference_gr. Can you pass me a few entries from peak.merge.output.file. Might be the fastest way to see what is happening.

Cheers,
D

from sierra.

SaskiaFreytag avatar SaskiaFreytag commented on August 21, 2024

This is what it looks like:

Gene    Chr     Strand  Fit.start       Fit.end polyA_ID        PeakClass       OriginalPeak    DataOrigin      exon.intron
RALBP1  18      1       9475447 9517196 RALBP1:18:9475447-9517196:1     Merged  RALBP1:18:9475454-9517153:1     AD_1    non-juncs
DGKG    3       -1      186320066       186362147       DGKG:3:186320066-186362147:-1   Merged  DGKG:3:186320135-186362109:-1   AD_1    non-juncs
MYO9A   15      -1      72045760        72118214        MYO9A:15:72045760-72118214:-1   Merged  MYO9A:15:72045822-72118214:-1   AD_1    non-juncs
KIAA1107        1       1       92168344        92178343        KIAA1107:1:92168344-92178343:1  Merged  KIAA1107:1:92168643-92177836:1  AD_1    non-juncs
MAP4K4  2       1       101697409       101824267       MAP4K4:2:101697409-101824267:1  Merged  MAP4K4:2:101697451-101824189:1  AD_1    non-juncs
EIF4G3  1       -1      20806949        20810922        EIF4G3:1:20806949-20810922:-1   Merged  EIF4G3:1:20806989-20810902:-1   AD_1    non-juncs
PAK3    X       1       111095936       111163914       PAK3:X:111095936-111163914:1    Merged  PAK3:X:111095936-111163889:1    AD_1    non-juncs
GNB1    1       -1      1810928 1891117 GNB1:1:1810928-1891117:-1       Merged  GNB1:1:1824919-1891117:-1       AD_1    non-juncs
BRINP1  9       -1      119311804       119369467       BRINP1:9:119311804-119369467:-1 Merged  BRINP1:9:119311804-119369467:-1 AD_1    non-juncs

from sierra.

SaskiaFreytag avatar SaskiaFreytag commented on August 21, 2024

I found the error. It is in the function annotate_gr_from_gtf. My genes are not labeled "gene" but rather "exon", so line 228 does not work for me. Maybe you could introduce an extra argument.

from sierra.

davhum avatar davhum commented on August 21, 2024

Hi Saskia,

I have modified code so that annotation is not dependent on having a "gene" entry. Note that if your gtf really only has "exons" then I predict it won't identify UTRs. Let me know if this has resolved your issue.

from sierra.

rj-patrick avatar rj-patrick commented on August 21, 2024

Hi Saski,

May I ask, are you working with snRNA-seq and a 'pre-mRNA' reference? If so, it should be okay to run Sierra using the original GTF that the pre-mRNA reference was derived from. You can still specify to include intronic peaks in the differential usage analysis (DUTest) by setting the feature.type parameter to c("UTR3", "exon", "intron").

from sierra.

SaskiaFreytag avatar SaskiaFreytag commented on August 21, 2024

Yes I am. I may have an old pre-mRNA gtf file though. Thanks for letting me know.

from sierra.

SaskiaFreytag avatar SaskiaFreytag commented on August 21, 2024

Unfortunately I am encountering a new error

Analysing genomic motifs surrounding peaks (this can take some time)
  |======                                                                |   8%Error in .getOneSeqFromBSgenomeMultipleSequences(x, names[i], start[i],  :
  sequence chrMT not found

I have already tried to make a new gtf where I get rid off all the weird sequences but that does not seem to be the solution. Whilst doing this I also figured out that if you do not define a global variable reference.file your code encounters an error.

from sierra.

davhum avatar davhum commented on August 21, 2024

Is it possible that you gtf has the mitochondrial genome annotated as chrMT while your reference is chrM? (or vice versa).

from sierra.

SaskiaFreytag avatar SaskiaFreytag commented on August 21, 2024

I am not sure how that is possible if I used it the gtf file to create the cellranger output.

from sierra.

davhum avatar davhum commented on August 21, 2024

Ah ... I bet this is because I use genome from bioconductor which annotates as chrM. I will make a fix.

from sierra.

davhum avatar davhum commented on August 21, 2024

Just added a fix which should work.

from sierra.

rj-patrick avatar rj-patrick commented on August 21, 2024

Regarding the reference.file global variable, the latest version should fix that as well.

from sierra.

SaskiaFreytag avatar SaskiaFreytag commented on August 21, 2024

Sweet I just wanted to let you know.

from sierra.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.