Comments (16)
Hi Saskia,
it appears as if the chromosome entries in gtf file do not match that from the genome index used for mapping. At this point in time I am not sure if the "GL" contigs come from the BAM file or the GTF file. You should be able to work this out by examining the chromosome entries in the object "peak.merge.output.file". IF they do exist in this object then you could exclude them before passing to the function. Alternatively if they exist in the gtf file you could bypass error by modifying file.
This is not a very satisfactory solution so I'll work up a solution that will do it within AnnotatePeaksFromGTF function.
Cheers,
Dave
from sierra.
Hi Saskia,
Upon closer examination I now see that the error is derived from none of your peaks falling within regions of the annotation file. I have added some code that will provide a better error message. The question now is are you using the right gtf file for your aligned data.
Perhaps this come down to something simple like the naming of chromsomes. Currently the annotation function within Sierra by default appends "chr" to chromosome peak data. If your BAM file was build on a reference that already has a "chr" then this could be the problem (i.e. chr1 would then look like chrchr1. The solution if this case would be to set append.chr.peaks = FALSE in AnnotatePeaksFromGTF.
Let me know how you go.
Regards,
Dave
from sierra.
Hi Dave,
I have finally had time to test your new function (Sorry for taking ages). I get the following error message now:
Each of the 2 combined objects has sequence levels not in the other:
- in 'x': M
- in 'y': MT, GL000009.2, GL000194.1, GL000195.1, GL000205.2, GL000213.1, GL000218.1, GL000219.1, KI270711.1, KI270713.1, KI270721.1, KI270726.1, KI270727.1, KI270728.1, KI270731.1, KI270734.1
Make sure to always combine/compare objects based on the same reference
genome (use suppressWarnings() to suppress this warning).No peaks aligned to any entry within gtf reference.
Sanity check: 24 seqnames (i.e. chromosomes) match between peak and reference fileError in `rownames<-`(`*tmp*`, value = as.character(all.peaks)) :
attempt to set 'rownames' on an object with no dimensions
I am using the same gtf file that was used to make the cellranger output. So I am not sure what is going on here.
Cheers,
Saskia
from sierra.
Hi Saskia,
so no peaks align to the reference_gr. Can you pass me a few entries from peak.merge.output.file. Might be the fastest way to see what is happening.
Cheers,
D
from sierra.
This is what it looks like:
Gene Chr Strand Fit.start Fit.end polyA_ID PeakClass OriginalPeak DataOrigin exon.intron
RALBP1 18 1 9475447 9517196 RALBP1:18:9475447-9517196:1 Merged RALBP1:18:9475454-9517153:1 AD_1 non-juncs
DGKG 3 -1 186320066 186362147 DGKG:3:186320066-186362147:-1 Merged DGKG:3:186320135-186362109:-1 AD_1 non-juncs
MYO9A 15 -1 72045760 72118214 MYO9A:15:72045760-72118214:-1 Merged MYO9A:15:72045822-72118214:-1 AD_1 non-juncs
KIAA1107 1 1 92168344 92178343 KIAA1107:1:92168344-92178343:1 Merged KIAA1107:1:92168643-92177836:1 AD_1 non-juncs
MAP4K4 2 1 101697409 101824267 MAP4K4:2:101697409-101824267:1 Merged MAP4K4:2:101697451-101824189:1 AD_1 non-juncs
EIF4G3 1 -1 20806949 20810922 EIF4G3:1:20806949-20810922:-1 Merged EIF4G3:1:20806989-20810902:-1 AD_1 non-juncs
PAK3 X 1 111095936 111163914 PAK3:X:111095936-111163914:1 Merged PAK3:X:111095936-111163889:1 AD_1 non-juncs
GNB1 1 -1 1810928 1891117 GNB1:1:1810928-1891117:-1 Merged GNB1:1:1824919-1891117:-1 AD_1 non-juncs
BRINP1 9 -1 119311804 119369467 BRINP1:9:119311804-119369467:-1 Merged BRINP1:9:119311804-119369467:-1 AD_1 non-juncs
from sierra.
I found the error. It is in the function annotate_gr_from_gtf. My genes are not labeled "gene" but rather "exon", so line 228 does not work for me. Maybe you could introduce an extra argument.
from sierra.
Hi Saskia,
I have modified code so that annotation is not dependent on having a "gene" entry. Note that if your gtf really only has "exons" then I predict it won't identify UTRs. Let me know if this has resolved your issue.
from sierra.
Hi Saski,
May I ask, are you working with snRNA-seq and a 'pre-mRNA' reference? If so, it should be okay to run Sierra using the original GTF that the pre-mRNA reference was derived from. You can still specify to include intronic peaks in the differential usage analysis (DUTest) by setting the feature.type parameter to c("UTR3", "exon", "intron").
from sierra.
Yes I am. I may have an old pre-mRNA gtf file though. Thanks for letting me know.
from sierra.
Unfortunately I am encountering a new error
Analysing genomic motifs surrounding peaks (this can take some time)
|====== | 8%Error in .getOneSeqFromBSgenomeMultipleSequences(x, names[i], start[i], :
sequence chrMT not found
I have already tried to make a new gtf where I get rid off all the weird sequences but that does not seem to be the solution. Whilst doing this I also figured out that if you do not define a global variable reference.file your code encounters an error.
from sierra.
Is it possible that you gtf has the mitochondrial genome annotated as chrMT while your reference is chrM? (or vice versa).
from sierra.
I am not sure how that is possible if I used it the gtf file to create the cellranger output.
from sierra.
Ah ... I bet this is because I use genome from bioconductor which annotates as chrM. I will make a fix.
from sierra.
Just added a fix which should work.
from sierra.
Regarding the reference.file global variable, the latest version should fix that as well.
from sierra.
Sweet I just wanted to let you know.
from sierra.
Related Issues (20)
- Generate GitHub Releases HOT 2
- Cellranger mkref function parameters for Sierra HOT 1
- FindPeaks Error--'x' values larger than vector length 'sum(width)' HOT 2
- DUTest function Error HOT 1
- Sierra dataframe has 0 length HOT 1
- [E::hts_open_format] Failed to open file HOT 1
- is it possible to generate a plot that shows global 3'UTR length change?
- issues with generating splice junction file HOT 7
- MergePeakCoordinates takes long time๏ผ HOT 9
- Paired-end & PlotRelativeExpression functions. HOT 1
- Getting "Error in (function (x) : attempt to apply non-function" HOT 3
- Using Sierra with Singleron Biotechnologies Platform HOT 1
- CountPeaks error when using Singleron Biotechnologies 3'-end BAM files HOT 4
- Error in PlotRelativeExpression functions HOT 1
- Coordinates of peaks across junctions after merging HOT 5
- PlotRelativeExpressionUMAP - can an additional function be added to bring out cells that are buried in the UMAP plot?
- Interpretation of log2 fold changes in the result HOT 1
- Changes in gene 3'UTR length HOT 1
- Can I force Sierra to include exons from gtf? HOT 2
- Question on counting overlapping peaks HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sierra.