Giter VIP home page Giter VIP logo

dewseq's Introduction

dewseq's People

Contributors

hentzelab avatar hpages avatar jwokaty avatar nturaga avatar sudeepsahadevan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

fulaibaowang

dewseq's Issues

question about annotObj in DESeqDataSetFromSlidingWindows

Hi,

I followed the tutorial from Htseq-clip and generated everything.

But the sliding windowed annotation file that Htseq-clip created is BED6 format.

It cannot be used for DESeqDataSetFromSlidingWindows as annotObj.

May I ask do you have any idea that can convert annotation from Htseq-clip to DESeqDataSetFromSlidingWindows recoginized annotObj ?

citation?

Good day, there is no output to citation("DEWSeq"). How do I properly cite this package? Thank you.

resultRegions() and toBED() function

Hi,

I am running the vignette, and I will really appreciate if you can explain a bit more about the output.

1 extractRegions
As wrote there, extractRegions function to combine the overlapping significant windows.
But in the result, you still see overlapping regionns, for example in the vignette:

##  4 chr1           28648620   28648730 +                      5           110
##  5 chr1           28648620   28648733 +                      5           113

Then the real number of signficant binding region shall be less than the total number of row of resultRegions table (218)?

  1. toBED
    if I do :
resultRegions <- extractRegions(windowRes  = resultWindows,
                                padjCol    = "p_adj_IHW",
                                padjThresh = 0.01, 
                                log2FoldChangeThresh = 0.5) %>% as_tibble

and

toBED(windowRes = resultWindows,
      regionRes = resultRegions,
      fileName  = "enrichedWindowsRegions.bed",                               
       padjCol    = "p_adj_IHW",
       padjThresh = 0.01, 
       log2FoldChangeThresh = 0.5)

the output file "enrichedWindowsRegions.bed" has much more rows than the table resultRegions, why?

Thank you!

contrast argument in resultsDEWSeq

Hi,

I see the order of group in contrast argument changes result of resultsDEWSeq, basically

resultWindows <- resultsDEWSeq(ddw,
                              contrast = c("type", "group1", "group2"),
                              tidy = TRUE) %>% as_tibble
resultWindows <- resultsDEWSeq(ddw,
                              contrast = c("type", "group2", "group1"),
                              tidy = TRUE) %>% as_tibble

are different.

Can you elaborate a bit more about this? Thank you!

error when running the vignette

I guess the last piece of code of toBED

toBED(windowRes = resultWindows,
      regionRes = resultRegions,
      fileName  = "enrichedWindowsRegions.bed")

should be

toBED(windowRes = resultWindows,
      regionRes = resultRegions,
      fileName  = "enrichedWindowsRegions.bed",
      padjCol = "p_adj_IHW")

function not found

Hello!
After I run the code in your R package, I get the following error:
Error in DESeqDataSetFromSlidingWindows(countData = count_matrix, colData = col_data, :
There is no "DESeqDataSetFromSlidingWindows" function.
The source code is as follows:
ddw <- DESeqDataSetFromSlidingWindows(countData=count_matrix, colData=col_data, annotObj=annotation_file, design=~-type)

How should this problem be solved? Looking forward to your answer! grateful!

Does count/matrix file dimension have to match annothenion file dimension?

Hi,

I followed the htseq-clip + Dewseq pipeline to process clip data.

But it gave erroe when use DESeqDataSetFromSlidingWindows:

Error in SummarizedExperiment(assays = SimpleList(counts = countData), :
the rownames and colnames of the supplied assay(s) must be NULL or identical to those of the
RangedSummarizedExperiment object (or derivative) to construct

Is this because countData (1462654 3) dimension not match annotationData (65312564 12)?

I get all count/ matrix files and sliding windowed annotation files from HTseq-clip.

Is there any way to fix this ?

Thanks

specific design and model

Hello,
Thank you very much for the great tool!
I have eCLIP dataset with two conditions (before and after treatment), 8 IP replicates plus SMI controls in replicates.
I would like to use DEWSeq to compare binding profile of a RBP at two different time points.
I was wondering if the DEWSeq pipeline can be applied to search for differentially bindind regions between IP samples (not for a one-sided IP vs SMI comparison)? If so, how can I make this comparison while accounting for the negative controls? What would be the design formula and model for this experiment?

Here is the sample info:

Sample ID Condition1 Condition2
1 T0 A IP
2 T0B IP
3 T0C IP
4 T0 D IP
5 T2 A IP
6 T2B IP
7 T2C IP
8 T2 D IP
9 T0 A Input
10 T0B Input
11 T0C Input
12 T0 D Input
13 T2 A Input
14 T2B Input
15 T2C Input
16 T2 D Input

The different condition are T0 or T2 (non-stimulated or stimulated).

So for example,
“T0 A IP” is the non stimulated sample A with the IP
“T0 A input” is the same sample as before just the input

“T2 A IP” is the stimulated sample A with the IP
“T2 A input” is the same sample as before just the input

Thank you in davance!
All the best,

Memory issues

I'm running out of memory trying to create ddw object using DESeqDataSetFromSlidingWindows.

Code is:
ddw <- DESeqDataSetFromSlidingWindows(countData=count_matrix, annotObj = data.frame(annotation_file), colData=col_data, design=~type)

Result is:
Error: cannot allocate vector of size 1024.0 Mb

I checked dimensions of matrixes:
> dim(count_matrix)
[1] 381366 16
> dim(annotation_file)
[1] 88574203 12

Does the annotation seem a bit large? I struggled to upload this to R in the first place using fread so used ff instead.

I followed the examples from https://link.springer.com/protocol/10.1007%2F978-1-0716-1851-6_10 to generate the annotation file so I'm not sure how to fix it.

Any help would be much appreciated.

How to set the -e/--mate parameter for non-strand-specific or single-end sequencing libraries?

Hi,

I have a question regarding the -e/--mate parameter for the extract command.

Sometimes sequencing library is either non-strand-specific or single-end sequencing. According to the documentation, the -e/--mate parameter is used to select the read/mate to extract the crosslink sites from paired-end sequencing, with choices 1 or 2 (1 for the first mate and 2 for the second mate).
image
Could you please provide guidance on how to set this parameter for:

Non-strand-specific libraries
Single-end sequencing libraries

Thank you for your help!

Best regards,

DESeqDataSetFromSlidingWindows issue

When trying to generate DESeq object I get this error:
ddw <- DESeqDataSetFromSlidingWindows(countData=count_matrix, annotObj = annotation_file, colData=col_data, design=~type)

Warning in DESeqDataSetFromSlidingWindows(countData = count_matrix, annotObj = annotation_file, :
Cannot find chromosomal positions for all entries in countData.
countData rows with missing annotation will be removed !
Error in DESeqDataSet(se, design = design, ignoreRank) :
all samples have 0 counts for all genes. check the counting script.

head(count_matrix) gives me:
smb-hk-2-22-fhevci1-rep1-20200123-ju_trimmed
ENSG00000227232.5:intron0005W00067 1
ENSG00000227232.5:exon0004W00079 0
ENSG00000227232.5:intron0002W00089 2
ENSG00000227232.5:intron0001W00221 0
ENSG00000279457.4:intron0008W00009 0
ENSG00000279457.4:intron0007W00029 0
smb-hk-2-23-fhevci8-rep1-20200123-ju_trimmed
ENSG00000227232.5:intron0005W00067 0
ENSG00000227232.5:exon0004W00079 0
ENSG00000227232.5:intron0002W00089 1
ENSG00000227232.5:intron0001W00221 0
ENSG00000279457.4:intron0008W00009 0
ENSG00000279457.4:intron0007W00029 1

I thought the chromosomal positions came from the annotation object rather than the matrix file?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.