Giter VIP home page Giter VIP logo

qsvar's Issues

k_qsvs todos

getBonfTx todos

get_qsvs todos

> seq_len(0)
integer(0)
> 1:0
[1] 1 0
> seq_len(-3)
Error in seq_len(-3) : argument must be coercible to non-negative integer
> 1:(-3)
[1]  1  0 -1 -2 -3

Figure 1.

add abcd
move correlation value inside the plot

Provide access to all statistical model results

We will likely deposit to Bioconductor's ExperimentHub the full output from limma and the full RangedSummarizedExperiment object for the 119 degradation samples used in this study. Doing so will allow users to re-use the data from this study in different ways.

We will also add an equivalent function to spatialLIBD::fetch_data() in order to download this data.

Design a 2 day workshop / short course

Similar to LieberInstitute/spatialLIBD#68

Develop a long format workshop to teach in 2 days the basics of differential expression analysis using RNA-seq data from postmortem human brain, which is affected by degradation. To make this workshop more general, it will also cover how you can select transcripts associated with degradation if you were to generate the relevant data.

We might want to learn about the format used by https://carpentries.org/index.html for short courses prior to designing this short course / long workshop. The target audience would be users who have some basic familiarity with Bioconductor, SummarizedExperiment, limma and/or similar tools.

Update DESCRIPTION

  • Title
  • Authors info: check other packages like megadepth for example.
  • Description: about 3 sentences. Should end with a period. Note the spacing (see other packages).
  • Add other biocViews terms. Aim for at least 5 terms.

Figure 4

facet grid
labels
tile plot
variable names need to be cleaned up

Add DEqual plots to package

Use code for creating DEqual plot in /dcl01/lieber/ajaffe/lab/degradation_experiments/Joint/all/SCRIPTS/qsva_purr.R to create a helper function for assessing the extent of degradation.

qsvaR workflow overview image

We plan to improve the qsvaR workflow image and further expand the related background documentation for understanding the different steps of the process, what data you need, what are the outputs of qsvaR, and how you can use them for downstream analyses.

Helper functions for reproducing statistical modeling results that can be applied to subsets of the available data or new degradation datasets

We will write helper functions for reproducing the "main" and "interaction" model results based on the data provided in #32. These functions can then be applied to either subsets of the data to compute new statistical results that users might want to use as input (related to #33). Alternatively these functions could be applied to new degradation data from either new experiments carried at LIBD or data generated elsewhere for other tissues / organisms.

Video guides for qsvaR

We want to create a collection of short videos (likely shared on YouTube) demonstrating how to use the different features of qsvaR. We will collate these videos into a new user guide (a new vignette). These videos will be helpful to explain the different components related to qsvaR such as the experimental design for RNA degradation experiments, the selection of transcripts associated with degradation (related to #36 #37), computing the qSVs, and using the qSVs in downstream analyses (related to #35).

Some of these videos will be inspired by actual use cases we hear from our users using publicly available data.

This will be an evolving process as new videos will have to be made to reflect new features added to qsvaR.

Interactive tool for assessing confounding of DE results with RNA degradation

We will build an interactive website, likely powered by shiny, such that users can upload their differential expression results at the gene-level and make DEqual plots to assess if their results appear to be confounded by RNA degradation.

This tool (website) will enable users to check publicly available differential expression results and identify genes which could potentially be false positives. The website will generate an automated report that can then be shared with collaborators. The function for making this report will also be available as an R function that can be used in an non-interactive way.

If possible (due to memory constraints), this tool will also support exon-level, transcript-level, and/or exon-exon junction level DE results.

Document how to use Salmon and SPEAQeasy output in the vignette

Explore whether we can use the example data from tximport to create an RSE from Salmon transcript count output files and run the qSVA functions on it.

See:

@Nick-Eagles might be able to help compare the code from tximport and/or rnaseqDTU with what we have in SPEAQeasy.

You could also use the SPEAQeasy-example data (see https://github.com/LieberInstitute/SPEAQeasy-example/blob/master/pipeline_outputs/count_objects/rse_tx_Jlab_experiment_n42.Rdata) to document how to import those files and use them for this package.

getDegTx todos

Screen Shot 2021-11-02 at 11 27 14 AM

See https://github.com/LieberInstitute/recount3/blob/master/R/create_rse_manual.R#L3-L6 for another example of how I use it in a sentence.

Expand functionality for downstream analyses

We plan to add helper functions for visualizing the relationship between qSVs and other covariates, as is typically done in several analyses. This will likely involve making heatmaps with ComplexHeatmap. We will study what qsvaR users have done for other analyses when designing these helper functions.

[Feature Request] Support ENSEMBL IDs

We should add a is_gencode = TRUE default argument such that when it's set to FALSE, it matches using ENSEMBL IDs instead of Gencode IDs. Aka, it removes the trailing .[0-9] in the IDs.

This should include a unit test that checks the results using the same data with Gencode IDs, then manually makes them ENSEMBL IDs, and checks that with is_gencode = FALSE we get exactly the same results (might have to use set.seed() on this unit test).

Figure 2

telegraph subsetting
Main and interaction should be horizontal
label arrows
pair line plot with matrices

Document concepts and link to external resources for more detail

Package data

will need help making the data in data/ available to the package so that the examples work.

Translate course and intro documentation to Spanish

Translate to Spanish the short workshop #38 and other intro level materials to help increase access to these technologies to Spanish-speaking individuals. This will help increase diversity of our user base and stimulate use of these analytical technologies in other parts of the world.

K_qsvs coverage

  • a test that shows full rank error

  • a test that shows low expression error (sva doesn't work)

Figure 3

cord equal
facet wrap
black line is hard to see (maybe red)
correlation value moved on to plot
density graphs need a scale
MAIN VS INT
old version of plot is clearer.
A bar plot is clearer to show region distribution

[BUG] Complete qSVA() wrapper

It seems like a bug that sig_transcripts at

qsvaR/R/qSVA.R

Line 23 in 498143f

qSVA <- function(rse_tx, type = "cell_component", sig_transcripts = select_transcripts(type), mod, assayname) {
is not used later on the qSVA() function.

Figure 5

add dot size to legend
move to ggplot

Guide for downstream analyses

We will write a user guide (vignette) for downstream analyses using qSVs generated with qsvaR. This guide will showcase the helper functions from #34. Potentially not all the code will be evaluated given testing restrictions on Bioconductor. Though we will explore the use of "long tests". The goal of this user guide is to help orient users into what analyses they should likely run once they have estimated qSVs. This will also help users determine whether they should use qSVs in their analyses or not, for example, if they are analyzing data from a brain region or another tissue for which no degradation data exists.

Complete select_transcripts()

You might want to use code like this to create the vector of transcript IDs.

x <- letters
x
#>  [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s"
#> [20] "t" "u" "v" "w" "x" "y" "z"
cat(paste0('c("', paste(x, collapse = '", "'), '")'))
#> c("a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z")

Created on 2022-03-15 by the reprex package (v2.0.1)

Then you can copy paste it into

return("TODO")
.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.