hbctraining / scrna-seq Goto Github PK
View Code? Open in Web Editor NEWHome Page: https://hbctraining.github.io/scRNA-seq/
Home Page: https://hbctraining.github.io/scRNA-seq/
the elbow plot markdown (quantitave approach) needs a figure update
AND an overview of when to use integration
Talk to Rory and Sarah
#1. Warning in irlba(A = t(x = object), nv = npcs, ...) :
You're computing too large a percentage of total singular values, use a standard svd instead.
#2. In PrepDR(object = object, features = features, verbose = verbose) :
The following 15 features requested have not been scaled (running reduction without them): RAD51, CDC45, E2F8, DTL, EXO1, UHRF1, ANLN, GTSE1, NEK2, HJURP, DLGAP5, PIMREG, KIF2C, CDC25C, CKAP2L
Include as note, with how to run with svd instead
slide deck to wrap-up like we do with RNA-seq/ChIP-seq
we should add a markdown? or some text about how we can specify whether to use raw counts or normalized. The following code will allow us to extract the raw counts after normalization:
RNA_raw_assay <- seurat_integrated@assays$RNA@counts
seurat_integrated[['RNA_raw']] <- CreateAssayObject(counts = RNA_raw_assay)
RNA_norm_assay <- seurat_integrated@assays$RNA@data
seurat_integrated[['RNA_norm']] <- CreateAssayObject(counts = RNA_norm_assay)
DefaultAssay(seurat_integrated) <- "RNA_norm"
For clustering and marker id
Split Introduction to scRNA-seq markdown (including replicates from DGE too) AND present before Sarah.
Have the raw counts to matrix content to be presented after Sarah
Add cellranger and the process of getting counts as a separate 2-day workshop (kind of like how we have Intro to RNA-seq + DGE). Include Intro plus experiemental design.
The second workshop would be R-based starting with counts. Cellranger is not a prerequisite for this second. Include some points from Intro here too.
https://github.com/hbctraining/scRNA-seq/blob/master/lessons/mitoRatio.md
I ran into a problem when I did the last step.
metadata$mtUMI <- Matrix::colSums(counts[which(rownames(counts) %in% mt),], na.rm = T)
error in evaluating the argument 'x' in selecting a method for function 'colSums': object of type 'closure' is not subsettableI am not familiar with the R language, and I have tried many ways to solve it. I am not familiar with the R language, and I have tried many ways to solve it. Hope someone can help me. Thank you.
for creating count data object, you use readMM() function from the Matrix package to turn our standard matrix into a sparse matrix. However, the data I downloaded is not standard. from where I can download the standard data with zeros?
this new lesson will incorporate content from single sample marker identification + integration marker identification.
Since we will be using integrated data we will use FindConservedMarkers and run on all clusters, for clusters that have few cells per group this will fail giving us a chance to run/introduce FindAllMarkers as well
QC set-up: add unknown # of cells expected - look into paper and see if we can find this number
Download BAM from SRA and check the paper.
We are not sure if there are only 2 BAMS, because the data was pooled (as described in the study) or if individual samples were also supplied to SRA.
Update the setup markdown to reflect this
this is the first lesson (not exploration). May not need to split if using sctransform (reassess as we develop!)
Split at "Determining PCs" sections
Maybe from the NGS single cell lessons - link to a markdown which show s what bad data would look like
Thank you so much for your sharing,it has benefited me a lot.
I have 10 samples in 2 condition, every conditon has 5 samples. So when to integrate data to analysis.,How to split?
Should I use the condition or the 10 sample in Seurat.
data.list <- SplitObject(data, split.by = "sample") or by condition.
Looking forward to your reply
so we have clusters labeled when presenting
Zhu got the error: 'calculateQCMetrics' is defunct. Use 'perCellQCMetrics' instead. See help("Defunct")
She tried perCellQCMetrics but it is not the same as calculateQCMetrics - need to look into this error (especially using R 4.0)
if mention of it in the lessons - remove it.
our new UMAP install instructions do not require this library.
To our README add a link to the UMAP installation markdown - students will need this for the workshop.
@mistrm82 did this - but double check to make sure it is using Anaconda and NOT requiring reticulate
In QC lesson, merged unprocessed seurat object was saved as raw_seurat.RData. Filtered seurat object was saved as seurat_raw.RData. The names are confusing for students.
Find out from the paper if they have a list of barcodes to identify which cell came from which sample (of the 8)
Hi,
I got an error while running the following code
# Filter out low quality reads using selected thresholds - these will change with experiment
filtered_seurat <- subset(x = merged_seurat,
subset= (nUMI >= 500) &
(nGene >= 250) &
(log10GenesPerUMI > 0.80) &
(mitoRatio < 0.20))
The error:
Error in CellsByIdentities(object = object, cells = cells) :
Cannot find cells provided
Could it be that the meta data with renamed variables are causing the error?
After integration, it feels a bit redundant. Maybe only go through with integration using sctransform.
Since we have two samples, start with integration of the two samples. Have a section that describes a single sample scenario (and the differences)
Maybe one for experimental design (expanding on what Sarah has)
and one for the analysis: QC, integration, marker id, differential expression, traj analysis
Some packages are no longer needed for the current analysis workflow: Matrix.utils, devtools, AnnotationHub, ensembldb. They could be removed from pre-work installation instruction. Note that AnnotationHub and ensembldb are still needed if people want to generate annotation file themselves.
Hello,
I have been following some of the tutorial provided by hbc training specifically on integrating different datasets: https://hbctraining.github.io/scRNA-seq/lessons/06_SC_SCT_and_integration.html
I believe I have encountered a slight issue. I followed much of the code that was given on the page; I had all of samples in one seuratobject and I split them then performed SCtransformation on EACH SEPARATELY(NOTE I didn;t regress out cell cylce):
split_srt <- SplitObject(sample.merge, split.by = "Sample.Name")
for (i in 1:length(split_srt)) {
split_srt[[i]] <- NormalizeData(split_srt[[i]], verbose = TRUE)
split_srt[[i]] <- SCTransform(split_srt[[i]], vars.to.regress = c("percent.MT"))
}
I then performed the suggested integration steps:
integ_features <- SelectIntegrationFeatures(object.list = split_srt,
nfeatures = 3000)
split_srt <- PrepSCTIntegration(object.list = split_srt,
anchor.features = integ_features)
integ_anchors <- FindIntegrationAnchors(object.list = split_srt,
normalization.method = "SCT",
anchor.features = integ_features)
seurat_integrated <- IntegrateData(anchorset = integ_anchors,
normalization.method = "SCT")
Running a PCA and TSNE yield dimensionality reduction that looked quite integrated:
But the issue is when I try to find marker genes, it appears that expression of most genes is seen as background; IE there are no white dots on a featureplot:
seurat_integrated <- FindNeighbors(seurat_integrated,dims = 1:30)
seurat_integrated <- FindClusters(seurat_integrated, resolution = 0.5)
Merged.markers <- FindAllMarkers(seurat_integrated, only.pos = TRUE, min.pct = 0.25, logfc.threshold = 0.25)
plotting some of these markers
I am slightly unsure what I have done wrong/if I missed some steps. I would greatly appreciate any help I get.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.