smcclatchy / mapping Goto Github PK
View Code? Open in Web Editor NEWQuantitative Trait Mapping Lesson
Home Page: https://smcclatchy.github.io/mapping/
License: Other
Quantitative Trait Mapping Lesson
Home Page: https://smcclatchy.github.io/mapping/
License: Other
kinship = proportion of shared alleles between individuals
heatmap(kinship[1:20, 1:20], symm = TRUE)
some individuals don't appear to be related to self because they are highly heterozygous, thus don't share alleles with self; each chromosome doesn't share alleles with its partner
discuss with your partner why diagonal doesn't equal 1, and the relationship between this and homo and heterozygosity
in addition to R and programming language, ask about genetics and biology and statistics background
graphic of lod plot showing 90, 95, 99% credible interval
make a histogram of permuted iron data
have a look at the permuted iron data (head)
I think this page is very well constructed! It flows very nicely and the explanations a very clear.
# How did you get 0.002?
pr <- calc_genoprob(cross=iron, map=map, error_prob=0.002)
after explaining model accounting for relatedness, include example of kinship loco calculation
fit model leave on chr out
model effect of allele minus current allele; chr minus current chr
During the Kinship Matrix section, there was discussion about using allele probabilities vs genotype probabilities. There was a specific question about why you would choose to use one over the other, however, for me I wasn't even sure what was meant by the difference between allele probabilities and genotype probabilities. It turns out when I spoke to Dan during break, I actually had the two concepts reversed.
as a reference so that people can run through commands quickly instead of searching through lesson for individual commands.
Karl's documentation should have it.
1st column = sample ids
columns 2-5 = phenotypes
column 6-7 = covariates
table 2 = marker genotypes
superior table = marker map
output as svg
... except the genotype under consideration. Why?
because you're putting the same genotype (predictor) into the model twice
overfitting
multi-collinearity
we only use X-chromosome covariates in the lesson
change episode "Special covariates for the X chromosome" to "Covariates" w/ additive covar as subhead 1 and special Xcovar as subhead 2
for addcovar use sex and direction of cross
for Xchr covar do what's written
update all subsequent episodes - all genome scans and perms
Students were confused by the threshold in find_peaks(). Move permutations after performing a genome scan and before find peaks.
showing spread of permutations
thrs = rep(0, 100)
for(i in 1:100) {
tmp = scan1perm(genoprobs = pr, pheno = iron$pheno[,'liver', drop = F], Xcovar = Xcovar,
n_perm = 10)
thrs[i] = quantile(tmp, probs = 0.95)
}
thrs2 = rep(0, 100)
for(i in 1:100) {
tmp = scan1perm(genoprobs = pr, pheno = iron$pheno[,'liver', drop = F], Xcovar = Xcovar,
n_perm = 100)
thrs2[i] = quantile(tmp, probs = 0.95)
}
thrs3 = rep(0, 100)
for(i in 1:100) {
print(i)
tmp = scan1perm(genoprobs = pr, pheno = iron$pheno[,'liver', drop = F], Xcovar = Xcovar,
n_perm = 1000, cores = 4)
thrs3[i] = quantile(tmp, probs = 0.95)
}
data.frame(perm10 = thrs, perm100 = thrs2, perm1000 = thrs3) %>%
pivot_longer(cols = everything()) %>%
ggplot() +
geom_density(aes(value, color = name), size = 1) +
labs(title = 'Comparison of 0.05 threshold using 10 vs 100 permutation',
x = 'LOD', color = 'Num. Perms')
# Redudant:
grav2 <- read_cross2( system.file("extdata", "grav2.zip", package="qtl2geno") )
grav2 <- read_cross2("http://kbroman.org/qtl2/assets/sampledata/grav2/grav2.zip")
grav2 <- read_cross2("~/my_data/grav2.zip")
zip_datafiles("~/my_data/grav2.yaml")
grav2 <- read_cross2("~/my_data/grav2.yaml")
# From all these I think the easiest method is:
grav2 <- read_cross2( system.file("extdata", "grav2.zip", package="qtl2geno") )
# but it would be good for everyone to download the sample data into their own directory
# and learn to load data from their own machines:
grav2 <- read_cross2("~/my_data/grav2.yaml")
setwd("~/Desktop")
dir.create("./mapping")
setwd("~/Desktop/mapping")
dir.create("./data")
dir.create("./scripts")
dir.create("./results")
Downloading DOQTL data:
It should be said that once the data have been downloaded, it should be put into the ~/Desktop/mapping/data directory.
Initial steps that demonstrate reading data is a bit confusing because grav2.yaml wasn't downloaded yet. Where do you get it? There is no link. Also It might be better to either have participants go download the grav2.yaml first and put it into their data directory then the demo, or just read from Karl's website directly but showing all the different ways might be a bit confusing:
library(qtl2geno)
grav2 <- read_cross2("http://kbroman.org/qtl2/assets/sampledata/grav2/grav2.zip")
#or download into directory first then:
grav2 <- read_cross2("~/my_data/grav2.yaml")
?write_control_file()
write_control_file("~/my_data/grav2.yaml",
crosstype="riself",
geno_file="grav2_geno.csv",
gmap_file="grav2_gmap.csv",
pheno_file="grav2_pheno.csv",
phenocovar_file="grav2_phenocovar.csv",
geno_codes=c(L=1L, C=2L),
alleles=c("L", "C"),
na.strings=c("-", "NA"))
new_order <- sample(rownames(iron$pheno))
pheno_perm <- iron$pheno
rownames(pheno_perm) <- new_order
xcovar_perm <- Xcovar
rownames(xcovar_perm) <- new_order
p <- scan1(genoprobs = pr, pheno = pheno_perm, Xcovar = xcovar_perm)
plot(p, map)
head(new_order)
paste your max lod into etherpad
load all max lods into a vector
create a histogram of max lods
do 1000 perms as in the lesson
plot the max lod in a histogram
max a hist of operm[,1]
now on your own, hist of operm[,2]
draw abline for 0.2 or 0.05 threshold for liver or spleen onto histogram
abline(v = 3.36, col = "red")
now on your own, do the same for spleen, for 0.2 , etc. use blue, gray etc.
create a script of the following for people to copy and paste, then change lodcolumns to 2 for spleen and col 2 for spleen
create challenge and provide url to script for copy-paste
plot(out, map, lodcolumn=1,
col=color[1],
main=colnames(iron$pheno)[1],
ylim=c(0, ymx*1.02))
plot(out_pg, map, lodcolumn=1, col=color[2], add=TRUE)
plot(out_pg_loco, map, lodcolumn=1,
col=color[3], add=TRUE, lty=2)
legend("topleft", lwd=2, col=color,
c("H-K", "LMM", "LOCO"), bg="gray90", lty=c(1,1,2))
plot(out, lodcolumn="liver", map)
plot(out, lodcolumn="liver", map, col="black", add=TRUE)
plot(out_pg, lodcolumn="liver", map, col="red", add=TRUE)
plot(out_pg_loco, lodcolumn="liver", map, col="yellow", add=TRUE)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.