mikemc / mgs-bias-manuscript Goto Github PK
View Code? Open in Web Editor NEWAnalysis for McLaren, Willis, and Callahan (2019)
Home Page: https://doi.org/10.7554/eLife.46923
License: Other
Analysis for McLaren, Willis, and Callahan (2019)
Home Page: https://doi.org/10.7554/eLife.46923
License: Other
Currently, I have LazyData: true
in the DESCRIPTION file, which means that the datasets in data/
are available without calling the data()
function. If I keep this setting, then I can remove the calls to data()
from the Rmd's.
Calculating the error with adist
as in
error <- pred %>%
group_by(Mixture_type, Bias_type) %>%
summarize(
RAdist = sqrt(mean(adist(Observed, Predicted))),
Adist2 = mean(adist(Observed, Predicted)^2),
(...)
)
is incorrect, as the Aitchison distance must be calculated within samples and then averaged. This is currently done twice in the Brooks2015 analysis, though the results are not currently used or referenced in the manuscript.
Possible ways to improve documentation include adding a README.md to the folder giving the overall purpose and a brief description of each script, and adding a numerical prefix to the scripts for the costea2017 dataset by order in which they need to be run.
In particular, the estimated CN of 1 for L. iners is suspicious. Should compare the estimates from the refseq genomes to the closest relatives in the rrnDB, and if warranted, use the median of the rrnDB numbers instead of the refseq annotation.
also make sure coord_fixed() is used
Specifically in case the new nest()
and unnest()
syntax breaks the existing code. (See https://tidyr.tidyverse.org/articles/in-packages.html.) If so, either update the code or create a requirement for tidyr < 1.0.0 and a warning in the ReadMe
Many thanks for providing the tools for using correcting bias in metagenomic sequencing measurements. I still need to fully understood the paper but it is very interesting.
However I couldn't use the method implemented on the paper by calling the package with library. The package only exports some data: "%>%", "brooks2015_counts", "brooks2015_sample_data", "brooks2015_species_info", "costea2017_metaphlan2_profiles", "costea2017_mock_composition", "costea2017_sample_data"
.
To correct this you need to add the @export
tag on each function that you want to make available for the user. I could provide a pull request if you wanted.
Currently the figures make heavy use of ggplot's default colors for a three-level categorical variable---red, green, and blue---making essential variables impossible to distinguish for people who are red/green colorblind. To fix, need to pick new colors for the taxa in Figures 1 and 2, and change the text references to e.g. "the red taxon", and also need to pick new colors to distinguish the three protocols in the shotgun dataset.
Just a note to archive a copy of this repository somewhere more permanent (e.g. Zenodo) before publication. Can use Zenodo's Github integration https://zenodo.org/account/settings/github/
Cowplot is back to overriding the default ggplot theme, and this is messing up the sizing in some figures in the html documents
in the Brooks2015 dataset.
Using the simple linear model of Krehenwinkel2017 and the all-taxa three-way interactions model of Brooks2015. Add to the brooks2015 analysis and save for SI figures.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.