surh / hmvar Goto Github PK
View Code? Open in Web Editor NEWHuman Microbiome Variant Analysis in R
License: GNU General Public License v3.0
Human Microbiome Variant Analysis in R
License: GNU General Public License v3.0
Obtain a doi via Zenodo.
When the minor allele frequency is equal to 0.5, and freq_thres = 0.5. The locus is always assigned to the major allele. It should be modiied to discard such ties. Most will come from cases when there are two reads only.
For functional enrichment functions (#14). If Gene Ontology is being tested. Use the structure of the ontology to get correct p-values. Probably via topgo.
Imputation via mice needs to complete benchmark
We need a test_annotations function that tests all annotations above some count threshold from a set (test_annotations
).
We also need a more general process_annotations
function that takes some form of table of genes with annotations, and a subset of significant genes to look for enrichments.
See functions in inst/scripts/mktest_enrichments
Calculate Ka/Ks per gene.
either use the MK table or use other package that incorporates more information.
Script should take results from both vMWAs and MKtest and compare the results.
It should be able to:
Basically from a file of genes with statistics, a file of annotations in eggnog mapper (emapper.py) format, and some significance thresholds. Calculate enrichment for desired annotations.
Ideally it should have option to take files or directories with the stats. If directories are passed, it should analyze all files within directory.
It was originally written for mktest but probably has general utility. Must add option to keep only CDS and set default to TRUE for backwards compatibility
Should take output from MIDAS and plot/analyze number of variable sites per genome per sample.
The script should be able to:
Script should be able to plot SNPs from genes from MIDAS. Might require some functions.
Script must be able to:
Add an option that allows me to subset the genes I want to analyze.
Should take the output from MIDAS and select all the data from a subset of genes.
I am not sure if it is possible. But at least n_trials should be the same as 'size' in the ouptut of the other functions
Use DoS function (#11) in dos.r
Set command line arguments via argparser.
Build testing via Travis CI
Vignette should include
Incorporate unit testing via testhat.
Create function to read egnogg mapper (emapper.py
) annotation format.
The function test_go
already does this automatically via topGO, but the other enrichment functions, gsea
and sign_test
, rely exclusively on the annotations given by the user.
There is some major bug, from unknown reasons, in gsea
function. Calling term_gsea
on each
term produces a list with the correct p-values and sizes, but for some unknown reason, using either bind_rows or rbind (or map_dfr which uses bind_rows) changes all the numeric values.
I cannot reproduce the error with simple tibble creation.
Create function for DoS from mktest output.
Track internal testing coverage via codecov
Currently, terms below min_size enter the testing function and return NULL. It would be better if they never called the testing function, and if the testing function didn't have a min_size argument.
Need to include some test data.
Some basic output from MIDAS must be here. The data must include:
Add function for sign test for either DoS or mkratio.
Function to plot SNP abundances
It doesn't work with allele frequencies
Perform KS tests on functional groups. set of functions should be similar to generic functional enrichment functions (#14).
Probably there should be one wrapper function to call either method.
Alternatively, we should have consistent naming and output.
Some anlsysi can be made via topgo
To be consistent with other determine_* functions
Set AMOR as suggests in DESCRIPTION.
Make sure it doesn't break anything.
Use Weir & Cockerham 1984 estimates for multiple loci
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.