Giter VIP home page Giter VIP logo

polyploid-genotyping's Introduction

Build Status DOI

polyploid-genotyping

This is the GitHub repository for the manuscript "SNP genotyping and parameter estimation in polyploids using low-coverage sequencing data," which is currently under peer review. We have also posted a preprint on bioRxiv.

  • ebg/: C++ source code for EM/ECM algorithms for genotyping with our models.

  • helper-scripts/: Python, Perl, R, and Bash scripts for extracting and filtering allele depth info from a VCF file for use with ebg.

  • Rcode/: R and C++ code for performing simulations and for running analyses.

  • data/: example data sets from Betula pendula, B. pubescens, and Andropogon gerardii that were used in the manuscript.

  • docs/: Rmd and HTML files for GitHub pages site.

More details on the contents of each folder can be found in their respective README files.

polyploid-genotyping's People

Contributors

pblischak avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

polyploid-genotyping's Issues

compiling error on Ubuntu 18.04

Hi Paul,
I tried installing EBG on our machine running Ubuntu 18.04. When I ran make, I got the error "'isnan' was not declared in this scope", as well as suggestions to use std::isnan.
Lindsay

Improper log-sum of exponentials in log likelihood

The log likelihoods of the different models are computed using log sums of exponentials, which can have nasty overflow/underflow problems (cf. this link). I've run into this problem when read counts are high (on the order of ~1000x). Since sequencing coverage isn't typically this high, it hopefully should be an issue for most data sets. Nevertheless, I'll need to fix it.

Potential memory error

The way that I am storing the input data (total and alt reads) is to push them linearly into an integer vector. A user has noted that large data sets will not be read in properly due to a presumed memory error (I haven't verified this yet), so the program will not run. I am planning to try to fix it by making the input data two-dimensional vectors to see if that will fix the issue.

# Current implementation
std::vector<int> input;

# New implementation
# with 2D vector
std::vector< std::vector<int> > input;

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.