Giter VIP home page Giter VIP logo

hmm4ga's Introduction

Hidden Markov Models for Genome Analysis

The project's goal is the development of a basic implementation of the pair hidden Markov Model (HMM) forward algorithm for genomic sequence analysis (described in [1]), with the introduction of concurrent computation through the use of OpenMP APIs.

Further details are provided in these articles ([2], [3]) and on the book Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids (page 88, §4.2).

Code Description

The main files that build up the project are:

  • Sequence.h: class that represents the sequence of nucleotides, it contains the string of characters that compose the sequence and the class SequenceGenerator
  • SequenceGenerator.h: class that defines a random emission probability distribution of a sequence of nucleotides. Currently, an instance of the class Sequence is randomly generated according to its SequenceGenerator.
  • ProbabilityMatrix.h: class that represents a generic matrix of floating point values, from which the classes DynamicMatrix and StateTransitionMatrix inherit common attributes and methods. DynamicMatrix adds the possibility of adding rows and columns dynamically, while StateTransitionMatrix provides a series of states, and a mapping between them and the indexes of the matrix
  • PairHMM.h: the class that implements the pair HMM forwarding algorithm, it encloses 2 instances of the class Sequence (one for defining the read sequence, and one for defining the haplotype sequence), 1 instance of the class StateTransitionMatrix (for defining matrix T), and 3 instances of the class DynamicMatrix (for the definition of matrices M, I and D)
  • main.cpp: the entry point of the program, contains an instance of the class PairHMM and the call of its method for the execution of the PairHMM forwarding algorithm

Language and APIs

The code is entirely written in C++ programming language, with the use of the following libraries and APIs (omitting the standard ones):

  • random: used for the random generation of sequences and the random definition of state transition probabilities
  • algorithm: used for the shuflling of sequences, used for randomization purposes
  • OpenMP: used in PairHMM.cpp for introducing thread level computation in the algorithm

How to run the code (Windows)

  1. Install MinGw64 version > 9.2 (otherwise the random generated sequence will be the same at each execution, as reported here and here)
  2. Install CMake
  3. Create folder for building project
  mkdir build
  cd build
  1. Generate the makefiles
  cmake -G “MinGW Makefiles” ..
  1. build the project
  cmake --build .
  1. run the program
  ./HMM4GA.exe

hmm4ga's People

Contributors

leogori avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.