Giter VIP home page Giter VIP logo

biotite's Introduction

Biotite at PyPI

Python version

Test status

The Biotite Project

Biotite project

Biotite is your Swiss army knife for bioinformatics. Whether you want to identify homologous sequence regions in a protein family or you would like to find disulfide bonds in a protein structure: Biotite has the right tool for you. This package bundles popular tasks in computational molecular biology into a uniform Python library. It can handle a major part of the typical workflow for sequence and biomolecular structure data:

  • Searching and fetching data from biological databases
  • Reading and writing popular sequence/structure file formats
  • Analyzing and editing sequence/structure data
  • Visualizing sequence/structure data
  • Interfacing external applications for further analysis

Biotite internally stores most of the data as NumPy ndarray objects, enabling

  • fast C-accelerated analysis,
  • intuitive usability through NumPy-like indexing syntax,
  • extensibility through direct access of the internal NumPy arrays.

As a result the user can skip writing code for basic functionality (like file parsers) and can focus on what their code makes unique - from small analysis scripts to entire bioinformatics software packages.

If you use Biotite in a scientific publication, please cite:

Kunzmann, P. & Hamacher, K. BMC Bioinformatics (2018) 19:346.

Installation

Biotite requires the following packages:

  • numpy
  • requests
  • msgpack
  • networkx

Some functions require some extra packages:

  • mdtraj - Required for trajetory file I/O operations.
  • matplotlib - Required for plotting purposes.

Biotite can be installed via Conda...

$ conda install -c conda-forge biotite

... or pip

$ pip install biotite

Usage

Here is a small example that downloads two protein sequences from the NCBI Entrez database and aligns them:

import biotite.sequence.align as align
import biotite.sequence.io.fasta as fasta
import biotite.database.entrez as entrez

# Download FASTA file for the sequences of avidin and streptavidin
file_name = entrez.fetch_single_file(
    uids=["CAC34569", "ACL82594"], file_name="sequences.fasta",
    db_name="protein", ret_type="fasta"
)

# Parse the downloaded FASTA file
# and create 'ProteinSequence' objects from it
fasta_file = fasta.FastaFile.read(file_name)
avidin_seq, streptavidin_seq = fasta.get_sequences(fasta_file).values()

# Align sequences using the BLOSUM62 matrix with affine gap penalty
matrix = align.SubstitutionMatrix.std_protein_matrix()
alignments = align.align_optimal(
    avidin_seq, streptavidin_seq, matrix,
    gap_penalty=(-10, -1), terminal_penalty=False
)
print(alignments[0])
MVHATSPLLLLLLLSLALVAPGLSAR------KCSLTGKWDNDLGSNMTIGAVNSKGEFTGTYTTAV-TA
-------------------DPSKESKAQAAVAEAGITGTWYNQLGSTFIVTA-NPDGSLTGTYESAVGNA

TSNEIKESPLHGTQNTINKRTQPTFGFTVNWKFS----ESTTVFTGQCFIDRNGKEV-LKTMWLLRSSVN
ESRYVLTGRYDSTPATDGSGT--ALGWTVAWKNNYRNAHSATTWSGQYV---GGAEARINTQWLLTSGTT

DIGDDWKATRVGINIFTRLRTQKE---------------------
-AANAWKSTLVGHDTFTKVKPSAASIDAAKKAGVNNGNPLDAVQQ

More documentation, including a tutorial, an example gallery and the API reference is available at https://www.biotite-python.org/.

Contribution

Interested in improving Biotite? Have a look at the contribution guidelines. Feel free to join or community chat on Discord.

biotite's People

Contributors

alex123012 avatar aurelg avatar claudejrogers avatar danpf avatar dnlbauer avatar ebetica avatar edikedik avatar entropybit avatar f-allain avatar jacobanter avatar jhkru avatar maxgreil avatar padix-key avatar t0mdavid-m avatar thomasnevolianis avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.