Giter VIP home page Giter VIP logo

gnomic's Introduction

Gnomic

https://travis-ci.org/biosustain/gnomic.svg?branch=master

Gnomic is a human– and computer–readable representation of microbial genotypes and phenotypes. The gnomic Python package contains a parser for the Gnomic grammar able to interpret changes over multiple generations.

The first formal guidelines for microbial genetic nomenclature were drawn up in the 1960s. These traditional nomenclatures are too ambiguous to be useful for modern computer-assisted genome engineering. The gnomic grammar is an improvement over existing nomenclatures, designed to be clear, unambiguous, computer–readable and describe genotypes at various levels of detail.

Installation

pip install gnomic

Language grammar

The grammar consists of a list of genotype or phenotype designations, separated by spaces and/or commas. The designations are described using the following nomenclature:

Designation Grammar expression
feature deleted -feature
feature at locus deleted -feature@locus
feature inserted +feature
site replaced with feature site>feature
site (multiple integration) replaced with feature site>>feature
site at locus replaced with feature site@locus>feature
feature of organism organism/feature
feature with type type.feature
feature with variant feature(variant)
feature with list of variants feature(var1, var2) or feature(var1; var2)
feature with accession number feature#GB:123456
feature by accession number #GB:123456
accession number #database:id or #id
fusion of feature1 and feature2 feature1:feature2
insertion of two fused features +feature1:feature2
insertion of a list of features or fusions +{..insertables}
fusion of a list and a feature {..insertables}:feature
a non-integrated plasmid (plasmid) or (plasmid ...insertables)
integrated plasmid vector with required insertion site site>(vector ..insertables)

Feature variants

Features may have one or more variants, separated by colon ";" or comma ",".

For example: geneX(cold-resistant; heat-resistant)

Variants can either be identifiers (using the characters a-z, 0-9, "-" and "_") or be sequence variants following the HGVS Sequence Variant Nomenclature.

For example: geneY(c.123G>T)

Example usage

In this example, we parse "EcGeneA ΔsiteA::promoterB:EcGeneB ΔgeneC" and "ΔgeneA" in gnomic syntax:

>>> from gnomic import Genotype
>>> g1 = Genotype.parse('+Ec/geneA(variant) siteA>P.promoterB:Ec/geneB -geneC')
>>> g1.added_features
{Feature(organism='Ec', name='geneA', variant=('variant',)),
 Feature(organism='Ec', name='geneB'),
 Feature(type='P', name='promoterB')}
>>> g1.removed_features
{Feature(name='geneC'),
 Feature(name='siteA')}

>>> g2 = Genotype.parse('-geneA', parent=g1)
>>> g2.added_features
{Feature(type='P', name='promoterB'),
 Feature(name='geneB', organism='Ec')}
>>> g2.removed_features
{Feature(name='siteA'),
 Feature(name='geneC')}
 >>> g2.changes()
 (Change(multiple=False,
         after=Fusion(annotations=(Feature(type='P', name='promoterB'), Feature(organism='Ec', name='geneB'))),
         before=Feature(name='siteA')),
  Change(multiple=False, before=Feature(name='geneC')))

 >>> g2.format()
 'ΔsiteA→P.promoterB:Ec/geneB ΔgeneC'

Development

To rebuild the gnomic parser using grako (version 3.18.1), run:

grako gnomic-grammar/genotype.enbf -o gnomic/grammar.py -m Gnomic

References

gnomic's People

Contributors

lyschoening avatar maciejkorzepa avatar melanie-m avatar phantomas1234 avatar warlink avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gnomic's Issues

Parsing 'None' gives obscure error

Trying to parse None as a genotype will give an obscure error

gnomic.Genotype.parse(None)

.... [lots of lines]
grako.exceptions.FailedToken: (1:1) expecting '/' :

^
FEATURE_ORGANISM
FEATURE
FUSION
REPLACEABLE
replacement
change
start

It would ease debugging to have a clearer error message

genotype_to_string and genotype_to_text failure

from gnomic.utils import genotype_to_string, genotype_to_text

from gnomic import Mutation, FeatureTree, Feature, Accession

g = Mutation(new=FeatureTree(Feature(type='reaction', accession=Accession(identifier='3OAR140'), variant='down-regulation(-0.996)')), old=FeatureTree(Feature(type='reaction', accession=Accession(identifier='3OAR140'))), multiple=False)
genotype_to_text(g)

fails with

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-33-db861f18a11d> in <module>()
      4 
      5 g = Mutation(new=FeatureTree(Feature(type='reaction', accession=Accession(identifier='3OAR140'), variant='down-regulation(-0.996)')), old=FeatureTree(Feature(type='reaction', accession=Accession(identifier='3OAR140'))), multiple=False)
----> 6 genotype_to_text(g)

/Users/niso/anaconda3/lib/python3.5/site-packages/gnomic/utils.py in genotype_to_text(genotype, fusions)
     24     """
     25     parts = []
---> 26     for change in genotype.changes(fusions=fusions):
     27         parts.append(change_to_text(change))
     28 

AttributeError: 'Mutation' object has no attribute 'changes'

Cannot parse gnomic generated string

Hi!

This simple example doesn't work:

>>> g1 = Genotype.parse('+Ec/geneA siteA>P.promoterB:Ec/geneB -geneC')
>>> g1.format(output='string')
# '-geneC +Escherichia coli/geneA +promoter.promoterB:Escherichia coli/geneB -siteA'
>>> Genotype.parse(g1.format(output='string'))
---------------------------------------------------------------------------
FailedParse                               Traceback (most recent call last)
/Users/joao/.virtualenvs/marsi/lib/python3.4/site-packages/grako/contexts.py in parse(self, text, rule_name, filename, buffer_class, semantics, trace, whitespace, **kwargs)
    212             rule = self._find_rule(rule_name)
--> 213             result = rule()
    214             self.ast[rule_name] = result

/Users/joao/.virtualenvs/marsi/lib/python3.4/site-packages/grako/contexts.py in wrapper(self)
     60             name = name[1:-1]
---> 61             return self._call(rule, name, params, kwparams)
     62         return wrapper

/Users/joao/.virtualenvs/marsi/lib/python3.4/site-packages/grako/contexts.py in _call(self, rule, name, params, kwparams)
    468 
--> 469             node, newpos, newstate = self._invoke_rule(rule, name, params, kwparams)
    470 

/Users/joao/.virtualenvs/marsi/lib/python3.4/site-packages/grako/contexts.py in _invoke_rule(self, rule, name, params, kwparams)
    508             try:
--> 509                 rule(self)
    510 

/Users/joao/.virtualenvs/marsi/lib/python3.4/site-packages/gnomic/grammar.py in _start_(self)
    110             self._sep_()
--> 111         self._check_eof()
    112 

/Users/joao/.virtualenvs/marsi/lib/python3.4/site-packages/grako/contexts.py in _check_eof(self)
    648         if not self._buffer.atend():
--> 649             self._error('Expecting end of text.')
    650 

/Users/joao/.virtualenvs/marsi/lib/python3.4/site-packages/grako/contexts.py in _error(self, item, etype)
    443             list(reversed(self._rule_stack[:])),
--> 444             item
    445         )

FailedParse: (1:21) Expecting end of text. :
-geneC +Escherichia coli/geneA +promoter.promoterB:Escherichia coli/geneB -siteA
                    ^
start

During handling of the above exception, another exception occurred:

FailedParse                               Traceback (most recent call last)
<ipython-input-7-bee3877cab49> in <module>()
----> 1 Genotype.parse(g1.format(output='string'))

/Users/joao/.virtualenvs/marsi/lib/python3.4/site-packages/gnomic/__init__.py in parse(cls, string, parent, organisms, types, **kwargs)
    222         changes = cls._parse_string(string,
    223                                     organisms or DEFAULT_ORGANISMS,
--> 224                                     types or DEFAULT_TYPES)
    225         return Genotype(changes, parent=parent, **kwargs)
    226 

/Users/joao/.virtualenvs/marsi/lib/python3.4/site-packages/gnomic/__init__.py in _parse_string(cls, string, *args, **kwargs)
    186                             whitespace='',
    187                             semantics=semantics,
--> 188                             rule_name='start')
    189 
    190     @classmethod

/Users/joao/.virtualenvs/marsi/lib/python3.4/site-packages/grako/contexts.py in parse(self, text, rule_name, filename, buffer_class, semantics, trace, whitespace, **kwargs)
    219         except FailedParse as e:
    220             self._set_furthest_exception(e)
--> 221             raise self._furthest_exception
    222         finally:
    223             self._clear_cache()

/Users/joao/.virtualenvs/marsi/lib/python3.4/site-packages/grako/contexts.py in _option(self)
    677         try:
    678             with self._try():
--> 679                 yield
    680             raise OptionSucceeded()
    681         except FailedCut:

/Users/joao/.virtualenvs/marsi/lib/python3.4/site-packages/gnomic/grammar.py in _VARIANT_(self)
    447                 self._token(')')
    448             with self._option():
--> 449                 self._BINARY_VARIANT_()
    450                 self.name_last_node('@')
    451             self._error('no available options')

/Users/joao/.virtualenvs/marsi/lib/python3.4/site-packages/grako/contexts.py in wrapper(self)
     59             # that the parser generator added
     60             name = name[1:-1]
---> 61             return self._call(rule, name, params, kwparams)
     62         return wrapper
     63     return decorator

/Users/joao/.virtualenvs/marsi/lib/python3.4/site-packages/grako/contexts.py in _call(self, rule, name, params, kwparams)
    467             self._last_node = None
    468 
--> 469             node, newpos, newstate = self._invoke_rule(rule, name, params, kwparams)
    470 
    471             self._goto(newpos)

/Users/joao/.virtualenvs/marsi/lib/python3.4/site-packages/grako/contexts.py in _invoke_rule(self, rule, name, params, kwparams)
    507         try:
    508             try:
--> 509                 rule(self)
    510 
    511                 node = self.ast

/Users/joao/.virtualenvs/marsi/lib/python3.4/site-packages/gnomic/grammar.py in _BINARY_VARIANT_(self)
    481             with self._option():
    482                 self._token('-')
--> 483             self._error('expecting one of: + -')
    484 
    485     @graken()

/Users/joao/.virtualenvs/marsi/lib/python3.4/site-packages/grako/contexts.py in _error(self, item, etype)
    442             self._buffer,
    443             list(reversed(self._rule_stack[:])),
--> 444             item
    445         )
    446 

FailedParse: (1:31) expecting one of: + - :
-geneC +Escherichia coli/geneA +promoter.promoterB:Escherichia coli/geneB -siteA
                              ^
BINARY_VARIANT
VARIANT
FEATURE
FUSION
REPLACEABLE
replacement
change
start

Cheers

Implement HTML format output

Develop this in utils.genotype_to_html(), change_to_html() and feature_to_html() etc.

The output should be identical to the text formatting with some additions:

All identifiers, names etc. must be escaped HTML strings.

All features must be wrapped in <span class="gnomic-{class}">...</span> where class is one of plasmid, fusion, feature-set, feature.

Plasmid names must be wrapped in <span class="gnomic-plasmid-name">...</span>.

Feature variants must be wrapped in a <sup>...</sup> and rendered without any parentheses around them.

As much as possible, code should be shared with the text and also the string formatting functions. This will require some refactoring.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.