belbio / bep Goto Github PK
View Code? Open in Web Editor NEWBEL Enhancement Proposals
Home Page: http://bep.bel.bio
License: Apache License 2.0
BEL Enhancement Proposals
Home Page: http://bep.bel.bio
License: Apache License 2.0
How does the second week of November sound? That should give us time to scramble to write and pre-review any remaining BEPs ๐
Definitely on the docket:
Still needs BEP:
The BEL v2.1 release corresponds to this GitHub project: https://github.com/belbio/bep/projects/1. It includes the following BEPs:
I'm not absolutely sure if this is a good idea, but defining namespaces by the regular expressions defined by Identifiers.org would be a good way to defer semantic validation for the most tricky namespaces (such as dbSNP)
While BEL.bio may become the reference implementation for the BEL language, it would be prudent to encourage the language and community towards agnosticism in regards to these sorts of technical details
I've already started writing an opinionated style guide here: https://github.com/pharmacome/curation/blob/master/style-guide.rst
While some of it may be a bit picky, we should have some minimum community standards for BEL scripts. Example: is there a preferred maximum line length? Is it written anywhere that spacing is important?
@ncatlett I think we would best write this together
I'd like to see numeric/quantitative BEL annotations specifically supported as numbers. Use cases would include filtering knowledge (e.g., edges with 'cohort size' > 20, 'p-value' < 0.01) and potentially weighting edges based on significance or effect size.
In BEL Script 1.0, namespaces and annotations could be defined as a url (pointing to a belns or belanno file), a list of values, or a pattern/regular expression (see https://wiki.openbel.org/display/BLD/Control+Records). To some extent, this representation need is covered by patterns, but the original specification does not specifically cover handing of numeric annotations as numbers.
I am also interested in specifying time annotations, but am not sure if they belong in a separate BEP since units (e.g., minutes, hours, days, years) also need to be considered.
Earlier this week, @johnbachman mentioned he wanted to do this. Now, we're working on curating how Tau antibodies are interacting with Tau, and we have the same need. Let's use this issue thread to start collating examples and discussion. CC: @AldisiRana
One idea:
p(HGNC:YFG, domain(start_end))
Also it would be good to show some examples with their equivalences to InterPro terms as well
Much like secretion()
and cellSurfaceExpression()
it would be nice to have an intake()
function that's syntactic sugar around the translocation()
function that represents the opposite direction of movement from secretion()
.
This would have some nice examples for viruses pulling themselves into cells, for endocytosis of chemicals like neurotransmitters, and for chemicals activating receptors which then get endocytosis-ed as well
BEL documents have either been written in 1.0, 2.0, a mixture of 1.0/2.0, or 2.0.0+
Should BEL documents specify what version they're using? This wouldn't affect 1.0 or 2.0, but would make differentiating future versions with minor/major updates much more easily.
Perhaps with something in the document section of the BEL script like SET BEL = "1.0.0"
.
Another question: how would we deal with documents that are a mixture of BEL 1.0/BEL 2.0?
We have a specific need to add the modification type Acylation/Acyl to the default namespace. (this general term seems like a good fit to describe acylated Ghrelin, as this is how it is frequently referred to in the literature. (The specific modification appears to be o-octanoylation (MOD_00295) or o-decanoylation (MOD_00390), but many/most papers do not refer to the modification in this way)
We need a mechanism to make additions and other changes to the BEL default namespace. This could be a BEP, but the level of discussion/review and the review and update cycle timing needs may be different.
See also related issue belbio/bel_resources#98
Once this is done, we can accept BEP-0001.
It might be the case that text evidence is not available. Sometimes, it comes from a database or a table that has no text. This would also need some thorough guidelines as well.
Receiving a web page that says:
404 Not Found
Code: NoSuchKey
Message: The specified key does not exist.
Key: bep
RequestId: 32A5B3D108FA8081
HostId: G9Po1vh9BUO0s4Tkw7k4Lb5WsgDmzI26jAm4dEP6AeTSzv/UbQxWzvCHI+nvYPxVPuYRct4RRBY=
Like the noCorrelation
and equivalentTo
relationships, a partOf
relationship would allow much better expression of the semantics common to other knowledge assemblies. See FamPlex as a good example. Sometimes, partOf
has the inverse, but broader semantics as the hasComponent
relationship.
We should consider the semantics of partOf
and hasPart
as described in the Gene Ontology as well:
In BEL 1.0/2.0 this made sense:
complex(FPLX:9_1_1) hasComponent p(HGNC:HUS1)
complex(FPLX:9_1_1) hasComponent p(HGNC:RAD1)
complex(FPLX:9_1_1) hasComponent p(HGNC:RAD9A)
It might be better to write in the other direction:
p(HGNC:HUS1) partOf complex(FPLX:9_1_1)
p(HGNC:RAD1) partOf complex(FPLX:9_1_1)
p(HGNC:RAD9A) partOf complex(FPLX:9_1_1)
Basically, any direct paths containing only partOf
and isA
imply partOf
.
A partOf B
and B isA C
implies A partOf C
For proteins, this would describe a hierarchy of complexes.
p(X) partOf complex(Y)
complex(Y) isA complex(Z)
# Infer:
p(X) partOf complex(Z)
A partOf B
and B partOf C
implies A partOf C
For proteins, this would describe a complex formed of other complexes.
p(X) partOf complex(Y)
complex(Y) partOf complex(Z)
# Infer:
p(X) partOf complex(Z)
Does FamPlex have an example for either of these first two headers (@johnbachman)? I will write a script to check the https://github.com/sorgerlab/famplex/blob/master/relations.csv to see.
A isA B
and B partOf C
implies A partOf C
p(HGNC:PRKAA1) isA p(FPLX:AMPK_alpha)
p(HGNC:PRKAA2) isA p(FPLX:AMPK_alpha)
p(FPLX:AMPK_alpha) partOf p(FPLX:AMPK)
# Infer:
p(HGNC:PRKAA1) partOf p(FPLX:AMPK)
p(HGNC:PRKAA2) partOf p(FPLX:AMPK)
Does this make sense? Should we infer all AMPK complexes have PRKAA1? I don't think that's what FamPlex is trying to say. If p(X1) isA p(X)
and p(X2) isA p(X)
and p(X) partOf complex(Y)
then should we should infer that either X1 or X2 are present in Y?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.