Comments (15)
I was not away of Yet Another Newick Hack but there you have it. I guess
you could either do this as multiple annotations on the "edge" element, or
by introducing an unbranched internal node for each stretch of mapped
character state.
from rnexml.
Note: this discussion continues on the nexml-discuss, where it might reach a broader audience.
Sounds like there's a good case for not modifying the topology. Meanwhile, notes to myself on how this might be done, with some outstanding questions to resolve, based on the discussion on the listserve.
Perhaps an edge that changed from state 1 to state 2 might be annotated as:
<states>
<state id="s1" label="description of state">
<state id="s2" label="description of alternative state">
...
<edge id="e1" source="n1" target="n2" about="#e1" length="6.4">
<meta property="x:order" content="1" xsi:type="nex:LiteralMeta" id = "m1" about="#m1">
<meta property="x:length" content="3.4" xsi:type="nex:LiteralMeta" id = "m2" />
<meta property="x:hasState" content="s1" xsi:type="nex:LiteralMeta" id = "m3" />
</meta>
<meta property="x:order" content="2" xsi:type="nex:LiteralMeta" id = "m4" about="#m4">
<meta property="x:length" content="3.0" xsi:type="nex:LiteralMeta" id = "m5"/>
<meta property="x:hasState" content="s2" xsi:type="nex:LiteralMeta" id = "m6"/>
</meta>
</edge>
- Obviously I have just made up the properties
x:
. How would I go about establishing a formal namespace for such properties? - Clearly several alternative annotations could be proposed here. For instance, could either declare
start
,stop
, andstate
times for each section, instead? - Also, not sure if my nesting of meta elements is appropriate. Perhaps they should all be wrapped in a
meta
declaring something likehasStochasticCharacterMapping
.
from rnexml.
Cool, I really like @mtholder's suggestion on the mailing list; it more clearly reflects the logic of NeXML elements and helps me think about how (a typical user) would (most sensibly) extend nexml (as compared to hacking the newick format yet again).
Perhaps this use-case might be a nice one to illustrate in the manuscript as an example of how an R user might go about defining a meaningful extension to nexml?
from rnexml.
<characters id="m1">
<format>
<states id="ss1">
<state id="s1"/>
<state id="s2"/>
</states>
<char id="cr1" states="ss1" label="reef-dwelling"/>
</characters>
...
<tree>
...
<edge id="e1" source="n1" target="n2">
<meta>
<simmap:reconstructions>
<simmap:reconstruction character="cr1">
<simmap:stateChange id="sc1" length="0.4" state="s2"/>
<simmap:stateChange id="sc2" length="0.5" state="s1"/>
</simmap:reconstruction>
</simmap:reconstructions>
A few minor changes from Mark's (@mtholder) suggestion. Mark annotates the node state, but it seems strange to do this in a meta element child to an edge. I've also dropped the attribute edge = e1
from the simmap:stateChange
node, since this annotation is a child of the edge element e1
already -- perhaps I should keep it anyway? (When phenoscape annotates a state
element with a meta doesn't appear to explicitly reference the state id)
One limitation is that this format doesn't explicitly state the order in which the changes occur. The order is of course implicit in the ordering of the stateChange
elements, but I believe that's not quite consistent with NeXML design principles (e.g. that data should be explicit, not encoded in structure)? Happy for more feedback on this.
from rnexml.
Given @hlapp and @rvosa's comments in issue #23, we probably want to consider using a meta
based format to define the simmap representation.
I think this is the straight-forward translation into RDFa meta based on the XML-based description I have above:
<edge id="e1" source="n1" target="n2" length="0.9">
<meta property="simmap:reconstructions" id = m1>
<meta property="simmap:reconstruction" id = m2>
<meta property="nex:char" content = "cr1" id = m3>
<meta property="simmap:stateChange" id = m4>
<meta property = nex:length" content="0.4">
<meta property = "nex:state" content = "s2"/>
</meta>
</meta>
<meta property="simmap:stateChange">
<meta property = nex:length" content="0.5">
<meta property = "nex:state" content = "s1"/>
</meta>
</meta>
</meta>
</meta>
</edge>
(would have ids and xsi-types on all meta elements)
Note that I've claimed that state
, length
and char
properties are defined in the nexml namespace, but probably that's not kosher? How should that be done properly?
Naturally this extension would have to come with a definition of new terms. If I understand correctly, while ideally that would be an OWL ontology, it would be permissible just to have a plain text definition like:
simmap definitions
simmap:reconstruction
: A mapping of a character state onto an edge of the phylogeny. The state may change along the length of the edge, as indicated by thestateChange
child element.simmap:stateChange
: An element indicating the character state given by the reconstruction and the duration (length) the edge was in this state.stateChange
elements are given sequentially in the order or the state changes from root to tip. (Note that stochastic character mapping is not well-defined for an unrooted tree.) The sum of all lengths in a reconstruction of an edge should equal the length of the edge itself.- ....
(add more text for additional attributes, explanatory diagram)
Is it poor form that I use the order of meta elements to indicate the order of the state changes?
from rnexml.
@hlapp @rvosa Can a meta
element have both child nodes and a content
value? (e.g. my meta property="nex:char"
element in the RDFa version above? If not, not sure how to do this to avoid re-listing the character id every time I list the state id.
from rnexml.
Can a meta element have both child nodes and a content value? (e.g. my meta property="nex:char" element in the RDFa version above?
The schema doesn't seem to prohibit it, but the documentation says no:
Metadata annotations in which the object is a literal value. If the @content attribute is used, then the element should contain no children.
I'm not following yet why you have to have this. Can you perhaps give an example, such as what you think you'd be forced to do but don't want to?
from rnexml.
Is it poor form that I use the order of meta elements to indicate the order of the state changes?
Yes. Wouldn't it be possible to add a seq
or ordering
or other property to indicate order? Or use the same mechanism that NeXML uses for ordering characters, but now that I think about it I'm not sure how it does that.
from rnexml.
Characters are sort of ordered. They have id attributes, which datum cells
then reference - so in that context they are actually unordered in the
sense that the location of the char element among its siblings is
meaningless. But, if there are no datum cells (i.e. with compact seq
elements) the convention is that the order in which tokens appear in the
seq corresponds with the order in which char elements are defined. Note
that char elements can also, optionally, have an integer attribute to
specify codon position. Maybe you can take this as precedent for an integer
property to store order?
from rnexml.
@hlapp @rvosa Thanks for the feedback. Yeah, would like this example to be solid as possible if we're to use it as an exemplar how-to. Here's an example of the current version:
<edge id="e1" source="n1" target="n2" length="0.9">
<meta property="simmap:reconstructions" id = "m1">
<meta property="simmap:reconstruction" id = "m2">
<meta property="nex:char" content = "cr1"/>
<meta property="simmap:stateChange" id = "m4">
<meta property="simmap:order" content = "1"/>
<meta property = nex:length" content="0.4"/>
<meta property = "nex:state" content = "s2"/>
</meta>
<meta property="simmap:stateChange">
<meta property="simmap:order" content = "2"/>
<meta property = nex:length" content="0.5"/>
<meta property = "nex:state" content = "s1"/>
</meta>
</meta>
</meta>
</edge>
(namespace definitions, id and about tags would be added automatically too, just omitted above).
I've explicitly added the property simmap:order
to indicate the ordering of the state changes explicitly (and not rely on the ordering of the elements). I've also moved the nex:char
property to be sister rather than parent to simmap:stateChange
. My thinking is that nex:char
is annotating simmap:reconstruction
, stating that this particular reconstruction is a reconstruction of the given character. I think this fixes most of my concerns. I have also added code that converts this to/from the simmap format used by the phytools
R package.
@hlapp @rvosa one outstanding concern I have is if I'm okay using nex:length
, nex:char
, and nex:state
as I do above, rather than defining new terms for these explicitly in the simmap context. I'm not sure if they are semantically identical concepts or not, e.g. simmap:length
is the length of time an edge spends in a particular state, while nex:length is the length of an <edge>
.
from rnexml.
Would it be an idea to try to generate the output using the API for nested
semantic annotation and just look at what that looks like? I am also (as
you are) doubtful whether it is a good idea to re-use the nex:* names for
ever-so-slightly different concepts.
On Thu, Jan 16, 2014 at 11:59 PM, Carl Boettiger
[email protected]:
@hlapp https://github.com/hlapp @rvosa https://github.com/rvosaThanks for the feedback. Yeah, would like this example to be solid as
possible if we're to use it as an exemplar how-to. Here's an example of the
current version:<meta property="nex:char" content = "cr1"/> <meta property="simmap:stateChange" id = "m4"> <meta property="simmap:order" content = "1"/> <meta property = nex:length" content="0.4"/>
<meta property="simmap:stateChange"> <meta property="simmap:order" content = "2"/> <meta property = nex:length" content="0.5"/> <meta property = "nex:state" content = "s1"/> </meta> </meta> </meta> </edge>
(namespace definitions, id and about tags would be added automatically
too, just omitted above).I've explicitly added the property simmap:order to indicate the ordering
of the state changes explicitly (and not rely on the ordering of the
elements). I've also moved the nex:char property to be sister rather than
parent to simmap:stateChange. My thinking is that nex:char is annotating
simmap:reconstruction, stating that this particular reconstruction is a
reconstruction of the given character. I think this fixes most of my
concerns. I have also added code that converts this to/from the simmap
format used by the phytools R package.@hlapp https://github.com/hlapp @rvosa https://github.com/rvosa one
outstanding concern I have is if I'm okay using nex:length, nex:char, and
nex:state as I do above, rather than defining new terms for these
explicitly in the simmap context. I'm not sure if they are semantically
identical concepts or not, e.g. simmap:length is the length of time an
edge spends in a particular state, while nex:length is the length of an
.—
Reply to this email directly or view it on GitHubhttps://github.com//issues/48#issuecomment-32556509
.
from rnexml.
one outstanding concern I have is if I'm okay using nex:length, nex:char, and nex:state as I do above, rather than defining new terms for these explicitly in the simmap context.
I think that's a bad idea. Not only as you see is the semantic match not clear, but there also is no nex vocabulary. It's a schema, and XML Schema per se actually don't have semantics.
from rnexml.
Okay, I've implemented my go at writing a simmap extension to NeXML along the lines we describe in this thread as an illustration of how RNeXML
users can use the package to construct such extensions (rather than continuing to hack Newick formats as illustrated at the top of this thread). Could really use some critique from @rvosa and @hlapp on my stab at this, particularly with regards to defining a simmap
namespace. I'm hoping to create an example to be something other users could reasonably
do themselves without expertise in RDFa or XML, but also to be a good model case that doesn't cut corners.
You can see my attempt at explaining this implementation in this section of the manuscript: https://github.com/ropensci/RNeXML/blob/devel/inst/doc/pubs/manuscript.md#extending-the-nexml-standard-through-metadata-annotation
Obviously in addition to refining the implementation, it would be good to improve the explanation as well. (Overall not sure how much of that I will have space for in the manuscript body and what will be left to a supplement, vignette, and/or blog post, but for now not worrying about space.) @sckott would be great to get your feedback on this as well from the practical R perspective more than the valid nexml perspective.
from rnexml.
I had a look at it and I think it's pretty good. I find the syntax (line
730...) palatable enough, in any case. As regards telling people to create
something at a URL that the namespace points to and defining their
predicates there: that's nice advice (though technically nothing will break
if they don't do that). Doing it in plain text is probably the best we can
expect, people certainly aren't going to fire up protege to define a couple
of predicates in their own research.
On Mon, Mar 24, 2014 at 11:32 PM, Carl Boettiger
[email protected]:
Okay, I've implemented my go at writing a simmap extension to NeXML along
the lines we describe in this thread as an illustration of how RNeXMLusers can use the package to construct such extensions (rather than
continuing to hack Newick formats as illustrated at the top of this
thread). Could really use some critique from @rvosahttps://github.com/rvosaand
@hlapp https://github.com/hlapp on my stab at this, particularly with
regards to defining a simmap namespace. I'm hoping to create an example
to be something other users could reasonably
do themselves without expertise in RDFa or XML, but also to be a good
model case that doesn't cut corners.You can see my attempt at explaining this implementation in this section
of the manuscript:
https://github.com/ropensci/RNeXML/blob/devel/inst/doc/pubs/manuscript.md#extending-the-nexml-standard-through-metadata-annotationObviously in addition to refining the implementation, it would be good to
improve the explanation as well. (Overall not sure how much of that I will
have space for in the manuscript body and what will be left to a
supplement, vignette, and/or blog post, but for now not worrying about
space.) @sckott https://github.com/sckott would be great to get your
feedback on this as well from the practical R perspective more than the
valid nexml perspective.Reply to this email directly or view it on GitHubhttps://github.com//issues/48#issuecomment-38509874
.
from rnexml.
@rvosa Cool, thanks for the feedback. We'll probably need to keep working on the manuscript discussion of this as we get down the road. As it sounds like we have at least some acceptable basics for a simmap extension, I think I'll close this issue for now, but feel free to re-open.
from rnexml.
Related Issues (20)
- Replace taxize backend HOT 1
- NCBI URIs HOT 5
- Cut new release to CRAN? HOT 5
- Thoughts on a hex? HOT 3
- Rmarkdown version of toplevel README no longer necessary? HOT 4
- General purpose accessor functions for nexml object inspection HOT 10
- Print summary doesn't deal properly with zero phylogenetic trees HOT 1
- Bug in splitting character matrix into continuous and discrete
- Adding characters fails for some matrices
- dplyr methods select_ and mutate_ are deprecated HOT 2
- New release with new summary() etc HOT 13
- get_characters() returns columns in different ordering than the list of char objects HOT 1
- Ability to drop objects from nexml object
- breaking change introduced in R 4.0.0 HOT 1
- Message from CRAN HOT 5
- tests fail with new dplyr HOT 3
- Warning message: select_() is deprecated as of dplyr 0.7.0
- Add RNeXML_ prefix to tree (and other) classes HOT 13
- Compatibility with dplyr 1.1.0 HOT 6
- nexml.org is down, causing tests to fail HOT 15
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rnexml.