Comments (11)
Do we expect people to alter the DOM tree a lot (i.e. are their risks of clashes if we use a simpler scheme)? Otherwise maybe tag name + a counter is more concise?
from rnexml.
@rvosa Yeah, I'm not sure -- still trying to wrap my head around this one. Most users will probably only use the top-level API for writing an ape::phylo
tree or list of trees (ape:multiPhylo
) to NeXML, in which case we can number them as we go. But the S4-Class-based interface we have so far also allows users to just coerce ape::phylo
trees into the S4 RNeXML::tree
class, which can then be inserted into NeXML later. Perhaps we don't want users doing that, but this means they could modularly build up the DOM and we then have to watch out for collisions.
Or in more concrete terms, I have this setAs("phylo", "tree" ...)
subroutine for mapping phylo objects to the S4 object that mimics the schema. Since the phylo object doesn't have an ID, I either have to generate one at this time, or otherwise add the id when adding the tree to an existing or new nexml/trees object. Does that make sense?
In other news, the validator complains that UUIDs aren't valid id attributes:
... is not a valid value of the atomic type 'xs:ID'
from rnexml.
On Aug 14, 2013, at 10:59 PM, Carl Boettiger wrote:
In other news, the validator complains that UUIDs aren't valid id attributes:
That sounds like a validator bug.
from rnexml.
What do the UUIDs look like? The schema specifies that the type of @id is
xs:ID, which is a non-colonized name (NCName), so instance documents must
conform to the production rules of NCNames (probably most importantly:
start with a letter or an underscore). If they don't, I don't see how the
validator is at fault here.
On Thu, Aug 15, 2013 at 5:50 AM, Hilmar Lapp [email protected]:
On Aug 14, 2013, at 10:59 PM, Carl Boettiger wrote:
In other news, the validator complains that UUIDs aren't valid id
attributes:That sounds like a validator bug.
—
Reply to this email directly or view it on GitHubhttps://github.com//issues/14#issuecomment-22683684
.
Dr. Rutger A. Vos
Bioinformaticist
Naturalis Biodiversity Center
Visiting address: Office A109, Einsteinweg 2, 2333 CC, Leiden, the
Netherlands
Mailing address: Postbus 9517, 2300 RA, Leiden, the Netherlands
http://rutgervos.blogspot.com
from rnexml.
On Aug 15, 2013, at 5:44 AM, Rutger Vos wrote:
What do the UUIDs look like? [...] instance documents must conform to the production rules of NCNames (probably most importantly: start with a letter or an underscore). If they don't, I don't see how the
validator is at fault here.
UUIDs can start with a digit.
@cboettig: I suggest that if you choose UUIDs, you put them in the form of a urn:uuid: scheme. See http://www.ietf.org/rfc/rfc4122.txt
from rnexml.
IDs need to be non-colonized names, i.e. strings without colons. If I
understand your suggestion correctly, the UUIDs would contain colons, which
would be a no-no.
On Thu, Aug 15, 2013 at 4:16 PM, Hilmar Lapp [email protected]:
On Aug 15, 2013, at 5:44 AM, Rutger Vos wrote:
What do the UUIDs look like? [...] instance documents must conform to
the production rules of NCNames (probably most importantly: start with a
letter or an underscore). If they don't, I don't see how the
validator is at fault here.UUIDs can start with a digit.
@cboettig: I suggest that if you choose UUIDs, you put them in the form of
a urn:uuid: scheme. See http://www.ietf.org/rfc/rfc4122.txt—
Reply to this email directly or view it on GitHubhttps://github.com//issues/14#issuecomment-22705449
.
Dr. Rutger A. Vos
Bioinformaticist
Naturalis Biodiversity Center
Visiting address: Office A109, Einsteinweg 2, 2333 CC, Leiden, the
Netherlands
Mailing address: Postbus 9517, 2300 RA, Leiden, the Netherlands
http://rutgervos.blogspot.com
from rnexml.
On Aug 16, 2013, at 6:27 AM, Rutger Vos wrote:
IDs need to be non-colonized names, i.e. strings without colons. If I
understand your suggestion correctly, the UUIDs would contain colons, which
would be a no-no.
So HTTP URIs can't be IDs?
from rnexml.
Not normally. However, IDs can become part of HTTP URIs when transforming documents to RDF as they are then made globally unique by prefixing them with either the location of the document or the value of xml:base of the nearest ancestor node that contains this attribute. (Note that I didn't just make this up or anything.)
I see where you're going with this line of questioning. If we want HTTP URIs as IDs (good id(ea)), use xml:base.
from rnexml.
I was just using the uuid package, which generates uuids that look like:
> UUIDgenerate()
[1] "f7af80aa-dfb2-4134-aa82-db1c0e9e7980"
No colons, so I'm not sure why the validator (accessed with the R wrapper to xmllib2
) is unhappy.
Regardless, not sure uuids were a good idea for this purpose anyhow. The current workflow doesn't give the user the same flexibility over the DOM directly, so we probably don't have to worry about a user creating two S4 "tree" objects and then sticking them in the same nexml with duplicated IDs.
Instead, there is a method for phylo->nexml
that creates the ids for otus as t1, t2..., nodes as n1, n2..., edges as e1, e2... etc (done). A separate method for multiPhylo -> nexml
will allow the user to add multiple trees while avoiding id conflicts (not written yet). With a sensible top-level API I think we should be fine using these simple ids(?)
from rnexml.
Okay, I think we're happy with our only locally unique ids for the moment. (Though still unsure what was wrong with the uuid above according to the validator...). Anyway, closing this issue.
from rnexml.
It appears that strings starting with a number were not valid ids (and uuids often start with numbers).
To address this, all functions that assign ids use the internal method nexml_id()
, which can create local numbers using a given character prefix; e.g. edges use "nexml_id("e") to get ids like e1, e2, etc, using an internal counter. The counters start at 1 and increase each time the id of a given prefix is used in that R session, unless reset with reset_id_counter()
. This local counter scheme is used by default.
The command options(uuid=TRUE)
will make RNeXML use uuids for all id attributes instead. To avoid the validation error, these are prepended with uuid-
. This option can be issued per session or put in the user's .Rprofile as persistent configuration. options(uuid=FALSE)
sets the behavior back to the local identifiers.
test_global_ids.R
provides a unit test that we generate valid nexml when using the global (uuid) id scheme.
from rnexml.
Related Issues (20)
- Replace taxize backend HOT 1
- NCBI URIs HOT 5
- Cut new release to CRAN? HOT 5
- Thoughts on a hex? HOT 3
- Rmarkdown version of toplevel README no longer necessary? HOT 4
- General purpose accessor functions for nexml object inspection HOT 10
- Print summary doesn't deal properly with zero phylogenetic trees HOT 1
- Bug in splitting character matrix into continuous and discrete
- Adding characters fails for some matrices
- dplyr methods select_ and mutate_ are deprecated HOT 2
- New release with new summary() etc HOT 13
- get_characters() returns columns in different ordering than the list of char objects HOT 1
- Ability to drop objects from nexml object
- breaking change introduced in R 4.0.0 HOT 1
- Message from CRAN HOT 5
- tests fail with new dplyr HOT 3
- Warning message: select_() is deprecated as of dplyr 0.7.0
- Add RNeXML_ prefix to tree (and other) classes HOT 13
- Compatibility with dplyr 1.1.0 HOT 6
- nexml.org is down, causing tests to fail HOT 15
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rnexml.