Giter VIP home page Giter VIP logo

Comments (8)

rvosa avatar rvosa commented on September 26, 2024

@hlapp, isn't this a job for the MIAPA ontology?

from rnexml.

hlapp avatar hlapp commented on September 26, 2024

Metadata are generally about positively stating or asserting facts, not the absence of them. We developed a provenance documentation recommendation at the 2nd Phylotastic Hackathon, using W3C's PROV and MIAPA. You could of course use OWL to assert that some instance that is not of type cdao:Tree prov:wasDerivedFrom the trait matrix. But it'd probably be more powerful to assert instead what exactly was derived from it. See issue #26.

from rnexml.

bomeara avatar bomeara commented on September 26, 2024

I think it's ok to just have the comparative data with the tree with no special need to note that the tree came from a different dataset. For example, one popular sample dataset in R is the geospiza one from Geiger: it has a tree, and various bird measurements, but I don't think anyone expects that the bird tree came from the included data.

from rnexml.

cboettig avatar cboettig commented on September 26, 2024

@bomeara Thanks, that's good to hear! Of course the geiger case, that data isn't being read in as a nexus file, so there is isn't the same assumption. Have you seen anyone read in or write out comparative trait data in nexus format? Does your group tend to store character data in xlsx/csv formats, or nexus, or something else?

I'd like to make the case that in comparative phylogenetics we should start publishing the relevant trait data along with the phylogenies in a single nexml file, as it would facilitate reproducibility, metadata annotation, and data exchange across different platforms and software. I don't think many comparative methods people are using nexus files for their trait data at the moment though (perhaps/hopefully I'm wrong), so wondering if this will seem confusing to people.

@hlapp excellent point about documenting where the tree did come from. Perhaps I can parse that down into some simple user commands for common cases, even if it captures only the general notion (e.g. used "MrBayes" vs "simulated bd tree in R") and not the whole provenance.

from rnexml.

hlapp avatar hlapp commented on September 26, 2024

@cboettig Could you perhaps also file an issue on the MIAPA ontology tracker (referring back to this issue) about needing a term indicating that a matrix is trait data for comparative analysis?

from rnexml.

bomeara avatar bomeara commented on September 26, 2024

Note that DNA data could be comparative trait data. For example, I could make a tree from the usual phylogenetic markers and then use it to reconstruct a venom gene sequence down the tree. I'd try to deal with this as simply as possible: metadata that a tree is made from COI, 28S, and ef1a and that the comparative traits are venom genes.

As far as comparative data formats, I think xls may be most common (sigh), followed by csv and nexus (perhaps Mesquite-flavored nexus: title, multiple taxa blocks, etc).

from rnexml.

cboettig avatar cboettig commented on September 26, 2024

@bomeara thanks! perhaps that low adoption is part due to problems with nexus parsers for character data in R? (e.g. problems I've run into parsing morphobank nexus files as mentioned in #42).

At least that is something we could overcome in having both tree and character in NeXML. For instance, a user with comparative trait data could serialize that data for easy exchange and archiving with this RNeXML package as it stands:

library(RNeXML)
library(geiger)
data(geospiza)

nexml <- add_trees(geospiza$phy)
nexml <- add_character_data(geospiza$dat)
write.nexml(nexml, "geospiza.xml") 

Which generates geospiza.xml nexml. This would keep the traits and tree together in a single file (to which more annotations/metadata could easily be added) that one could deposit on Dryad etc.

from rnexml.

cboettig avatar cboettig commented on September 26, 2024

Think we're good here. Still need implementation for the rest of MIAPA ontology, #46

from rnexml.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.