ecologicaltraitdata / traitdataform Goto Github PK
View Code? Open in Web Editor NEWA package to manage and compile functional trait data into predefined templates
Home Page: https://ecologicaltraitdata.github.io/traitdataform/
License: Other
A package to manage and compile functional trait data into predefined templates
Home Page: https://ecologicaltraitdata.github.io/traitdataform/
License: Other
the package should provide more datasets from the living spreadshet (fdschneider/bexis_traits#20).
A standardised version of each dataset should be provided as well (linking to trait Thesauri and taxon Ontologies).
If a synonym was mapped to an accepted name, or spelling was corrected, an entry should show up in column warnings or taxonRemarks.
I want to include a suite of example trait data. Criteria:
Possible data:
An FYI, for some reason the apostrophes (') at the first header of your README (Package 'traitdataform') are not translating correctly to your .io landing page.
The apostrophes are showing up for me (Chrome Browser, English language) as " �"
add traitmap
as universal source for
mutate()
(e.g. to add ratios or indices, or logical traits)There is an error in pantheria.R code that is causing pulldata("pantheria") not
to work.
amniota needs to be changed to pantheria and
read.csv needs to be changed to read.delim
amniota <- utils::read.csv("PanTHERIA_1-0_WR05_Aug2008.txt",
fileEncoding = "UTF-8",
stringsAsFactors = FALSE)
pantheria <- utils::read.delim("PanTHERIA_1-0_WR05_Aug2008.txt",
fileEncoding = "UTF-8",
stringsAsFactors = FALSE)
I composed the function mutate.traitdata() to modify a traitdata object and add derived traits.
Still open is the derivation of units for these traits: this requires units being given in the first place.
Since units can be provided in the trait standardization procedure, this is of minor importance.
This replaces issue #11.
After running the standardize.taxonomy function, the user provided unit might need to be changed. a re-running of as.traitdata is invalid. Also the mapping function does not provide this. It might not be necessary to write an own function, but providing a one-liner that maps units according to column "traitName".
Am 11.04.2019 um 08:00 schrieb Prof Brian Ripley:
This concerns packages
[...] traitdataform [...]
which are failing their checks in a strict Latin-1 locale: see the debian-clang results. (Several of these seem to stem from vcr.)
On Linux, such a locale can be ensured via LC_CTYPE=en_US (which may need installing for distros that micro-package). AFAWK it cannot be done on Windows.
- The character in don't is an (ASCII) apostrophe, not a right quote:
don’t
(with a right quote) is used in packages [...] (and others not failing).
en and em dashes are not portable, found in packages
[...] traitdataform [...] .Using \uxxxx coding for non-ASCII chars in R character strings should help in some cases (see 'Writing R Extensions').
Please correct before May 10 to safely retain the package on CRAN.
TOP and T-Sita, as well as some other physiological Ontologies could be tapped as source for looking up and matching trait names to definitions and get URIs as traitID.
The package that seems to allow access to Ontologies from R is ontoCAT.
I'm not experienced enough with API usage. Maybe someone wants to invest time in this for a later version.
can't install package on mac:
ERROR: dependency ‘units’ is not available for package ‘traitdataform’
> install.packages('units')
package ‘units’ is available as a source package but not as a binary
Warning in install.packages :
package ‘units’ is not available (for R version 3.1.2)
if GBIF matching does not return a match, apply other nameservers.
Example:
get_gbif_taxonomy("Carabus arvensis")
improve metadata handling by providing standard object class. A list of named objects, each a named list of metadata.
Method print.metadata() should produce the output we see at print.traitdata.
to produce
With recent commits the package now produces output according to ETS v0.10.
This may break code that relies on columns with a *Std
ending. To fix this, you should redirect those calls to the plain terms. Calls to plain terms, e.g. scientificName
should now point to verbatimScientificName
. See definitions in ETS.
For data publications, always refer to the version of ETS that has been applied to avoid misunderstandings.
I will provide a wrapper function to produce output according to v0.9. (please vote here if you require it urgently)
The output created by the standardize functions is parsimonious, i.e. contains only the columns that have been explicitly provided. A template argument should provide the terms and order of terms that are desired as output. This would be used to create a standardized output, e.g. for upload to BExIS or other services that expect a particular structure.
The template could be just a vector of exact column names (from the vocabulary), or a named vector that renames columns according to the desired output. If wrapping around transform, this could even provide new computed columns.
This functionality adds quite generous power to the package, since it allows to map any input onto any output.
When I try to install the package on Windows, I get the following error message:
Warnung: Ausführung von Kommando 'curl -s -S "http://onlinelibrary.wiley.com/store/10.1002/ecy.1783/asset/supinfo/ecy1783-sup-0002-DataS1.zip?v=1&s=361647dd673d04c9b0838931cda1cf28e1f6eb1f" -o "C:\Users\mbiber\AppData\Local\Temp\RtmpkNdjTP\file2b702e22858.zip"' ergab Status 127
Error in download.file("http://onlinelibrary.wiley.com/store/10.1002/ecy.1783/asset/supinfo/ecy1783-sup-0002-DataS1.zip?v=1&s=361647dd673d04c9b0838931cda1cf28e1f6eb1f", :
'curl' call had nonzero exit status
Error : unable to load R code in package 'traitdataform'
ERROR: lazy loading failed for package 'traitdataform'
removing 'C:/Users/mbiber/Documents/R/win-library/3.4/traitdataform'
Installation failed: Command failed (1)
Cheers,
Matthias
the function standardize.traits()
is supposed to map factor levels provided into harmonized factor levels. For this, a more advanced mapping structure might be required in parameter traitmap
.
Hi,
This hardly deserves an "issue" but noticed that get_gbif_taxonomy
documentation says that the default is fuzzy = FALSE
but I think the code as written has the default as fuzzy = TRUE
. Should be a quick fix when you next update :-)
Hi,
Ive encountered an issue with the get_gbif_taxonomy breaking when trying to resolve a synonymous genus. See below example:
traitdataform::get_gbif_taxonomy("Epiptera septentrionalis",subspecies = FALSE, verbose=TRUE, higherrank=FALSE, fuzzy=TRUE ,resolve_synonyms = TRUE )
The problem seems to be that a taxon is flagged as synonymous at any rank, but this function conducts a new get_gbifid_ search for only the species:
taxize::get_gbifid_(temp[[i]]$species[which.max(temp[[i]]$confidence)], messages = verbose)
Which is NULL, breaking the function
each main dataset might contain a Std version, which is harmonized according to the traitdata standard. This requires
Ich versuche die function standardize.taxonomy am passerines datensatz anzuwenden.
Leider bricht der Function call standardize.taxonomy bei der Species (Acrocephalus familiaris kingi) ab.
Hier der Code dafür:
library(traitdataform)
# Merge Genus and Species into one column
passerines <- tidyr::unite(passerines, Genus, Species, col="scientificName", sep=" ")
# Separate species and subspecies by " " rather than "_"
passerines$scientificName <- sapply(passerines$scientificName, function(x) paste0(strsplit(x, split="_")[[1]][1:2], collapse=" "))
passerines$scientificName <- factor(passerines$scientificName)
passerines_std <- standardize.taxonomy(passerines, return="scientificNameStd")
Vielen Dank für deine Hilfe.
An additional parameter can be set to produce data that can easily be uploaded to BExIS.
It should be a parameter in standardize.exploratories()
which will be applied only within the wrapper function standardize()
if one of its parameters is set.
I am trying to follow the steps in the README but stumbled over some issues.
I'm running traitdataform_0.2.6
(installed when using devtools::install_github()
) in:
R version 3.4.3 (2017-11-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: NixOS 18.03.133070.89ff9f94b67 (Impala)
As this might be a rather specific setup I'm also using an Ubuntu docker container for testing:
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04 LTS
In both instances I get the same errors. The first error encountered is:
> library(traitdataform)
> data(carabids)
Warning message:
In data(carabids) : data set 'carabids' not found
This dataset can be manually loaded via:
> source("/usr/local/lib/R/site-library/traitdataform/extdata/carabids.R")
> ls()
[1] "carabids"
Next, the creation of the thesaurus works but an error occures in a subtask of standardize()
:
> thesaurus <- as.thesaurus(
+ body_length = as.trait("body_length",
+ expectedUnit = "mm",
+ identifier = "length"
+ ),
+ antenna_length = as.trait("antenna_length",
+ expectedUnit = "mm",
+ identifier = "antenna"
+ ),
+ metafemur_length = as.trait("metafemur_length",
+ expectedUnit = "mm",
+ identifier = "metafemur"
+ ),
+ eyewidth = as.trait("eyewidth_corr",
+ expectedUnit = "mm",
+ identifier = "eyewidth"
+ )
+ )
>
> traitdataset1 <- standardize(carabids,
+ thesaurus = thesaurus,
+ taxa = "name_correct",
+ units = "mm"
+ )
Input is taken to be a species -- trait matrix. If this is not the case, please provide parameters!
Error in taxize::get_gbifid_(resolved$matched_name2, verbose = verbose) :
unused argument (verbose = verbose)
Also setting verbose
in standardize
to any value explicitly does not help as the internal function taxize::get_gbifid_
does not seem to accept this parameter at all. At least not in the version installed.
> sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
locale:
[1] C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] traitdataform_0.2.6
loaded via a namespace (and not attached):
[1] Rcpp_0.12.18 xml2_1.2.0 magrittr_1.5 units_0.6-0
[5] getPass_0.2-2 ape_5.1 lattice_0.20-35 R6_2.2.2
[9] rlang_0.2.2 foreach_1.4.4 httr_1.3.1 stringr_1.3.1
[13] plyr_1.8.4 tools_3.4.4 parallel_3.4.4 bold_0.5.0
[17] grid_3.4.4 data.table_1.11.4 nlme_3.1-131 iterators_1.0.10
[21] tibble_1.4.2 httpcode_0.2.0 taxize_0.9.4 crayon_1.3.4
[25] reshape2_1.4.3 codetools_0.2-15 bitops_1.0-6 triebeard_0.3.0
[29] RCurl_1.95-4.11 curl_3.2 crul_0.6.0 stringi_1.2.4
[33] pillar_1.3.0 compiler_3.4.4 urltools_1.7.1 XML_3.98-1.16
[37] jsonlite_1.5 reshape_0.8.7 zoo_1.8-3
feeding in a data table to link to measurementID or occurenceID or locationID (for georeferencing exploratories data).
The Ecological Trait-data Standard Vocabulary has been updated to v0.10 in the process of paper re-submission. Key terms have been modified to better reflect the verbatim or standardised character of entries. This now requires some major changes in the as.traitdata()
as well as standardize()
functions.
verbatim*
terms.there is need for a function that updates an already formatted traitdataset by
derived columns, via mutate() (e.g. to add ratios or indices, or logical traits)
to preview trait definitions
Since the reference to the external trait datasets on Dryad and other sources are not stable enough, a next version should include the raw data in the package. This would include
probably the other demo data would be removed to minimize these issues. Ideally, recipes for pulling datasets from published trait data (e.g. pantheria #43) would be provided elsewhere, e.g. through the Open Traits Network.
Hello traitdataform
developers 👋
I don't know to what extent you're still working on traitdataform
but I wanted to warn you that it is not available on CRAN with the following message:
Archived on 2021-04-30 as check problems were not corrected in time.
The check issues are available here: https://cran-archive.r-project.org/web/checks/2021/2021-04-30_check_results_traitdataform.html
It seems that there are two causes for the check errors:
arthropodtraits
used in several places doesn't seem to be accessible through the packagerow.names
which make R complainI can offer to try to make a PR to solve these issues and help have traitdataform
back on CRAN as I think it's important that this package is accessible to a maximum number of people!
Thanks :)
Cleaning the package up for CRAN will require quite some work. Check produces plenty of warnings, mostly related with ASCII encoding, incomplete function definitions and documentation, generic method consistency, and some package dependencies.
library('traitdataform')
Error in library("traitdataform") :
there is no package called 'traitdataform'
Execution halted
include updated version,
function to resolve a locationID based on Biodiversities EPPlotID. matches provided ID against lookup table and fills in longitude and latitude.
Loving get_gbif_taxonomy so far.
I just ran it on a list my colleagues maintain of about 20,000 "valid" names of hymenopteran species. About 16 came back with the warning " Selected first of multiple equally ranked concepts!". Of these, the majority meet the following condition: scientificName == scientificNameStd.
However, the ones that do not (at the treshold I used) seem likely to be mis-matched. It would be super helpful, I think, to provide a different warning on these two cases, as when going through and manually checking results, it's great to have warnings in cases where the automation probably worked, but it's also nice to be able to focus easily on the ones most likely to be a problem.
Thanks!
a simple merger function that
This is applying reshape::cast(). Only for 'aggregated' data without multiple measurements of one trait.
Otherwise, definitions must be set for how to aggregate data, if one taxon has multiple measurements for one trait. It is impossible to make advanced assumptions about the data quality.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.