Giter VIP home page Giter VIP logo

bolognese's Introduction

DataCite

DataCite is a leading global non-profit organisation that provides persistent identifiers (DOIs) for research data. Our goal is to help the research community locate, identify, and cite research data with confidence.

About this repository

This is the generic DataCite repository for bugs, enhancements, and other issues. DataCite users can add their ideas through the DataCite Roadmap.

bolognese's People

Contributors

actions-user avatar ashwinisukale avatar chrisgorgo avatar cjcolvar avatar codycooperross avatar dependabot[bot] avatar digitaldogsbody avatar jrhoads avatar kaysiz avatar kjgarza avatar larsgw avatar mfenner avatar orangewolf avatar prdanelli avatar richardhallett avatar svogt0511 avatar wendelfabianchinsamy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bolognese's Issues

implementation details in the schema.org representation of articles

Hey @mfenner ,

Apologies if this really isn't the right place to discuss this, please direct me if there's a better thread.

I'm really looking for your input on the best implementation of an article citation in schema.org (I know so far it's only DataCite and not CrossRef that offers a schema.org format, but still).

In particular, I'm looking at the examples Schema.Org gives for ScholarlyArticle. One of the fascinating but also complete nuisance elements of their implementation is the explicit enumeration of issue and volume through isPartOf. Instead of something you might see in bibliographic formats (say, bibtex or csl) in which volume and issue would be 'top-level' attributes of an article, they've been careful with the more semantically precise definitions: the article isPartOf an issue with isPartOf a volume of a periodical.

Similarly, the datePublished is given not as a property of the article but as a property of the issue (again perhaps more semantically precise, at least as far as the publication-print date (to use the CSL term) is concerned, but not the most convenient or perhaps expected structure. Also following this structure makes it harder to know what to do with online-only or entries otherwise missing volume/issue data.

Anyway, I guess these choices make sense, but wanted to get your thoughts. (I do note that schema.org shows two different versions, both of which use isPartOf, but the third is less explicit and simply indicates the ScholarlyArticle isPartOf both an PublicationIssue and a PublicationVolume / Periodical ... maybe this third example is the best compromise?)

Title field of DOI 10.1104/pp.111.178582

Some article titles have italic type, for instance for some biological or chemical nomenclature. It seems most publishers do not translate that into the DOI metadata.
Meanwhile I found an exception, https://doi.org/10.1104/pp.111.178582. The Crossref metadata already looks creepy (https://api.crossref.org/v1/works/http://dx.doi.org/10.1104/pp.111.178582). Bolognese returns "name": "\n Ectopic Expression of\n \n in\n \n : Creating a Metabolic Sink Has Tissue-Specific Consequences for the Jasmonate Metabolic Network and Silences Downstream Gene Expression\n " (note the missing "AtJMT" and "Nicotiana attenuata"). It would be cool to get the complete title without the newlines and the spaces.

Undefined method `dig' in read_crossref

Try to use bolognese with the DOI 10.1111/nph.14619:

$ bolognese https://doi.org/10.1111/nph.14619
/var/lib/gems/2.3.0/gems/bolognese-0.9.65/lib/bolognese/readers/crossref_reader.rb:156:in `read_crossref': undefined method `dig' for nil:NilClass (NoMethodError)
        from /var/lib/gems/2.3.0/gems/bolognese-0.9.65/lib/bolognese/metadata.rb:113:in `initialize'
        from /var/lib/gems/2.3.0/gems/bolognese-0.9.65/lib/bolognese/cli.rb:32:in `new'
        from /var/lib/gems/2.3.0/gems/bolognese-0.9.65/lib/bolognese/cli.rb:32:in `convert'
        from /var/lib/gems/2.3.0/gems/thor-0.20.0/lib/thor/command.rb:27:in `run'
        from /var/lib/gems/2.3.0/gems/thor-0.20.0/lib/thor/invocation.rb:126:in `invoke_command'
        from /var/lib/gems/2.3.0/gems/thor-0.20.0/lib/thor.rb:387:in `dispatch'
        from /var/lib/gems/2.3.0/gems/thor-0.20.0/lib/thor/base.rb:466:in `start'
        from /var/lib/gems/2.3.0/gems/bolognese-0.9.65/bin/bolognese:6:in `<top (required)>'
        from /usr/local/bin/bolognese:23:in `load'
        from /usr/local/bin/bolognese:23:in `<main>'

crossref author id to orcid id?

Hi @mfenner,

Quick question: Is there a way to map from the crossref author ids I get back in the crossref xml into ORCIDs (assuming the author has an orcid)? Do any crossref cn types return ORCID id data for articles? Thanks!

Rename periodical to container

Container is a more generic term and better fit to describe relationship to containing repository, journal, or event series. Should support the following information:

  • type (e.g. repository, journal, series)
  • identifier and identifier type (could be multiple)
  • name
  • volume
  • issue
  • page numbers

inconsistent name parsing in JSON-LD conversion

Consider the following pair of <creators> blocks:

<creator>
    <creatorName>Mahdi D.</creatorName>
    <nameIdentifier schemeURI="http://orcid.org/" nameIdentifierScheme="ORCID">0000-0001-5000-0007</nameIdentifier>
</creator>
<creator>
    <creatorName>Soubiran C.</creatorName>
</creator>
<creator>
    <creatorName>lanco-Cuaresma S.</creatorName>
</creator>
<creator>
    <creatorName>Chemin L.</creatorName>
</creator>

and

<creator>
    <creatorName>D. Mahdi</creatorName>
    <nameIdentifier schemeURI="http://orcid.org/" nameIdentifierScheme="ORCID">0000-0001-5000-0007</nameIdentifier>
</creator>
<creator>
    <creatorName>C. Soubiran</creatorName>
</creator>
<creator>
    <creatorName>S. Blanco-Cuaresma</creatorName>
</creator>
<creator>
    <creatorName>L. Chemin</creatorName>
</creator>

for the first block the bolognese export of JSON-LD with the given/surname parser works for all authors like magic. For the latter block it works correctly but only for the first <creator> then doesn't run for subsequent entries. This seems like a bug.

The full XML of the first entry is attached in this zip file: test.zip
To be clear it is all test rather than real data so the DOI prefix is borrowed.

This was with bolognese --version == 0.9.22

Support funding information

Support funding information in Crossref and DataCite metadata (both before and after schema 4.0 release).

Case study for JATS export

References for DOI https://doi.org/10.7554/elife.01567

[{
    "type": "CreativeWork",
    "id": "https://doi.org/10.1038/nature02100",
    "title": "APL regulates vascular tissue identity in Arabidopsis"
}, {
    "type": "CreativeWork",
    "id": "https://doi.org/10.1534/genetics.109.104976",
    "title": "In the beginning was the worm"
}, {
    "type": "CreativeWork",
    "id": "https://doi.org/10.1034/j.1399-3054.2002.1140413.x",
    "title": "Secondary xylem development in Arabidopsis: a model for wood formation"
}, {
    "type": "CreativeWork",
    "id": "https://doi.org/10.1162/089976601750399335",
    "title": "Training nu-support vector classifiers: theory and algorithms"
}, {
    "type": "CreativeWork",
    "id": "https://doi.org/10.1007/bf00994018",
    "title": "Support-vector Networks"
}, {
    "type": "CreativeWork",
    "id": "https://doi.org/10.1016/j.semcdb.2009.09.009",
    "title": "Stem cell function during plant vascular development"
}, {
    "type": "CreativeWork",
    "id": "https://doi.org/10.1242/dev.091314",
    "title": "WOX4 and WOX14 act downstream of the PXY receptor kinase to regulate plant vascular proliferation independently of any role in vascular organisation"
}, {
    "type": "CreativeWork",
    "id": "https://doi.org/10.1371/journal.pgen.1002997",
    "title": "Plant vascular cell division is maintained by an interaction between PXY and ethylene signalling"
}, {
    "type": "CreativeWork",
    "id": "https://doi.org/10.1038/msb.2010.25",
    "title": "Clustering phenotype populations by genome-wide RNAi and multiparametric imaging"
}, {
    "type": "CreativeWork",
    "id": "https://doi.org/10.1016/j.biosystems.2012.07.004",
    "title": "BaSAR-A tool in R for frequency detection"
}, {
    "type": "CreativeWork",
    "id": "https://doi.org/10.1016/j.pbi.2005.11.013",
    "title": "Developmental mechanisms regulating secondary growth in woody plants"
}, {
    "type": "CreativeWork",
    "id": "https://doi.org/10.1105/tpc.110.076083",
    "title": "TDIF peptide signaling regulates vascular stem cell proliferation via the WOX4 homeobox gene in Arabidopsis"
}, {
    "type": "CreativeWork",
    "id": "https://doi.org/10.1073/pnas.0808444105",
    "title": "Non-cell-autonomous control of vascular stem cell fate by a CLE peptide/receptor system"
}, {
    "type": "CreativeWork",
    "id": "https://doi.org/10.1016/0092-8674(89)90900-8",
    "title": "Arabidopsis, a useful weed"
}, {
    "type": "CreativeWork",
    "id": "https://doi.org/10.1126/science.1066609",
    "title": "Plants compared to animals: the broadest comparative study of development"
}, {
    "type": "CreativeWork",
    "id": "https://doi.org/10.1104/pp.104.040212",
    "title": "A weed for wood? Arabidopsis as a genetic model for xylem development"
}, {
    "type": "CreativeWork",
    "id": "https://doi.org/10.1038/nbt1206-1565",
    "title": "What is a support vector machine?"
}, {
    "type": "CreativeWork",
    "id": "https://doi.org/10.1073/pnas.77.3.1516",
    "title": "Classification of cultured mammalian cells by shape analysis and pattern recognition"
}, {
    "type": "CreativeWork",
    "id": "https://doi.org/10.1093/bioinformatics/btq046",
    "title": "EBImage–an R package for image processing with applications to cellular phenotypes"
}, {
    "type": "CreativeWork",
    "id": "https://doi.org/10.1105/tpc.111.084020",
    "title": "Mobile gibberellin directly stimulates Arabidopsis hypocotyl xylem expansion"
}, {
    "type": "CreativeWork",
    "id": "https://doi.org/10.5061/dryad.b835k",
    "title": "Data from: Automated quantitative histology reveals vascular morphodynamics during Arabidopsis hypocotyl secondary growth"
}, {
    "type": "CreativeWork",
    "id": "https://doi.org/10.1016/j.cub.2008.02.070",
    "title": "Flowering as a condition for xylem expansion in Arabidopsis hypocotyl and root"
}, {
    "type": "CreativeWork",
    "id": "https://doi.org/10.1111/j.1469-8137.2010.03236.x",
    "title": "Evolution of development of vascular cambia and secondary growth"
}, {
    "type": "CreativeWork",
    "id": "https://doi.org/10.1007/s00138-011-0345-9",
    "title": "Cell morphology classification and clutter mitigation in phase-contrast microscopy images using machine learning"
}, {
    "type": "CreativeWork",
    "id": "https://doi.org/10.1016/j.cell.2012.02.048",
    "title": "Mechanical stress acts via katanin to amplify differences in growth rate between adjacent cells in Arabidopsis"
}, {
    "type": "CreativeWork",
    "id": "https://doi.org/10.1038/ncb2764",
    "title": "A screen for morphological complexity identifies regulators of switch-like transitions between discrete cell shapes"
}]

Bolognese at the RubyDataScience list

Dear @mfenner ,

we've recently added bolognese to our upcoming RubyDataScience resource list: https://github.com/arbox/data-science-with-ruby

You could help us to spread the word about Ruby in the non-web context and connect with the RubyDataScience network. It increases visibility and makes Ruby based tools more useful for the community.

For that purpose please consider adding the rubydatascience topic to your repository.

Thank you in advance!

Implement proxyIdentifiers for Event Data

Filter out relatedIdentifiers of the following types and put them into a proxyIdentifiers array:

  • isIdenticalTo
  • isVersionOf
  • isPartOf
  • isSupplementTo

The assumption is that credit can be transferred from these related identifiers.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.