Giter VIP home page Giter VIP logo

Comments (7)

atomrab avatar atomrab commented on September 26, 2024

I'm not sure what column mapped to localizedLabel, but I think it goes something like this.

The original label is "Early Bronze Age Period (Early Cypriote I)".

The localizedLabel is our editorial cleanup of the first part of that label, for display in the interface, always in English (or at least we called it label_en in the spreadsheet): "Early Bronze Age" (dropping the "Period", and moving the rest to the alternate label).

The alternateLabel is "Early Cypriote I", which we didn't make up, but which we separated from the original label which had an internal alternate (again, always in English, translated from the original label).

We always did alternate labels in English in the spreadsheet, even when we'd pulled them out of a combined period term in another language. Consider this example:

Original term=label: "Mesjetë /Bizantine" (Albanian)
Our localizedLabel (label_en in the spreadsheet): "Medieval" (translation of Albanian "Mesjete")
Our alternateLabel (label_alt in the spreadsheet): "Byzantine" (translation of Albanian "Bizantine")

Is this enough for semantic consistency, or should we also provide the alternate term as a separate value in the original language, and then as an English translation? I'd prefer the current system for simplicity, but will it make search in other languages harder?

from periodo-data.

ptgolden avatar ptgolden commented on September 26, 2024

The localizedLabel is not always in English in the dataset. For example:

"id": "p0vn2fr2ft3",
"alternateLabel": [
  "Late Mesolithic"
],
"label": "Yngre mesolitikum",
"localizedLabel": {
  "eng-latn": "Late Mesolithic",
  "swe-latn": "Yngre mesolitikum"
},
...

from periodo-data.

rybesh avatar rybesh commented on September 26, 2024

There is (purposefully) some redundancy here.

A period definition always has a label that is how it appeared in the source. Examples:

  • Early Bronze Age Period (Early Cypriote I)
  • Mesjetë e hershme

If the label is not English or it needs some cleanup/normalization, a period definition may be given one or more alternateLabels. These are always English. Examples:

  • Early Bronze Age III
  • Early Cypriot III
  • Early Medieval

The localizedLabel is simply there to assign language tags to the various labels. It should not contain any values that are not already a label or an alternateLabel.

Looking at the data, I see a few cases where there is a 2nd alternateLabel that does not appear in localizedLabel, which I guess is a data quality bug.

from periodo-data.

ptgolden avatar ptgolden commented on September 26, 2024

Does that mean that localizedLabel should be able to have multiple entries for a single language?

from periodo-data.

ptgolden avatar ptgolden commented on September 26, 2024

(i.e. a localizedLabel value for each different alternateLabel)?

from periodo-data.

rybesh avatar rybesh commented on September 26, 2024

No. Now I remember why I only included at most one alternateLabel in localizedLabel, because the JSON-LD language indexing stuff will only work with one value per language tag. So it's fine. If they ad more than one alternateLabel just put the first one in localizedLabel.

from periodo-data.

ptgolden avatar ptgolden commented on September 26, 2024

My issues are resolved in #19.

from periodo-data.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.