Comments (5)
Agreed that this is too lax and we should probably remove this behaviour. It is used in practise though so it would be a change from FoLiA v1.6 forward then as we can't demand this from older versions due to backward compatibility. On a related note: I thikn we should also be strict in demanding declarations for structural elements (token, paragraph, sentence), these are now optional, but if you really want the declarations to be meaningful they'd better be strict too.
For the lazy users we can provide a tool that automatically generates some ad-hoc declarations.
from folia.
yes, let's do this:
- down't allow missing setnames for >= 1.6
- provide a simple upgrade script to 1.6 (might also include test fixing ans some other goodies...)
I am still a bit reluctant towards making declarations mandatory. But in the long run it might be needed. So start requiring this too.
A conversion script adding some default declarations might be more difficult though .
from folia.
Just for clarity: of course the libraries should still remain capable of parsing pre-1.6 documents (with the missing setnames and all). The upgrade script is not a replacement for that but just an additional tool.
from folia.
We do have something extra to consider; for certain annotation types set is optional (this applies to a lot of structure elemnets), or in rare cases not present at all perhaps even. In such cases a declaration without set
is permitted.
I also want to enforce that if there is a set, then there must be a class on the annotations (and obviously if there is no set, there can't be a class on annotations).
from folia.
In summary, for FoLiA 2.0:
- Certain declarations may be set-less
- this is indicated in the new documentation
- this is determined by whether
class
is a required property for an annotation type or not, if it is then it can never be setless.
- Annotations that assign classes must always have a set
- There is no "undefined" set anymore that may get assigned automatically, but
- For text-annotation and phon-annotation, a default set does get assumed (if no set is specified), but this is an explicit set now. This is a transition from the situation in which text was kind of setless in FoLiA 1 but we had a predefined class (current).
- All annotation types (including structural ones and text itself) need to be declared (the FoLiApy library can do this automatically to a certain extent)
from folia.
Related Issues (20)
- Document and extend the "external" mechanism
- Problems with leading/trailing whitespace in text content HOT 32
- Allow features in markup annotation
- Predefine some subsets for style annotation
- allow for multiple foreign metadata nodes in FoLiA, even in 'native' mode
- Regression: Text consistency breakage since FoLiA v2.4.1 HOT 6
- Tagging mechanism to aid processors HOT 1
- Add a t-lang element HOT 1
- some questions regarding the new <t-hspace> tag HOT 7
- May a processor be assigned to a <text> element? HOT 5
- correcting a correction. What is wrong here? HOT 1
- extracting text from corrections. What are the semantics? HOT 13
- Random results from foliavalidator and folia2txt HOT 19
- New problems with leading/trailing whitespace around linebreaks in text content HOT 1
- FoLiA to W3C Web Annotations conversion HOT 1
- What is the license of the FoLiA data format? HOT 4
- Is this valid FoLiA? HOT 3
- comprehensive linguistic annotation HOT 16
- Python issues: Splitting long text by folia2txt and FLAT in the custom software HOT 1
- Offset problems with "empty" TextMarkup elements
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from folia.