Comments (4)
The following cases are possible:
- CAS contains exactly one
org.apache.uima.jcas.tcas.DocumentAnnotation
annotation - CAS contains more than one
org.apache.uima.jcas.tcas.DocumentAnnotation
annotation (in DKPro Core, this would be considered a bug, but there is nothing in UIMA Core preventing this scenario) - first or second case, just that it is actually a subclass of
DocumentAnnotation
(DKPro Core actually uses a subclass, namelyDocumentMetaData
)
The document annotation is often the first one created in the CAS, but I wouldn't rely on that. E.g. when a user creates a CAS via uimaFIT in a DKPro Core context and sets the document string, first a DocumentAnnotation
is created. This can later be removed and replaced with a DocumentMetaData
. I would expect this scenario to have an effect on the ID.
org.apache.uima.jcas.tcas.DocumentAnnotation
is "predefined", but contrary to all other predefined types, it is not in UIMA Core. It is defined in a separate JAR and a user could decide to redefine the type as well as use their own custom JCas class with additional fields. DocumentMetaData
is not "predefined" - it is a part of the DKPro Core typesystem.
The (J)CAS APIs assume that there is always some document-level annotation of type DocumentAnnotation
(or compatible like DocumentMetaData
) and return this when CAS.getDocumentAnnotation()
is called. Furthermore, they assume that there is a field language
in the document annotation which can be accessed via the shorthand CAS.get/setDocumentLanguage()
(equiv. to CAS.getDocumentAnnotation().get/setLanguage()
)
from dkpro-cassis.
Right now, I do not serialize it at all which is not so good.
from dkpro-cassis.
@Br0ce I made it so that we do not crash if a type system already defines custom DocumentAnnotation
. Can you try out the master?
from dkpro-cassis.
@jcklie tested with and without DocumentAnnotation
. Looks good to me.
from dkpro-cassis.
Related Issues (20)
- Unable to rely on a feature of a custom layer for annotation HOT 3
- Cas.add() should be able to accept multiple feature structures HOT 1
- Cannot deserialize from JSON Cas if child type comes before super type
- uima.tcas.DocumentAnnotation not predefined when deserializing from JSON HOT 3
- Function to rename Views
- JSON CAS parsing does not handle DocumentAnnotation properly
- Types with array range break JSON typesystem parsing
- Allow reading JSON CASes with out-of-order SofaFSes
- Specific type of array elements in element FS is not retained
- Can not add annotations to characters not right next to punctuation marks for Chinese HOT 2
- Relation creation between two entities HOT 2
- Speed up load_cas_from_xmi by improving offset_mapping and sofaString setter HOT 1
- When a type cannot be found try suggesting another type with a similar name
- Relax dependency on attrs
- Improve warning message in `_add_feature()`
- Possible inconsistency in test fixtures HOT 6
- cassis won't find tokens when doing `cas.select(...TOP)`
- Make better use of type constants in code
- Relax version constraints on itertools
- Cassis 0.9.1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dkpro-cassis.