Giter VIP home page Giter VIP logo

cei2tei's People

Contributors

gvogeler avatar larkvi avatar maburg avatar ntsch avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cei2tei's Issues

Clarify use cases close to current tags

Document the differences between our ODD's uses and exiting TEI P5 uses, and why, so that users can understand which to use, and members of the community can underrstand the differences in order to either 1. not accuse us of duplicating current tags or 2. suggest ways that the uses can be harmonized.

  • legalActor type="recipient" vs. correspAction type="received"
  • legalActor type="issuer" vs. correspAction type="sent"
  • diploPart type="protocol"/"datatio"/"salutatio" vs. opener/dateLine/salute

How should traditioForm be structured?

Currently, traditioForm has a wide range of structures (see following list). Obviously, the type of document needs to be typed in order to control the volacbulary, but how should the identification of the original be treated?

orig.

Original
cop.
ins.
orig.nein, unbekannt
orig.nein, verloren
cop.
orig.Orig.
orig.Abschrift vom 18.12.1686
orig.Kopie
orig.Abschrift
orig.Orig. Perg. Bulle Haus-, Hof- und Staatsarchiv Wien.
orig.mit einer Kopie und einem Begleitschreiben
orig.Breve mit zwei Abschriften
orig.Breve mit Abschrift
orig.Orig. und eine Kopie
orig.Breve mit drei Abschriften
orig.Orig. und 1 Kopie
orig.Drei Urkundenabschriften auf einem Blatt
orig.Abschrift von 146/1
orig.Druck
orig.Vordruck
orig.MS
Orig.
Kopie
Orig. Fragment
beglaubigte Abschrift
Cop.
Orig. Bulle
2 cop.
Abschrift
2 Cop.
Cop.
2 Cop.
Oeig.
2 Orig.
Kopie von 1620
Orig. + 1 Cop.
3 Orig.
fehlt
Depositum
DAL
OÖLA
orig.2 Ausfertigungen
orig.Bulle "Sincere devotionis"

orig.Bulle "Digna exaudicione vota"

orig.Bulle "Romanus pontifex beati Petri"

orig.2 Exemplare
orig.Bulle "Apostolatus officium"

orig.auf der Rückseite von 1515 Juni 27
orig.Bulle "Divine providentie altitudo"
orig.Breve
orig. (?), Libell, 4 fol.
orig.Libell, 8 fol.
orig.Libell, 12 fol.
orig.Libell, 32 fol. Pap.;
orig.Bulle "Apostolatus officium".
orig.Bulle "Gratis divine premium"

orig.Bulle "Apoatolatus officium"

orig.Libell
orig.Libell, 2 fol.
orig.Bulle "Apostolatus officium"

orig.Bulle Apostolicea Sedis consueta clementia"

orig.Bulle "Hodie ecclesiae"

orig.Bulle "Ad cumulum"

orig.Bulle "Cum nos nuper"

orig.Bulle "Cum nobis nuper"

orig.Bulle "Hodie ecclesie"

orig.Bulle "Apostolice sedis consueta clementia"

orig.Konsistorialratifikation vom 11. /12. April 1642;
orig.Libell, 6 fol.
orig.nein, Konzept
orig."Apostolatus officium"

orig.Bulle "Cum nos pridem"

orig.Bulle "Hodie ecclesiae"

orig.Bulle "Personam tuam"

orig.Bulle "Hodie venerabilem fratrem"

orig.Bulle "Nobilitas generia"

orig.Bulle "Hodie Ecclesiae"

orig.Bulle "Cum nos hodie"

orig.Bulle "Apostolice sedis consueta clementia"

orig.Bulle "Romani pontificis"

orig.Bulle "Hodie venerabilem fratrem"

orig.Bulle "Gratie divine premium"

orig.Libell, 14 fol.
orig.Libell, 6 fol., 2 Exemplare.
orig.3 Exemplare.
orig.Bulle "Apostolicae Sedis consueta clementia"

orig.Bulle "Nobilitas generis"

orig.Libell, 4 fol.
orig.3 Ausfertigungen
orig.Libell, 12 + 2 fol.
orig.Libell, 23 fol.
orig.in Bullenform
orig.Libell, 32 fol.
ins. Beglaubigte Abschrift Papierlibell 21,3 x 33,8, 28 fol., Beglaubigung 29. Juni 1728, eh. Franz Joseph Eder, Dr. jur., Lehenpropst und Landgerichtsverwalter der Herrschaft Ybbsitz.
ins. Unbeglaubigte Abschrift Papierlibell 20,7x 31, 26 fol.
orig.Bulle "Personam venerabilis fratris"

orig.2 Exempla

Additions Ontology

Additions needs to be referenceable as an IRI.

  • (draft complete)
  • (SKOS writeup)
  • add cross-references and intersections
  • validate
  • circulate for feedback

<availability>

Currently, the availability element is defined but unused in the public charters. Since it is a standard header element, what should we put in it when transforming the files? Would the Terms of Use serve?

Transforming the .odd

We encountered a few issues when trying to convert the .odd file.

ROMA

Converting the tei_cei.odd file with ROMA to .rnc and to .xsd is not possible:

  • the RNC export returns a not readable file
    image

  • the .xsd export returns a not readable file
    image
    However, exporting to .rng works

Oxygen

Converting with Oxygen.

  • to .xsd returns a not valid schema
    image
  • to .rnc and .rng works

TEIgarage

Converting with TEIgarage

  • to .xsd returns two files (document.xsd, xml.xsd)
  • to .rnc works
  • to .rng works

Is there a reason why the conversion process varies from tool to tool?
Is there or will there be a working .xsd version of the schema?

Is a chop or Jitsu-in type=signed for the purposes of authen?

there is an existing stamp element that can be turned into an authenticating element, like seal, but should stamps used as signatures be treated as scuh (like notarial marks or crosses) or should they be their own category?

Also: Should adhesive stamps with kinegrams, like those on computers, establishing authentic make, be treated as another category?

Is setPhrase useful in an era of text processors?

setPhrase seems like it is an example of the concordance problem--something that we used to do by hand, but now is more easily and better done by computers. In opposition to variable formulae of diplomaticParts, which still need to be controlled by hand, setPhrase should definitionally be easier to search for than to mark up, and capture more results in the process.

Do we currently make use of typed formatting information?

in the charter atom:id tag:www.monasterium.net,2011:/charter/IlluminierteUrkunden/1028-12-99_Bari

There is stylistic information attached to the quote element:
<cei:quote type="italic">
Do we currently use this style information? Should it be converted to a type of highlight? Discarded?

Replaced attribute "type" in element "note" impedes using "note" outside of the transcription

The problem
In tei_cei.odd, lines 565-595, the attribute type in the element tei:core.note was replaced with a custom type attribute definition implementing restrictions on the values that may be used.

These values are clearly intended for a use of tei:core.note in the context of the transcription of the charter text. A use of note in this case indicates, that a legal document contains some text, that can be classified as a note.
However, tei:core.note may occur in other contexts such as the manuscript description (example at the bottom) or the apparatus (last example in section '12.1.2 Readings'). In these contexts notes are usually used to annotate a (meta-)note given by the authors of the TEI-file/editors of the text and not to annotate the text of the legal document itself. The values enumerated for the type-attribute don't fit these usecases, yet because the mode of the attribute definition is 'replace' (<attDef ident="type" mode="replace">), the type attribute cannot be used with any other custom value as would have been possible in the original TEI definition.

This makes it impossible to accurately define the type of an editorial note outside of the context of source text annotation.

How to reproduce
E.g., try to add a note with a custom type in a witness list:

...
<listWit>
  <witness xml:id="sigil">Madrid, Archivo Histórico Nacional, Secc. Clero, Pergaminos, carp. 1234567
    <note type="msDetail" place="preface">The manuscript was thought to have been lost in a fire but gladly recovered later.</note>
  </witness>
</listWit>
...

When linked to a .xsd schema file generated from the tei_cei.odd, the document will not validate due to a violation of the enumeration restriction in the type attribute:

Validation_Error_notetype

Suggestions for fixing this issue
Suggestion 1: Don't replace the type attribute of tei:core.note. Instead, add a new attribute, e.g. named diploNoteType or some other fitting name, that has the restriction with the currently used values 'production', 'ownership', 'personal', 'impersonal', 'structural', 'other'.
This way, the note element can still be used to annotate notes in the source text (combined with the attribute diploType) as well as to annotate editorial remarks or other notes (combined with the standard TEI type and subtype attribute).
This fix could be implemented easily without having to change a lot in the current tei_cei.odd.

Suggestion 2: If it is deemed important, that a note annotation occurring in the transcription must always be used with the given values as type, an additional element could be provided for the use case of annotating note-text already present in the source. This element, e.g. called diploNote or some other fitting name, would then be a possible child of the body and could override the TEI's type attribute by its own restricted type attribute, as it is currently the case for TEI:core.note.
This way, the diploNote element could be used for annotating the notes found in a legal document, while tei:core.note would be used for all other types of notes.

Make legalActor personLike

Should the legalActor element belong to the class model.personLike, so that it can be used in elements where only personLike elements are allowed? This is, for example, the case in the element personList, used in the participant description to list persons occurring in the document and their relation to each other (see TEI guidelines). Currently, legalActor can only be used very unintuitively, leaving questions about where to put the corresp attribute:

<particDesc>
    <listPerson>
        <person corresp="#person1ID"> <!-- put @corresp here or... -->
            <legalActor type="issuer"/> <!-- here? -->
        </person>
        <person corresp="#person2ID">
            <legalActor type="recipient"/>
        </person>
         <listRelation>
             <relation type="personal" name="sibling" mutual="#person1ID #person2ID"/>
         </listRelation>
    </listPerson>
</particDesc>

If legalActor were personLike, it could be used easily and logically like this:

<particDesc>
    <listPerson>
        <legalActor type="issuer" corresp="#person1ID"/>
        <legalActor type="recipient" corresp="#person2ID"/>
         <listRelation>
             <relation type="personal" name="sibling" mutual="#person1ID #person2ID"/>
         </listRelation>
    </listPerson>
</particDesc>

The necessary change in the ODD would comprise only one line:

<elementSpec ident="legalActor" mode="add">
      <desc> Persons or organizations party to or otherwise mentioned in a an act or contract. </desc>
      <classes>
       <memberOf key="att.global"/>
       <memberOf key="att.typed"/>
       <memberOf key="model.inter"/>
       <memberOf key="model.pLike"/>
       <!-- change start -->
       <memberOf key="model.personLike"/>
       <!-- change end -->
      </classes>
       ...
</elementSpec>

Are subtypes of tokens actually subtypes of non-attached?

Currently, the typology has two categories of non-attached items:

  • "non-attached" deals with elements of authentication that have never been attached to a docuiment, like registration, deposit of copies in archives, external attestation (an attestation that is not written on the document, but written in an external document).
    ** Certificate of authenticity
    ** Duplicate
    ** Registration
    *** registry number
    *** serial number
    ** Warrant for Issue
    ** Reference in other documents
    ** Presence in archive / Chain of transmission
  • "token" deals with external tokens that are provided with documents to prove the authentic identity of the sender, as when you are required to attach a photocopy of your passport or driver's license to an application. They could be theoretically attached, but the actual authenticating document is separate, in the same way tha tthe photocopy in the archive is separate.
    ** Driver's license
    ** Passport
    ** ID card
    ** cf. Digital Token

Add attributes or elements for missing seals

Per @GVogeler:
From the authentication point of view a missing seal is a seal with an attribute "missing". From the textual transimission point of view, it is a physical feature (ovservable: wholes, infered: missing seal), so we could treat it with the same attribute. From the editorial point of view, people want to mark the position of the seal in relation to the text - and even with a reference to a missing seal. (SPD for Sigillum pendens deperditum). Having something to encode this would be beneficial. There was even once a dicussion on TEI-L, if I remember it correctly, where Thomas Staecker and me discussed the "L.S." abbreviation in charter copies (abbreviation for "locus sigillum").

Index

Currently, terms in the index have an @lemma attribute, which is not part of the ODD. Would it be preferable to:

  • convert the index to an ontology, and ref a URI (or)
  • use the @key attribute

Authentication Ontology

Currently, a draft has been rendered into SKOS and validates. It needs to be finished.

  • (draft complete)
  • (SKOS writeup)
  • rework materials
  • rework formats
  • add cross-references and intersections
  • validate
  • circulate for feedback

textLang in seal?

I am wondering if the textLang element should be available in within the seal element, to allow classification of the language(s) used in the seal legend, if there is one. This would be particularly useful for identifying legends in vernacular languages. Alternatively the mainLang/otherLang attributes could be available for the legend element.

Class

Since the CEI version of class is unused in practice, it should be easy to extend the msContents @class list to cover all of our needs.

Example from documentation:

<msContents class="#sermons">
 <p>A collection of Lollard sermons</p>
</msContents>

In practice, what is the minimal list of classes we can define in order to make the material comparable?

Review all TEI content types

There was a lot of fiddling with content models to make everything nest properly. Review all content models in the ODD, with regard to:

  • do content models for existing elements need to be changed?
    • if so, would this be desirable to change for all users of TEI? (submit change)
  • are there constraints that need to be enforced that are not currently enforced?
  • how would a model extension handle the content types, versus the accomodations made for existing monasterium data?

Details of publicationStmt

Constructing a formal publication statement, what should the specifics be?

  • Are our TEI files CC-BY-NC, whereas the images are restricted? Is the text of the TEI files restricted, as implied by the existing use statement?
  • Should the mixed availability mean that the availability is @status restricted, free, or unknown?
<publicationStmt>
  <distributor>Monasterium.net</distributor>
  <idno type="Monasterium" xml:id="monasterium">{$mom_id}</idno>
  <availability>
    <p>All texts and pictures are protected according to national copyrights and exploitation rights. Furthermore, all rights of publication and duplication of the pictorial reproductions of the documents are held by the respective archive’s proprietor. Any means of publication is therefore bound to above mentioned authorization and infringement is punishable.</p>
    <p>We would like to make all users aware that addresses, time and duration of access will be stored on our server. The place of jurisdiction for all disputes arising from this agreement is the court nearest to the respective archive.</p>
    <p>Conditions of use of printed editions and depictions apply in the same way to scientific utilization. Citation according to good scientific practice is therefore expected. (URL, author, archive)</p>
    <p>When publishing or duplicating research results (including unpublished theses and dissertations) obtained from data provided by Monasterium.Net, we would like to ask every user to pass a free sample copy to the respective holder of the originals (archive).</p>
  </availability>
  <date when="2018">2018</date>
</publicationStmt>

Types of annotation

The existing <rubrum> tag essentially duplicates msdescription's <additions> element, but the latter is a slightly larger concepts, as it covers all types of marginalia. (See TEI documentation, 10.7.2.4.) Accordingly, I am adding att.typed to it, with a closed list of marginalia types. To start with, we could simply use 'archival' and 'marginalia', but a fuller ontology would be useful down the road. Some types of marginalia include:

  • Penmanship Exercises
    Probatio Pennae
    Scribbles
  • Instructions for Reading (marginal rubric?)
  • Ownership Marks
    Dedication
  • Cross-References / Indices
  • Records (biths/deaths/family trees in bibles, etc.)
  • Corrections <- possibly overlaps with elements of textcrit
  • Notes on Textual Content
    Polemical Notes
    Reading Notes
    Highlight <- should this be a nested <hi> in the <additions> element?
    Underline
    Circling
    Manicules / Pointers
  • Commercial Notes (price, sale date, etc.)
  • Archival Annotations
    Pagination / Foliation <- possibly an overlap with the multiples ways that pages can be marked up
    Endorsement
  • Notarial Annotations
  • Postil / Apostil (VID 327)

What other types of marginalia or annotation might exist, and what is the minimal useable list of types for the list?

rnc out of date

Would it be possible to regenerate the relax_ng_compact schema, which seems to be out of date? For example, it lacks the "forged" value for attribute type on the copyStatus element. Thanks! (We are currently using your schema/ODD for the catalogue of charters we're developing at the Bodleian; if we can contribute to development, please let us know.)

Classifying references from Fontenay charters

Came across cases like the following in the Fontenay charters:

<witness n="b">
	<bibl default="false" status="draft">{Jobin, 1891 #5154}, p. 613</bibl>
</witness>
<witness n="b">
	{Jobin, 1891 #2360@p. 618-620 n° 50}
</witness>

I'm bringing them in as cei:witness for now, but, being printed sources, these seem like they would be a case for cei:listBiblEdition (which becomes listBibl type="edition") in CEI2TEI. Would this be 'ontologically' inconsistent?

specialized dispositiveStatus element?

Would it be useful for accommodating different charter traditions where charters witness legal acts, but are not themselves dispositive, to have a specialized status element?

diploDesc could include documentation of authentication methods

The description of diploDesc states that it conveys "diplomatic description and analysis of a document, including bibliographic references to studies; formal criticism of content and textual/legal form (physical form is physDesc); and discussions of transmission and authenticiy." Discussion of authenticity could include descriptions of means of authentication as encoded by <authDesc> or <authen>.

Multiple archIdentifiers at same level?

Looking for examples of multiple archIdentifier elements at the same level, it looks like the examples may be a failure to use separate <witness> elements to contain separate documents.

For example, from tag:www.monasterium.net,2011:/charter/CH-KAE/Urkunden/KAE_Urkunde_Nr_1016:

<cei:chDesc>
<cei:witnessOrig n="A">
    <cei:archIdentifier>
        <cei:arch>Klosterarchiv Einsiedeln</cei:arch>
        <cei:idno>KAE, N.P.10</cei:idno>
    </cei:archIdentifier><cei:traditioForm>Original</cei:traditioForm>
    <cei:archIdentifier>
        <cei:arch>Klosterarchiv Einsiedeln</cei:arch>
        <cei:idno>KAE, N.P.11</cei:idno>
    </cei:archIdentifier>
    <cei:traditioForm>Original</cei:traditioForm>
</cei:witnessOrig>

tag:www.monasterium.net,2011:/charter/CH-KAE/Urkunden/KAE_Urkunde_Nr_1016
tag:www.monasterium.net,2011:/charter/CH-KAE/Urkunden/KAE_Urkunde_Nr_152
tag:www.monasterium.net,2011:/charter/CH-KAE/Urkunden/KAE_Urkunde_Nr_276
tag:www.monasterium.net,2011:/charter/CH-KAE/Urkunden/KAE_Urkunde_Nr_307
tag:www.monasterium.net,2011:/charter/CH-KAE/Urkunden/KAE_Urkunde_Nr_536
tag:www.monasterium.net,2011:/charter/CH-KAE/Urkunden/KAE_Urkunde_Nr_643
tag:www.monasterium.net,2011:/charter/CH-KAE/Urkunden/KAE_Urkunde_Nr_774
tag:www.monasterium.net,2011:/charter/DE-StaALohr/Urkunden/I_B_*35

xquery version "3.1";
declare namespace cei = "http://www.monasterium.net/NS/cei";
declare namespace atom = "http://www.w3.org/2005/Atom"; 

let $collection := 
	subsequence(collection('/db/MOMData/metadata.charter.public/')//cei:archIdentifier[2], 1, 1000)
for $entry in $collection/ancestor::atom:entry/atom:id/text()
return $entry

Add desc to seal

separate seal content from description by adding model.desclike to seal

Write type/subtype constraints

Currently, the schema does not associate type and subtype, though the subtypes should be children of the types in the attached SKOS ontology. Writing a constraint would improve the consistency of the data, but might not be best for those using their own specialized subtypes?

Provide abbreviation and glyph examples

In order to promote shared practices in the representation of abbreviations and glyphs across projects, document suggested forms for common abbreviations as well as selected special cases (special signs, notarial signs, monograms).

  • Check usage of cei:c across various projects to see how it has been used in the past
    (only 17 distinct uses of cei:c (uo, ui, Ov, e˛, ie, ae, oe, Oe, ue, Vo, Lvedel, we, e¸, ve, vo, ov, vo,)
  • Provide code examples for:
    • the most common abbreviations (table from Clemens & Graham? Beginning ofeach subsection in Capelli's introduction?)
    • example of printed representation of signatures marks in manuscript
    • monogram
    • special notarial sign
  • discuss when it may be more appropriate to represent as a figure

One approach to representation (using <c>, rather than <g>) is:
http://www.helsinki.fi/varieng/series/volumes/14/honkapohja/

Fingerprints used to sign documents are type="biometric" or type="signed"?

As the title says: in some less-literate areas of the world, documets are authenticated using fingerprints in the place of signatures. Are these treated like the various special marks that may be used to sign things, or are they treated as biometrics, a separate category that arises from the authenticating element of biometric passports and identity cards (driver's licenses, etc.)?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.