Giter VIP home page Giter VIP logo

niem-code-lists-spec's People

Contributors

cdmgtri avatar webb avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

niem-code-lists-spec's Issues

CSV Character Encoding

The following comment was received during the review of the NIEM-based Code List Specification being developed for the NATO Core Data Framework (NCDF):

For CSV files there is no mention on the character encoding. For interoperability it is useful to require a specific character encoding, so everybody interprets it in the same way. For example state that it must be in UTF-8, or ISO-8859-1. On windows Excel probably can’t save UTF-8 encoded CSV directly. On Google I see workarounds using Powershell, but this is another hurdle. OpenOffice/Libreoffice seems to have better support. For a practical solution, it needs to be determined what tools should be used, and what the support. For interoperability it is useful to require a specific character encoding, so everybody interprets it in the same way.

Proposed solution: The spec should provide a method for identifying the character encoding of a CSV.

Add section for Document Conventions

The NDR includes a section for document conventions and normative content. Recommend adding a similar section to the Code Lists spec; also, recommend standardizing these/similar sections across specifications.

Typo in Introduction

In second sentence of second paragraph of Introduction, change "including include:" to "including:".

Appendix D.5 Extension schema with schema-time code list binding methods

The following comment was received during the review of the NIEM-based Code List Specification being developed for the NATO Core Data Framework (NCDF):

Example D-5 documents two methods of binding at schema-time a code list in the same example. This is a bit unclear at first, since from the example you can’t say whether both methods are needed in all cases, of whether they are indeed alternatives, as intended. I would suggest to split it in two examples.

Proposed solution: Split Schema examples into 2 separate appendices; one defining the complex code list binding, and the other defining 2 simple code list bindings. This will entail additional narrative to clarify that they describe different sets of constraints.

Section 3.3 rules reference

Section 3.2 includes a sentence to refer the reader to Section 6 for Genericode code lists rules. Section 3.3 does not include such a reference. Recommend adding the following to section 3.3., "Section 5, Comma-separated values (CSV) code lists, below, provide rules for CSV code list documents."

"Code list" vs "use of a code list" in Section 1.1

1.1. Code list example

The first paragraph and the first bullet under "In this example" call Table 1-1 a "code list".

...an example of a code list...a code list for vehicle makes and models...the code list is the table...

The following sentence (appears in the spec after the bulleted list) calls Table 1-1 a "use of a code list".

What we see above is a use of a code list, but there are things missing; in order to make code lists and their uses machine-readable and verifiable, this specification provides:

Missing punctuation section 1.5

Missing period (".") after sentence: "Figure 1-1, An XML instance with run-time code list binding, below, shows an example of run-time binding".

Incorrect values in example genericode file

I note in your one sample genericode file the incorrect expression of "short name" values. Per genericode rule R39, and repeated where short name values are discussed, such a value must not contain white-space:

http://docs.oasis-open.org/codelist/cs-genericode-1.0/doc/oasis-code-list-representation-genericode.html#_Toc190622861

It is also noted in the schema:

  <xsd:complexType name="ShortName">
    <xsd:annotation>
      <xsd:documentation>A short name without whitespace that is suitable for use in generating names for software artifacts.</xsd:documentation>
      <xsd:documentation><rule:text id="R39" category="document">Must not contain whitespace characters.</rule:text></xsd:documentation>
    </xsd:annotation>

Looking in make-model.gc I note a number of column short name values have embedded spaces.

I hope this helps.

G. Ken Holman
Chair, OASIS Code List Representation Technical Committee
http://www.oasis-open.org/committees/codelist

Repetitive wording in Section 1.2

1.2. Machine-readable format for a code list

Paragraph 1:

This specification provides for representing code lists using CSV files and Genericode files...

Paragraph 2:

This specification provides for the use of CSV files and Genericode files as code lists.

Suggest removing the first sentence from paragraph 2 and changing the start of the next sentence to "This specification also ..."

Suggest moving structural terms definition reference in section 5

Section 5 - I suggest moving the text "The definitions of CSV structural terms are provided by [RFC4180], as noted in Section 2.10, RFC 4180: Comma-separated values (CSV) files, above." from after Rule 5-4, appending it to the section 5 opening paragraph. That way users are alerted to the source of the definitions of the structural terms before they are used.

Order of Conformance Targets in Section 3

The last sentence in section 3.1 refers to Section 6 then Section 5, which is reverse sequence. Recommend reordering the GC-CLD and CSV-CLD sections within Section 3, starting with Table 3-1.

Revise special column names in bindings

Revise binding column name of "code" to "#code", and column of "range" to "#range".

This is to distinguish these special column references from actual column names. A reference to "#code" is not just a reference to a column called "code"; Genericode and CSV rules have special instructions for how to resolve a reference to "code", and so it should be distinct from a regular column called "code". Same for "#range", which points into a table using minimum-inclusive, maximum-exclusive, etc.

"Machine readable and verifiable" in Section 1.1 might not be enough justification

1.1. Code list example

What we see above is a use of a code list, but there are things missing; in order to make code lists and their uses machine-readable and verifiable, this specification provides: ...

Still justifying the need for the spec and the additional complexity at this early point in the document. Would make the reasons for it a little more specific since XML Schema enums, spreadsheets, and basic CSVs can all provide machine-readable codes that can be used in some manner for verification.

Maybe something like:

...in order to make code lists tables and their uses in NIEM consistent, machine-readable, and not restricted to the limitations of XML Schema enumerations yet still bindable to instance values, this specification provides...

There may be better wording for this, but the point behind it is...XML Schema enums support basic code lists, but not code list tables. You could just throw random code list spreadsheets or CSVs into a release or IEPD, but that wouldn't be consistent. You could place random code list spreadsheets or CSVs into a designated spot in the release or IEPD, but that wouldn't be enough to tie it to instance values.

Repetitive wording in section 1.1

1.1. Code list example

List item # 1 has repetitive wording:

  1. Machine-readable formats for code lists. This specification provides for the use of Genericode, an XML representation, for code lists. It provides for the use of CSV files to represent code lists. It also provides a foundation for other representations of code lists, to be identified and implemented later.

Suggest combining Genericode and CSV into the same sentence, and maybe specifying what this spec actually provides for their use. Something like...

...This specification provides rules and guidance for the use of Genericode (an XML representation) and CSV files to represent code lists...

Add Rule 4-15 Clarifying text

The following comment was received during the review of the NIEM-based Code List Specification being developed for the NATO Core Data Framework (NCDF):

Specifically for Example D-5 for the complex binding, I wonder whether the elementName attribute should be e.g. “nc:VehicleMakeAbstract”, as that is an element of the nv:VehicleType used for the Vehicle element. And it is the substitution group used for the extension element “ext:VehicleMakeCode”. Looking at Rule 4-15, current wording is not clear whether the value of elementName needs to be an element that is in a substitution group, of is the element that is the base of a substitution group. I recommend clearer rewording.

Proposed solution: Rule 4-15 states that the bound element is the first element in the substitution group named by elementName. Each element is in its own substitution group.
Add text after rule 4-15 clarifying that each element is in its own substitution group, and that elementName may indicate a concrete element or a different element within the substitution group of the named element.

Figure 7-2 Title

The title for Figure 7-2 should read "Definition of columns in CSV" vice "... Genericode". The preceding sentence beginning with "Figure 7-2, Definition of columns in Genericode, …" should also be corrected.

Vague text in Section 7

The following comment was received during the review of the NIEM-based Code List Specification being developed for the NATO Core Data Framework (NCDF):

Section 7. Well-known columns and references, 1st paragraph, last sentence: The phrase "These values may be used different ways in different contexts:" seems a little vague and not conducive to interoperability. Be more specific about how they can (or shall) be used.

Proposed solution: Omit this sentence, and replace the preceding "." with ":".

Big picture a little unclear in Abstract and Intro sections

Saw some potential questions coming off of the abstract and intro:

  1. Does this spec replace or coexist with traditional XML schema code lists?
  2. Are supported code list artifacts just CSVs and Genericode? Could I use JSON?
  3. Lead sentence just describes using code list artifacts. Seems to imply that you could just grab some existing Genericode or CSV code list and use it with your exchange. Think there should also be a mention of defining the artifact. Maybe something like "This document establishes rules and methods for defining and using enhanced code list artifacts with ..."

Current:

This document establishes methods for using code list artifacts with NIEM information exchange specifications. It provides for the use of Genericode documents, as well as for CSV code lists. It supports the use of code lists at run time and at schema definition time. It also includes identifiers for well-known columns that have semantics needed across the NIEM community.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.