sandflow / regxmllib Goto Github PK
View Code? Open in Web Editor NEWConvert MXF to XML: RegXML (SMPTE ST 2001-1) tools and libraries
License: BSD 2-Clause "Simplified" License
Convert MXF to XML: RegXML (SMPTE ST 2001-1) tools and libraries
License: BSD 2-Clause "Simplified" License
__ ___ __ __ |__) |__ / _` \_/ |\/| | | | |__) | \ |___ \__> / \ | | |___ |___ | |__) INTRODUCTION ============ regxmllib is a collection of tools and libraries for the creation of RegXML (SMPTE ST 2001-1) representations of MXF header metadata (SMPTE ST 377-1). A RegXML Fragment example can be found at [1] [1] src/test/resources/reference-files Two implementations of regxmllib are provided: * regxmllibj, which is implemented in pure Java; and * regxmllibc, which is implemented in C++03. KNOWN ISSUES AND LIMITATIONS ============================ regxmllib relies on SMPTE Metadata Registers that conform to SMPTE ST 335, ST 395, ST 400, ST 2003. These registers are published at [1]. [1] https://smpte-ra.org/smpte-metadata-registry regxmllib deviates from ST 2001-1:2013 in a few narrow instances. Such deviations are noted in the source code and are expected to be submitted for consideration at the next revision of ST 2001-1. In particular: * no baseline metadictionary is used, instead one extension metadictionary per namespace is used Bugs are tracked at [2] [2] https://github.com/sandflow/regxmllib/issues REGXMLLIBJ ========== Prerequisites ------------- Java 8 language and SDK Maven (recommended) Git (recommended) SMPTE Metadata Registers (Types, Elements, Groups and Labels) Quick Start ----------- The following outputs to path PATH_TO_FRAGMENT an XML representation of the header metadata of the MXF file at path PATH_TO_MXF_FILE * build the 'jar' target using Maven 'package' goal * choose one of the following: * OPTION 1 * retrieve the four SMPTE Metadata Registers (see [1] above) * build the metadictionaries from the SMPTE registers java -cp <PATH_TO_JAR> com.sandflow.smpte.tools.XMLRegistersToDict -e <PATH_TO_ELEMENTS_REG> -l <PATH_TO_LABELS_REG> -g <PATH_TO_GROUPS_REG> -t <PATH_TO_TYPES_REG> PATH_TO_DICT_DIR * OPTION 2 * retrieve metadictionaries from [3] [3] https://github.com/sandflow/IMF/tree/master/dict * generate the RegXML fragment run java -cp <PATH_TO_JAR> com.sandflow.smpte.tools.RegXMLDump -all -d <PATH_TO_DICT1> <PATH_TO_DICT2> ... -i <PATH_TO_MXF_FILE> > <PATH_TO_FRAGMENT> * (optional) generate XSDs for RegXML Fragments run java -cp <PATH_TO_JAR> com.sandflow.smpte.tools.GenerateDictionaryXMLSchema -d <PATH_TO_DICT1> <PATH_TO_DICT2> ... -o <PATH_TO_OUTPUT_DIR> * (optional) generate XSDs for SMPTE registers run java -cp <PATH_TO_JAR> com.sandflow.smpte.tools.GenerateXMLSchemaDocuments -cp <CLASS_PATH_TO_REGISTER_MODEL> -d <PATH_TO_OUTPUT_DIR> where <CLASS_PATH_TO_REGISTER_MODEL> is equal to com.sandflow.smpte.register.catsup.TypesRegisterModel, etc. Architecture ------------ At the heart of regxmllib is the FragmentBuilder.fragmentFromTriplet() method that creates an XML fragment from a KLV group given a a RegXML metadictionary and a collection of KLV groups from which strong references are resolved. The rules engine implemented in FragmentBuilder.fragmentFromTriplet() is intended to follow the rules specified in ST 2001-1 as closely as possible. A sister method, XMLSchemaBuilder.fromDictionary(), creates a matching XML Schema that can be used to validate RegXML fragments. Metadictionaries can be imported and exported from XML documents that conform to the schema specified in SMPTE ST 2001-1. They can also be created from SMPTE Metadata Registers published in XML form. regxmllib includes a minimal MXF and KLV parser library. Packages -------- com.sandflow.smpte.klv : Classes for processing SMPTE KLV triplets (ST 336) com.sandflow.smpte.mxf: Classes for processing SMPTE MXF structures (ST 377-1) com.sandflow.smpte.register: Classes for processing SMPTE metadata registers (ST 335, ST 395, ST 400, ST 2003) com.sandfow.smpte.regxml: Classes for managing RegXML metadictionaries and creating RegXML representations of MXF structures com.sandfow.smpte.tools: Application-level tools com.sandfow.smpte.util: Utilities classes Tools ----- RegXMLDump: dumps either the first essence descriptor or the entire header metadata of an MXF file as a RegXML structure XMLRegistersToDict: converts XML-based SMPTE metadata registers to a RegXML metadictionaries GenerateXMLSchemaDocuments: generates XSDs for the SMPTE metadata registers GenerateDictionaryXMLSchema: generate XSDs for RegXML Fragments from the RegXML metadictionaries Unit Test --------- Unit testing is performed by generating RegXML fragments from sample files located at [1] and registers located at [2]. The resulting RegXML fragments are compared to reference RegXML fragments located at [3]. [1] src/test/mxf-files [2] src/test/registers [3] src/test/regxml-files Reference RegXML fragments can regenerated by running the package goal with the build-reference-test-files profile: mvn package -Pbuild-reference-files Maven Artifacts --------------- * GroupId com.sandflow * ArtifactId regxmllib Snapshots are deployed at https://oss.sonatype.org/content/repositories/snapshots/ Releases are deployed at the central repository REGXMLLIBC ========== Prerequisites ------------- C++03 toolchain Metadictionaries generated by regxmllibj (see _Building Metadictionaries_ above) Xerces-C++ Version 3.1.4 (or above) [1] (recommended) CMake [1] https://xerces.apache.org/xerces-c/ Architecture ------------ regxmllibc generally follows the architecture and idioms of regxmllibj. Applications will typically call FragmentBuilder::fromTriplet() or MXFFragmentBuilder::fromInputStream(), and the unit test at [1] provides an example. [1] src/test/cpp/com/sandflow/smpte/dict/MetaDictionaryTest.cpp regxmllibc does not however support the conversion of the SMPTE Metadata Registers to RegXML metadictionaries, and instead relies on the metadictionaries generated by regxmllibj (see _Building Metadictionaries_ above). Unit Test --------- As with regxmllibj, unit testing is performed by generating RegXML fragments from sample files located at [1] and reference metadictionaries located at [2]. The resulting RegXML fragments are compared to reference RegXML fragments located at [3]. [1] src/test/mxf-files [2] src/test/regxml-dicts [3] src/test/regxml-files DIRECTORIES AND NOTABLE FILES ============================= build.xml Helper script (Ant) pom.xml Maven POM file CMakeLists.txt CMake build file target Output of the Maven build process, including the JAR src/java regxmllibj codebase src/cpp regxmllibc codebase src/main/config/repoversion.properties Template Java properties file used to host the a unique source code version generated using git by the build system src/main/resources/reg.xsd Common XML Schema definitions used when generating XML Schemas for RegXML Fragments src/test/resources/regxml-files Reference RegXML fragment used for unit testing src/test/resources/registers Reference SMPTE registers used for unit testing src/test/resources/mxf-files Sample MXF files used for unit testing src/test/resources/regxml-dicts Reference metadictionaries used for unit testing
When processing smpte_registers-bbc_rd_db_exports-201412092156 I got quite a few warnings of the form:
Jan 08, 2015 3:26:44 PM com.sandflow.smpte.regxml.dict.importer.XMLRegistryImporter fromRegister
WARNING: Missing Target Set UL at Facet null for Type urn:smpte:ul:060e2b34.01040101.05010c00.00000000
What does this mean? Are these warnings genuine? Is there a problems with the Registers entries?
I used smpte_registers-bbc_rd_db_exports-201412092156 and generated www-smpte-ra-org-reg-2003-2012.xml
In this I see:
<TypeDefinitionVariableArray>
<Identification>urn:smpte:ul:060e2b34.01040101.04020700.00000000</Identification>
<Symbol>dupIndexEntryArray1</Symbol>
<Name>IndexEntryArray</Name>
<ElementType>urn:smpte:ul:060e2b34.01040101.04100600.00000000</ElementType>
</TypeDefinitionVariableArray>
Why has the symbol changed from what is in Groups.xml (i.e. "IndexEntryArray")?
Include command line option to remove namespaces from Essence Descriptor
Boolean is defined as an "enumeration" {"1 -> true", "0 -> false"} in the register. ST 377 allows any non-zero value to mean "true".
applyRule5_2 in FragmentBuilder should accept any non-zero value to mean "true" for the special type "boolean".
This would make it more human readable.
Specifically, should all AAF-defined class 13 symbols use the AAF namespace
I assume (from looking at various MXF files I've tried) that's the cause of this output:
No Primer Pack found
Exception in thread "main" java.lang.NullPointerException
at com.sandflow.smpte.klv.LocalSet.fromTriplet(LocalSet.java:113)
at com.sandflow.smpte.tools.RegXMLDump.main(RegXMLDump.java:176)
This situation is permitted by ST 377-1:2011 Section 9.1. Note that in this MXF file it is a "legacy" KLV Fill key that is used (for KLV Fill byte 8 of the Key needs to be ignored).
I think that KLV Fill bytes not being correctly counted is probably also causing the following warning on a different MXF file:
WARNING: Index Table Segment encountered before Header Byte Count bytes read.
ST 377-1:2011 Section 9.1 notes that the KLV Fill at the end of the Header Metadata is included in the value of HeaderByteCount
Properties whose definitions are not found should be included as XML comments
@reg:uid should be generated by on whether the class contains an element for which isUniqueIdentifier() is true
Array with non-zero type size should be have dedicated type kind of FixedArray since they are encoded without element count and size in MXF. This also maps directly to FixedArray definitions in RegXML.
This appears to be an artifact of JAXBContext.generateSchema()
The current XSL transform would be complex to adapt to multiple metadictionaries.
Add 2 sample XML files + RegXML generated from the two sample files so that output of future revisions can be tested against the previous output.
Create 'test' target.
SMPTE 30MR has deprecated the spreadsheet versions of the registries.
In the Registers, ProductReleaseType is incorrectly modelled as an Enumeration of UInt8 rather than UInt16. ProductReleaseType is used by one of the Record members of ProductVersionType. ProductVersionType is used by Identification.ToolkitVersion
and Identification.ApplicationVersion
(among other Elements -- although most of those are (currently) erroneous uses of this Type...).
I ran regxmllib dump tool build 34efd99 on an MXF file with the final two bytes of Identification.ToolkitVersion
set as 0002h. The result was:
<r2:BuildType>VersionUnknown</r2:BuildType>
If ProductReleaseType was correctly modelled then the answer would be:
<r2:BuildType>VersionDebug</r2:BuildType>
(Note: a further complication is that where the Registers currently say "BuildType" they should actually say "Release"...)
It seems that regxmllib finds 10 bytes for fields such as Identification.ToolkitVersion
but the meta-dictionary tells it only 9 are needed and so the final byte is silently discarded. My suggestion would be that an error is raised as this is a serious fault (in this case with the Registers, but it could be a problem with the MXF file). In this case, the result of the data length mis-match is that incorrect data is output (i.e. VersionUnknown instead of VersionDebug).
The generated dictionaries cannot have “rootElement” and “rootObject” attributes.
It would be handy to have some "beginner's instructions" in the README to address the following:
Both UTF-16BE and ISO-7 (e.g. RFC5646LanguageCode) are present in MXF file.
Add switch to applyRule5_12() based on elementType of String Definition.
Metadictionary should use
http://www.smpte-ra.org/schemas/2001-1b/2013/metadict
for the Extension element defined in ST 2001-1 instead of
http://sandflow.com/ns/SMPTEST2001-1/baseline
Change "Array" type kind to "VariableArray"
Remove isSizeImplicit from Set type
regxmllib currently requires Java 8
Also make Record.UL mandatory in Groups
Note that this has been added to the Types Register but Reg-XML itself has not been ammended (yet).
To help with tracking and debugging, add creation date and toolkit build version to XML Registers and MetaDictionaries as XML comments.
RegXMLDump ignores K and L length when computing header bytes.
E.g.:
Entry namespace: "http://www.smpte-ra.org/reg/335/2012"
XML schema namespace: "http://www.smpte-ra.org/schemas/335/2012"
FragmentBuilder (applyRule5_3) currently encodes labels as UL. This could be improved by using the Label symbol if available.
Generate XML schemas for XML Registers and Metadictionaries using JAXB schemagen at build time. Store the resulting schemas under /schemas.
Is this a concern?
regxmllib\regxmllib\src\main\java\com\sandflow\smpte\klv\adapters\ULValueAdapter.java:40: warning: [unchecked] fromValue(byte[]) in ULValueAdapter overrides <W>fromValue(byte[]) in TripletValueAdapter
public static UL fromValue(byte[] value) {
return type requires unchecked conversion from UL to W
where W is a type-variable:
W extends Object declared in method <W>fromValue(byte[])
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
1 warning
NetBeans info:
Product Version: NetBeans IDE 8.0.2 (Build 201411181905)
Updates: Updates available to version NetBeans 8.0.2 Patch 2
Java: 1.8.0_25; Java HotSpot(TM) 64-Bit Server VM 25.25-b02
Runtime: Java(TM) SE Runtime Environment 1.8.0_25-b18
System: Windows 7 version 6.1 running on amd64; Cp1252; en_GB (nb)
Issues are tracked at [1]
[1] https://github.com/palemieux/regxml/issues
is incorrect. It should be:
https://github.com/sandflow/regxmllib/issues
The namespace should be something like http://www.smpte-ra.org/schemas/2001-1c/2013
(we may also want a suffix on this (but "metadict" would not be appropriate)). Not sure if it should be 2013 or not...
WeakReference can be either a UL or a UUID in an MXF file
There are issues with the 4-1 source, and the children are not needed in MXF.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.