Giter VIP home page Giter VIP logo

jgexml's Introduction

jgeXml - The Just-Good-Enough XML Toolkit

Build status Join the Mermade Slack

Share on Twitter Follow on Twitter

jgeXml provides an event-driven parser to process XML 1.0 / 1.1. Both pull and push modes are supported. Tools are included for writing XML (documents or fragments) and to convert between XML and JSON.

The code has no dependencies on other modules or native libraries.

Setting up a push-parser is as simple as:

const jgexml = require('jgexml');
const result = jgexml.parse(xml, function(state, token) {
  //...
});

Events (stateCodes)

  • sDeclaration
  • sDocType
  • sDTD
  • sElement
  • sAttribute
  • sValue
  • sEndElement
  • sContent
  • sComment
  • sProcessingInstruction
  • sCData
  • sError
  • sEndDocument

No event is generated for ignoreable whitespace, unlike SAX. Empty elements are normalised into sElement/sEndElement pairs.

Notes

jgeXml is a non-validating parser. It attempts to report if the XML is well-formed or not.

Both when reading and writing, attributes follow after the element event, and in the order they are given in the source.

When converting to JSON, the attributePrefix (to avoid name clashes with child elements) is configurable per parse.

In JSON, child elements can be represented as properties (the default) or objects (exposing the parser's intermediary state).

The parser by default treats all content as strings when converting to JSON, optionally data can be coerced to primitive numbers or null values.

The xsd2json utility can convert most simple XML Schemas to JSON schema draft 4. XSD's may of course be converted to JSON simply as if they were XML documents too.

Experimental JSONPath and JSONT utilities are under development.

Limitations

jgeXml is currently schema agnostic and staunchly atheist when it comes to DTDs. It can parse XML documents with schema information, but it is up to the consumer to interpret the namespace portions of element names. It can parse internal DTDs, but does nothing with them. xmlWrite minimally supports DTDs but you must build them and the DOCTYPE yourself.

The parser is string-based; to process streams, read the data into a string first. It may be memory intensive on large documents.

CLI commands

  • xml2json - convert XML to JSON.
  • json2xml - convert JSON to XML.
  • xsd2json - convert XSD to JSON Schema.

Examples

See in the examples directory: xml2xml.js for parsing XML to XML, fragment.js for writing XML fragments, jpath.js for JSONPath examples, jsont.js for JSONT examples and pullparser.js / pushparser.js for how to set up and run the parser.

jgexml's People

Contributors

gitter-badger avatar mikeralphson avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

jgexml's Issues

Some xml schema is not support

example:

<?xml version="1.0" encoding="ISO-8859-1" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

<xs:element name="test">
    <xs:complexType>
        <xs:attribute name="id" fixed="123" />
    </xs:complexType>
</xs:element>

</xs:schema>

'xs:attribute' is not support when there is no 'xs:sequence'->'xs:element'.
'xs:attribute' is not support 'fixed'.

Enumeration return object instead of array if there is only 1 enumeration

Description

When parsing an xsd schema that contains an enumeration with a single value, the json schema type is an object instead of an array which makes the schema invalid.

XSD Schema

Omitted beginning and end of XSD Schema

    <xs:complexType name="DateType1Code">
        <xs:restriction base="xs:string">
            <xs:enumeration value="UKWN"/>
        </xs:restriction>
    </xs:complexType>

Steps to Reproduce

Parse an XSD schema containing a single enumeration
Example XSD Schema

<xs:schema elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="Address">
    <xs:complexType name="DateType1Code">
        <xs:restriction base="xs:string">
            <xs:enumeration value="UKWN"/>
        </xs:restriction>
    </xs:complexType>
  </xs:element>
</xs:schema>

Expected behavior

image

Actual behavior

image

Workaround

Add another enumeration with a value that explicitly tell the user to not use it

Don't change the behavior of String.prototype.replaceAll globally

jgeXml unconditionally overwrites String.prototype.replaceAll with its own version, which does not handle a Function as its second argument according to the spec:

jgeXml/common.js

Lines 3 to 9 in 82cb8c7

Object.defineProperty(String.prototype,'replaceAll',{
value: function(search, replacement) {
var target = this;
return target.split(search).join(replacement);
},
enumerable: false}
);

This can break other code in the same project in interesting ways. For example, running the code

console.log('Before:', 'example'.replaceAll('e', (match, offset) => offset ? 'e!' : 'E'));
require('jgexml');
console.log('After:', 'example'.replaceAll('e', (match, offset) => offset ? 'e!' : 'E'));

prints

Before: Example!
After: (match, offset) => offset ? 'e!' : 'E'xampl(match, offset) => offset ? 'e!' : 'E'

Note that jgexml is not used or called, simply required. This is surprising and was frustrating to track down.

Could you consider not modifying String.prototype.replaceAll (or any global variables or their properties) or, if you must, only polyfilling it when absent and using a spec-conformant polyfill?

Thanks for considering,
Kevin

Non-consecutive repeated elements at same depth, converting to JSON

We handle this badly

<?xml version="1.0"?>
<root>
        <header>H1</header>
        <body>B1</body>
        <header>H2</header>
        <body>B2</body>
</root>

The intermediary state contains all the data but has not created an array due to the intervening B1 segment.

{"root": {"header":  "H1","body":  "B1","header":  "H2","body":  "B2"}}

But JSON.parse is eating the repeated elements.

{
  "root": {
    "header": "H2",
    "body": "B2"
  }
}

We should probably aim for something like this:

{
  "root": {
    "header": [
      "H1",
      "H2"
    ],
    "body": [
      "B1",
      "B2"
    ]
  }
}

Union type crashes cli

The attached xsd crashes the cli with the error:

/home/kratib/.nvm/versions/node/v17.4.0/lib/node_modules/jgexml/xsd2json.js:304
        type = element[xsPrefix + "simpleType"][xsPrefix + "restriction"]["@base"];
                                                                         ^

TypeError: Cannot read properties of undefined (reading '@base')

This file is a combination of addressSchema.xsd with an added xs:attribute that causes the crash:

 <xs:attribute name="lang">
  <xs:simpleType>
   <xs:union memberTypes="xs:language">
    <xs:simpleType>    
     <xs:restriction base="xs:string">
      <xs:enumeration value=""/>
     </xs:restriction>
    </xs:simpleType>
   </xs:union>
  </xs:simpleType>
 </xs:attribute>

which is taken from the MusicXML specification.

When I simplify the above by removing the xs:union clause, the cli no longer crashes:

<xs:attribute name="lang">
  <xs:simpleType>
     <xs:restriction base="xs:string">
      <xs:enumeration value=""/>
     </xs:restriction>
    </xs:simpleType>
 </xs:attribute>

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.