xlate / staedi Goto Github PK

View Code? Open in Web Editor NEW

119.0 15.0 33.0 2.76 MB

StAEDI - Streaming API for EDI: Java library featuring a reader/parser, writer/generator, and validation

License: Apache License 2.0

Java 100.00%

edi edi-reading x12 edifact parser generator edi-api edi-stream edi-messages tradacoms

staedi's Introduction

StAEDI - Streaming API for EDI

StAEDI is a streaming API for EDI reading, writing, and validation in Java. [Support | Wiki]

The API follows the same conventions as StAX (XML API available in the standard JDK) using a "pull" processing flow for EDI parsing and an emit flow for EDI generation.

Features

Support for X12, EDIFACT, and TRADACOMS standards
Read structures from an EDI stream in sequence (start loop, start segment, element data, end segment, etc.)
EDI Schema that allows for user-specified validation rules
Validation of EDI standards (segment occurrences, element type, element length constraints, etc.)
Validation of industry implementations, for example HIPAA
Read and write EDI data using standard Java XML interfaces (StAX)
Read EDI data using standard Java JSON interfaces (Jakarta JSON Processing, aka JSR-353/JSR-374)
Support for X12 ISX segment (release character, element 01 only), introduced in version 007040

Maven Coordinates

<dependency>
  <groupId>io.xlate</groupId>
  <artifactId>staedi</artifactId>
  <version>CURRENT VERSION</version>
</dependency>

Support

Support is available to assist with incorporating StAEDI into your business's application. Available services include

Development of EDI validation schemas using your documentation (e.g. PDF)
Integrating StAEDI into your Java application
Troubleshooting issues with your existing integration (not including StAEDI bugs - please open an issue)

Please email contact at xlate dot io for more information.

Have a Question?

If you have a question about StAEDI that may not require the opening of an issue, please head to the StAEDI Gitter channel at https://gitter.im/xlate/staedi to discuss.

Reading EDI

Input data is provided using a series of events via the EDIStreamReader class. In addition to events such as the start of a segment or element, the looping/nested structure of the EDI stream is represented using derived events.

+ Start Interchange
| +-- Start Segment (ISA / UNB / STX)
| |     Element Data (repeats)
| +-- End Segment (ISA / UNB / STX)
| |
| +-- Start Functional Group (Optional for EDIFACT and TRADACOMS)
| |   +-- Start Segment (GS / UNG / BAT)
| |   |     Element Data (repeats)
| |   +-- End Segment (GS / UNG / BAT)
| |
| |   +-- Start Transaction/Message
| |   |  +-- Start Segment (ST / UNH / MHD)
| |   |  |     Element Data (repeats)
| |   |  +-- End Segment (ST / UNH / MHD)
| |   |
| |   |  // Segments / Loops specific to the transaction
| |   |
| |   |  +-- Start Segment (SE / UNT / MTR)
| |   |  |     Element Data (repeats)
| |   |  +-- End Segment (SE / UNT / MTR)
| |   +-- End Transaction/Message
| |
| |   +-- Start Segment (GE / UNE / EOB)
| |   |     Element Data (repeats)
| |   +-- End Segment (GE / UNE / EOB)
| +-- End Functional Group
| |
| +-- Start Transaction/Message (EDIFACT and TRADACOMS only, if functional group(s) not used)
| |   // Same content as messages within group
| +-- End Transaction/Message
| |
| +-- Start Segment (IEA / UNZ / END)
| |     Element Data (repeats)
| +-- End Segment (IEA / UNZ / END)
+ End Interchange

EDIInputFactory factory = EDIInputFactory.newFactory();

// Obtain Stream to the EDI document to read.
InputStream stream = new FileInputStream(...);

EDIStreamReader reader = factory.createEDIStreamReader(stream);

while (reader.hasNext()) {
  switch (reader.next()) {
  case START_INTERCHANGE:
    /* Retrieve the standard - "X12", "EDIFACT", or "TRADACOMS" */
    String standard = reader.getStandard();

    /*
     * Retrieve the version string array. An array is used to support
     * the componentized version element used in the EDIFACT standard.
     *
     * e.g. [ "00501" ] (X12) or [ "UNOA", "3" ] (EDIFACT)
     */
    String[] version = reader.getVersion();
    break;

  case START_SEGMENT:
    // Retrieve the segment name - e.g. "ISA" (X12), "UNB" (EDIFACT), or "STX" (TRADACOMS)
    String segmentName = reader.getText();
    break;

  case END_SEGMENT:
    break;

  case START_COMPOSITE:
    break;

  case END_COMPOSITE:
    break;

  case ELEMENT_DATA:
    // Retrieve the value of the current element
    String data = reader.getText();
    break;
  }
}

reader.close();
stream.close();

Disable Validation of Control Codes

Out of the box, instances of EDIStreamReader will validate the control structures of X12 and EDIFACT messages (interchange, group, and transaction headers and trailers). This validation includes checking that some fields contain one of an enumerated list of values (e.g. a known transaction set code in X12 segment ST, element 1).

If you wish to disable the validation of the code values but keep the validation of the structure, including field sizes and types, set the EDIInputFactory.EDI_VALIDATE_CONTROL_CODE_VALUES property to false on an instance of EDIInputFactory prior to creating a new EDIStreamReader, as shown below.

// Create an EDIInputFactory
EDIInputFactory factory = EDIInputFactory.newFactory();
factory.setProperty(EDIInputFactory.EDI_VALIDATE_CONTROL_CODE_VALUES, false);

// Obtain an InputStream of the EDI document to read.
InputStream stream = new FileInputStream(...);

// Create an EDIStreamReader from the stream using the factory
EDIStreamReader reader = factory.createEDIStreamReader(stream);

// Continue processing with the reader...

Sample Writing X12 EDI

The below example shows how X12 data could be written. TRADACOMS and EDIFACT standards are also supported, using the segments specific to those standards.

EDIOutputFactory factory = EDIOutputFactory.newFactory();

// Obtain Stream write the EDI document.
OutputStream stream = new FileOutputStream(...);

EDIStreamWriter writer = factory.createEDIStreamWriter(stream);
int groupCount = 0;

// Set a schema for the control structures being written (interchange, group, and transaction envelope segments)
SchemaFactory schemaFactory = SchemaFactory.newFactory();
/*
 * A control schema can be created with the factory by providing the standard
 * and version array. The version is an array to support multi-field versions
 * such as the composite element UNB01 for EDIFACT.
 */
Schema controlSchema = schemaFactory.getControlSchema(EDIStreamConstants.Standards.X12, new String[] { "00501" });
writer.setControlSchema(controlSchema);

writer.startInterchange();

// Write interchange header segment
writer.writeStartSegment("ISA")
      .writeElement("00")
      .writeElement("          ")
      .writeElement("00")
      .writeElement("          ")
      .writeElement("ZZ")
      .writeElement("ReceiverID     ")
      .writeElement("ZZ")
      .writeElement("Sender         ")
      .writeElement("203001")
      .writeElement("1430")
      .writeElement("^")
      .writeElement("00501")
      .writeElement("000000001")
      .writeElement("0")
      .writeElement("P")
      .writeElement(":")
      .writeEndSegment();

// Write functional group header segment
groupCount++;
int txCount = 0;
writer.writeStartSegment("GS");
writer.writeElement("FA");

// Continue writing remainder of group header and transactions, increment `txCount` for each transaction

writer.writeStartSegment("GE")
      /* Count of transactions here must match the actual count of ST/SE pairs */
      .writeElement(String.valueOf(txCount))
      /* Control number here must match the value in the group header */
      .writeElement("1");

writer.writeStartSegment("IEA")
      /* Count of groups here must match the actual count of GS/GE pairs */
      .writeElement(String.valueOf(groupCount))
      /* Control number here must match the value in the interchange header */
      .writeElement("000000001")
      .writeEndSegment();

writer.endInterchange();

writer.close();
stream.close();

staedi's People

Contributors

Stargazers

Watchers

staedi's Issues

Validation of edifact fails when an empty segment starts a loop in an implementation schema

If the empty segment is removed (from schemas and edifact file), the validation works perfectly. Same if the empty segment is moved to second or following positions in the loop.

A failed test is attached.

empty-segment-loop-validation.zip

Schema files for standard transactions

are there schema files for documents like 850?

TOO_MANY_COMPONENTS BUG

Hi,

I have had this issue for over a week now and I though to myself, let me consult you guys about it. I have a transaction schema that validates the codeco 95b. Here is a zip file containing the edi file and the transaction file.
edi.zip

What happens is on the TDT segment I get an error TOO_MANY_COMPONENTS staring from element 8178. What I suspect happening is that the error itself act as an component and for that reason the componentIndex cannot be the "componentPosition - 1" but needs to be "componentPosition -2"

I Suspect that for various reasons. The code you sent me earlier in a question I had about errors I also needed to fix the error there as I received outOfBound exceptions. See code below:
"
if (segment == null && transactionSchema != null) {
segment = (EDIComplexType) transactionSchema.getType(location.getSegmentTag());
}
if (segment != null) {
// Obtain an EDIReference for the current element from the segment's list of
// elements
EDIReference reference = segment.getReferences().get(location.getElementPosition() - 1);
ediWriteElement.setReference(reference);

		// The element will be either an EDISimpleType (i.e. a simple element) or an
		// EDIComplextType (if it is a composite element)
		EDIType element = reference.getReferencedType();
		if (element instanceof EDIComplexType) {
			List<EDIReference> references = ((EDIComplexType) element).getReferences();
			try {
				EDIReference componentRef = references.get(**location.getComponentPosition() - 2**);
				EDISimpleType component = (EDISimpleType) componentRef.getReferencedType();
				ediWriteElement.setSimpleType(component);
				System.out.println("Not Normal Type");
			} catch (IndexOutOfBoundsException e) {
				e.printStackTrace();
			}
		} else {
			ediWriteElement.setSimpleType((EDISimpleType) element);
			System.out.println("Normal Type");
		}
	} else {
		ediWriteElement.setRemedy("No element type defined for location [" + location + "]");
	}

Another platform where I also test my files has no errors on the TDT;

reference:

I hope I have explained this issue very well, let me know If there is anymore information I can provide

Option to disable validation of code values in a control schema

Provide an option to disable validation of the value enumerations defined in a control schema. The option/property will be a boolean and can be set on an EDIInputFactory instance using the constant EDIInputFactory.EDI_VALIDATE_CONTROL_CODE_VALUES. The default value should be true.

This option/property will allow a user to utilize control schema validation for structural and data type constraints, but still provide support for custom values that may not be defined in the list of enumerated values for an element in the control schema.

Provide a way to extract EDI Schema from JAXBContext

Validation counters not cleared after first transaction

Validator associated with transaction Schema must be reset before each transaction "loop".

An element/composite with minOccurs=0 at impl. schema doesn't override minOccurs=1 from trans. schema

Regarding question/issue #78, I'm trying implementation schemas to achieve the goal, but I've a problem at "relaxing" the minOccurs attribute of some elements/composites.

If I put all components of segment TVL required (minOccurs="1") and then relax some of them (composites or elements) in the implementation schema (minOccurs="0"), the implementation doesn't override the transaction as expected.
But, if I do the opposite, relax transaction / tense implementation, it works as expected.

A failed test is attached. There is a FIXME comment in the implementation schema about the issue.

validation-multi-level-test.zip

Create Wiki pages for basic read/write functionality

Use schema validation from EDIStreamWriter

Enhance the EDIOutputFactory and EDIStreamWriter to use a Schema (or separate ones for control/message). Throw EDIStreamException when invalid data is written.

EDIFACT Binary Messages

Add support for EDIFACT binary data using the UNO and UNP segments.

Please comment on this issue if this feature is important to you.

Support syntax relationships between loops and/or segments

The schema syntax element can only be used currently for segments and composites to indicate relationships between elements. This enhancement will allow for syntax restrictions to be specified at the transaction and loop levels to place constraints on the loops and segments contained therein.

URI is not absolute Exception

Exception in thread "main" java.lang.IllegalArgumentException: URI is not absolute
at java.net.URI.toURL(Unknown Source)
at io.xlate.edi.internal.schema.SchemaUtils.getXmlSchema(SchemaUtils.java:117)
at io.xlate.edi.internal.schema.SchemaUtils.getControlSchema(SchemaUtils.java:94)
at io.xlate.edi.internal.stream.StaEDIStreamReader.nextEvent(StaEDIStreamReader.java:130)
at io.xlate.edi.internal.stream.StaEDIStreamReader.next(StaEDIStreamReader.java:179)

Validate implementation elements/composites when segment selected

Validate implementation constraints on the elements and composites not selected until implementation segment is selected

EDIFACT release element throws exception from writer with UNA

EDIFACT release element throws exception from writer when using the UNA segment.

java.lang.StringIndexOutOfBoundsException: String index out of range: 20
	at java.lang.AbstractStringBuilder.charAt(AbstractStringBuilder.java:237)
	at java.lang.StringBuilder.charAt(StringBuilder.java:76)
	at io.xlate.edi.internal.stream.tokenization.EDIFACTDialect.parseVersion(EDIFACTDialect.java:97)
	at io.xlate.edi.internal.stream.tokenization.EDIFACTDialect.initialize(EDIFACTDialect.java:66)
	at io.xlate.edi.internal.stream.tokenization.EDIFACTDialect.processServiceStringAdvice(EDIFACTDialect.java:224)
	at io.xlate.edi.internal.stream.tokenization.EDIFACTDialect.appendHeader(EDIFACTDialect.java:172)
	at io.xlate.edi.internal.stream.StaEDIStreamWriter.write(StaEDIStreamWriter.java:314)
...

Delimiters outside ASCII range fails to parse

Describe the bug
Attempting to parse the included EDI, I'm getting the following Exception:

java.lang.ArrayIndexOutOfBoundsException: Array index out of range: 8230

at io.xlate.edi.internal.stream.tokenization.CharacterSet.setClass(CharacterSet.java:193)
at io.xlate.edi.internal.stream.tokenization.X12Dialect.initialize(X12Dialect.java:95)
at io.xlate.edi.internal.stream.tokenization.X12Dialect.appendHeader(X12Dialect.java:167)
at io.xlate.edi.internal.stream.tokenization.Lexer.handleStateHeaderData(Lexer.java:370)
at io.xlate.edi.internal.stream.tokenization.Lexer.parse(Lexer.java:229)
at io.xlate.edi.internal.stream.StaEDIStreamReader.nextEvent(StaEDIStreamReader.java:155)
at io.xlate.edi.internal.stream.StaEDIStreamReader.next(StaEDIStreamReader.java:187)

To Reproduce

EDIInputFactory factory = EDIInputFactory.newFactory();
EDIStreamReader reader = factory.createEDIStreamReader(ediInputStream);

while (reader.hasNext()) {
  EDIStreamEvent ediStreamEvent = reader.next();
  switch (ediStreamEvent) {
    case START_INTERCHANGE:
      String[] version = reader.getVersion();
      break;
  }
}

With the follow as the data used in the ediInputStream

ISA*00*          *00*          *ZZ*XXXX           *ZZ*DDDDDD         *200910*1930*U*00400*000075776*0*P*>…
GS*QM*XXXX*DDDDDD*20200910*1930*75776*X*004010…
ST*214*757760001…
B10*0542550913*7727019*XXXX…
L11*900169*PO…
N1*SH*HEATCO INC…
N3*50 HEATCO CT NW…
N4*CARTERSVILLE*GA*30120…
N1*CN*ADDISON HVAC LLC…
N3*7050 OVERLAND RD…
N4*ORLANDO*FL*32810…
N1*BT*TRANSAVER FREIGHT SE…
N3*108 WASHINGTON S…
N4*MANLIUS*NY*13104…
LX*1…
AT7*AG*NS***20200911*1159*ET…
MS2*XXXX*902577*CV…
LX*2…
AT7*AF*NS***20200910*1135*ET…
MS2*XXXX*902577*CV…
AT8*G*L*2085*5…
SE*20*757760001…
GE*1*75776…
IEA*1*000075776…

@MikeEdgar mentioned this might/probably is a bug, "the parser should automatically detect the delimiters, but it's possible there is an issue if it's outside the ASCII range"

Elements of improperly sequenced elements are not reset

If a segment appears out of sequence in a loop and the segment has already appeared earlier in the loop, the elements of the additional occurrence are not reset before validation. This results in element occurrence errors in addition to the (correct) segment occurrence error for the improperly sequenced segment.

NPE on Validator#handleImplementationSelected, line 740, v1.10.2

It works with 1.10.1, but fails with 1.10.2. Test case is attached.

validation-npe-impl-schema.zip

Enable retrieval of current schema reference from reader/writer

Add methods to EDIStreamReader and EDIStreamWriter interfaces to retrieve the current standard EDIReference and the current implementation EDITypeImplementation instances.

Get Description from Reference Code

Hi, looking for a way to get the Description of an Element from its Reference Code.

I'm parsing an 837 file and want to know if I can get the element description from the reference code (reader.getReferenceCode())
In other words, is there a way for me to get (as an example) "Organization Name" from "NM103"?
Same question with Loops, how can I get (as an example) "1000B" from "L0001"?

Thanks in advance.

Modify XML schema for transaction and control

Hi,

I am setting up a codeco 96b validation and I noticed you had some predefined files in the library. So I copied your EDIFACT/v3.xml and tried adding a single validation on segment DAM 9500(Type of Damage).

However, I get the message UNEXPECTED_SEGMENT which is pointing me to the DAM segment. I know it has something to do with the grouping of the DAM segment in the interchange XML element, but I just can't find the right way to do it.

Can you please assist?

View codeco 95b:Truugo (DAM is under GRP5>GRP6>C821>7500(Type Of Damage))

My xml is as follows:
codeco95b.zip

Ignore repetition separator

PADIS EDIFACT from IATA says (source):

Repetition separator is not used in PNRGOV messages and therefore the default * separator does not need to be released.

So, our clients are sending files like:

UNA:+.?*'
...
LTS+0/O/SS/NT6604 Y 21FEB 5 LPADSS LK4 1200 1415/NN *1A/E*  /NT/ES/C/I/CAB Y//0////'
...
UNZ+1+0001'

Where those asterisks are content, not separators :(

¿Would it be possible to support a setting to ignore the repetition separator?

Support schema inclusion of segment, composite, and element types

Support schema inclusion of segment, composite, and element types - without another top-level <interchange>, <transaction>, or 'element - using the` element in v4 schema.

Implementation schema split from standard schema

Allow for implementation schemas to be specified in separate files that "extend" from a parent standard schema.

Prevent trailing empty elements

Do not write trailing empty components in a composite element or trailing empty elements/composites in a segment.

EDIFACT Syntax Rule D7 (If first, then none of the others)

If first, then none of the others: `D7(030,040,050)`

If the first item (at position 03) is present, then the items at 04 and 05 cannot be included.

Method to retrieve the current transaction version

Add getTransactionVersion to EDIStreamReader, align with existing getVersion used for the interchange/control version.

EDIFACT Support validation of empty segments

In transaction schema, if we put:

<segmentType name="SRC" />

an error occurs parsing the schema: io.xlate.edi.internal.schema.StaEDISchemaReadException: Unexpected XML event [2]

And if we put:

<segmentType name="SRC">
  <sequence />
</segmentType>

an event ELEMENT_OCCURRENCE_ERROR is returned by EDIStreamReader::next with error type TOO_MANY_DATA_ELEMENTS.

This issue is related to #10

Support X12 ISX Segment Release Character

X12 TR4 (Clarification Paper)

The ISX segment first appeared in version/release 007040 of the X12 EDI Standard, with ISX01 and ISX02. ISX03 and ISX04 were introduced in version/release 008010. The I11 value in the ISA segment determines the version of the ISX which may be used in the interchange.

The elements features in ISX are listed below. This issue will address the first element.

Release Character - ISX01
Character Encoding - ISX02
Overriding X12 Version / Release Code - ISX03
Industry Identifier - ISX04

GE control schema elements reversed

Currently:
GE01 = 28
GE02 = 97

Correct:
GE01 = 97
GE02 = 28

EDIFACT Support empty (trigger) segments

Support empty segments like the "SRC" one in IATA PADIS EDIFACT (https://www.iata.org/contentassets/18a5fdb2dc144d619a8c10dc1472ae80/pnrgov20edifact20implementation20guide2016_1.pdf).

When an empty segment (SRC') is reached, this exception is thrown:
Message: EDIE003 - Invalid processing state: INVALID (previous: TAG_3); input: '''

We have "fixed" it, but we don't know if it's the best solution :)

Schema XML parse error with multiple syntax elements

io.xlate.edi.schema.EDISchemaException: EDISchemaException at [row,col]:[618,32]
Message: Unexpected XML event [1]
        at io.xlate.edi.internal.schema.StaEDISchemaFactory.createSchema(StaEDISchemaFactory.java:95)
        at io.xlate.fresno.schema.boundary.SchemaResource.getTransactionSchema(SchemaResource.java:73)

Implementation elements before segment selection not available

Need to -

Set implementation element and composite in validator when implementation segment is determined
Update elements and composites with implementations in pending event list when implementation segment is determined

Support different validations for the same segment at different levels

I'm not sure if this is a new feature or a question... Our problem is:

Some segments (with the same segment name) have different validation rules depending on the level they are.

One example from PADIS EDIFACT (spec here):

Segment "TVL" at Level 0 (page 57): many mandatory fields
Segment "TVL" in Gr5 at Level 2 and Gr.12 at Level 4 (page 59): same fields, but less number of mandatory ones

Does the transaction schema support different validation rules depending on the level/loop of the segment?

Add support for partial validation

It would be appreciated if the transaction schema could support something like "wildcard sequences" inside a segmentType declaration.

The idea is to skip validation of elements/composites of some segments. This can be useful when you are dealing with big "models", and you are only interested in parse and validate part of the "model".

The syntax could be something like:

<segmentType name="FOO">
    <sequence>
    	<any/>
    </sequence>
</segmentType>

It's similar to the "any" element in XSD (with processContents="skip"): https://www.w3schools.com/xml/el_any.asp

Thanks in advance

Support Implementation Loops

Provide a way to identify and validate implementation-specific loops via specific value(s) in an element using a schema.

Remedy for error

Hello,

I would like to know. If let's say I encounter a data error(DATA_ELEMENT_TOO_LONG ), how can I retrieve what the length should be? "A sort of remedy for the error"

Something like:


> case ELEMENT_DATA_ERROR:
> 				if (reader.getErrorType() != null) {
> 					/*
> 					 * createSchema.getElementById(reader.getId());
> 					 * to get the schema defined element, so we can access things like
> 					 * "minOccurenc, maxOccurence, minLength etc..."
> 					 * 
> 					 * and
> 					 * 
> 					 * reader.getRemedyType()
> 					 * so we can show how to fix the error
> 					 */
> 					ediWriteElement.setError(reader.getErrorType().name());
> 				}
> 				break;

<s:SEG>
  <!-- SEG01 missing, treat as empty EDI element in output -->
  <e:SEG02>value</e:SEG02>
  <!-- SEG03 missing, treat as empty EDI element in output -->
  <c:SEG04>
    <e:C002-01>value</e:C002-01>
    <!-- C002-02 missing, treat as empty EDI component in output -->
    <e:C002-03>value</e:C002-03>
  </c:SEG04>
</s:SEG>