dmn-tck / tck Goto Github PK

View Code? Open in Web Editor NEW

50.0 14.0 35.0 30.9 MB

Decision Model and Notation Technology Compatibility Kit

Home Page: https://dmn-tck.github.io/tck

Java 67.41% FreeMarker 17.12% XSLT 3.64% C# 2.65% JavaScript 9.16%

dmn tck compatibility-kit decision-model dmn-model hacktoberfest

tck's People

Contributors

Stargazers

Watchers

tck's Issues

Problems opening DMN files in Camunda Modeler CE

I installed Camunda Modeler Community Edition 1.11.3 from December 6, 2017 on Windows 7. As of the date of this writing, it is the latest version of Camunda Modeler officially offered on the site: https://camunda.com/download/modeler/

I carefully followed all instructions on how to setup the tool on the page below to the link.

I have problems opening DMN models from DMN TCK test suite. I ve tried randomly around a dozen of TCK models, and all of them have roughly same warnings. Not much content is shown visually after opening. Also, I cannot edit models opened in this way.

Below are some screenshots and logs attached. Any advise on what should be done to fix these errors will be highly appreciated. Thank you.

testCases.xsd does not support empty context (without name)

What is the valid way to define an expected test result for an empty context? The FEEL Expression "{}" is an empty context. To define the test result for this case I guess the nillable attribute was added. But the name attribute is required. Therefore a dummy name must be used.

<testCase id="example123">
    <resultNode name="feel-ctx-abc" type="decision">
        <expected>
            <component xsi:nil="true" name="dummyName" />
         </expected>
    </resultNode>
</testCase>

Should we remove the required-flag for the name attribute, or is there another way to define an expected result value for an empty context?

Type of BKM CreditContingencyFactorTable in 0004-lending.dmn

Currently the type is <variable name="CreditContingencyFactorTable" typeRef="feel:string"/>

It should be <variable name="CreditContingencyFactorTable" typeRef="feel:number"/> as the output of the decision table is a number.

Increase test coverage for Decision Table compliance

Opening a ticket to link and consolidate discussions on the increase of test coverage for decision tables.

Our current test set covers the basics of decision table runtime semantics, but many common and edge cases are not covered. In order to increase test coverage, I feel we need to create tests to cover:

Test decision tables with all hit policies: unique, any, priority, first, output order, rule order and all types of collect
Test both single output and multiple column output for all hit policies that allow it
Test decision tables in BKMs, both with explicit and implicit parameters
Test decision tables embedded in context boxed expressions, where some of the inputs might be defined in previous rows of the context
Test use of variables in the input cells of the decision table
Test use of variables in the output cells of the decision table
Test use of expressions in the header of the decision table
On decision tables inside a BKM, test use of BKM parameters as part of the decision table expressions (header, input, and output).
Test semantics of lists in the input entries

I will update this description with the list of PRs, until we close this ticket. Here are the PRs created so far for this proposal:

PR #169
PR #170
PR #171
PR #173
PR #174
PR #178
PR #180

What is the meaning of "Skipped?"

We allow vendors to designate a result as "failed" or "skipped". What is the difference?

The question is one of credibility to the visitors of the results site. If every vendor was to mark all failures as skipped, what does it mean.

For me personally: I felt that skipped was to be used to designate areas that the vendor does not implement, and has a reason to not implement. For example: say that the spec had a feature for reading and writing Excel files, and vendor decides not to support that. So ... they mark all the tests for Excel file as "skipped" to tell users that they simply decided not to implement that. Is that what skipped means to others?

Since the tests are contributed, and because each implementation is different, the "skipped" setting is an escape valve to prevent one vendor from submitting a lot of tests just to make another vendor look bad. The ability for a vendor to avoid certain categories of tests seems important.

For this to work, I think some sort of explanation is needed. That explanation needs to persuade the visitors that there is a good reason that they will not need the functionality represented by the tests. There is no question that "skipped" means that the test does not function on that vendor.

One simple way to handle this is to expect that a vendor include these details on the overview sheet. Is that sufficient?

Or, should we require that any skipped test be mentioned on a "black list" and that some form of explanation be given. There IS a spot in the results to make this explanation. Should we require that a skipped result have a comment? Is that sufficient? Is that necessary?

PLEASE COMMENT - I am putting this issue on the site because I want to get everyone's input.

Shall we remove "cast" and "type" from schema

Questions have been raised in recent weeks about the meaning of cast and type which are on the inputNode and resultNode.

We asked Bruce to document these, but upon review he thinks it would be simpler to just remove them from the schema.

This is a proposal to remove "type" attribute and "cast" attribute from inputNode and also resultNode.

XML character entities (`>` and `<`) inside CDATA sections

Several test cases contain XML character entities like > and < inside CDATA sections, e.g.:
https://github.com/dmn-tck/tck/blob/master/TestCases/compliance-level-2/0004-simpletable-U/0004-simpletable-U.dmn
https://github.com/dmn-tck/tck/blob/master/TestCases/compliance-level-3/0013-sort/0013-sort.dmn

According to the XML specification, XML character entities like > and < must not be decoded in CDATA sections:

Within a CDATA section, only the CDEnd string is recognized as markup, so that left angle brackets and ampersands may occur in their literal form; they need not (and cannot) be escaped using " < " and " & ".

Proposal: Replace the entities with their decoded character.

Goldman Sachs results for September

All the tests are displayed as SKIPPED at https://dmn-tck.github.io/tck/ although the status is SUCCESS.

Looks like a file separator \ vs / issue. Edson could you advise?

Perform optional and opt-in DMN engines TCK Runners during Continuous Integration

As discussed on the 2018-06-01 meeting, I am opening this issue to track the work to allow on an optional and opt-in basis, the ability for the Continuous Integration tool (Travis CI) to perform the execution of the DMN engines TCK Runners, especially for newly raised PR.

Given the opt-in nature, for demonstrating this work I will use Camunda and Drools as example; other Vendors will be able to include their configuration or ask for it to be included, later.

I experimented with a couple options, both with their pros and cons which I describe in the following, in order for the whole group to make an informed decision when proceeding.

Demo description

In order to experiment without touching this repository, I made a clone of this repository and I created a dummy PR in order to show an example. The dummy PR demonstrate the proposal to add a new test: the DMN model is valid, however the test case xml file is intentionally broken. This is because the DMN model is composed of:

Where the decision Greetings is defined as:

The companion test xml is:

(...)
	<testCase id="001">
		<inputNode name="Name">
			<value>John</value>
		</inputNode>
		<resultNode name="Greetings" type="decision">
			<expected>
				<value>Hello John</value>
			</expected>
		</resultNode>
	</testCase>
	<testCase id="002">
		<inputNode name="Name">
			<value>John</value>
		</inputNode>
		<resultNode name="Greetings" type="decision">
			<expected>
				<value>Hello, John!</value> <!-- a WRONG test definition.. -->
			</expected>
		</resultNode>
	</testCase>

Naturally the test id 001 is valid while 002 is intentionally failing.
I am using a 002 intentionally failing test, to simulate the case where a specific Vendor runner, would not cover for any reason the correctness of execution.

Hence for the purpose of this demo, you should consider this dummy PR represent a case where a new test is proposed, the test is xml valid both in the DMN model and the xml file, the test is considered correct (here simulated in the PR), and some Vendor execution does not cover the case correctly.

Option A, use of Travis CI matrix configuration

This option uses the Travis CI matrix configuration to run the Validation as a mandatory configuration, while all the other Vendors are configured as optional/"allow-failure". In turn this means a failure in a Vendor runner, would not classify the full build as failed. To be explicit this means if the Validation (of xml) is failing, the build will be marked as Failed regardless of the Vendor runners results.

On the provided example, it would look like this:

And once clicking on the Details for the GitHub check would lead to the Travis CI page for this execution:

Being a simulation please consider the Vendor line failure, as simulating the case where the correctness of execution was not covered; because the Validation is passed, the build is marked as Success, regardless of the Vendor results.

PROs

this is using standard Travis CI features of configuration
the build is Success/Failure only with regards to Validation, which is the most important case anyway. This meaning we will have on the GitHub PR archive green/red only concerning the Validation phase, and regardless of any Vendor results.

CONs:

each Vendor need to click into the Details page in order to see if their engine results for the new test as proposed in the PRs.
The vendor execution are performed anyway, regardless of the result of the Validation.

Option B, use of Travis CI "Build Stages" (beta-feature)

This option uses the new Beta feature of Travis CI to group jobs separately, Validation being the first group, followed by all the other Vendors runner as the second group. Currently this beta feature requires for all Groups to pass, therefore a failure in a Vendor runner, would classify the full build as failed.

On the provided example, it would look like this:

And once clicking on the Details for the GitHub check would lead to the Travis CI page for this execution:

Being a simulation please consider the Vendor line failure, as simulating the case where the correctness of execution was not covered; because any of the Group failed, the build is marked as Failure.

PROs

any Vendor can immediately see on the PR page something has gone wrong, and can click into the Details page in order to see if their engine results for the new test as proposed in the PRs.
The vendor executions should be performed only if the Validation group is passed [I haven't tested this properly]

CONs:

this is using Beta-feature of Travis CI features of configuration
the build is Success/Failure also with regards to any vendor results. This meaning we will have on the GitHub PR archive green/red affected by any Vendor results: we might decide we merge a PR because is a valid case, but on the GitHub archive will show as red becayse any Vendor might have failed.

Final considerations

I preferred to show all possible options which I experimented with, by studying the Travis CI documentation for the online version we are using, despite I have naturally a preference.

My personal preference and my vote goes with Option A, because for the purpose of adding a new test, we should first most concerned with it being xml valid (for the DMN file and test xml file) and a human check the test make sense. A Vendor result should not influence with a build marked Success/Failure. Being Option A using standard features, it should make more easy later to expand Travis CI usage also to produce the webpage itself.

p.s.: @falko you might want to check the Camunda runner, for the test case 002 I expect it should have reported a failure, with the input "John"

		<resultNode name="Greetings" type="decision">
			<expected>
				<value>Hello, John!</value> <!-- a WRONG test definition.. -->
			</expected>
		</resultNode>

this should be a test failure, while it is marking success instead ;)

0012-list-functions.pdf doesn't show decision remove2nd

Joining the effort

Hi all,

First of all. Congrats to all for a fine specification and well done to the contributors here on the TCK effort.

Secondly, apologies for this posting in 'issues'.

I have followed the instructions for joining the TCK effort and things have been very quiet. I've sent of some emails and received no responses. Hence this post. Again apologies. I am aware it is pretty likely folks may be on vacation in the US/Canada at the moment. I am in Melbourne, Australia.

Also, my apologies for being a complete DMN newb. My relationship with DMN and FEEL started from absolute zero maybe three weeks ago.

As part of research for a possible new venture I am currently prototyping a v1.2 DMN and FEEL 'execution' implementation (no modelling as yet). After a couple of weeks or so of some pretty intense pizza-under-the-door coding I am nearing the end-game of the current TCK compliance tests. It may be of interest to some that the prototyping platform is Typescript. The TCK is being run by Nodejs and Mocha.

If I may, I wish to contribute back to the TCK. There are gaps in the suite so this post is a starting point to find out how I can help. There are also some large-ish gaps in my understanding and more than a few spec ambiguities that need resolving for me. I have some questions for sure. I doubt I can finish it without getting stuff resolved.

At any rate, I am keen to make a start here and get the ball rolling and understand what the process is.

Many thanks,

Greg McCreath.

Exorbitant input in "0004-lending"

In this example test case "001" has input node "RequestedProduct", which is not defined in information requirement of tested decision "Adjudication".

Documentation of Test File Schema

Simon has pointed out that there are aspects of the schema that are not clear what their meaning is. The case in point is the "type" attribute on the result tags. Currently, many tests have this set to "decision" so it is not a data type of the result (which is handled differently). It might be that this is a byproduct of the development where an early implementation thought this was needed, but later found it not necessary.

We need to have a pass that documents the parts of the schema, and explains how they are to be used, and possibly to proposing simplifying the schema if was can not find reasons for parts existing.

Simon said he could do some of this work, but would need help and input from Edson, Bruce, and probably others.

3-0020 test file

Expected value for only 1 decision is provided. There are 4 supporting decisions also. Shouldn't there be expected values for all decisions in the model?

What is the result of a filter that is not a boolean?

The spec says, on page 113:

[1, 2, 3, 4][item > 2] = [3, 4]
[ {x:1, y:2}, {x:2, y:3} ][x=1] = {x:1, y:2}

And:

The filter expression is evaluated for each item in list, and a list containing only items where the filter expression is true is returned.

Based on the above, what is the result when a filter evaluates to a non-boolean value and/or to an error?

E.g.:

{ 
    a list : [ { a : false, b : 2 }, { a : null, b : 3 }, { b : 4 }, { a : true, b : 5 } ],
    r : a list[a] 
}.r

Our understanding is that this should:

when evaluated for the first element { a : false, b : 2 }, a is a false boolean-literal, hence not selected,
when evaluated for the second element { a : null, b : 3 }, a is the null literal, hence not selected
when evaluated for the third element { b : 4 }, there is no a, the filter would be an error, hence return null, hence not selected
when evaluated for the fourth element { a : true, b : 5 }, a is the true boolean-literal, so it will be selected

The result of the expression then would be:

[ { a : true, b : 5 } ]

Mapping between DMN and test XML

I cannot find exact specification of correspondence between elements of DMN and test XML to establish universal translation of entries. Does it exist? If yes, I will be grateful for a link. If not, it might have sense to formalize.

Below are my guesses of intended mapping. On left is DMN. On right - XML. Please correct or extend where appropriate.

definitions -> testCases
DMN file name -> modelName
??? -> [label]
decision -> testCase
decision.id -> testCase.id ?
informationRequirement -> inputNode
??? -> inputNode.name
requiredInput -> component
inputData [href from requiredInput] -> value
inputData.variable.name -> component.name
??? -> component.component.... - according to nesting of boxed expressions?
decision.invocation -> resultNode
??? -> resultNode.name
binding -> expected ?

Students Writing Tests

Mihail from MITRE suggested that in his class on DMN modeling, it might be possible to have students write tests that the TCK could use. It would mean a minimal coordination (what tests are needed at the time). The Trisotech implementation allows modeling of the decision, as well as testing it. That would mean the students would need some free access to Trisotech, is that possible? Denis said that the format on the Trisotech is not exactly the same as TCK, so there is some adjustment that must be made. It is all one file instead of separate model from input/output. (Could we get this to automatically produce TCK conforming output?)

This issue is to discuss the possibility, and to understand what we need to do to make this happen.

Tests with nested components in decision results

We have several examples with complex nested types in inputs. Are there similar examples where decision results are complex types?

1115-feel-date-function

All the itemDefinitions associated with a decision that results in null do not content a typeRef or itemComponent. Doesn't look right to me. We should add typeRef:date in all of them.

Some tests are sensitive to timezone difference of client/server

After implementing the ability to run the DMN TCK tests in the Fujitsu DXP product, and after submitting the successful report, we had an engineer in India re-run the tests. She got a bunch of errors. Our test runner is running in the browser (in JavaScript) and her browser was in a different time zone from the server (that is not very surprising). The test client is running in a different time zone from the DMN server.

This problem causes tests to fail when there is nothing wrong with the engine. False negatives. A good test suite does not behave this way. We should be able to test any DMN model, but we need the test data constructed in such a way that the test always runs.

I have attached the 30 failed results so you can see which tests failed.

This issue is to come to an agreement on how this problem should be addressed so that the test suite is on sensitive to the configuration of the system they are being run on.

tck_results_that_failed.txt

One solution might be to require that the runner to always consider the time zone of the server when interpreting the date/time values in the test files.

Nested lists in 0010-concatenate.dmn

Decision literalNestedList has type List<List<String>> but returns a literal that is not compatible with its type. See below.

              <literalExpression>
			<text>[["w","x"],"y","z"]</text>
              </literalExpression>

We should change the decision to return

              <literalExpression>
			<text>[["w","x"],["y","z"]]</text>
	      </literalExpression>

and the tests accordingly.

Types of decision concatenate1 is List<List<String>>. However, the type of returned literal expression is List<String>. I suggest to change the type of the decision to List
Type of decisions concatenate2, concatenate3 and concatenate4 is List<List<String>>. However, the type of the returned results are not compatible with this type.

I suggest to change the type to List<String> and change the literal expression to match the type, for example flatten the second parameter.

		<literalExpression>
			<text>concatenate(simpleList,flatten(nestedList))</text>
		</literalExpression>

Improve the Test Case Schema with extension possibility and a name attribute for TestCase

I'm submitting a PR for a new schema for the Test Cases.

Two improvements:

Extensibility: Added the possibility for vendor to have extended attributes attached to test cases. We found that this would be pretty useful in our implementation.

Name attribute on Test Case: In our tool, we would like to have a Human Readable name for the test cases. It's optional in the schema.

Test case 0035-test-structure-output has wrong variable typeRefs for BKMs

As already mentioned and accepted with issue #112 the variable of a business knowledge model should have no type. A valid type for the variable would be the type "function". But since DMN does not define such a type, we already removed the typeRefs from all BKM variables in the DMN TCK test model files.

The concrete type (e.g. feel:number) can be specified for the inner expression, e.g. the literal expression.

Please remove the typeRef attribute for the variable elements of all 3 BKMs in test 0035-test-structure-output.

Expected behavior for list built-in functions in combination with singleton lists?

The behavior of singleton lists is not clearly specified in the DMN specification. Especially for list built-in functions there are some special cases:

operation	expected result 1	expected result 2	expected result 3
distinct values(["a", ["a"]])	["a"]	["a", ["a"]]	[["a"]]
union(["a"], [["a"]])	["a"]	["a", ["a"]]
index of(["a", ["a"]], "a")	[1,2]	[1]
count([[]])	0	1
list contains(["a", ["b"]], "b")	true	false

Should the elements of the lists be normalized (transform singleton lists to single element) before the operation is applied? If this is the case column "expected result 1" shows the return values.

Should the elements treated "as is", without any transformation. It this is the case column "expected result 2" shows the return values.

For distinct values() there is also a valid 3rd result (as alternative to expected result 1 if normalization was done). Should we return element 1 or 2 of the list. In this case element 2 is returned.

Which results do you calculate/expect in your engine/runtime? Which is the valid approach?

Should encapsulated decisions of a decision service include output decisions?

For us it is unclear what decisions should be stored in the DMN model as encapsulated decisions. Must the output decisions also be included in the list of encapsulated decisions (as stated on page 25, 26)? Or does the list of encapsulated decisions only hold the decisions contained in the lower compartment of a decision service (as mentioned on 56 since encapsulatedDecisions seems to be optional)?

Need an indicator for ERROR test cases

Actico reported that they had to remove error test cases from their recent submission because of problems with handling error cases with the TCK framework.

My impression was that we had decided that error test cases were merely tested by testing for null. Thus, if an expression hits an error, the spec says that it is defined to return null, and if the test cases is expecting null, then test case should PASS even though the evaluation hit an error condition. We can not distinguish non-error nulls from error nulls -- nobody is happy about that but at least we can test error cases.

Arthur mentioned that in their case, if the execution hits an error, that the test case will be marked as FAILED. I think this is a bug. If the test cases says it is expecting null, then hitting an error should not cause the test to be marked failed.

Separate from that, Arthur raised the concern that we really should start marking tests that are expecting errors from those that are expecting no errors. If we don't do that now, we will have a maintenance issue later is fixing up the test cases.

In the meeting we discussed more about errors. It seems that you might have a decision table with dozens of expressions that hit errors, but the table still might correctly produce non-null output. Does this count as an error case or not? I will make a separate issue for that, this issue is just about extending the schema to create a marker for error cases.

Should we pool funds to promote the TCK results?

This is a question of whether we should collect funds from members, and develop some sort of common marketing / promotion plan (maybe a small video or an email campaign) to raise awareness of the results of the TCK. Given that we all have invested effort into this, the more people who know about it, the more return we will get. What we want is end-users knowing about TCK and selecting vendors based on their demonstration of conformance with the spec.

I raised this issue last week meeting, and the conclusion is that members are NOT interested in this. If you, however, would like to see something like this, or if you have a better idea, then please respond to this issue with a comment.

If there are no comments and no suggestions I will close the issue for the time being.

Definition of Business Knowlege Model return type is wrong for 22 test cases

22 testcases listed in the attached file InvalidTestCases.txt are defining the function invocation return type using the Information Item of the Business Knowledge Model. This is wrong.

The specification says on page 50 for the variable/InformationItem: "The instance of InformationItem that is bound to the function. An invocation can reference this variable by name" and "This attribute defines a variable that holds the FunctionDefinition, allowing a Decision to invoke it by name".

The InformationItem of the Business Knowledge Model defines the name and typeRef corresponding to the included FunctionDefinition. The typeRef is always a Function. FEEL/DMN has no function type. Therefore the typeRef of the variable/InformationItem should be omitted.

To specify the type that is returned by the FunctionDefinition, the typeRef attribute of the body expression of the FunctionDefinition of the Business Knowledge Model should be used.

22 testcases are affected. The typeRef attribute must move from BusinessKnowledgeModel/Variable to BusinessKnowledgeModel/EncapsulatedLogic/Body expression.

Currently these 22 test cases do not comply with the DMN 1.1 specification.

There are OMG issues for DMN 1.2 about fixing the doubled description of variable/InformationItem in table 14. But the content is the same.

Please find attached:

a file listing the 22 wrong test cases
an example testcase that is invalid (renamed to *.txt) with an additional screenshot file
an example testcase that was modified (renamed to *.txt) and is valid with an additional screenshot file

InvalidTestCases.txt

INVALID_0009-invocation-arithmetic.txt

VALID_0009-invocation-arithmetic.txt

Is pre-processing of tests allowed?

It came up in the meeting on Mar 9 that it might be that some implementations do not read the test DMN files exactly as the TCK publishes them. Some of the implementation may require "pre-processing" of the test DMN files before they can be read and executed.

The problem might be the old 'spaces in names' issue we have discussed in the past. The specifics that Boris ran into should be clarified. However, the general question is: do we allow a product to claim conformance when the model must be modified to make it readable?

The TCK promise is that if any person makes a DMN model that conforms to the DMN spec, that it will run in a particular engine to the extent that the test results say that the engine will run that.

We lose credibility if there is a test DMN model, and a vendor claims runs, but when the user tries it it does not run. This is the worst possible scenario for the TCK. As soon as a user learns that tests claimed to run don't actually run, they lose all interest in the TCK, and our ability achieve our goals are compromised. We are wasting our time if we allow companies to claim to run a test when they don't.

One simple solution is to make the pre-processing available in the product. A product might have a "compatibility mode" checkbox when reading the file. A product that has such a feature, can legitimately claim conformance while in compatibility mode. It would not be acceptable to be forced to download a special version of the product, or to download a separate compatibility tool. If a company claims that XXX can run the test, then XXX should run the test. Anything else would appear to be a lie and ruin our credibility.

Even with that, there would have to be constraints on the "pre-processing" because obviously a (trivial extreme case) a pre-processor that converted all tests to a single model and a single set of inputs would not be acceptable. We need to have boundaries how the preprocessor works. But if the product reads the files and generates the right results, then it passes the test. It is that simple.

I am thinking of the Volkswagen example: the car has a special "mode" for detecting the test conditions where it behaved as expected, but did not behave that way when running outside the test. This is not acceptable. The test is expected to actually test the actual product in actual use.

We don't do anyone any favors by "relaxing" our definition of what passing a test means.

I propose that we discuss this in the March 16 meeting.

DMN Power function

I recently came across that the DMN Power function e1 ** e2 is defined as follow in the spec:

type(e1) is number. value(e2) is an integer ...
(Table 55 page 134 of the DMN 1.2 beta spec)

This means that 2 ** 2 is 4 but what about 2 ** 2.5? Is it 4 because we take the integer portion of 2.5) is it null with an error because the function clearly states that e2 needs to be an integer.

Some things to consider, a lot of financial formulas use interest rates (non integer) as exponent.

Basically, we (Trisotech) need a solution to provide 2 ** 2.5 in our execution.

2 choices:

We create a non standard power function: power(e1, e2) without that e2 is an integer requirement
We support the ** operator without the restriction of e2 is an integer.

I'm certainly not a big fan of solution 1 since it introduce a new function that does nearly the same thing as the first one.

I'm more leaning toward solution 2 and interpreting that DMN require the support for e1 ** e2 to "at least" support integer values for e2 and we would also support non integer values.

Anyway, I would certainly like to have a clarification on that in the DMN 1.3 specification.

CL3 1110-feel-contains contains an explicit reference to type null

AFAIK the DMN standard does not support the feel:null notation. See table 41, page 120 in the DMN 1.1. spec for a list of all the kind / datatypes.

It supports a null literal to define missing values, and an internal type null, compatible with all other types. But this type is internal and is not allowed for the user to use it.

I think

  <dmn:itemDefinition id="_JIz9cLpMEeeHcLDylYgVNA" name="tfeel-contains-function_1_Result">
    <dmn:typeRef>feel:null</dmn:typeRef>
  </dmn:itemDefinition>

should be replaced with

  <dmn:itemDefinition id="_JIz9cLpMEeeHcLDylYgVNA" name="tfeel-contains-function_1_Result">
    <dmn:typeRef>feel:boolean</dmn:typeRef>
  </dmn:itemDefinition>

After all, the result of contains is 3-value boolean (true, false and null).

Am I missing something?

Namespace prefixes in typeRefs

Quite a number of the tests use a namespace prefix on typeRefs such as 'tck:tFoo'. My reading of the spec has it that this is not how typeRefs should be expressed. The spec defines a typeRef as a qualified name (in FEEL, meaning 'dot separated') - not namespace prefixed.

When a typeRef refers to something imported then the 'name' of the import is to be the first element of the dot-separated typeRef. The usage of a namespace prefix is at odds with that.

Specifically, the (1.2) spec says in "12.3.2 References within the DMN XSD":

Attribute typeRef references ItemDefinitions and built-in types by name not ID. In order to support imported types, typeRef uses the namespace-qualified name syntax [qualifer].[local-name], where qualifier is specified by the name attribute of the Import element for the imported type. If the referenced type is not imported, the prefix SHALL be omitted.

It doesn't define what 'prefix' means here, but my guess is it is referring to the [qualifier]+dot.

So typeRef attributes referring to something in the same definitions should just be their name - not with a namespace or any other prefix.

Furthermore, in section 7.3.2 ItemDefinition metamodel

If the type language is FEEL the built-in types are the FEEL built-in data types: number, string, boolean, days and time duration, years and months duration, time, and date and time. A typeRef referencing a built-in type SHALL omit the prefix.

Perhaps the fix is simple - remove all the namespace prefixes from typeRefs in the tests.

Comments welcome.

Greg.

Nested lists in 0009-append-flatten

Decision literalNestedList has type List<List> but returns a literal that is not compatible with its type. See below.

              <literalExpression>
			<text>[["w","x"],"y","z"]</text>
              </literalExpression>

We should change the decision to return

              <literalExpression>
			<text>[["w","x"],["y","z"]]</text>
	      </literalExpression>

and the tests accordingly.

Decision append1 has type List<List>. The returned expression

		<literalExpression>
			<text>append(simpleList,literalSimpleList)</text>
		</literalExpression>

produces and output incompatible with the decision type.

We should replace the literal expression with

		<literalExpression>
			<text>append(nestedList,literalSimpleList)</text>
		</literalExpression>

and change tests accordingly.

Sepsis mitigation logic

This issue is created as a discussion point for preparation of submission of real life DMN example, which solves important problem in healthcare.

Provided are original DMN file and respective XML file created automatically from the model by our test case generator. DMN was created in Camunda Modeler. We kindly ask for evaluation of created XML in respect to complimentary DMN model.

When XML will be confirmed as valid, we will fill it with real data and submit it to main test list. Please do NOT edit XML file manually. It is automatically generated. We need directions on inconsistencies to improve the code, which was used to create this test file. Thank you.

0201-sepsis-mitigation.zip

Need example of decision table that produces non-null results when there is an error somewhere

During the processing of a decision table, any number of FEEL expressions might be evaluated. I don't see any examples in the spec, but I am assuming that any FEEL expression might be used in any cell in a decision table (particularly the input entries). If that is true, then a 100-row table might have several hundred expressions, and any number of them might hit errors and return null. The decision table is then expected to treat these as null, but there still might be rows that match the input correctly and produce output.

Is this an error case or not? An engine that is looking for errors and stopping processing might not evaluate this correctly according to the spec.

Is my understanding correct? If so, can someone design / supply a decision table with these properties?

0034-drg-scopes.dmn not schema-valid

xmllint is reporting the following issues:

0034-drg-scopes.dmn:489: element variable: Schemas validity error : Element '{http://www.omg.org/spec/DMN/20151101/dmn.xsd}variable', attribute 'id': '_3_-kYfQLEeecaNY7kk20yA' is not a valid value of the atomic type 'xs:ID'.
0034-drg-scopes.dmn:504: element variable: Schemas validity error : Element '{http://www.omg.org/spec/DMN/20151101/dmn.xsd}variable', attribute 'id': '_68UZsfQLEeecaNY7kk20yA' is not a valid value of the atomic type 'xs:ID'.
0034-drg-scopes.dmn:522: element variable: Schemas validity error : Element '{http://www.omg.org/spec/DMN/20151101/dmn.xsd}variable', attribute 'id': '_MvwmIfQMEeecaNY7kk20yA' is not a valid value of the atomic type 'xs:ID'.
0034-drg-scopes.dmn:537: element variable: Schemas validity error : Element '{http://www.omg.org/spec/DMN/20151101/dmn.xsd}variable', attribute 'id': '_H52AEPQMEeecaNY7kk20yA' is not a valid value of the atomic type 'xs:ID'.
0034-drg-scopes.dmn fails to validate

An online validator confirmed this:

Not valid.
Error - Line 489, 63: org.xml.sax.SAXParseException; lineNumber: 489; columnNumber: 63; cvc-id.2: There are multiple occurrences of ID value '_3_-kYfQLEeecaNY7kk20yA'.
Error - Line 489, 63: org.xml.sax.SAXParseException; lineNumber: 489; columnNumber: 63; cvc-attribute.3: The value '_3_-kYfQLEeecaNY7kk20yA' of attribute 'id' on element 'dmn:variable' is not valid with respect to its type, 'ID'.
Error - Line 504, 64: org.xml.sax.SAXParseException; lineNumber: 504; columnNumber: 64; cvc-id.2: There are multiple occurrences of ID value '_68UZsfQLEeecaNY7kk20yA'.
Error - Line 504, 64: org.xml.sax.SAXParseException; lineNumber: 504; columnNumber: 64; cvc-attribute.3: The value '_68UZsfQLEeecaNY7kk20yA' of attribute 'id' on element 'dmn:variable' is not valid with respect to its type, 'ID'.
Error - Line 522, 64: org.xml.sax.SAXParseException; lineNumber: 522; columnNumber: 64; cvc-id.2: There are multiple occurrences of ID value '_MvwmIfQMEeecaNY7kk20yA'.
Error - Line 522, 64: org.xml.sax.SAXParseException; lineNumber: 522; columnNumber: 64; cvc-attribute.3: The value '_MvwmIfQMEeecaNY7kk20yA' of attribute 'id' on element 'dmn:variable' is not valid with respect to its type, 'ID'.
Error - Line 537, 65: org.xml.sax.SAXParseException; lineNumber: 537; columnNumber: 65; cvc-id.2: There are multiple occurrences of ID value '_H52AEPQMEeecaNY7kk20yA'.
Error - Line 537, 65: org.xml.sax.SAXParseException; lineNumber: 537; columnNumber: 65; cvc-attribute.3: The value '_H52AEPQMEeecaNY7kk20yA' of attribute 'id' on element 'dmn:variable' is not valid with respect to its type, 'ID'.

Some IDs seem to be used for multiple elements.

parsing function definitions for sort

The TCK has some coverage for sort(). Presently, either just using primitive types for a & b, or context types with simple property names.

Those are pretty parseable. But consider the following (hopefully illustrative) nasty example - using things not yet covered by the TCK.

sort(precedes: function(a, b) b.tax + total + a.a + a > max(list: for i in )a.another + total ..5(, j in (2..10( return i*2 + j*a + total), list: foo)

It is utterly horrific, but, I believe valid. It show a function defn using complex identifiers, a function call using a multi-iteration-context for/in that has open/closed ranges as iterations contexts with a final 'return' that may (or may not) be using the value of 'a'.

In order to execute the sort() we need to parse the 'precedes' and 'list' named params.

at the time of parsing 'preceeds', we do not yet know what type 'list' is so

for 'b.tax + total + a'

we don't know whether "tax + total + a" is a property on b, or whether it is just "b.tax" etc or indeed whether "tax + total + a.a + a" is just a property on b. The usual weirdo stuff with identifiers with spaces and plus (etc) signs.

Function definitions are not delimited so we don't quite know where it ends ... so we can't grab the text and park it for later 'parsing on demand'.
We can't scan forward looking for a comma, because the for in is multi-iteration-context and it has commas.
We can't scan looking for a ':' because we may invocations using named params inside the func def
We can't scan looking for matching parenthesis because the iteration contexts are intervals, with closed / open end combination like ")x .. y(" and "(x .. y(". (not covered yet in TCK)

Note: the spec doesn't specifically say ranges are permitted in for/in loops, but follow the grammar and an expression, including a unary test is. At any rate, a range could be embedded in some other way inside an anon function definition.

If we take the given range of ")a.another + total ..5(" we don't know whether "another + total" is part property on 'a' or not.

If we know the type of 'list' we can - but, interestingly, probably not it if it is a list of type 'any'.

The thing is though, we have to parse 'preceeds' to get to 'list', and we have to parse them both before invoking sort(). My issue is I can't see how we can parse 'precedes' to an AST without first knowing the list type. We really need to understand the a & b type, and for that we need to understand 'list'.

If the function definition is not embedded but in a context, or is a BKM, then life is simpler, we get "sort(precedes: bkm, list: this + that)" and we're sweet. (well, sweeter ...)

Does that make sense?

... or do I have it all wrong .... ?

All advice appreciated,

Greg.

EDIT: typos.

Request for Proposal for testing functions

Bruce and Simon raised the issue that it is impossible to test a function by itself. Currently you have to create a model that uses the function, and pass input and output to the model. It is not impossible to test functions, but it is cumbersome.

They request that the schema for the test be extended to allow the specification of a specific function, as well as the normal input and expected output.

This extensions would then be possible to write a runner that would take the inputs, pass them directly to the function, and compare the output. We don't have any guarantee that a particular implementation would support the ability to call a function outside of a model. However, it seems likely that any given implementation would be able to accommodate this. Whether we use the capability within the TCK is a question we might have to grapple with, but this request comes from those who want to continue to use the TCK Test Format for things beyond what the TCK is directly involved in.

We presume that the schema change would be small, and so making that extension now might enable a lot of future value at very low cost today.

TCK Observations / notes / questions

All, maybe some stuff for discussion. Below are my notes re the current TCK. I mean no offence to anyone. The effort so far has been amazing. Well done. This is just a basic audit on what I have experienced and what my code coverage looks like. My first read of the spec was maybe 5 weeks ago and thus far, I have developed in isolation. So, I am a complete DMN newb and likely have it wrong. I am happy to be corrected/slapped. Apologies if this is covered elsewhere.

TCK Notes / observations / issues

Issues

I tested a few DMNs against the schema and some failed - so I didn't test the rest. Should it be a basic requirement of the test DMN that it passes schema validation?
the grammar states that a 'function definition' is not part of a "textual expression". Only a "boxed expression" may have a function definition. A textual function can only 'invoke' functions, not define them. Which make the tests and examples for the 'sort' function where a function is defined inline invalid. ? If it is not invalid then it must be legal to pass a function to (say) a BKM. Which sounds 'dangerous' as the BKM is no longer self-contained and means the BKM now has a dependency on something in (say) a decision.
some tests use the output of string() for their assertions. The output of string() is not defined by the spec.
0003 has a typeRefs of 'tck:tEmploymentStatus' and 'tns:tLoan' but they are not actually defined. There is a type 'tEmploymentStatus' and a 'tLoan' defined, but neither bound to the 'tck' namespace. What is the policy?
some tests have typeRefs that are non-existent (without or without namespace). What is the policy?

TCK Coverage

Large-ish

no DecisionService coverage
no equivalence coverage
no conformance coverage
no import coverage
very little add/subtract coverage beyond numbers and strings

Basic FEEL tests:

some coverage - but mostly 'incidental' in nature. Not specific tests.

Assert:

which chars are whitespace
which chars may be included in names
inclusion of extra chars in names
string escape chars
disambiguating names with spaces ()giving precedence to longest scope name) - wrap not longest scope name in ()
truth table matrix for AND/OR/NULL
type equality equivalence between types
type conformance
2.00 = 2
0 = -0
** is right associative (spec doesn't say it, but it is)
multi-line expressions
comments in expressions //-EOL and /* */
var names with cr/lf in them
order of precedence

Boxed Expressions

pretty sure there is no coverage for nested boxed expressions

Decision tables:

No coverage for:

'not' usage in unary tests (it is in a single test, but is incidental)
input clause expressions that fail unary tests should result to null
output entries that fail output clause unary tests should result to null
P/O hitpolicy - null values in output values and how they affect priority sorting
P/O hitpolicy - multi-column priority when some or all columns do not have unary tests
P/O hitpolicy - how to handle failures in output clause unary tests
'any' hit policy - verify the multiple results are actually the same (using type equivalence)
input/output clause unary tests with ranges - all tests seems to be on static strings
open/closed start/end ranges (might be but I haven't seem then ..)
defaultOutputEntry for multi column results
return value when no results and no defaultOutputEntry (for each hitPolicy)
return value for U hitPolicy when more than one result
'?' value in scope for generalised cell expression
an input entry generalised expression returning a value, not a boolean

Boxed expr List:

no coverage for list element that are 'expressions', test cover only string/numeric literals
no coverage for list cells that are other boxed expressions

Literal Expressions:

no coverage for inline contexts like {foo: "bar"}, {foo bar: a + 1}, or {a: 1, : b: a + 1}
no coverage for 'instance of'
no coverage for array property access with negative indexes
no coverage for array 'out of bounds'
path accessor: when left evaluates to null
path accessor: when left evaluates to neither a list nor a context
path accessor: on qname-like nested contexts/lists
negation: return null when negating a non-number
no coverage for 'between'
for/in: no coverage on ranges like 1..10, 5..x+10, -10..10
for/in: no coverage on reverse looping like 10..0, -10..-5

Function Definitions

grammar rule 58 says that a function definition (say for a context), the type of a formal param can be specified after a ':' like 'function(a:string, b:xsd:dateAndTime)' or 'function(a:string, b:feel:date and time)'

Built-in functions

for/every

Should return true when list is empty
no coverage for nested iteration contexts
no 'true' test (all test assertions are 'false' )

for/some:

Should return false when list is empty
no coverage for nested iteration contexts
no 'false' test (all test assertions are 'true' )

number() - no coverage
all() - no coverage
any() - no coverage
append() - no named params coverage, no type conformance assertion coverage
concatenate() - no named params coverage, no type conformance assertion coverage, no singleton array coverage
distinct values() - no 'equality' checking as per spec - with contexts and lists and range and functions etc.
indexOf() - no 'null' param coverage, no 'equality' checking as per spec
insert before() - no 'null' param coverage, no type conformance checking, no list bounds checking
list contains() - no 'equality' checking as per spec, only 'false' is tested. Maybe no named params test.
max() - return null for empty list not covered, non-numbers not covered. strings - dates, time, durations
max() - return null for empty list not covered, non-numbers not covered. strings - dates, time, durations
mean() - return null for empty list not covered, what to do when list if non-numbers.
median() - no coverage
mode() - no coverage
remove() - no null param coverage. no list index bounds coverage, position == 0 not covered
reverse() - no null param coverage
sort() - no null param coverage. refer in 'issues' section re embedded function defs in literal expressions. Tests assume 'stable' sort but spec has no requirement for that.
stddev() - no coverage. spec example assumes 'sample' stddev algorithm, not 'population' algorithm, though it is not specified.
sublist() - no null param coverage. No coverage for leaving off optional 'length' param
sum() - return null for empty list not covered, non-numbers not covered
union() - no type conformance checking.
abs() - no coverage
ceiling() - no coverage for non-number (inc null).
even() - no coverage.
floor() - no coverage for non-number (inc null)
log() - no coverage
modulo() - no coverage
exp() - no coverage
odd() - no coverage
sqrt() - no coverage
endsWith() - no coverage for non-string (inc null)
lowercase() - no coverage for non-string (inc null)
matches() - no coverage for non-string params (inc null). no coverage for invalid regex.
replace() - no coverage for non-string params (inc null). no coverage for invalid regex or invalid flags.
split() - no coverage
length() - no coverage for non-string params (inc null).
substring() - no coverage for non-string params (inc null).
substring after() - no coverage for non-string params (inc null).
substring before() - no coverage for non-string params (inc null).
uppercase() - no coverage for non-string params (inc null).

Other:

the word 'in' cannot appear in names for 'for .. in' statement. Should include in tests?

Questions

When a decision does not define a variable, were does the output go? What is a decision without a variable?
Are entries in context ordered to have peer-dependencies for later entries resolved by earlier entries?
context 'get value' and 'get entries' are not in the built-in functions list - should they be implemented? If should if 'key' of get value a string or an expression? If an expression that makes pre-compilation an type-inference not possible.
when a unary test within a unarys tests returns null how does this impact the overall result of the unary tests.

Notes

testCases.xsd: has a test type of 'bkm' but it is not used anywhere.
testCases.xsd: what is the 'computed' result node?

Bug: Wrong InputNode name in test file: 0001-filter-test-01.xml

The inputNode in the test file is called "Employee" whereas in the model the InputData is called "Emplyoees".

I think this is the result of a mistake during merge conflict resolution.

testCases.xsd schema validation

I've tried to validate the schema of dmn and xml files from TestCases folder.

The dmn files are fine. Some of the xml tests are failing validation due to missing nillable attribute in the testCases.xsd schema.

Adding nillable = "true" to complexType valueType solved the issue:

	<xs:choice>
		<xs:element name="value" type="xs:anySimpleType" nillable="true"/>
		<xs:element name="component" maxOccurs="unbounded">

I'll raise a pull request if we are happy with this change.

Literal normalization

There are a few tests (e.g. tests for string() function) where the same expected result can be expressed in several equivalent ways. For example,

"2016-01-01T12:00:00+01:00" is equivalent to "2016-01-01T12:00:00@Europe/Paris"
"2016-08-01T12:00:00+02:00" is equivalent to "2016-08-01T12:00:00@Europe/Paris" if DST is considered.

We need to define a normal format for the expected results for all the types. It will help us to avoid discrepancies when tests are written by different DMN providers.

Type of FinancialMetrics in 0014-loan-comparison

Type of BKM FinancialMetrics is List<Metric>. Hence, the loop

<contextEntry>
    <variable name="metricsTable" typeRef="tns:tMetrics"/>
    <literalExpression>
        <text>for i in Bankrates return FinancialMetrics(i,RequestedAmt)</text>
    </literalExpression>
</contextEntry>

produces a List<List<Metric>> which is incompatible with the type of the contextEntry

Changing the type of the BKM to typeRef="tns:tMetric" should solve the issue.

Minimal set of automatically generated test cases

We generated a set of test cases for all models in current DMN TCK. Please see appended. If somebody with ready and configured runner might run them for consistency, it might be curious. Of course, tests are very formal and schematic. But still, it is interesting, if we comply to semantics. Thank you.

TestCases_AutoGenerated_2018_05_10.zip

Test case 0031: semantics of instanceof are not clear in the spec

Test case CL3/0031 defines an item definition with a number of fields, and then declares variables of that type that do not contain those fields.

The spec, AFAIK, is not clear on the semantics of "instance of". For instance, if an item definition tCoordinate declares fields x and y. A context with a single entry x is declared as a coordinate, is this valid? I.e., is an object with only field x, but no field y, still a tCoordinate?

Some engine implementations define the semantics of "instance of" using duck-typing, due to the underspecification of types in the spec, and on those implementations, the object defined above is not a tCoordinate.

There is an open ticket with the RTF for DMN 1.2 to clearly define the semantics of "instance of" and types in general in FEEL.

Until the RTF decides on a course of action, I suggest we avoid tests that include such ambiguity.

Test case CL3/0031 should either be refactored to not fall into this problem or be moved to non-compliant until DMN 1.2 is published and the test adapted accordingly.

Node type for Adjudication in 0004-lending-test-01.xml

Adjudication is a decision that returns a constant. The associated node in the test case is

	<inputNode name="Adjudication" type="decision">
		<value>ACCEPT</value>
	</inputNode>

In my view an InputNode is associated with TInputDatas and ResultNodes with DRG Elements that produces values

The associated node should be

	<resultNode name="Adjudication" type="decision">
		<expected>
			<value>ACCEPT</value>
		</expected>
	</resultNode>

This approach will help us catch changes in the definition of constant DRG Elements.

0101-feel-constants.dmn '+' not allowed for numeric literal according to grammar rules

I am currently testing the TCK for ACTICO GmbH against our DMN runtime currently under development.
We plan to integrate the TCK tests as part of our continous builds.

One test, that is failing is 0101-feel-constants.dmn for compliance-level-2
Our grammar does not allow the '+' character to express positive numbers.

<decision name="Decision3" id="d_Decision3"> <variable name="Decision3" typeRef="feel:number"/> <literalExpression> <text>+.872</text> </literalExpression> </decision>

According to the spec, this test does not seem to be compliant:
chapter 10.3.1.2, page 108, grammar rule 37: numeric literal = [ "-" ] , ( digits , [ ".", digits ] | ".", digits ) ;

Only '-' is allowed to express negative numbers. Did we miss something here?

Thanks for your feedback and good to be on board with ACTICO.

Day Time Duration division

0007-date-time.dmn contains a literal expression

                 <literalExpression>
			<text>dtDuration2 / oneHour</text>
		</literalExpression>

where both operands are typeRef="feel:dayTimeDuration"

I believe this is not supported in DMN 1.1. (see Table 47: Specific semantics of multiplication and division )

I think we should replace the literal expression with

                 <literalExpression>
			<text>dtDuration2.hours</text>
		</literalExpression>

and test the "hours" member.

Special Characters in String Literals

As discussed in today's meeting, we have just realized that there is no way in DMN to specify a special value in a DMN string literal. There is no escape notation, no backslash sequences. The spec says that a string literal starts with a quote and ends with a quote and all the characters in between must not be a quote or (newline?). There is support for having 'special characters' in any string value expressed as a literal. By 'special character' we mean quotes, newlines, tabs, or any character that you can not type directly into the text containing the literal expression.

There is in DMN the ability to use some backslash escaped character values in the syntax for variable names and function names and such. But this does not extend to string literal expressions.

Someone from the TCK should make a proposal to the RTF to adopt Java-like or JSON-like string literal expressions. This would allow one to specify newline with \n tab with \t quote with \" and any unicode value with \uxxxx style expression (where the x are hex numerals). Of course, \\ would be needed to represent a backslash.

Here is the spec for JSON: http://www.json.org/ which follows the conventions of Java.

JSON includes: \" ,\\ ,\/, \b, \f, \n, \r, \t, \u four-hex-digits

0007-date-time test - decision cTimezone- uses wrong property to access offset information

There are some issues for test 0007-date-time:

The PDF does not match the DMN file: Decision cTimezone of the DMN file is not listed in the PDF. The diagram shows cOffset instead. I think the diagram is valid, but the decision in the DMN file should be renamed to cOffset.
The literal expression of decision cTimezone is "Date-Time2.timezone" and the expected type is feel:dayTimeDuration. I think this is a mismatch. The return type of the timezone property is of type feel:string. Therefore the literal expression should be changed to: "Date-Time2.time offset". Since the decision was already renamed to cOffset (see first bullet point), the decision is now consistent.