unicode-org / conformance Goto Github PK

View Code? Open in Web Editor NEW

3.0 10.0 12.0 13.44 MB

Unicode & CLDR Data Driven Testing

Home Page: https://unicode-org.github.io/conformance/

License: Other

JavaScript 10.65% Python 39.71% Rust 7.93% Shell 2.55% Makefile 0.58% C++ 9.31% HTML 7.27% Dart 3.92% Java 18.09%

cldr test

conformance's Introduction

Unicode & CLDR Data Driven Test

This repository provides tools and procedures for verifying that an implementation is working correctly according to the data-based specifications. The tests are implemented on several platforms including NodeJS (JavaScript), ICU4X (RUST), ICU4C, etc. Additional programming platforms may be added to use the test driver framework.

The goal of this work is an easy-to-use framework for verifying that an implementation of ICU functions agrees with the required behavior. When a DDT tet passes, it a strong indication that output is consistent across platforms.

Data Driven Test (DDT) focuses on functions that accept data input such as numbers, date/time data, and other basic information. The specifications indicate the expected output from implementations when given the data and argument settings for each of the many individual data items.

Note that these tests are only part of the testing required for ICU-compliant libraries. Many additional tests are implemented in the

!!! TODO: reference to data specifications

Components of Data Driven Test

ICU versions for data and testing

Each ICU test program is built with a specific version of ICU & CLDR data. These versions are updated periodically. For each ICU version, there is a specific CLDR version, e.g., ICU73 uses data from CLDR 43, although multiple ICU releases may depend on the same CLDR data.

For this reason, specifying a particular ICU version for test data or test executor or both

Each part of Data Driven Testing is designed to handle a specific ICU version.

Data generation uses specifications starting with ICU versions 70, 71, etc. For each ICU release, these data should be updated.
Test execution allows setting the data version explicitly with a command line argument --icuversion that points to the indicated test data. The ICU version of the test executor platform is requested from each platform at the start of the test driver. Output directories are created under the platform for the test results running a particular ICU version, e.g., testOutput/node/icu73.
Test verification uses ICU version information in the test output files for matching with the corresponding expected results. Verification output appears in the testResults subdirectory for each node, e.g. testOutput/rust/icu71.

Architectural Overview

Conceptually, there are three main functional units of the DDT implementation:

Data generation

Utilizes Unicode (UTS-35) specifications, CLDR data, and existing ICU test data. Existing ICU test data has the advantage of being already structured towards data driven testing, is in many cases formatted in a way to simplify adding new tests, and contains edge and error cases.

Data generation creates two files:

Test data instance: a JSON file containing the type of test and additional information on the environment and version of data.

The test type is indicated with the "Test scenario" field.

Individual data tests are stored as an array of items, each with a label and parameters to be set for computing a result.

Example line for collation_short:

{
"description": "UCA conformance test. Compare the first data\n   string with the second and with strength = identical level\n   (using S3.10). If the second string is greater than the first\n   string, then stop with an error.",
"Test scenario": "collation_short",
"tests": [
  {
    "label": "0000000",
    "string1": "\u0009!",
    "string2": "\u0009?"
  },

A required test result file (JSON) containing the expected results from each of the inputs. This could be called the “golden data”.

Sample verify data:
```
{"Test scenario": "collation_short",
"verifications": [
  {
    "label": "0000000",
    "verify": "True"
  },
```

Text Execution

Test execution consists of a Test Driver script and implementation-specific executables. The test driver executes each of the configured test implementation executables, specifying the input test data and the location for storing results. STDIN and STDOUT are the defaults.

Test executors

Each test executor platform contains a main routine that accepts a test request from the test driver, calling the tests based on the request data.

Each executor parses the data line sent by the test driver, extracting elements to set up the function call the the particular test.

For each test, the needed functions and other objects are created and the test is executed. Results are saved to a JSON output file.

See executors/README for more details

Verification

Each test is matched with the corresponding data from the required test results. A report of the test results is generated. Several kinds of status values are possible for each test item:

Success: the actual result agrees with expected results
Failure: a result is generated, but the result is not the same as the expected value.
No test run: The test was not executed by the test implementation for the data item
Error: the test resulted in an exception or other behavior not anticipated for the test case

Open questions for the verifier

What should be done if the test driver fails to complete? How can this be determined?
- Proposal: each test execution shall output a completion message, indicating that the test driver finished its execution normally, i.e., did not crash.

How to use DDT

In its first implementation, Data Driven Test uses data files formatted with JSON structures describing tests and parameters. The data directory string is set up as follows:

A directory `testData` containing

Test data files for each type of test, e.g., collation, numberformat, displaynames, etc. Each file contains tests with a label, input, and parameters.
Verify files for each test type. Each contains a list of test labels and expected results from the corresponding tests.

Directory `testOutput`

This contains a subdirectory for each executor. The output file from each test is stored in the appropriate subdirectory. Each test result contains the label of the test and the result of the test. This may be a boolean or a formatted string.

The results file contains information identifying the test environment as well as the result from each test. As an example, collation test results from the testOutput/node file are shown here:

{
  "platform": {
    "platform": "NodeJS",
    "platformVersion": "v18.7.0",
    "icuVersion": "71.1"
  },
  "test_environment": {
    "test_language": "nodejs",
    "executor": "/usr/bin/nodejs ../executors/nodejs/executor.js",
    "test_type": "collation_short",
    "datetime": "10/07/2022, 16:19:00",
    "timestamp": "1665184740.2130146",
    "inputfile": "/usr/local/google/home/ccornelius/DDT_DATA/testData/icu73/collation_testt.json",
    "resultfile": "/usr/local/google/home/ccornelius/DDT_DATA/testOutputs/node/icu73/collation_test.json",
    "icu_version": "ICUVersion.ICU71",
    "cldr_version": "CLDRVersion.CLDR41",
    "test_count": "192707"
  },
  "tests": [
    {
      "label": "0000000",
      "result": "True"
    },
    {
      "label": "0000001",
      "result": "True"
    },
    ...
  ]
}

Directory `testReports`

This directory stores summary results from verifying the tests performed by each executor. Included in the testReports directory are:

index.html: shows all tests run and verified for all executors and versions. Requires a webserver to display this properly.
exec_summary.json: contains summarized results for each pair (executor, icu version) in a graphical form. Contains links to details for each test pair.
subdirectory for each executor, each containing verification of the tested icu versions, e.g., node/, rust/, etc.

Under each executor, one or more ICU version files are created, each containing:

verfier_test_report.html - for showing results to a user via a web server
verfier_test_report.json - containing verifier output for programmatic use
failing_tests.json - a list of all failing tests with input values
pass.json - list of test cases that match their expected results
test_errors.json - list of test cases where the executor reported an error
unsupported.json - list of test cases that are not expected to be supported in this version

The verifier_test_report.json file contains information on tests run and comparison with the expected results. At a minimum, each report contains:

The executor and test type
Date and time of the test
Execution information, from the testResults directory
Total number of tests executed
Total number of tests failing
Total number of tests succeeding
Number of exceptions identified in the test execution. This may include information on tests that could not be executed, along with the reasons for the problems.
Analysis of test failures, if available. This may include summaries of string differences such as missing or extra characters or substitutions found in output data.

Contributor setup

Requirements to run Data Driven Testing code locally:

Install the Python package jsonschema
- In a standard Python environment, you can run
```
pip install jsonschema
```
- Some operating systems (ex: Debian) might prefer that you install the OS package that encapsulates the Python package
```
sudo apt-get install python-jsonschema
```
Install the minimum version supported by ICU4X
- The latest minimum supported supported Rust version ("MSRV") can be found in the rust-toolchain.toml file
- To view your current default Rust version (and other locally installed Rust versions):
```
rustup show
```
- To update to the latest Rust version:
```
rustup update
```
- Install logrotate
```
sudo apt-get install logrotate
```

History

Data Driven Test was initiated in 2022 at Google. The first release of the package was delivered in October, 2022.

Copyright & Licenses

The project is released under LICENSE.

A CLA is required to contribute to this project - please refer to the CONTRIBUTING.md file (or start a Pull Request) for more information.

conformance's People

Contributors

Stargazers

Watchers

Forkers

sven-oly gnrunge echeran mosuem cclauss seanpm2001 robertbastian hsivonen mradbourne priyanshujava mounees-321 catamorphism

conformance's Issues

More flexible source data download in testdata generator

testdata_gen.py hardcodes the source of data using a Github URL for a file from a specific version of ICU: https://github.com/unicode-org/conformance/blob/main/testgen/testdata_gen.py#L334

Instead, we should:

Options to de-flake the download process
- Separate the download step from the data generation step
- Enable option to download a file vs. using a local copy
- Show user a display of download progress
Handle versioning of data (allow different versions of input)

Fix schema messages on main page

Simply to remove unneeded detail. Fix to include any failures.

Number format tests include incorrect units

In many of the test failures for number format, the reason is that "furlong" is not a recognized unit. I think that the test data is incorrect, however. Perhaps the unit is not correctly set for many of the test cases.

test input issues for NumberFormatter / ICU4J

Some of these issues are a part of the test framework (ex: schema definition), some might be related to the ICU4J executor, some might be for the ICU4J NumberFormatter APIs.

the types of the enum values for groupingStrategy are not homogeneous. This caused a problem for the ICU4J executor when handling the parsed value, requiring a workaround to stringify the parsed value followed before conversion to an enum was possible.
the names of the enums for groupingStrategy don't match the enum names for NumberFormatter.GroupingStrategy.
does ICU4J support halfCeil or halfFloor? It is not a part of java.math.RoundingMode
notation should be an enum instead of an open ended string
how do you use currencyDisplay in ICU4J NumberFormatter?
Update test data generator to skip tests with roundingMode = exact
Fix bugs in test data generation for skeleton values not fully matching the provided options

ICU4X - use more collation options

Use compare_type in collation to reduce test failures. Consider other options, too, e.g., strength.

ICU4X Collation failures

ICU4X in conformance testing shows more that 20% of the tests failing, seen here:
ICU4X/icu73

The actual collator options are seen in the test failure detail, with a few examples here. The inputs are s1 and s2 and the actual options used are given

{"label":"0010001","s1":"𑜿!","s2":"𑜿?","line":8661,"ignorePunctuation":true} CollatorOptions { strength: Some(Tertiary), alternate_handling: Some(Shifted), case_first: None, max_variable: None, case_level: None, numeric: None, backward_second_level: None }
{"label":"0243300","s1":"𑛁b","s2":"𑜱b","line":47434} CollatorOptions { strength: Some(Tertiary), alternate_handling: None, case_first: None, max_variable: None, case_level: None, numeric: None, backward_second_level: None }
{"label":"0373766","s1":"龜a","s2":"龜a","line":177900} CollatorOptions { strength: Some(Tertiary), alternate_handling: None, case_first: None, max_variable: None, case_level: None, numeric: None, backward_second_level: None }

We need some help debugging help with this!

Remove extra logging of schema checks

Lots of debug lines are printed by schema checking. It's unnecessary!

Reorganize verifier to open files based on verify plan

Current code uses settings of the the verifier object in function compareTestToExpected. It should use the data in the vplan object.

Add end-to-end pipeline commands to CI

Integrate schema validation into executables

For the executables that we run (test data generator, test executor), we should validate the inputs to the executable against the schema within the executable, right before we use them.

So if step A generates output a that goes into step B that generates b, ..., then we want step B validating values in a right before it processes them.

That protects us against the data inconsistency of stale data problem.

Add flexible pagination in test reports

For test reports, add pagination to speed review of test failures / errors / unimplemented options. This could use JSON data loaded directly rather than creating tables in the Python code.

Set locale field for collation tests

Also, for any of the existing collation tests, they are implicitly defaulting to the root locale, which is und. Updating these tests to have a specified locale means that we set the locale to be und .

CaseMap data sources

We should include SpecialCasing.txt when we get around to writing a casemap adapter: https://unicode.org/Public/UNIDATA/SpecialCasing.txt

end-to-end not exiting on fatal Rust executor errors

The Rust executor is getting an error when trying to execute sendOneLine, and it does so for every batch of 10,000 tests that it sends.

Ex:

Testing ../executors/rust/target/release/executor / coll_shift_short. 190,000 of 192,707
Testing ../executors/rust/target/release/executor / coll_shift_short. 191,000 of 192,707
Testing ../executors/rust/target/release/executor / coll_shift_short. 192,000 of 192,707
!!! sendOneLine fails: input => {"label": "0190000", "string1": "\u2eb6!", "string2": "\u2eb6?", "test_type": "coll_shift_short"}
{"label": "0190001", "string1": "\u2eb6?", "string2": "\u2eb7!", "test_type": "coll_shift_short"}
...
#EXIT<. Err = [Errno 2] No such file or directory: '../executors/rust/target/release/executor'
!!!!!! processBatchOfTests: "platform error": "None"

Issues:

The Python script running everything logs the entire batch of test cases upon this error. We shouldn't print those 10000 lines
In cases where the Python script can't get the executors to do basic things properly, the Python script should exit with a non-zero exit code

Bonus points: in the future, we can use a logging library so that we can more easily control the behavior differently on our local machines vs. on CI

Using logging instead of print

For the test driver and test data generator in Python, we should use logging instead of just printing to the console.

At the least, it's equivalent. But the potential benefits are:

logging methods (ex: logging.debug(), logging.error()) allow us to indicate what severity a statement is
we can control what level we view logs at for testing mode, debugging mode, and production mode
we can configure the format of the messages if needed (add timestamps, etc or not)

Define schema of test case data JSON

Some options for defining a schema:

JSON Schema
Protobuf

JSON Schema is a natural first choice. Also, it would take more effort to deal with Protobuf (perhaps too prohibitive in statically typed languages, even if possible in dynamic ones).

Only need to have a single tool to use JSON Schema since purpose is to validate once the JSON test data cases generated by the test generation tool.

Fix handling of non-matching surrogates in collation data.

The current test generator doesn't create tests for collation data when either of the test strings contains an incomplete surrogate. These are recorded in the logging files but they are not stored in any data or mentioned in any dashboards.

Add a Python linter for this code

Fix many failing tests of collation in ICU4C

Handling the characters in escaping and converting to UnicodeString may be part of the problem here.

Add likely subtags tests

CLDR is adding test data for likely subtags:

unicode-org/cldr#3173

We have support for this in both Intl and ICU4X. It would be a good test to add.

Update the model to include tuple of platform + release, icu data version, and test type

Add this to the testing and reporting to support more flexible testing. For example:

ICU4C release 74.2 with icu69 test data on all test types
ICU4C 74.2 with icu74 testdata on all test types
NodeJS 18. with icu72 testdata
NodeJS 20.1.0 with icu73 test data
NodeJS 21.6.0 with icu72 data
NodeJS 21.6.0 with icu73 data
NodeJS 21.6.0 with icu74 data

Leave input line untransformed in the error handling

Revisit #145 (comment), where an executor encounters an error in processing a test case. Instead of returning the test case input line as is in the error response, the error handling code is transforming the input line before including in the error response. This transformation seems unintended, unless there is a good reason.

@sven-oly

verifier crashes

From a fresh checkout of main, when running sh generateDataAndRun.sh, I get the following:

#EXIT<. Err = [Errno 2] No such file or directory: '../executors/rust/target/release/executor'
!!!!!! processBatchOfTests: "platform error": "None"

Traceback (most recent call last):
  File "/usr/local/google/home/elango/oss/conformance/testdriver/testdriver.py", line 111, in <module>
    main(sys.argv)
  File "/usr/local/google/home/elango/oss/conformance/testdriver/testdriver.py", line 101, in main
    driver.runPlans()
  File "/usr/local/google/home/elango/oss/conformance/testdriver/testdriver.py", line 91, in runPlans
    plan.runPlan()
  File "/usr/local/google/home/elango/oss/conformance/testdriver/testplan.py", line 86, in runPlan
    self.runOneTestMode()
  File "/usr/local/google/home/elango/oss/conformance/testdriver/testplan.py", line 219, in runOneTestMode
    numErrors = self.runAllSingleTests(per_execution)
  File "/usr/local/google/home/elango/oss/conformance/testdriver/testplan.py", line 279, in runAllSingleTests
    allTestResults.extend(self.processBatchOfTests(testLines))
TypeError: 'NoneType' object is not iterable
1
Verifier starting on 9 verify cases
  Verifying test coll_shift_short on rust executor
Cannot load ../TEMP_DATA/testResults/rust/coll_test_shift.json result data: Expecting value: line 1 column 1 (char 0)Traceback (most recent call last):
  File "/usr/local/google/home/elango/oss/conformance/verifier/verifier.py", line 500, in <module>
    main(sys.argv)
  File "/usr/local/google/home/elango/oss/conformance/verifier/verifier.py", line 491, in main
    verifier.verifyDataResults()
  File "/usr/local/google/home/elango/oss/conformance/verifier/verifier.py", line 189, in verifyDataResults
    self.compareTestToExpected()
  File "/usr/local/google/home/elango/oss/conformance/verifier/verifier.py", line 267, in compareTestToExpected
    self.report.platform_info = self.resultData['platform']
AttributeError: 'Verifier' object has no attribute 'resultData'. Did you mean: 'result_path'?
1

Turn off logging of progresss during GHA runs.

This would save time and output. Logging "10 of 1000" is not necessary in non-interactive runs.

Debug problems with collation data and missing verifications

Some types of collation are missing verification data, giving runtime errors with no explanation, e.g., label "00000", "00002", ...

Fix these so the data and verifications are correct.

Use a single tuple of versions from {ICU/CLDR/Unicode}

This more accurately will represent the dependent relationship between the codebases/data.

Add collation with non-ignorable option

Started in PR#94: #94

Add Java executor with ICU4J

Speed up end-to-end CI

We can speed up our end-to-end CI in different ways:

Cache Rust Cargo build artifacts
Split up executor work per-platform (or per-{platform, version})

Rust executor build error for ICU4X 1.0

It's not clear why the Rust executor fails to build in PR #59 . In executors/rust/Cargo.toml, all the versions of dependencies are fixed to a specific version (except for rust_version_runtime, which moved to version 2.x years ago).

I started PR #60 to fix (or at least diagnose) the error. It gives similar error output.

@sffc Any thoughts?

CPP Likely Subtags: Replace "_" with '-' to match expected results

ICU4C uses "_" to separate components of the locale string. However, test data, Dart, ICU4X, and Node all use "-".

Fix code creating characterizations in testreport.py.

The code in characterize_failures_by_options in verifier/testreport.py can be improved a lot by using collections.defaultdict.

See comments in this to fix the #124

Do we need to include the ECMA-402 version in the test case schema?

Is the schema of generated test cases affected by the version of ECMA-402 being used? If so, then include that version, too.

A follow-on task (or subtask) of #43.

Validate test case input and output at runtime

Now that we have schemas for test input and output, we should enable runtime validation of those test inputs & outputs across the board.

Doing so will enable the realization of a large chunk of the value proposition for having the schemas. It would ensure that all test cases passed to executors, and all data received from executors, adhere to the contracts defined by the schemas.

Find reason for many collation shift failures in NodeJS

The ignorePunctuation option doesn't have an effect on the test results for coll_shift_short data. This may be a problem in NodeJS

Include HTML output from end-to-end execution in Github Pages

Executor for dart_native needs environment setup to execute

Testdriver with dart_native gives this in Linux environment. This needs to be fixed to run dart_native tests.

----> STDOUT= ><

!!!!!! !!!! ERROR IN EXECUTION: 255. STDERR = Unhandled exception:
UnimplementedError: Insert diplomat bindings here
#0 Collation4X.compareImpl (package:intl4x/src/collation/collation_4x.dart:16)
#1 Collation.compare (package:intl4x/src/collation/collation.dart:28)
#2 testCollator (file:///usr/local/google/home/ccornelius/ICU_conformance/conformance/executors/dart_native/bin/executor.dart:74)
#3 main. (file:///usr/local/google/home/ccornelius/ICU_conformance/conformance/executors/dart_native/bin/executor.dart:49)
#4 _RootZone.runUnaryGuarded (dart:async/zone.dart:1594)
#5 _BufferingStreamSubscription._sendData (dart:async/stream_impl.dart:339)
#6 _BufferingStreamSubscription._add (dart:async/stream_impl.dart:271)
#7 _SyncStreamControllerDispatch._sendData (dart:async/stream_controller.dart:776)
#8 _StreamController._add (dart:async/stream_controller.dart:650)
#9 _StreamController.add (dart:async/stream_controller.dart:598)
#10 _Socket._onData (dart:io-patch/socket_patch.dart:2381)
#11 _RootZone.runUnaryGuarded (dart:async/zone.dart:1594)
#12 _BufferingStreamSubscription._sendData (dart:async/stream_impl.dart:339)
#13 _BufferingStreamSubscription._add (dart:async/stream_impl.dart:271)
#14 _SyncStreamControllerDispatch._sendData (dart:async/stream_controller.dart:776)
#15 _StreamController._add (dart:async/stream_controller.dart:650)
#16 _StreamController.add (dart:async/stream_controller.dart:598)
#17 new _RawSocket. (dart:io-patch/socket_patch.dart:1899)
#18 _NativeSocket.issueReadEvent.issue (dart:io-patch/socket_patch.dart:1356)
#19 _microtaskLoop (dart:async/schedule_microtask.dart:40)
#20 _startMicrotaskLoop (dart:async/schedule_microtask.dart:49)
#21 _runPendingImmediateCallback (dart:isolate-patch/isolate_patch.dart:123)
#22 _RawReceivePort._handleMessage (dart:isolate-patch/isolate_patch.dart:190)
WARNING:root:!!!!!! process_batch_of_tests: "platform error": "!!!! ERROR IN EXECUTION: 255. STDERR = Unhandled exception:
UnimplementedError: Insert diplomat bindings here
#0 Collation4X.compareImpl (package:intl4x/src/collation/collation_4x.dart:16)
#1 Collation.compare (package:intl4x/src/collation/collation.dart:28)
#2 testCollator (file:///usr/local/google/home/ccornelius/ICU_conformance/conformance/executors/dart_native/bin/executor.dart:74)
#3 main. (file:///usr/local/google/home/ccornelius/ICU_conformance/conformance/executors/dart_native/bin/executor.dart:49)
#4 _RootZone.runUnaryGuarded (dart:async/zone.dart:1594)
#5 _BufferingStreamSubscription._sendData (dart:async/stream_impl.dart:339)
#6 _BufferingStreamSubscription._add (dart:async/stream_impl.dart:271)
#7 _SyncStreamControllerDispatch._sendData (dart:async/stream_controller.dart:776)
#8 _StreamController._add (dart:async/stream_controller.dart:650)
#9 _StreamController.add (dart:async/stream_controller.dart:598)
#10 _Socket._onData (dart:io-patch/socket_patch.dart:2381)
#11 _RootZone.runUnaryGuarded (dart:async/zone.dart:1594)
#12 _BufferingStreamSubscription._sendData (dart:async/stream_impl.dart:339)
#13 _BufferingStreamSubscription._add (dart:async/stream_impl.dart:271)
#14 _SyncStreamControllerDispatch._sendData (dart:async/stream_controller.dart:776)
#15 _StreamController._add (dart:async/stream_controller.dart:650)
#16 _StreamController.add (dart:async/stream_controller.dart:598)
#17 new _RawSocket. (dart:io-patch/socket_patch.dart:1899)
#18 _NativeSocket.issueReadEvent.issue (dart:io-patch/socket_patch.dart:1356)
#19 _microtaskLoop (dart:async/schedule_microtask.dart:40)
#20 _startMicrotaskLoop (dart:async/schedule_microtask.dart:49)
#21 _runPendingImmediateCallback (dart:isolate-patch/isolate_patch.dart:123)
#22 _RawReceivePort._handleMessage (dart:isolate-patch/isolate_patch.dart:190)
"

Create simple clustering of test failure/error results

When there are many test failures or errors, there are too many instances to report each one individually. Many of the test cases might look the same, and without any subgrouping.

It might be helpful to implement some simple unsupervised clustering of the input values (say, taking the top 10 most frequent values per input struct key) and report the top 10 counts.

Add errors and exceptions in test generation to the test data .json

All output warnings and errors should become available to later stages of the processing. At this time, they are merely output as logging to a terminal.

Fix version labeling to use ICU4X version, not Rust

In the summary page and also in the detail page, the platform version is shown but not the ICU4X version, e.g.,
"platform: {'cldrVersion': '43.1.0', 'icuVersion': 'icu4x/2023-05-02/73.x', 'platform': 'rust', 'platformVersion': '1.73.0'}"

This should show the ICU4X version, e.g., 1.3 or 1.4, not "1.73".

Test generation creates bad options, causing test error

Add C++ executor with ICU4C

Mechanism to run different Node.js versions locally

We can either use:

nvm (Node Version Manager)
Docker

Must deal with missing or incorrect icu testdata version

The testdriver code assumes that the --icu_version parameter for the test driver is defined and that it refers to existing data. However, the value may be missing or may not be one of the defined test sets.

Proposed solution: check all defined testdata directories. If icu_version is not defined or a bad value is given, use the highest number ICU version, e.g., a value of "xyz" will look at subdirectory names and pick the one that sorts highest.

For example, if the directories are [icu73, icu72, and icu71], a missing or incorrect value for icu_version will select icu73 data for testing.

Configure logging

Configure logging to have a single global settings file/config.

Also, make the logging level in CI be high enough to not show test execution progress.

Use HTML files to do HTML templating

Created from comment at #67 (comment)

+1 from me on this. Doing so should be win-win for everyone. It will probably feel like using jQuery.

It seems like the best way to do this in Python is using the Beautiful Soup library (docs). I've used JSoup in Java before, and that was really nice (powerful and easy). Beautiful Soup and JSoup seem to be comparable.

Using a regular HTML file as the input for HTML templating, rather than some special syntax that requires some special engine to interpret, is a simpler way to go. (Examples of special syntax HTML templating that are all-too-common still: ex1, ex2). The simplicity is that you keep code in Python along with the caller to the library, and you keep markup in HTML, and you don't mix the two. Not having to deal with yet another syntax is a follow on benefit.

Check ICU4C likely subtags for unsupported favorScript

Remove `DDT_DATA` dir and scripts referencing it

The DDT_DATA directory is obsolete at this point, and it seems to be just a copy of a portion of the TEMP_DATA directory that get created locally to store intermediate files.

We should remove the DDT_DATA directory. At this point, all scripts referencing that directory are obsolete, too.

Do not remove any Python code references to ddt_data. The Python identifier is the alias used for datasets.py when importing that Python file/module.

Rename 'rust' to 'icu4x' in testdriver, executor code

The code has been using "Rust" instead of "ICU4X". We should rename accordingly.

Since the thing under test is an i18n library, we should rename our code according to the library name under test. The version number of the language runtime needed for the library version is a separate thing, and may not correspond 1:1 anyways (ex: ICU4X 1.0 and ICU4X 1.1 were developed against Rust 1.61, ICU4X 1.2 was developed against Rust 1.68.2).

Dart web setup in testdriver doesn't work in testing

The path for the dart_web executor isn't correct, and some parameters need updating.

See PR#84 for a fix.