Comments (7)
Hey @hsivonen, can you take a look at these? Here is how we configure the ICU4X collator:
let mut options = CollatorOptions::new();
options.strength = Some(Strength::Tertiary);
// Ignore punctuation only if using shifted test.
if let Some(ip) = ignore_punctuation {
if *ip {
options.alternate_handling = Some(AlternateHandling::Shifted);
}
}
let collator: Collator = Collator::try_new(&locale!("en").into(), options).unwrap();
let comparison = collator.compare(str1, str2);
Here is the corresponding JS code which passes the tests:
// Locale if provided in the test data.
let testLocale = undefined;
if ('locale' in json) {
testLocale = json['locale'];
}
let testCollOptions = {};
if ('ignorePunctuation' in json) {
testCollOptions = {
ignorePunctuation:json['ignorePunctuation']}
}
// Get other fields if provided
let rules = undefined;
if ('rules' in json) {
rules = json['rules'];
}
// Set up collator object with optional locale and testOptions.
let coll;
try {
coll = new Intl.Collator(testLocale, testCollOptions);
let d1 = json['s1'];
let d2 = json['s2'];
const compared = coll.compare(d1, d2);
These cases have no locale option so I don't have reason to believe that the locale is at fault.
Can you advise on how to get the Intl.Collator behavior via ICU4X APIs?
from conformance.
Acknowledging that I've seen this. This needs stepping through in a debugger.
from conformance.
Do I understand correctly that your harness wants the outcome for all these three to be Less
, but the outcome is Equal
? If so, why is Less
expected?
- {"label":"0010001","s1":"𑜿!","s2":"𑜿?","line":8661,"ignorePunctuation":true} CollatorOptions { strength: Some(Tertiary), alternate_handling: Some(Shifted), case_first: None, max_variable: None, case_level: None, numeric: None, backward_second_level: None }
These are Equal
, because the inputs only differ by punctuation, the mechanism for ignoring punctuation is shifting punctuation to the quaternary level and you only compare on the tertiary level.
- {"label":"0243300","s1":"𑛁b","s2":"𑜱b","line":47434} CollatorOptions { strength: Some(Tertiary), alternate_handling: None, case_first: None, max_variable: None, case_level: None, numeric: None, backward_second_level: None }
These are Equal
, because both strings are the same.
- {"label":"0373766","s1":"龜a","s2":"龜a","line":177900} CollatorOptions { strength: Some(Tertiary), alternate_handling: None, case_first: None, max_variable: None, case_level: None, numeric: None, backward_second_level: None }
These are Equal
, because both compatibility ideographs normalize to U+9F9C, and the collator always normalizes.
from conformance.
I timed out on attempting to run the tests locally. Anyway, my guess is that this is a test harness bug, and the most suspicious part of the test harness is this bit:
let result = comparison == Ordering::Less;
// TODO: How to do this easier?
let mut result_string = true;
if !result {
result_string = false;
}
from conformance.
Thank you @hsivonen!
Yeah, @sven-oly, here is a difference in code:
In Rust you say:
let result = comparison == Ordering::Less;
In JS you say:
let result_bool = true;
if (compared > 0) {
result_bool = false;
}
So it appears perhaps Rust is looking for <
and JS is looking for <=
from conformance.
from conformance.
Related Issues (20)
- Add C++ executor with ICU4C HOT 1
- Add Java executor with ICU4J HOT 1
- Integrate schema validation into executables
- Configure logging HOT 1
- Add errors and exceptions in test generation to the test data .json
- Turn off logging of progresss during GHA runs. HOT 1
- Remove extra logging of schema checks HOT 1
- Debug problems with collation data and missing verifications
- Fix code creating characterizations in testreport.py.
- Fix version labeling to use ICU4X version, not Rust
- Validate test case input and output at runtime
- Fix handling of non-matching surrogates in collation data.
- CPP Likely Subtags: Replace "_" with '-' to match expected results HOT 1
- Leave input line untransformed in the error handling
- Check ICU4C likely subtags for unsupported favorScript
- Executor for dart_native needs environment setup to execute HOT 1
- Add likely subtags tests HOT 3
- Add collation with non-ignorable option HOT 1
- Find reason for many collation shift failures in NodeJS
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from conformance.