Giter VIP home page Giter VIP logo

Comments (7)

sffc avatar sffc commented on August 12, 2024

Hey @hsivonen, can you take a look at these? Here is how we configure the ICU4X collator:

    let mut options = CollatorOptions::new();
    options.strength = Some(Strength::Tertiary);

    // Ignore punctuation only if using shifted test.
    if let Some(ip) = ignore_punctuation {
        if *ip {
            options.alternate_handling = Some(AlternateHandling::Shifted);
        }
    }

    let collator: Collator = Collator::try_new(&locale!("en").into(), options).unwrap();

    let comparison = collator.compare(str1, str2);

let mut options = CollatorOptions::new();

Here is the corresponding JS code which passes the tests:

    // Locale if provided in the test data.
    let testLocale = undefined;
    if ('locale' in json) {
      testLocale = json['locale'];
    }
    let testCollOptions = {};
    if ('ignorePunctuation' in json) {
      testCollOptions = {
        ignorePunctuation:json['ignorePunctuation']}
    }

    // Get other fields if provided
    let rules = undefined;
    if ('rules' in json) {
      rules = json['rules'];
    }

    // Set up collator object with optional locale and testOptions.
    let coll;
    try {
      coll = new Intl.Collator(testLocale, testCollOptions);

      let d1 = json['s1'];
      let d2 = json['s2'];

      const compared = coll.compare(d1, d2);

// Locale if provided in the test data.

These cases have no locale option so I don't have reason to believe that the locale is at fault.

Can you advise on how to get the Intl.Collator behavior via ICU4X APIs?

from conformance.

hsivonen avatar hsivonen commented on August 12, 2024

Acknowledging that I've seen this. This needs stepping through in a debugger.

from conformance.

hsivonen avatar hsivonen commented on August 12, 2024

Do I understand correctly that your harness wants the outcome for all these three to be Less, but the outcome is Equal? If so, why is Less expected?

  • {"label":"0010001","s1":"𑜿!","s2":"𑜿?","line":8661,"ignorePunctuation":true} CollatorOptions { strength: Some(Tertiary), alternate_handling: Some(Shifted), case_first: None, max_variable: None, case_level: None, numeric: None, backward_second_level: None }

These are Equal, because the inputs only differ by punctuation, the mechanism for ignoring punctuation is shifting punctuation to the quaternary level and you only compare on the tertiary level.

  • {"label":"0243300","s1":"𑛁b","s2":"𑜱b","line":47434} CollatorOptions { strength: Some(Tertiary), alternate_handling: None, case_first: None, max_variable: None, case_level: None, numeric: None, backward_second_level: None }

These are Equal, because both strings are the same.

  • {"label":"0373766","s1":"龜a","s2":"龜a","line":177900} CollatorOptions { strength: Some(Tertiary), alternate_handling: None, case_first: None, max_variable: None, case_level: None, numeric: None, backward_second_level: None }

These are Equal, because both compatibility ideographs normalize to U+9F9C, and the collator always normalizes.

from conformance.

hsivonen avatar hsivonen commented on August 12, 2024

I timed out on attempting to run the tests locally. Anyway, my guess is that this is a test harness bug, and the most suspicious part of the test harness is this bit:

    let result = comparison == Ordering::Less;


    // TODO: How to do this easier?
    let mut result_string = true;
    if !result {
        result_string = false;
    }

from conformance.

sffc avatar sffc commented on August 12, 2024

Thank you @hsivonen!

Yeah, @sven-oly, here is a difference in code:

In Rust you say:

let result = comparison == Ordering::Less;

In JS you say:

      let result_bool = true;
      if (compared > 0) {
        result_bool = false;
      }

So it appears perhaps Rust is looking for < and JS is looking for <=

from conformance.

sven-oly avatar sven-oly commented on August 12, 2024

from conformance.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.