Giter VIP home page Giter VIP logo

go-natural's People

Contributors

bisgardo avatar

go-natural's Issues

Optimization: Don't process numbers redundantly

There are two small optimization opportunities for Natural:

  • There's no need to check if the byte on right hand side is a number if the one on the left hand side isn't.

  • Iterate over both numbers simultaneously: If one number is shorter, we only need to keep reading the other number until it's been determined that it's larger.

Add benchmarks to measure if it makes a difference before implementing this.

Optimization: Don't compute numbers

Computing the numbers contained in the compared strings actually isn't necessary:

Longer numbers will always be larger. Therefore, after skipping leading zeros and resolving the lenghts of the numbers, we can actually just compare the individual digits as regular bytes.

The rule that numbers are always greater than non-numbers should take care of the rest.

Test correctness for strings with multi-byte characters

Strings are currently indexed on the byte level. As Natural should be correct given general UTF-8 encoded strings, this means that characters may be accessed "within" their multi-byte encoding.

In a UTF-8 encoded string, all bytes with a leading 0 are single-byte (ASCII) characters. This means that all bytes of multi-byte characters start with 1. This means that there cannot be any interference with numbers and multi-byte characters.

The only potential "issue" is that multi-byte characters may not compare to each other the "standard" way: As the characters are compared byte for byte the order will depend on the precise encoding (of which there may be more than one for some characters) rather than comparing the code point.

This is deemed an acceptable shortcoming: As long as the string is normalized (a reasonable requirement), the result is consistent ("longer" characters are "greater").

As a final note, the library is not intended to support character sets that UTF-8 are not compatible with.

Based on the above, the task is the following two items:

  1. Verify that all it is true by adding tests that explore the corner cases as well as possible.
  2. Add relevant reasoning, preconditions, and "nonstandard" behavior to the documentation.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.