Giter VIP home page Giter VIP logo

haxe-strings's Introduction

haxe-strings - StringTools on steroids.

Build Status Release License Contributor Covenant

  1. What is it?
  2. The Strings utility class
  3. The String8 type
  4. The spell checker
  5. The string collection classes
  6. The StringBuilder class
  7. The Ansi class
  8. Random string generation
  9. Semantic version parsing with the Version type
  10. Installation
  11. Using the latest code
  12. License

What is it?

A haxelib for consistent cross-platform UTF-8 string manipulation.

All classes are located in the package hx.strings or below.

The library has been extensively unit tested (over 1,700 individual test cases) on the targets C++, C#, Eval, Flash, HashLink, Java, JavaScript (Node.js and PhantomJS), Neko, PHP 7 and Python 3.

Note:

  • When targeting PHP ensure the php_mbstring extension is enabled in the php.ini file. This extension is required for proper UTF-8 support.

Haxe compatiblity

haxe-strings Haxe
1.0.0 to 4.0.0 3.2.1 or higher
5.0.0 to 5.2.4 3.4.2 or higher
6.0.0 or higher 4.0.5 or higher

The Strings utility class

The hx.strings.Strings class provides handy utility methods for string manipulations.

It also contains improved implementations of functions provided by Haxe's StringTools class.

Methods ending with the letter 8 (e.g. length8(), indexOf8(), toLowerCase8()) are improved versions of similar methods provided by Haxe's String class, offering UTF-8 support and consistent behavior across all platforms (including Neko).

The hx.strings.Strings class can also be used as a static extension.

Some examples

using hx.strings.Strings; // augment all Strings with new functions

class Test {

   static function main() {
      // Strings are extended:
      "".isEmpty();      // returns true
      "   ".isBlank();   // returns true
      "123".isDigits();  // returns true
      "a".repeat(3);     // returns "aaa"
      "abc".reverse();   // returns "cba"

      // Int's are extended too:
      32.toChar().isSpace();       // returns true
      32.toChar().toString();      // returns " "
      32.toChar().isAscii();       // returns true
      6000.toChar().isAscii();     // returns false
      6000.toChar().isUTF8();      // returns true
      74.toHex();                  // returns "4A"

      // all functions are null-safe:
      var nullString:String = null;
      nullString.length8();         // returns 0
      nullString.contains("cat");   // returns false

      // all methods support UTF-8 on all platforms:
      "кот".toUpperCase8();         // returns "КОТ"
      "кот".toUpperCaseFirstChar(); // returns "Кот"
      "はいはい".length8();          // returns 4

      // ANSI escape sequence processing:
      "\x1B[1;33mHello World!\x1B[0m".ansiToHtml();            // returns '<span style="color:yellow;font-weight:bold;">Hello World!</span>'
      "\x1B[1;33mHello World!\x1B[0m".ansiToHtml(CssClasses);  // returns '<span class="ansi_fg_yellow ansi_bold">Hello World!</span>'
      "\x1B[1mHello World!\x1B[0m".removeAnsi();               // returns "Hello World!"

      // It is also possible to fully customize the css class names used using a callback:
      "\x1B[1;33mHello World!\x1B[0m".ansiToHtml(CssClassesCallback(function(st:hx.strings.AnsiState):String {
         var a : Array<String> = [];
         if (st.fgcolor != null) a.push("someprefix_fg_" + st.fgcolor + "_somesuffix");
         if (st.bgcolor != null) a.push("someprefix_bg_" + st.fgcolor);
         if (st.bold)            a.push("someprefix_bold");
         if (st.underline)       a.push("someprefix_underline");
         if (st.blink)           a.push("someprefix_blink");
         return a.join(" ");
       })));  // returns '<span class="ansi_fg_yellow ansi_bold">Hello World!</span>'

      // case formatting:
      "look at me".toUpperCamel();       // returns "LookAtMe"
      "MyCSSClass".toLowerCamel();       // returns "myCSSClass"
      "MyCSSClass".toLowerHyphen();      // returns "my-css-class"
      "MyCSSClass".toLowerUnderscore();  // returns "my_css_class"
      "myCSSClass".toUpperUnderscore();  // returns "MY_CSS_CLASS"

      // ellipsizing strings:
      "The weather is very nice".ellipsizeLeft(20);    // returns "The weather is ve..."
      "The weather is very nice".ellipsizeMiddle(20);  // returns "The weath...ery nice"
      "The weather is very nice".ellipsizeRight(20);   // returns "...ther is very nice"

      // string differences:
      "It's green".diffAt("It's red"); // returns 5
      "It's green".diff("It's red");   // returns { left: 'green', right: 'red', at: 5 }

      // hash codes:
      "Cool String".hashCode();       // generate a platform specific hash code
      "Cool String".hashCode(CRC32);  // generate a hash code using CRC32
      "Cool String".hashCode(JAVA);   // generate a hash code using the Java hash algorithm

      // cleanup:
      "/my/path/".removeLeading("/");       // returns "my/path/"
      "/my/path/".removeTrailing("/");      // returns "/my/path"
      "<i>So</i> <b>nice</b>".removeTags(); // returns "So nice"
   }
}

The String8 type

The hx.strings.String8 is an abstract type based on String. All exposed methods are UTF-8 compatible and have consistent behavior across platforms.

It can be used as a drop-in replacement for type String.

Example usage:

class Test {
   static function main() {
      var str:String = "はいはい";  // create a string with UTF8 chars
      str.length;  // will return different values depending on the default UTF8 support of the target platform

      var str8:String8 = str; // we assign the string to a variable of type String8 - because of the nature of Haxe`s abstract types this will not result in the creation of a new object instance
      str8.length;  // will return the correct character length on all platforms

      str8.ellipsizeLeft(2); // the String8 type automatically has all utility string functions provided by the Strings class.
   }
}

The type declaration in the Strings8.hx file is nearly empty because all methods are auto generated based on the static methods provided by the hx.strings.Strings class.

The spell checker

The package hx.strings.spelling contains an extensible spell checker implementation that is based on ideas outlined by Peter Norvig in his article How to write a Spell Checker.

The SpellChecker#correctWord() method can for example be used to implement a Google-like "did you mean 'xyz'?" feature for a custom search engine.

Now let's do some spell checking...

import hx.strings.spelling.checker.*;
import hx.strings.spelling.dictionary.*;
import hx.strings.spelling.trainer.*;

class Test {
   static function main() {
      /*
       * first we use the English spell checker with a pre-trained dictionary
       * that is bundled with the library:
       */
      EnglishSpellChecker.INSTANCE.correctWord("speling");  // returns "spelling"
      EnglishSpellChecker.INSTANCE.correctWord("SPELING");  // returns "spelling"
      EnglishSpellChecker.INSTANCE.correctWord("SPELLING"); // returns "spelling"
      EnglishSpellChecker.INSTANCE.correctWord("spell1ng"); // returns "spelling"
      EnglishSpellChecker.INSTANCE.correctText("sometinG zEems realy vrong!") // returns "something seems really wrong!"
      EnglishSpellChecker.INSTANCE.suggestWords("absance"); // returns [ "absence", "advance", "balance" ]

      /*
       * let's check the pre-trained German spell checker
       */
      GermanSpellChecker.INSTANCE.correctWord("schreibweise");  // returns "Schreibweise"
      GermanSpellChecker.INSTANCE.correctWord("Schreibwiese");  // returns "Schreibweise"
      GermanSpellChecker.INSTANCE.correctWord("SCHREIBWEISE");  // returns "Schreibweise"
      GermanSpellChecker.INSTANCE.correctWord("SCHRIBWEISE");   // returns "Schreibweise"
      GermanSpellChecker.INSTANCE.correctWord("Schre1bweise");  // returns "Schreibweise"
      GermanSpellChecker.INSTANCE.correctText("etwaz kohmische Aepfel ligen vör der Thür"); // returns "etwas komische Äpfel liegen vor der Tür"
      GermanSpellChecker.INSTANCE.suggestWords("Sistem");       // returns[ "System", "Sitte", "Sitten" ]

      /*
       * now we train our own dictionary from scratch
       */
      var myDict = new InMemoryDictionary("English");
      // download some training text with good vocabular
      var trainingText = haxe.Http.requestUrl("http://www.norvig.com/big.txt");
      // populating the dictionary might take a while:
      EnglishDictionaryTrainer.INSTANCE.trainWithString(myDict, trainingText);
      // let's use the trained dictionary with a spell checker
      var mySpellChecker = new EnglishSpellChecker(myDict);
      mySpellChecker.INSTANCE.correctWord("speling");  // returns "spelling"

      // since training a dictionary can be quite time consuming, we save
      // the analyzed words and their popularity/frequency to a file
      myDict.exportWordsToFile("myDict.txt");

      // the word list can later be loaded using
      myDict.loadWordsFromFile("myDict.txt");
   }
}

The string collection classes

The package hx.strings.collection contains some useful collection classes for strings.

  1. StringSet is a collection of unique strings. Each string is guaranteed to only exists once within the collection.

    var set = new hx.strings.collection.StringSet();
    set.add("a");
    set.add("a");
    set.add("b");
    set.add("b");
    // at this point the set only contains two elements: one 'a' and one 'b'
  2. SortedStringSet is a sorted collection of unique strings. A custom comparator can be provided for using different sorting algorithm.

  3. SortedStringMap is a map that is sorted by there keys (which are of type String).

The StringBuilder class

The hx.strings.StringBuilder class is an alternative to the built-in StringBuf. It provides an fluent API, cross-platform UTF-8 support and the ability to insert Strings at arbitrary positions.

import hx.strings.StringBuilder;

class Test {
   static function main() {
      // create a new instance with initial content
      var sb = new StringBuilder("def");

      // insert / add some strings via fluent API calls
      sb.insert(0, "abc")
         .newLine()   // appends "\n"
         .add("ghi")
         .addChar(106);

      sb.toString();  // returns "abcdef\nghij\n"

      sb.clear();     // reset the internal state

      sb.addAll(["a", 1, true, null]);

      sb.toString();  // returns "a1truenull"
   }
}

The Ansi class

The hx.strings.ansi.Ansi class provides functionalities to write ANSI escape sequences in a type-safe manner.

import hx.strings.ansi.Ansi;

class Test {
   static function main() {
      var stdout = Sys.stdout();

      stdout.writeString(Ansi.fg(RED));           // set the text foreground color to red
      stdout.writeString(Ansi.bg(WHITE));         // set the text background color to white
      stdout.writeString(Ansi.attr(BOLD));        // make the text bold
      stdout.writeString(Ansi.attr(RESET));       // reset all color or text attributes
      stdout.writeString(Ansi.clearScreen());     // clears the screen
      stdout.writeString(Ansi.cursor(MoveUp(2))); // moves the cursor 2 lines up

      // now let's work with the fluent API:
      var writer = Ansi.writer(stdout); // supports StringBuf, haxe.io.Ouput and hx.strings.StringBuilder
      writer
         .clearScreen()
         .cursor(GoToPos(10,10))
         .fg(GREEN).bg(BLACK).attr(ITALIC).write("How are you?").attr(RESET)
         .cursor(MoveUp(2))
         .fg(RED).bg(WHITE).attr(UNDERLINE).write("Hello World!").attr(UNDERLINE_OFF)
         .flush();
   }
}

Random string generation

The hx.strings.RandomStrings class contains methods to generate different types of random strings, e.g. UUIDs or alpha-numeric sequences.

import hx.strings.RandomStrings;
using hx.strings.Strings;

class Test {
   static function main() {
      RandomStrings.randomUUIDv4(); // generates a UUID according to RFC 4122 UUID Version 4, e.g. "f3cdf7a7-a179-464b-ae98-83f6659ae33f"
      RandomStrings.randomDigits(4); // generates a 4-char numeric strings, e.g. "4832"
      RandomStrings.randomAsciiAlphaNumeric(8); // generates a 8-char ascii alph numeric strings, e.g. "aZvDF34L"
      RandomStrings.randomSubstring("abcdefghijlkmn", 4); // returns a random 4-char substring of the given string, e.g. "defg"
   }
}

Semantic version parsing with the Version type

The hx.strings.Version type provides functionalities for parsing of and working with version strings following the SemVer 2.0 Specification.

import hx.strings.Version;

class Test {
   static function main() {

      var ver:Version;

      ver = new Version(11, 2, 4);
      ver.major;                  // returns 11
      ver.minor;                  // returns 2
      ver.patch;                  // returns 4
      ver.toString();             // returns '11.2.4'
      ver.nextPatch().toString(); // returns '11.2.5'
      ver.nextMinor().toString(); // returns '11.3.0'
      ver.nextMajor().toString(); // returns '12.0.0'

      ver = Version.of("11.2.4-alpha.2+exp.sha.141d2f7");
      ver.major;            // returns 11
      ver.minor;            // returns 2
      ver.patch;            // returns 4
      ver.isPreRelease;     // returns true
      ver.preRelease;       // returns 'alpha.2'
      ver.buildMetadata;    // returns 'exp.sha.141d2f7'
      ver.hasBuildMetadata; // returns true
      ver.nextPreRelease().toString(); // returns "11.2.4-alpha.3"

      var v1_0_0:Version = "1.0.0";
      var v1_0_1:Version = "1.0.1";

      v1_0_0 < v1_0_1;       // returns true
      v1_0_1 >= v1_0_0;      // returns true

      v1_0_1.isGreaterThan(v1_0_0); // returns true
      v1_0_1.isLessThan(Version.of("1.0.0"); // returns false

      Version.isValid("foobar"); // returns false
   }
}

Installation

  1. install the library via haxelib using the command:

    haxelib install haxe-strings
    
  2. use in your Haxe project

    • for OpenFL/Lime projects add <haxelib name="haxe-strings" /> to your project.xml
    • for free-style projects add -lib haxe-strings to your *.hxml file or as command line option when running the Haxe compiler

Using the latest code

Using haxelib git

haxelib git haxe-strings https://github.com/vegardit/haxe-strings main D:\haxe-projects\haxe-strings

Using Git

  1. check-out the main branch

    git clone https://github.com/vegardit/haxe-strings --branch main --single-branch D:\haxe-projects\haxe-strings --depth=1
    
  2. register the development release with haxe

    haxelib dev haxe-strings D:\haxe-projects\haxe-strings
    

License

All files are released under the Apache License 2.0.

Individual files contain the following tag instead of the full license text:

SPDX-License-Identifier: Apache-2.0

This enables machine processing of license information based on the SPDX License Identifiers that are available here: https://spdx.org/licenses/.

haxe-strings's People

Contributors

bolovevo avatar dependabot[bot] avatar grepsuzette avatar lguzzon avatar sebthom avatar singpolyma avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

haxe-strings's Issues

Other dictionaries

Are the dictionaries in a well known format? Is there a way to find other dictionaries?

RandomStrings alpha is wrong

65..92 is from A through \ whereas Z would be 90

67..123 overlaps with this and includes [ \ ] ^ _ and ` should be from 97

Haxe 4.2 deprecates Std.is resulting in build warnings

The following error occurs in my project which has a dependency on haxe-strings by referral of another library;

/usr/share/haxe/lib/haxe-strings/6,0,3/src/hx/strings/collection/StringMap.hx:72: characters 11-60 : Warning : Std.is is deprecated. Use Std.isOfType instead.
/usr/share/haxe/lib/haxe-strings/6,0,3/src/hx/strings/collection/StringMap.hx:77: characters 11-62 : Warning : Std.is is deprecated. Use Std.isOfType instead.

This happens on Haxe version 4.2.1, and I think it's been deprecated since 4.2, so relatively close :).

Demo code from README.md not compiling - missing isWhitespace & CallbackToCssClasses

COMPILER OUTPUT:
Source/Main.hx:28: characters 16-28 : String has no field isWhitespace
Source/Main.hx:43: lines 43-51 : Identifier 'CallbackToCssClasses' is not part of hx.strings.AnsiToHtmlRenderMethod
Source/Main.hx:43: lines 43-51 : For optional function argument 'renderMethod'

BUILDING TO: Neko -64

VERSIONS:
Haxe: 4.0.5
Haxe-strings: 5.2.4 from haxelib, current head of master
OpenFL: 8.9.5
OS: Peppermint Linux 10 Respin (Ubuntu 18.04.3)
IDE: Vim 8.2.121 w/ vaxe

Attached: project files
StringWiggle.zip

untyped __js__ is deprecated, use js.Syntax.code instead

Hi :)

I'm opening this now as from haxe 4.1.2 (or might be 4.1.0), any use of untyped __js__ logs a warning to change to js.Syntax.code.

...\.haxelib\haxe-strings/6,0,1/src/hx/strings/internal/OS.hx:24: characters 37-43 : Warning : __js__ is deprecated, use js.Syntax.code instead

Thanks!

Haxe development branch breaks Macros.is()

Building the Haxe development branch from source, I discovered that it breaks this library.

$ haxe --version
4.2.0-rc.1
$ haxelib version
4.0.2
$ haxelib run ihx
haxe interactive shell v0.4.0
running in interp mode
type "help" for help
>> addlib haxe-strings
added lib: haxe-strings
>> import hx.strings.Strings;
error: C:\HaxeToolkit\haxe\lib\haxe-strings/6,0,2/src/hx/strings/internal/Macros.hx:90: characters 27-29 : Warning : Using "is" as an identifier is deprecated

It looks like the Macros.is() function will need to be renamed to support future versions of Haxe.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.