Giter VIP home page Giter VIP logo

unicode-graphemes's Introduction

README for unicode-graphemes

This is a sample Java client for the ANTLR 4 Unicode grapheme cluster parser grammar:

https://github.com/antlr/grammars-v4/tree/master/unicode/graphemes

To Build and Install

% mvn install

Usage Example

import com.github.bhamiltoncx.UnicodeGraphemeParsing;

public class Example {
  public static void main(String[] strings) {
    for (String string : strings) {
      System.out.format("Parsing string: %s\n", string);
      for (UnicodeGraphemeParsing.Result grapheme : UnicodeGraphemeParsing.parse(string)) {
        String s = string.substring(grapheme.stringOffset, grapheme.stringOffset + grapheme.stringLength);
        String type = (grapheme.type == UnicodeGraphemeParsing.Result.Type.EMOJI ? "Emoji" : "Non-Emoji");
        System.out.format("%s: [%s] (offset=%d, length=%d)\n", type, s, grapheme.stringOffset, grapheme.stringLength);
      }
    }
  }
}

Full example

% mvn install
% javac -cp target/unicode-graphemes-0.1-SNAPSHOT.jar example/Example.java
% alias parse-graphemes="java -cp $HOME/.m2/repository/org/antlr/antlr4-runtime/4.7/antlr4-runtime-4.7.jar:$HOME/.m2//repository/com/github/bhamiltoncx/unicode-graphemes/0.1-SNAPSHOT/unicode-graphemes-0.1-SNAPSHOT.jar:example Example"
% parse-graphemes abc๐Ÿ˜€๐Ÿ’ฉ๐Ÿ‘ฎ๐Ÿฟโ€โ™€๏ธ๐Ÿ‘ฉโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ง๐Ÿ‡ซ๐Ÿ‡ฎ
Parsing string: abc๐Ÿ˜€๐Ÿ’ฉ๐Ÿ‘ฎ๐Ÿฟโ€โ™€๏ธ๐Ÿ‘ฉโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ง๐Ÿ‡ซ๐Ÿ‡ฎ
Non-Emoji: [a] (offset=0, length=1)
Non-Emoji: [b] (offset=1, length=1)
Non-Emoji: [c] (offset=2, length=1)
Emoji: [๐Ÿ˜€] (offset=3, length=2)
Emoji: [๐Ÿ’ฉ] (offset=5, length=2)
Emoji: [๐Ÿ‘ฎ๐Ÿฟโ€โ™€๏ธ] (offset=7, length=7)
Emoji: [๐Ÿ‘ฉโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ง] (offset=14, length=11)
Emoji: [๐Ÿ‡ซ๐Ÿ‡ฎ] (offset=25, length=4)

unicode-graphemes's People

Contributors

bhamiltoncx avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.