Giter VIP home page Giter VIP logo

deno_tokenizer's Introduction

Tokenizer

Actions Status GitHub license

A simple tokenizer for deno.

Example

import { Tokenizer } from "https://deno.land/x/tokenizer/mod.ts";

const tokenizer = new Tokenizer("abc 123 HELLO [a cool](link)", [
    { type: "HELLO",  pattern: "HELLO" },
    { type: "WORD",   pattern: /[a-zA-Z]+/ },
    { type: "DIGITS", pattern: /\d+/, value: m => Number.parseInt(m.match) },
    { type: "LINK",   pattern: /\[([^\[]+)\]\(([^\)]+)\)/ },
    { type: "SPACE",  pattern: / /, ignore: true } // Or leave type blank and remove "ignore: true"
]);

// The first option:
// console.log(...tokenizer);
// => { type: "WORD", match: "abc", value: "abc", groups: [], position: { start: 0, end: 3 } },
//    { type: "DIGITS", match: "123", value: 123, groups: [], position: { start: 4, end: 7 } },
//    { type: "HELLO", match: "HELLO", value: "HELLO", groups: [], position: { start: 8, end: 13 } },
//    { type: "LINK", match: "[a cool](link)", value: "[a cool](link)", groups: [ "a cool", "link" ], position: { start: 14, end: 28 } }

// The second option:
while (!tokenizer.done) {
    console.log(tokenizer.next().value);
}
// => { type: "WORD", match: "abc", value: "abc", groups: [], position: { start: 0, end: 3 } }
// => { type: "DIGITS", match: "123", value: 123, groups: [], position: { start: 4, end: 7 } }
// => { type: "HELLO", match: "HELLO", value: "HELLO", groups: [], position: { start: 8, end: 13 } }
// => { type: "LINK", match: "[a cool](link)", value: "[a cool](link)", groups: [ "a cool", "link" ], position: { start: 14, end: 28 } }

// The third option:
// console.log(tokenizer.tokenize()); // Add a parameter to the tokenize method to override the source string
// => [ { type: "WORD", match: "abc", value: "abc", groups: [], position: { start: 0, end: 3 } },
//      { type: "DIGITS", match: "123", value: 123, groups: [], position: { start: 4, end: 7 } },
//      { type: "HELLO", match: "HELLO", value: "HELLO", groups: [], position: { start: 8, end: 13 } },
//      { type: "LINK", match: "[a cool](link)", value: "[a cool](link)", groups: [ "a cool", "link" ], position: { start: 14, end: 28 } } ]

TODO

  • Custom patterns using functions
  • Add position information to Token
  • Array patterns (Multiple patterns for the same rule)
  • Documentation
  • Better error handling
  • Group matching
  • Value transform
  • More and better tests for everything
  • Examples
  • Line and column information? Or just a helper function to get line and column from index
  • BNF / EBNF ?
  • Generate a tokenizer

deno_tokenizer's People

Contributors

eliassjogreen avatar sirhall avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.