Giter VIP home page Giter VIP logo

lexer's Introduction

image

GitHub release (latest by date)

It is a lexical analyzer based on DFA that is built using JS and supports multi-language extensions. For a quick understanding and experience , please check the online website

Document :中文 / English

Contents

1、Background

(1) Situation

Most lexical analyzers are closely coupled with the language, the amount of code is relatively large. It's hard to pay attention to the essential principles of lexical analyzer.

(2) Task

In order to focus on the working principle of lexical analyzer , not to consider the small differences caused by different languages , an idea of making a lexer project that is completely decoupled from the language was born.

(3) Solution

lexer through the following two files, realize the decoupling of lexical analyzer and language

  • src/lexer.js is the core part of lexical analyzer within 300 lines, including ISR and DFA
  • src/lang/{lang}-define.jsis the language extension of lexical analyzer. Support different languages,such as src/lang/c-define.js

2、Features

(1) Complete lexical analysis

From inputting the character sequence to generating token after the analysis, lexer has complete steps for lexical analysis, and 12 token types for most language extensions

(2) Support multi-language extension

lexer supports different language extensions such as Python, Go, etc. How to make different language extensions, please check Contributions

  • C :A popular programming language,click here to see its lexical analysis
  • SQL :A popular database query language,click here to see its lexical analysis
  • Goal :A goal parser problem from leetCode ,click here to see its lexical analysis

(3) Provide state flow log

The core mechanism of lexical analyzer is based on the state flow of DFA. For this reason, lexer records detailed state flow log to achieve the following requirements of you

  • Debug mode
  • Automatically generate DFA state flow diagram

3、Get project

After git clone command, no need for any dependencies, and no extra installation steps

4、Ussage

(1) In your project

If you need use lexer in your project, such as code editor, etc.

Using NPM

npm install chain-lexer
var chainLexer = require('chain-lexer');
let lexer = chainLexer.cLexer;

let stream = "int a = 10;";
lexer.start(stream);
let parsedTokens = lexer.DFA.result.tokens;

lexer = chainLexer.sqlLexer;
stream = "select * from test where id >= 10;";
lexer.start(stream);
parsedTokens = lexer.DFA.result.tokens;

Using Script

Import the package/{lang}-lexer.min.js file, then visit lexer variable to get the object of lexical analyzer,and visit lexer.DFA.result.tokens to get tokens

// 1. The code that needs lexical analysis
let stream = "int a = 10;";

// 2. Start lexical analysis
lexer.start(strem);

// 3. After the lexical analysis is done, get the generated tokens
let parsedTokens = lexer.DFA.result.tokens;

// 4. Do what you want to do
parsedTokens.forEach((token) => {
    // ... ...
});

The Provide state flow log part in features,visit flowModel.result.paths will get the detail logs of state flow inside lexer. The data format is as follows

[
    {
        state: 0, // now state
        ch: "a", // read char
        nextSstate: 2, // next state
        match: true, // is match
        end: false, // is last char
    },
    // ... ...
]

(2) Web preview and testing

In order to preview the process of lexer in real time, to debug and test, there is a index.html file in the root directory of this project. Open it directly in your browser, and after entering the code will automatically output the Token generated after lexer analysis, as shown in the figure below

int a = 10;
int b =20;
int c = 20;

float f = 928.2332;
char b = 'b';

if(a == b){
    printf("Hello, World!");
}else if(b!=c){
    printf("Hello, World! Hello, World!");
}else{
    printf("Hello!");
}

img

or check the online website

5、Contributions

(1) Project Statistics

(2) Source code explanation

Documents about source code development, project design, unit testing, automated testing, development specifications, and how to make extensions in different languages, please read source code explanation

(3) Content contribution

  • Add more new features
  • Add more extensions /src/lang/{lang}-define.js

(4) Release version

The project is released with the version number of A-B-C,regarding release log, you can check the CHANGELOG or the release record

  • A:Major upgrade
  • B:Minor upgrade
  • C:bug fix / features / ...

(5) Q&A

If you have any problems or questions, please submit an issue

6、License

GitHub

lexer's People

Contributors

wgrape avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

lexer's Issues

有些疑问

image
没有学过编译原理,想问一下这种难道不需要做处理吗?字符串里面有诸如"或者注释符,或者注释符后面出现双引号,这都不需要处理吗?

如果再多一个语法分析就完美了

为什么没有早点遇到这个 demo , 幸苦写了词法分析器,结果一对比,发现自己写得有点捞了……

词法分析出了 tokens ,但生成 AST 树我就有点头疼了,语法分析有没有更为简洁而优雅一点的写法?就像你这个 demo 中的写法,我从没见过这种写法,优雅,确实优雅!!! 如果你能继续写个语法分析就完美了……

这是我自己的语法分析,目前还有很多没处理,像一些循环,选择等的逻辑还识别不出来,现在只能简单的识别一些比较简单的声明
image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.