Giter VIP home page Giter VIP logo

lexical-analyzer-1's Introduction

Lexical Analyzer

The task of translates high level code, i.e., programming languages, into a format that can be understand by a computer - binary code - is the main job of a compiler. Speaking in a simple way, the compiler can be split in 3 parts:

  • Lexical Analyzer (LA)
  • Syntax Analyzer (SA)
  • Semantic Analyzer (SMA)

The Lexical Analyzer is responsible to separate the source code into lexemes, which are the words that compose the code. After separate all lexemes, the LA classify them using Token classification. Keywords, Special Symbols, Identifiers and Operators, are examples of tokens. Remove white spaces and comments of the compiled code is also a role played by the Lexical Analyzer. The output of this process is a table containing the lexemes and their token classification. Lexical errors as invalid constructions of lexemes, e.g. '12variableName', 'na;;me', are also captured by the LA.

This project is an implementation of a simple Lexical Analyzer made in Java. It provides a GUI where the user can type the code and get the tokens of it. It is also possible load the code from a file and make the analysis.

Recognized Tokens

The Lexical Analyzer of this project recognizes the following classes of tokens:

  • IDENTIFIER - Variable names;
  • STRING - Words between double quotes "";
  • INTEGER - Number with no dot ( . );
  • FLOAT - Float point numbers;
  • PLUS - ( + );
  • MINUS - ( - );
  • TIMES - ( * ),
  • DIVIDE - ( / );
  • KEYWORD - for, while, do, if, else, print, switch, case, default and null;
  • INVALID;
  • ASSIGN_OP - Assignment operator ( = );
  • SEMICOLON - ( ; )
  • LEFT_PARENTHESIS - '(';
  • RIGHT_PARENTHESIS - ')';
  • LEFT_BRACE - ( { );
  • RIGHT_BRACE - ( } );
  • COMMA - ( , );
  • DOT - ( . );
  • DOTDOT - ( .. );
  • COLON - ( : );
  • EQUAL - ( == );
  • LOWER_OR_EQUALS - ( <= );
  • GREATER_OR_EQUALS - ( >= );
  • NOT_EQUALS - ( <> );
  • GREATER_THAN - ( > );
  • LOWER_THAN - ( < );
  • AT_SIGN - ( @ ).

P.S. 1: Sentences initiated by // or chunks of sentences between / / are considered comments and are not mentioned in the output.

P.S. 2: The lexemes must be separated by at least one white space(' ') to be recognized as separated things.

Screenshot

drawing

Conclusion

This is a very simple example that demonstrates how a Lexical Analyzer can be implemented. This project is also an usage example of Finite-State Automata, a very powerful and useful tool.

lexical-analyzer-1's People

Contributors

felipetomazec avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.