Giter VIP home page Giter VIP logo

pegasus's Introduction

Build Status

pegasus

A PEG parser library for Crystal

Installation

Add this to your application's shard.yml:

dependencies:
  pegasus:
    github: pawandubey/pegasus

Usage

Let's look at a simple example for a URL query parser that can accept:

  • an optional & at the end of the query
  • an optional = at the end of the query
require "pegasus"

parser = Pegasus::Parser.define do |p|
  p.rule(:query) do |p| # complex base rule with alternatives, repetition and optional occurence
    (p.rule(:pair) >> p.rule(:sep) >> p.rule(:pair).maybe?).repeat | p.rule(:pair)
  end

  p.rule(:pair) { |p| p.rule(:str).aka(:key) >> p.str("=") >> p.rule(:str).aka(:val).maybe? }

  p.rule(:sep) { |p| p.str("&").aka(:sep) } #=> rename nodes with .aka
  p.rule(:str) { |p| p.match(/\A[^\s\/\\\.&=]+/) } #=> use regex wherever it makes sense

  p.root(:query)
end

res = parser.parse("name=ferret&color=purple") # get back the parse result
#=> #<Pegasus::MatchResult:0x1e418e0>

res.success? # find if parse was successful
#=> true

parse_tree = res.parse_tree # get the parse tree
#=> #<Pegasus::ParseTree:0x1e418c0>

parse_tree.dump # dump the parse tree to JSON
#=> {"label":"rep","children":[{"label":"seq","children":[{"label":"seq","children":[{"label":"seq","children":[{"label":"seq","children":[{"label":"key","item":"name"},{"label":"terminal","item":"="}]},{"label":"rep","children":[{"label":"val","item":"ferret"}]}]},{"label":"sep","item":"&"}]},{"label":"rep","children":[{"label":"seq","children":[{"label":"seq","children":[{"label":"key","item":"color"},{"label":"terminal","item":"="}]},{"label":"rep","children":[{"label":"val","item":"purple"}]}]}]}]}]}

For a more complex (and standard) example:

# simple calculator example

require "pegasus"

parser = Pegasus::Parser.define do |p|
  p.rule(:add) do |p|
    p.rule(:mul).aka(:l) >> (p.rule(:addop) >> p.rule(:mul)).repeat | p.rule(:mul)
  end

  p.rule(:mul) do |p|
    p.rule(:int).aka(:l) >> (p.rule(:mulop) >> p.rule(:int)).repeat | p.rule(:int)
  end

  p.rule(:int) do |p|
    p.rule(:digit).aka(:i) >> p.rule(:space?).ignore
  end

  p.rule(:addop) { |p| p.match(/\A[\+\-]/).aka(:o) >> p.rule(:space?).ignore }
  p.rule(:mulop) { |p| p.match(/\A[\*\/]/).aka(:o) >> p.rule(:space?).ignore }
  p.rule(:digit) { |p| p.match(/\A\d+/) }
  p.rule(:space?) { |p| p.match(/\A\s*/) }

  p.root(:add) # define root node to start the parse from
end

res = parser.parse("0-1 + 2 /4 * 51 ")

res.success? #=> true
res.error #=> nil ; this will give you more context on a parse failure.

Contributing

  1. Fork it ( https://github.com/pawandubey/pegasus/fork )
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create a new Pull Request

Contributors

pegasus's People

Contributors

pawandubey avatar

Stargazers

Yury Batenko avatar Alberto Colón Viera avatar Millo Evers avatar Sergey Fedorov avatar Will Lewis avatar Ryan Scott Lewis avatar Stephen Belanger avatar Hinrik Örn Sigurðsson avatar Jonas Kuche avatar Stephen Strickland avatar Chongchen Chen avatar  avatar Wooster avatar Nickolay avatar Hady Mahmoud avatar Szikszai Gusztáv avatar yāλu avatar

Watchers

James Cloos avatar  avatar  avatar

Forkers

hinoue

pegasus's Issues

Refactor Repeatable to use clamp

It just fails right now with an ArgumentError if the limits supplied are negative. Should also check for overflow and clamp the value between 0..UInt64::MAX

Better Errors

Currently, the error reporting mechanism is not ideal. We return a true/false result but we provide no extra info when a parse fails.

Ideally, upon failure, we should return:

  • The position (line/col) where the match failed.
  • What was expected to be matched
  • What was actually encountered

Performance issues

Right now it fails to parse the simple calculator grammar with:

GC Warning: Repeated allocation of very large block (appr. size 402657280):
	May lead to memory leak and poor performance.
Too many heap sections: Increase MAXHINCR or MAX_HEAP_SECTS
Program received and didn't handle signal IOT (6)

Investigate and make the spec pass at

pending "matches simple calculator" do

Convert parse tree into AST

Right now the parse tree returned as the result is too crowded with junk because of the way the parsing is handled.

For non-terminals/sequences if there is only a single child, it should be extracted into a terminal.
Likewise if there are no children, it should be gotten rid of.

This would help return a "clean" AST which is easier to operate on.

Rename the aka method

I found your project and I love that it's syntax is so similar to Ruby's parslet library. However, the aka method is kind of an odd name. Perhaps it could be renamed?

  • label
  • named
  • tag
  • id
  • ????

Your project's name conflicts with an existing PEG parser project.

Hi, I just discovered this project, and wanted to inform you that my PEG parser project has been using the name Pegasus since July of 2012 with our first release in August of the same year.

Here are some links to my project:
http://otac0n.com/Pegasus/
https://github.com/otac0n/Pegasus

I would kindly request that you consider renaming your project to avoid confusion when people are doing web searches for either of our projects.

I feel it is right for your project to rename for these reasons:

  • There is confusion when people search for "pegasus parser" as to whether our projects are associated
  • My project has been using the name "Pegasus" for 5 years more than your project.
  • Your project is currently released as version v0.1.0 , which implies that it has not yet had its first official release. According to SemVer (officially recommended by Crystal), it is still entirely appropriate to make changes to the project name:
  1. Major version zero (0.y.z) is for initial development. Anything may change at any time. The public API should not be considered stable.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.