veged / ometa-js Goto Github PK

This project forked from alexwarth/ometa-js

OMeta for JavaScript

Home Page: http://veged.github.com/ometa-js/

License: MIT License

Makefile 0.06% JavaScript 99.94%

ometa-js's Introduction

   ____  __  ___     __            _______
  / __ \/  |/  /__  / /_____ _    / / ___/
 / / / / /|_/ / _ \/ __/ __ `/_  / /\__ \
/ /_/ / /  / /  __/ /_/ /_/ / /_/ /___/ /
\____/_/  /_/\___/\__/\__,_/\____//____/

OMetaJS

OMetaJS is a JavaScript implementation of OMeta, an object-oriented language for pattern matching.

This is a node.js module for developing and using such pattern matching grammars.

Installation

Installing npm (node package manager)

$ curl http://npmjs.org/install.sh | sh

Installing ometajs

$ [sudo] npm install ometajs -g

Note: If you are using ometajs programmatically you should not install it globally.

$ cd /path/to/your/project
$ npm install ometajs

Usage

Command line

$ ometajs2js --help

Usage:
  ometajs2js [OPTIONS] [ARGS]


Options:
  -h, --help : Help
  -v, --version : Version
  -i INPUT, --input=INPUT : Input file (default: stdin)
  -o OUTPUT, --output=OUTPUT : Output file (default: stdout)
  --root=ROOT : Path to root module (default: ometajs)

ometajs2js will take input *.ometajs file and produce a CommonJS- compatible javascript file.

You may also require('*.ometajs') files directly without compilation. (OMetaJS is patching require.extensions as CoffeeScript does).

Usage as CommonJS module

var ometajs = require('ometajs');

var ast = ometajs.grammars.BSJSParser.matchAll('var x = 1', 'topLevel'),
    code = ometajs.grammars.BSJSTranslator.matchAll([ast], 'trans');

Example grammar

ometa Simple {
  top = [#simple] -> 'ok'
}

More information about OMetaJS syntax.

Use cases

Quickly prototype and buildyour own parser/language. Process/traverse complex AST.

Some projects that are using OMetaJS:

More information

To study OMetaJS or ask questions about its core you can reach out to the original repository author Alessandro Warth or me.

Here is the documented code.

Contributors

ometa-js's People

Contributors

Stargazers

Watchers

Forkers

shanewholloway arikon coderpuppy the-grid michalliu emertechie pdubroy maserg sipayrt sevinf aredridel vkz kdawes grawk terryshisan

ometa-js's Issues

Problem: util.error is deprecated

Hello,

Here is what I got running ometajs2js --help (version 4.0.0) on node --version = 5.3.0:

(node) util.error is deprecated. Use console.error instead.

foreign rules

Right now foreign rules throws a not implemented exception.

Statements and ;

BSJS grammars seem to be quite frivolous with how they parse and particularly generate JS code.

var source = "function test() { for(var i = 0;i < 2;i++) { }; return [];}";

Of interest are ; before return and the one before the last }. Parsing and then generating JS using ometa BSJS grammars will produce:

function test(){for(var i = 0;i < 2;i++){};undefined;return []}

undefined; is there because at the moment empty statements are parsed into ['stmt', ['get', 'undefined']]. We should probably have a separate type for those
; after the very last statement gets dropped completely. This is because curlyTrans rule in BSJSTranslator simply joins all statements with rs.join(';'). Another consequence of joining is pervasive ; even after statements that didn't have semicolons after them in the original source. Latter is mostly problematic if we want to preserve the code-style of the origin (more or less). Esprima + Escodegen do the right thing here.

nodejs hangs on (~'.')+ rule

Example:

prop
    = (~'.')+:prop -> [#prop, prop.join('')]

XJST-like "local"

For deep context dependencies will be useful have something like local in XJST.

RegExp position error

ometa TestRe {
test = /test/i,
start = 'T',
end = 'T',
source = (start test end):x -> x,
}

console.log(TestRe.matchAll('TtestT', 'source'))
console.log(TestRe.matchAll('T.tesT', 'source')) // should not match!
console.log(TestRe.matchAll('T...testT', 'source')) // should match or not?

Maybe /test/i should be converted to /^test/i .

upgrade uglifyjs due to nodejs security advisories

https://nodesecurity.io/advisories/39
https://nodesecurity.io/advisories/48

Update to version 2.6.0 or later

Getting error when parsing "/"-simbol in expressions

How to reproduce:

Create file "div.ometajs" with content:

ometa Div {
    digit    = ^digit:d            -> d.digitValue(),
    divExpr  = digit:x '/' digit:y -> (x / y)
}

Execute: ometajs2js -i div.ometajs -o div.js

Error message:

Error: Lexer failer at: "/ y)

I'm using version from npm repo 3.1.12.

division sign again ...

div.ometajs:

ometa CalcInterpreter {
  interp = ['num' anything:x]        -> x
         | ['add' interp:x interp:y] -> (x + y)
         | ['sub' interp:x interp:y] -> (x - y)
         | ['mul' interp:x interp:y] -> (x * y)
         | ['div' interp:x interp:y] -> (x / y)
}
// any comment line

ometajs2js -i div.ometajs
throws an error: Unexpected end of file, null expected.

But if remove last row (a comment), that will work.
Also if (x / y) replace with something without division sign, for ex (x * y), that will work too.

PEG.js is faster

PEG.js in quite faster than OmetaJS, need to find a way to speedup grammars.

Use full featured JavaScript parser

According to issue alexwarth#9 OmetaJS buildin parser never supposed to support all of JavaScript. So we should use some full featured JavaScript parser.

Support for backreference?

for example I want to match:
'aa...aa' or 'aaa...aaa'
With regexp, I can write rule like this: /(a_)._\1/ , but how can I do in ometa?

How to enable left recursion with parameterized rules?

I'd like to write a left-recursive rule for logical operators. I can write something like:

logicPattern = logicPattern:lhs binop:op logicExpr:rhs -> makeOp(op, lhs, rhs)
    | logicExpr,
logicExpr = term | '(' logicPattern:e ')' -> e,
term = <char:c+>,
binop = '&' | '|' // and and or operators

However, I have a large number of different kinds of terms which should only be combined with like terms, so I'd like to parameterize logicPattern such that it will match only when both arguments are the same kind of term.

E.g.,

logicPattern :term = logicPattern(term):lhs binop:op logicExpr(term):rhs -> makeOp(op, lhs, rhs)
    | logicExpr(term)
logicExpr :term = apply(term) | '(' logicPattern(rule):e ')' -> e
binop = '&' | '|' // and and or operators

And then be able to apply logic for multiple different types of terms as follows:

term1 = <char:c>+,
term2 = <digit:d>+,
//etc
anyLogic = logicPattern(#term1) | logicPattern(#term2) | . etc,

When I try this, however, the stack blows up. Is there a way to enable left-recursion for paramterized rules like this?

Or is there a better approach?


I noticed that in `lib/ometajs/legacy/core.js:276`, there's a comment:

```javascript
// if you want your grammar (and its subgrammars) to memoize parameterized rules, invoke this method on it:
 ```

I assume this enables left-recursion for parameterized rules. Is there a way to do this with the newer code?

Using ometa-js for syntax highlighting with CodeMirror

I'd like to use ometa-js with codemirror.net to provide syntax highlighting for any ometa grammar.

Ideally I'd like to parse any grammar and produce a list of rule name matches and their position in the input. Is this currently possible and if not, what's best approach to doing so? I'm willing to try add this & submit pull request

Throw on "rule not found" instead of writing with console.error

This bit me hard. I was catching the case of grammar failing to match when the rule is missing:

function check(ast) {
  try {
    K.match(ast, 'topLevel');
  } catch (e) {
    return false;
  }
  return true;
}

ometa K <: KIdentity {
  noTopLevelRule
}

but it kept throwing up and tracing. Hours later I discover this delightful thing:
https://github.com/veged/ometa-js/blob/nodejs/lib/ometajs/core/grammar.js#L168-L171

@indutny is there a good reason not to throw here? I'd rather we didn't write to stderr which results in much weirdness when ometa apps are run in the console, which is always. Thx

"\u2028" or "\u2029" throws error

Example:

ometa Test {
eol = '\n' | '\r' | '\u2028' | '\u2029'
}

Error info:

SyntaxError: Unexpected token ILLEGAL
at Module._compile (module.js:437:25)
at Object.loadExtension [as .ometajs](C:UsersAdministratorAppDataRoamin
gnpmnode_modulesometajslibometajsapi.js:30:10)
at Module.load (module.js:356:32)
at Function.Module._load (module.js:312:12)
at Module.require (module.js:362:17)
at require (module.js:378:17)
at Object. (test-ometa.js:2:14)
at Module._compile (module.js:449:26)
at Object.Module._extensions..js (module.js:467:10)
at Module.load (module.js:356:32)

Error when one parser inherits from other

TypeError: object is not a function
    at Function.matchAll (/Users/arikon/projects/bem/borschik/node_modules/cssp/node_modules/ometajs/lib/ometajs/core/grammar.js:43:14)
    at Function.match (/Users/arikon/projects/bem/borschik/node_modules/cssp/node_modules/ometajs/lib/ometajs/core/grammar.js:30:15)
    at Function.BorschikCSSChilds.childs (/Users/arikon/projects/bem/borschik/lib/techs/css.ometajs.js:14:17)
    at exports.Tech.INHERIT.File.INHERIT.parseInclude (/Users/arikon/projects/bem/borschik/lib/techs/css.js:17:33)
    at exports.Tech.INHERIT.File.exports.File.INHERIT.parse (/Users/arikon/projects/bem/borschik/lib/tech.js:36:26)
    at exports.Tech.INHERIT.File.exports.File.INHERIT.read (/Users/arikon/projects/bem/borschik/lib/tech.js:26:37)
    at Cmd.<anonymous> (/Users/arikon/projects/bem/borschik/lib/borschik.js:40:18)
    at exports.Cmd.Cmd._do (/Users/arikon/projects/bem/borschik/node_modules/coa/lib/cmd.js:427:22)
    at _fulfilled (/Users/arikon/projects/bem/borschik/node_modules/coa/node_modules/q/q.js:934:32)
    at resolvedValue.promiseSend.done (/Users/arikon/projects/bem/borschik/node_modules/coa/node_modules/q/q.js:956:30)

stack overflow error

I wrote a parser which trying to parse something like interpolation expression, it is reported that stack overflow! But I tried the original version, it works.

It seems that the rewrite version is not as good as the old one. I also found that foreign rule is not supported in the new version, which is a very useful feature. Some issues I submitted here also only occur in this new version. So I'm just curious why rewrite it? What's the benifit of the new version?

Failed on parsing jquery (not minimized) - "parsererror"

It seems parser doesn't like "parsererror" token in the source.

Here is what I've got:

$:> node test/benchmark/bsjs.js
Source length: 252881

/Users/avakarev/Development/oss/ometa-js/lib/ometajs/core/grammar.js:46
    if (!errback) errback = function (err) { throw err; };
                                                   ^
Error: str rule failed at: 596:88
        if ( !xml || !xml.documentElement || xml.getElementsByTagName( "parsererror" ).length ) {
                                                                                        ^
    at Function.matchAll (/Users/avakarev/Development/oss/ometa-js/lib/ometajs/core/grammar.js:53:39)
    at Object.<anonymous> (/Users/avakarev/Development/oss/ometa-js/test/benchmark/bsjs.js:12:8)
    at Module._compile (module.js:446:26)
    at Object..js (module.js:464:10)
    at Module.load (module.js:353:31)
    at Function._load (module.js:311:12)
    at Array.0 (module.js:484:10)
    at EventEmitter._tickCallback (node.js:190:38)

Just interesting: why default branch is nodejs instead of master? :-)

FYI: anything I've tested is in branch in my fork: https://github.com/avakarev/ometa-js/tree/testing-to-parse-popular-libs

ometajs2js should have option to compile code without dependencies

Generated code by ometajs2js should be vanilla js but now it has ometajs dependency.
There is no real reason to install ometajs in each project with npm install and include as dependent module.
Generated parser can easily include required functionality of ometajs by itself. This way it could be included as is without any other moves. Profit.
How It can be resolved? Or how to build code without dependencies?

"Unexpected token: punc (:)" error when returning object literal

When I try to use a grammar like the following (that returns an object literal)

ometa Lisp {
  // Lexer
  identifier = <letter+>:id      -> { type: "Id", value: id },
  number     = <digit+>:num      -> { type: "Number", value: parseInt(num) },
  punctuator = '(' | ')' |'.' | ',',

  token :tt  = spaces ( punctuator:t             ?(t == tt)      -> t
                      | (identifier | number):t  ?(t.type == tt) -> t
                      ),

  // Parser
  list       = token("(") (atom | list)+:cs token(")") -> { type: "List", content: cs },
  atom       = token("Id") | token("Number")
}

I get the error:

Unexpected token: punc (:) (line: 2, col: 9, pos: 22)

Error
at new JS_Parse_Error (C:\Source\Tests\OMeta\node_modules\ometajs\node_modules\uglify-js\lib\parse-js.js:263:18)
at js_error (C:\Source\Tests\OMeta\node_modules\ometajs\node_modules\uglify-js\lib\parse-js.js:271:11)
at croak (C:\Source\Tests\OMeta\node_modules\ometajs\node_modules\uglify-js\lib\parse-js.js:733:9)
at token_error (C:\Source\Tests\OMeta\node_modules\ometajs\node_modules\uglify-js\lib\parse-js.js:740:9)
at unexpected (C:\Source\Tests\OMeta\node_modules\ometajs\node_modules\uglify-js\lib\parse-js.js:746:9)
at Object.semicolon [as 1] (C:\Source\Tests\OMeta\node_modules\ometajs\node_modules\uglify-js\lib\parse-js.js:766:43)
at prog1 (C:\Source\Tests\OMeta\node_modules\ometajs\node_modules\uglify-js\lib\parse-js.js:1314:21)
at simple_statement (C:\Source\Tests\OMeta\node_modules\ometajs\node_modules\uglify-js\lib\parse-js.js:906:27)
at C:\Source\Tests\OMeta\node_modules\ometajs\node_modules\uglify-js\lib\parse-js.js:801:47
at block_ (C:\Source\Tests\OMeta\node_modules\ometajs\node_modules\uglify-js\lib\parse-js.js:1003:20)

However if I call a function with that literal, then it works so I'm hacking around it for now like this:

function obj(o) { return o; }

ometa Lisp {
  // Lexer
  identifier = <letter+>:id      -> obj({ type: "Id", value: id }),
  number     = <digit+>:num      -> obj({ type: "Number", value: parseInt(num) }),
  punctuator = '(' | ')' |'.' | ',',

  token :tt  = spaces ( punctuator:t             ?(t == tt)      -> t
                      | (identifier | number):t  ?(t.type == tt) -> t
                      ),

  // Parser
  list       = token("(") (atom | list)+:cs token(")") -> obj({ type: "List", content: cs }),
  atom       = token("Id") | token("Number")
}

On a Windows box btw

the 'token' rule appears to be broken

I am just looking into OMeta, and ometa-js... I don't have enough experience to debug the issue, but I've made a pretty simple test case:

ometa Simple {
  top = cat:c mouse*:ms end -> [#top, c, ms],

  identifier = letter*:chars -> chars.join(''),

  cat = "cat" space* identifier:id -> [#cat, id],
  mouse = "mouse" space* identifier:id -> [#mouse, id]
}
console.log(Simple.matchAll('cat fluffy mouse mickey', 'top'));

The parse fails here on the space after 'fluffy'.
I would have expected it to succeed, based on the description of the 'token' rule (invoked with the double-quotes around 'cat', 'mouse'): eat spaces, and then match the text.

so, after matching the rule cat (consuming text "cat fluffy"), the next rule to match should be mouse, whose first rule is a 'token' rule, so the whitespace after 'fluffy' should be eaten, and then the text 'mouse' should be consumed.

The code works on @alexwarth playground: http://tinlizzie.org/ometa-js/#token_test

If I put space* in front of "mouse" (ie 'manually' making the token rule), then it works here in ometa-js.

veged / ometa-js Goto Github PK

ometa-js's Introduction

OMetaJS

Installation

Installing npm (node package manager)

Installing ometajs

Usage

Command line

Usage as CommonJS module

Example grammar

Use cases

More information

Contributors

ometa-js's People

Contributors

Stargazers

Watchers

Forkers

ometa-js's Issues

Recommend Projects

Recommend Topics

Recommend Org