Giter VIP home page Giter VIP logo

antlr4-native-rb's People

Contributors

alexeymorozov avatar camertron avatar dsisnero avatar maxirmx avatar zakjan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

antlr4-native-rb's Issues

antlr upgrade to 4.10.1

Hello
If I need to upgrade antlr to 4.10.1 what would be correct procedure ?
4.10+ is not compatibIe with 4.9- so I need 4.10 generator.
I see that see that antlr4-4.8-1-complete.jar is used, so I suppose that I will need something like antlr4-4.10-1-complete.jar. How shall I build it ?

NULL as a keyword in the target language breaks compilation

So... C++, right? Has a macro somewhere called NULL. And when tokens are generated in the source code of the parser they use the token itself as name. Now, SQL has a NULL keyword so trying to generate a grammar for any flavor of it fails because NULL is already defined.

For instance, take the Trino's (formerly PrestoSQL) grammar definition. This will fail to compile the generated .h/.cpp files.

Expected behaviour:
It compiles normally.

Unable to compile MySQL

Trying to use the MySQL grammar: https://github.com/antlr/grammars-v4/tree/master/sql/mysql/Positive-Technologies

Using rake task and the extconf.rb from https://gist.github.com/lavoiesl/efed2ed8886b32d778c5fd30bdb16390

$ bin/rake antlr:runtime:clone # ok
$ bin/rake antlr:mysql:generate # ok


$ bin/rake antlr:mysql:compile
Running via Spring preloader in process 97444
checking for rice/rice.hpp in /Users/seb/.gem/ruby/3.2.2/gems/rice-4.3.1/include... yes
checking for -lc++... yes
checking for -lstdc++... yes
creating Makefile
compiling my_sql_parser.cpp
compiling ../lib/antlr4-cpp-runtime/Vocabulary.cpp
compiling ../lib/antlr4-cpp-runtime/WritableToken.cpp
...
compiling ../lib/antlr4-cpp-runtime/atn/LexerCustomAction.cpp
compiling ../lib/antlr4-cpp-runtime/atn/LexerIndexedCustomAction.cpp
compiling ../lib/antlr4-cpp-runtime/atn/LexerModeAction.cpp
my_sql_parser.cpp:41070:12: error: expected ')'
    return Qnil;
           ^
/opt/rubies/3.2.2/include/ruby-3.2.0/ruby/internal/special_consts.h:60:25: note: expanded from macro 'Qnil'
#define Qnil            RUBY_Qnil              /**< @old{RUBY_Qnil} */
                        ^
/opt/rubies/3.2.2/include/ruby-3.2.0/ruby/internal/special_consts.h:358:40: note: expanded from macro 'RUBY_Qnil'
#define RUBY_Qnil   RBIMPL_CAST((VALUE)RUBY_Qnil)
                                       ^
my_sql_parser.cpp:41070:12: note: to match this '('
/opt/rubies/3.2.2/include/ruby-3.2.0/ruby/internal/special_consts.h:60:25: note: expanded from macro 'Qnil'
#define Qnil            RUBY_Qnil              /**< @old{RUBY_Qnil} */
                        ^
/opt/rubies/3.2.2/include/ruby-3.2.0/ruby/internal/special_consts.h:358:21: note: expanded from macro 'RUBY_Qnil'
#define RUBY_Qnil   RBIMPL_CAST((VALUE)RUBY_Qnil)
                    ^
/opt/rubies/3.2.2/include/ruby-3.2.0/ruby/internal/cast.h:43:5: note: expanded from macro 'RBIMPL_CAST'
    (expr)                                  \
    ^
my_sql_parser.cpp:41070:12: error: reference to non-static member function must be called; did you mean to call it with no arguments?
    return Qnil;
           ^~~~
/opt/rubies/3.2.2/include/ruby-3.2.0/ruby/internal/special_consts.h:60:25: note: expanded from macro 'Qnil'
#define Qnil            RUBY_Qnil              /**< @old{RUBY_Qnil} */
                        ^~~~~~~~~
/opt/rubies/3.2.2/include/ruby-3.2.0/ruby/internal/special_consts.h:358:33: note: expanded from macro 'RUBY_Qnil'
#define RUBY_Qnil   RBIMPL_CAST((VALUE)RUBY_Qnil)
                                ^~~~~~~
/opt/rubies/3.2.2/include/ruby-3.2.0/ruby/internal/cast.h:43:6: note: expanded from macro 'RBIMPL_CAST'
    (expr)                                  \
     ^~~~
my_sql_parser.cpp:41076:12: error: expected ')'
    return Qnil;
           ^

(Many errors like that)

fatal error: too many errors emitted, stopping now [-ferror-limit=]
compiling ../lib/antlr4-cpp-runtime/atn/LexerMoreAction.cpp
compiling ../lib/antlr4-cpp-runtime/atn/LexerPopModeAction.cpp
compiling ../lib/antlr4-cpp-runtime/atn/LexerPushModeAction.cpp
compiling ../lib/antlr4-cpp-runtime/atn/LexerSkipAction.cpp
compiling ../lib/antlr4-cpp-runtime/atn/LexerTypeAction.cpp
compiling ../lib/antlr4-cpp-runtime/atn/LookaheadEventInfo.cpp
compiling ../lib/antlr4-cpp-runtime/atn/NotSetTransition.cpp
compiling ../lib/antlr4-cpp-runtime/atn/OrderedATNConfigSet.cpp
compiling ../lib/antlr4-cpp-runtime/atn/ParseInfo.cpp
compiling ../lib/antlr4-cpp-runtime/atn/ParserATNSimulator.cpp
20 errors generated.
make: *** [my_sql_parser.o] Error 1
make: *** Waiting for unfinished jobs....

You can find the full output in the Gist above.


I tested similar steps with the lua parser used in the spec of this repo and it compiles just fine.

I suspect that the MySQL grammar is generating invalid C++, but the file is enormous and I don't really speak C++.


I found #15 and antlr/grammars-v4#1905, which seem related, but the MySQL grammar already uses NULL_LITERAL and I tried renaming (TRUE|FALSE) to *_LITERAL in the lexer and parser, but it didn't change anything.

A probable root cause of segfaults (1)

The method ContextProxy::getChildren returns Ruby array of Ruby objects:

class ContextProxy {
...
  Array getChildren() {
    if (children == nullptr) {
      children = new Array();

      if (orig != nullptr) {
        for (auto it = orig -> children.begin(); it != orig -> children.end(); it ++) {
          Object parseTree = ContextProxy::wrapParseTree(*it);

          if (parseTree != Nil) {
            children -> push(parseTree);
          }
        }
      }
    }

    return *children;
  }
...
}

C++ actually crates a copy of *children that is returned to caller. Objects in the array are not copied, of course.
When Ruby GarbageCollector frees the array it also unmarks and frees all objects contained in the array. However, the code above potentially reuses these objects without marking them. It looks like athe root cause for at least some of the crashes discussed at https://github.com/camertron/antlr4-native-rb#caveats

I have indirect validation of this idea. In the recent version of https://github.com/lutaml/expressir we had reproducible segfaults. Resolving the issue that I have described also resolved segfaults.

Return value from ParserProxy.visit method

Currently, ParserProxy.visit method doesn't return any value.

  VALUE visit(VisitorProxy* visitor) {
    visitor -> visit(this -> parser -> syntax());
    ...
    return Qnil;
  }

My Ruby visitor returns objects from visit_* methods. How can I propagate them through ParserProxy.visit? I tried to update the method code in various ways and recompile, but haven't succeeded yet. Usually I received RuntimeError: std::bad_cast or ArgumentError: Unable to convert antlrcpp::Any* in runtime.

A probable root cause of segfaults (2)

ParserProxy class has members that are pointers to parser, tokens, lexer and input Antlr objects
These objects are deleted in ParserProxy destructor

This can cause an issue if Ruby code had obtained a reference to one of these objects and after that ParserProxy is garbage collected.
This is the case described at Rice documentation https://jasonroelofs.com/rice/4.x/advanced/functions.html#keep-alive -- the second example.

In order to resolve this issue Return().keepAlive() shall be applied to definition of ParserProxy methods

  rb_cParser = define_class_under<ParserProxy>(rb_mExpressParser, "Parser")
    .define_singleton_function("parse", &ParserProxy::parse)
    .define_singleton_function("parse_file", &ParserProxy::parseFile)
    .define_method("syntax", &ParserProxy::syntax, Return().keepAlive())
    .define_method("visit", &ParserProxy::visit, Return().keepAlive());

Dasherized output dir is different from ruby dir name conventions

I understand that this library is primarily meant to create a standalone gem with gemerator. However, it can be used to generate part of a larger gem as well, where the user can require the bundle where needed.

Would you mind changing to dasherize to underscore in gem_name to match ruby dir name conventions, or maybe better remove gem_name from interop_file and antlrgen_dir completely, give the full control to the caller with output_dir param?

I stumbled upon this when using rake-compiler, default config assumes that the output bundle name is the same as the ext dir name. This produces a dasherized bundle name.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.