camertron / antlr4-native-rb Goto Github PK
View Code? Open in Web Editor NEWCreate native Ruby extensions from (almost) any ANTLR4 grammar.
License: MIT License
Create native Ruby extensions from (almost) any ANTLR4 grammar.
License: MIT License
Hello
If I need to upgrade antlr to 4.10.1 what would be correct procedure ?
4.10+ is not compatibIe with 4.9- so I need 4.10 generator.
I see that see that antlr4-4.8-1-complete.jar
is used, so I suppose that I will need something like antlr4-4.10-1-complete.jar
. How shall I build it ?
So... C++, right? Has a macro somewhere called NULL
. And when tokens are generated in the source code of the parser they use the token itself as name. Now, SQL has a NULL
keyword so trying to generate a grammar for any flavor of it fails because NULL
is already defined.
For instance, take the Trino's (formerly PrestoSQL) grammar definition. This will fail to compile the generated .h/.cpp
files.
Expected behaviour:
It compiles normally.
Trying to use the MySQL grammar: https://github.com/antlr/grammars-v4/tree/master/sql/mysql/Positive-Technologies
Using rake task and the extconf.rb
from https://gist.github.com/lavoiesl/efed2ed8886b32d778c5fd30bdb16390
$ bin/rake antlr:runtime:clone # ok
$ bin/rake antlr:mysql:generate # ok
$ bin/rake antlr:mysql:compile
Running via Spring preloader in process 97444
checking for rice/rice.hpp in /Users/seb/.gem/ruby/3.2.2/gems/rice-4.3.1/include... yes
checking for -lc++... yes
checking for -lstdc++... yes
creating Makefile
compiling my_sql_parser.cpp
compiling ../lib/antlr4-cpp-runtime/Vocabulary.cpp
compiling ../lib/antlr4-cpp-runtime/WritableToken.cpp
...
compiling ../lib/antlr4-cpp-runtime/atn/LexerCustomAction.cpp
compiling ../lib/antlr4-cpp-runtime/atn/LexerIndexedCustomAction.cpp
compiling ../lib/antlr4-cpp-runtime/atn/LexerModeAction.cpp
my_sql_parser.cpp:41070:12: error: expected ')'
return Qnil;
^
/opt/rubies/3.2.2/include/ruby-3.2.0/ruby/internal/special_consts.h:60:25: note: expanded from macro 'Qnil'
#define Qnil RUBY_Qnil /**< @old{RUBY_Qnil} */
^
/opt/rubies/3.2.2/include/ruby-3.2.0/ruby/internal/special_consts.h:358:40: note: expanded from macro 'RUBY_Qnil'
#define RUBY_Qnil RBIMPL_CAST((VALUE)RUBY_Qnil)
^
my_sql_parser.cpp:41070:12: note: to match this '('
/opt/rubies/3.2.2/include/ruby-3.2.0/ruby/internal/special_consts.h:60:25: note: expanded from macro 'Qnil'
#define Qnil RUBY_Qnil /**< @old{RUBY_Qnil} */
^
/opt/rubies/3.2.2/include/ruby-3.2.0/ruby/internal/special_consts.h:358:21: note: expanded from macro 'RUBY_Qnil'
#define RUBY_Qnil RBIMPL_CAST((VALUE)RUBY_Qnil)
^
/opt/rubies/3.2.2/include/ruby-3.2.0/ruby/internal/cast.h:43:5: note: expanded from macro 'RBIMPL_CAST'
(expr) \
^
my_sql_parser.cpp:41070:12: error: reference to non-static member function must be called; did you mean to call it with no arguments?
return Qnil;
^~~~
/opt/rubies/3.2.2/include/ruby-3.2.0/ruby/internal/special_consts.h:60:25: note: expanded from macro 'Qnil'
#define Qnil RUBY_Qnil /**< @old{RUBY_Qnil} */
^~~~~~~~~
/opt/rubies/3.2.2/include/ruby-3.2.0/ruby/internal/special_consts.h:358:33: note: expanded from macro 'RUBY_Qnil'
#define RUBY_Qnil RBIMPL_CAST((VALUE)RUBY_Qnil)
^~~~~~~
/opt/rubies/3.2.2/include/ruby-3.2.0/ruby/internal/cast.h:43:6: note: expanded from macro 'RBIMPL_CAST'
(expr) \
^~~~
my_sql_parser.cpp:41076:12: error: expected ')'
return Qnil;
^
(Many errors like that)
fatal error: too many errors emitted, stopping now [-ferror-limit=]
compiling ../lib/antlr4-cpp-runtime/atn/LexerMoreAction.cpp
compiling ../lib/antlr4-cpp-runtime/atn/LexerPopModeAction.cpp
compiling ../lib/antlr4-cpp-runtime/atn/LexerPushModeAction.cpp
compiling ../lib/antlr4-cpp-runtime/atn/LexerSkipAction.cpp
compiling ../lib/antlr4-cpp-runtime/atn/LexerTypeAction.cpp
compiling ../lib/antlr4-cpp-runtime/atn/LookaheadEventInfo.cpp
compiling ../lib/antlr4-cpp-runtime/atn/NotSetTransition.cpp
compiling ../lib/antlr4-cpp-runtime/atn/OrderedATNConfigSet.cpp
compiling ../lib/antlr4-cpp-runtime/atn/ParseInfo.cpp
compiling ../lib/antlr4-cpp-runtime/atn/ParserATNSimulator.cpp
20 errors generated.
make: *** [my_sql_parser.o] Error 1
make: *** Waiting for unfinished jobs....
You can find the full output in the Gist above.
I tested similar steps with the lua parser used in the spec of this repo and it compiles just fine.
I suspect that the MySQL grammar is generating invalid C++, but the file is enormous and I don't really speak C++.
I found #15 and antlr/grammars-v4#1905, which seem related, but the MySQL grammar already uses NULL_LITERAL
and I tried renaming (TRUE|FALSE)
to *_LITERAL
in the lexer and parser, but it didn't change anything.
The method ContextProxy::getChildren
returns Ruby array of Ruby objects:
class ContextProxy {
...
Array getChildren() {
if (children == nullptr) {
children = new Array();
if (orig != nullptr) {
for (auto it = orig -> children.begin(); it != orig -> children.end(); it ++) {
Object parseTree = ContextProxy::wrapParseTree(*it);
if (parseTree != Nil) {
children -> push(parseTree);
}
}
}
}
return *children;
}
...
}
C++ actually crates a copy of *children
that is returned to caller. Objects in the array are not copied, of course.
When Ruby GarbageCollector frees the array it also unmarks and frees all objects contained in the array. However, the code above potentially reuses these objects without marking them. It looks like athe root cause for at least some of the crashes discussed at https://github.com/camertron/antlr4-native-rb#caveats
I have indirect validation of this idea. In the recent version of https://github.com/lutaml/expressir we had reproducible segfaults. Resolving the issue that I have described also resolved segfaults.
Currently, ParserProxy.visit method doesn't return any value.
VALUE visit(VisitorProxy* visitor) {
visitor -> visit(this -> parser -> syntax());
...
return Qnil;
}
My Ruby visitor returns objects from visit_* methods. How can I propagate them through ParserProxy.visit? I tried to update the method code in various ways and recompile, but haven't succeeded yet. Usually I received RuntimeError: std::bad_cast
or ArgumentError: Unable to convert antlrcpp::Any*
in runtime.
ParserProxy class has members that are pointers to parser, tokens, lexer and input Antlr objects
These objects are deleted in ParserProxy destructor
This can cause an issue if Ruby code had obtained a reference to one of these objects and after that ParserProxy is garbage collected.
This is the case described at Rice documentation https://jasonroelofs.com/rice/4.x/advanced/functions.html#keep-alive -- the second example.
In order to resolve this issue Return().keepAlive()
shall be applied to definition of ParserProxy methods
rb_cParser = define_class_under<ParserProxy>(rb_mExpressParser, "Parser")
.define_singleton_function("parse", &ParserProxy::parse)
.define_singleton_function("parse_file", &ParserProxy::parseFile)
.define_method("syntax", &ParserProxy::syntax, Return().keepAlive())
.define_method("visit", &ParserProxy::visit, Return().keepAlive());
I understand that this library is primarily meant to create a standalone gem with gemerator
. However, it can be used to generate part of a larger gem as well, where the user can require
the bundle where needed.
Would you mind changing to dasherize
to underscore
in gem_name
to match ruby dir name conventions, or maybe better remove gem_name
from interop_file
and antlrgen_dir
completely, give the full control to the caller with output_dir
param?
I stumbled upon this when using rake-compiler, default config assumes that the output bundle name is the same as the ext dir name. This produces a dasherized bundle name.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.