ridencww / goldengine Goto Github PK
View Code? Open in Web Editor NEWJava implementation of Devin Cook's GOLD Parser engine
License: Other
Java implementation of Devin Cook's GOLD Parser engine
License: Other
According to GOLD Parser's documentation, grammar files have a header at the beginning (cgt class reads it).
However, the Parser class does not check that header. This could lead to more errors.
I'm trying to parse line comments and the following error appears:
Runaway group (no closing group terminator found). Last position line 6, column 9.
It might be related to issue #17
To temporarly fix that issue I've used the following
String text = ltsInputString.getSource().replaceAll("\n", "\r");
When I remove the replacement of the /n for /r to fix issue 17, the comments work properly. But the errors are reported in the wrong line (issue 17 still occurs).
I'm running on Linux.
Block comments work.
GRM file contain:
Comment Start = '/*'
Comment End = '*/'
Comment Line = '//'
I don't think the whole grammar file will help.
If the parsed source file contains a line comment (provided of course that the used grammar specifies line comments) in the last non-empty code line then the parser fails with error.group_runaway
, no matter whether the file ends with a newline or not. Even if hundreds of newlines follow, the parsing will fail.
With end-standing block comments, the behaviour is correct (if the end symbol is missing, i.e., the group is actually unterminated then the error duefully occurs, otherwise it does not).
In the appendix, there is a combination of a Java grammar and a Java file that pass the test in GOLDBuilder but fail in the engine with the error described above.
final_line_comment_failure.zip
I have UTF-8 symbols like this
NOT = [¬]
in my grammar.
If I start generated java code like this
java -classpath .;./goldengine-5.0.3-SNAPSHOT prenex_FOLsn ..\Algor.fol -tree
I have got
Lexical error at line 5, column 37. Read (Error)
Parse tree is not available. Did you set generateTree(true)?
exectly where ¬ is in Algor.fol
What should I do?
Alex
UTF-8 encoded source files with a BOM will cause a parsing error. Parser needs to handle BOM properly.
For the reverse engineering facility in Structorizer (Nassi-Shneiderman diagram generation from source code parsing) we were interested in being able to associate identified source comments to the closest tokens.
We didn't find such a possibility, though, and wrote an own workaround. Have we missed something? Would it be a helpful enhancement, otherwise?
I add the respective GOLDParser subclass we wrote for this purpose (forget about the proprietary logging mechanism, which has nothing to do with it). It simply results in a hash map Token --> String (the protected field commentMap
). In the diagram generator, we then defined a further map Reduction --> Token, which is derived from all non-terminal entries of the token-comment map, and we used to define language-specific sets of production rule IDs as stoppers for the actual association of the retrieved comment strings to meaningful syntactical units (diagram elements) where we had to avoid that all comments of substructure elements were also attached to their containing compound statements, but this is of course an application-specifc detail, briefly outlined in the class comment.
AuParser.zip
If you decide to integrate the proposed comment retrieval infrastructure, it might be helpful to make the commentMap
field public.
When constructing an instance of the parser, an IllegalStateException is thrown when loading the CGT or EGT file. This can occur when the size of the tables exceed 128 items.
Hi Ralph,
Sorry if it's not the good place for that, but I'm a new user with github.
Just one question: why the Variable class is not an abstract class or an interface ? I'm asking that because I have the feeling the implementation of Variable is limited for me.
For example, Variable is using asDouble(), asInt() and asNumber() when it's possible (i think) to use one method for the integers and one for the doubles or to leave the choice to the final user of your GoldEngine to use int, Integer, long, Long, BigDecimal, BigInteger or any other types.
One goal is also for me to have the choice to use one implementation or another for this Variable. For example, for the fast execution, to use only basic types or for a very good precision, to use one with BigDecimal and BigInteger just switching the variable type.
Regards,
Stef
When building with Maven, only the goldengine.jar was built, which was confusing because of the documentation. The project always built with Ant, but Maven support was added so the engine.jar could be posted to the Maven Central Repository.
The readme says:
The goldengine for Java is compatible with version 5.0 of the GOLD parsing engine.
http://www.goldparser.org/news/index.htm mentions some changes after 5.0 - are there any that are relevant to the java engine implementation?
I was writing a simplified version of c grammar for my program, and i tested it with a couple of lines of codes in Golden Parser Generator, it was correct and worked well.
But when i took it in my program and tried to run it by ridencww-engine it didn't "reduce" some tokens well and came up with some errors.
I attached my Grammar and my Test-file so you can analyze it in your engine.
http://www.mediafire.com/download.php?sd4vs0sdcplau1a
Unfortunately i'm running out of time and i need to deliver program to my client soon so i have to use another engine to make it work right now, but i like your engine very much and i want to use it in my further projects.
Thank you.
Hello dear friend,
Firstly i want to thank you because of your handful and clean engine which you shared with us,
And then i want to mention a little bug :
The library in my PC (OS: windows 7) is in the path
"D:\Science\Lessons\Compiler\Gold-Grammer\Ralph Iden Engine\JavaLib"
and spaces in path causes a corrupted path in here :
File : ResourceHelper.java
Function: findClassesInPackage
Part-Codes:
URL resource = resources.nextElement();
jarFile = getJarFile(resource.toString()); <-- !!! problem here
if (jarFile != null) {
break;
}
dirs.add(new File(resource.getFile())); <-- !!! and here
-->
the deal is that when you convert URL to normal path through functions "getFile" and "toString" some special non-alphabetic characters will stay in URL form like space which was converted to %20 and that makes a incorrect path.
i changed it a little bit like this, so it is working well now.
-->
"D:\Science\Lessons\Compiler\Gold-Grammer\Ralph%20Iden%20Engine\JavaLib"
URL resource = resources.nextElement();
String path = URLDecoder.decode(resource.getFile(), "UTF-8");
jarFile = getJarFile(path);
if (jarFile != null) {
break;
}
dirs.add(new File(path));
Hi,
I have a strange problem. At first it was with my grammar but I adapted the grammar of the sample2 to use a new line based grammar:
{WS} = {Whitespace} - {CR} - {LF}
Whitespace = {WS}+
NewLine = {CR}{LF}|{CR}
<nl> ::= NewLine <nl> !One or more
| NewLine
<Statements> ::= <Statement> <nl> <Statements>
| <Statement> <nl>
Finally, here the kind of code I used with this grammar:
assign n = 1
while n >= 1 do display n
assign n = n - 1
end
display 'Blast off!'
This code is working perfectly with Gold Parser Builder 5.2 but if I try it with the Iden Java Engine, I have this error:
2013-06-25 23:20:23 ERROR HtmlController:255 - Lexical error at line 1, column 13. Read (Error)
2013-06-25 23:20:23 ERROR HtmlController:256 - assign n = 1
while n >= 1 do display n
assign n = n - 1
end
display 'Blast off!'
I just have "(Error)" without a real reason of the error.
Is there a mistake in my grammar or one problem with the Iden Java Engine ?
Thank you for your help.
Regards,
Stef
Hi @ridencww ,
When using this grammar with the builder & the engine, few of the reductions go missing in the tree with the engine parser. Could you please take a look at it. I'm using goldparser for the first time, so please let me know if there is something wrong in my grammar implementation or engine usage.
MANUFACTURER 252,
DEVICE_TYPE 1,
DEVICE_REVISION 1,
DD_REVISION 1,
MANUFACTURER_EXT "xyz"
BLOCK DeviceBlock
{
TYPE PHYSICAL;
NUMBER 1;
}
The grammar is for EDDL(Electronic device description language). The grammar implementation is as seen below.
"Name" = 'EDDL grammar'
"Author" = 'Ashwin Jason Fernandes'
"Version" = 'The version of the grammar and/or language'
"About" = 'A short description of the grammar'
"Case Sensitive" = True
"Start Symbol" =
{Hex Digit} = {Digit} + [abcdefABCDEF]
{Oct Digit} = [01234567]
{String Ch} = {Printable} - ["]
{Id Head} = {Letter} + [_]
{Id Tail} = {Id Head} + {Digit}
DecLiteral = [123456789]{digit}*
HexLiteral = 0X | 0x{Hex Digit}+
OctLiteral = 0{Oct Digit}*
FloatLiteral = {Digit}*'.'{Digit}+
Id = {Id Head}{Id Tail}*
! ===================================================================
! Comments
! ===================================================================
Comment Start = '/'
Comment End = '/'
Comment Line = '//'
! -------------------------------------------------
! Character Sets
! -------------------------------------------------
{String Chars} = {Printable} + {HT} - ["]
! -------------------------------------------------
! Terminals
! -------------------------------------------------
!Identifier = {Letter}{AlphaNumeric}*
StringLiteral = '"' {String Chars}* '"'
! -------------------------------------------------
! Constants / Literals
! -------------------------------------------------
::= | StringLiteral
::= | FloatLiteral
::= DecLiteral | HexLiteral | OctLiteral
! -------------------------------------------------
! Rules
! -------------------------------------------------
! The grammar starts below
::= | |
! ===================================================================
! EDD Identification Declaration
! ===================================================================
::= ','
|
::=
::= MANUFACTURER
| DEVICE_TYPE
| DEVICE_REVISION
| DD_REVISION
| MANUFACTURER_EXT
::=
|
|
|
|
|
|
! ===================================================================
! Type Declaration
! ===================================================================
! ===================================================================
! Block Declaration
! ===================================================================
::= BLOCK Id '{' '}'
::= TYPE ';'
| NUMBER';'
|
::= PHYSICAL
| TRANSDUCER
| FUNCTION
! ===================================================================
! Variable Declaration
! ===================================================================
::= VARIABLE Id '{' '}'
::=
|
|
|
|
|
|
|
::= TYPE ';'
| TYPE '(' ')'';'
| TYPE '{''}'
| TYPE '(' ')''{''}'
::= CLASS ';'
::= CONTAINED
| DIAGNOSTIC
| LOCAL
::= LABEL StringLiteral';' | LABEL '['Id']'';'
::= HELP StringLiteral';' | HELP '['Id']'';'
::= CONSTANT_UNIT '['Id']'';' | CONSTANT_UNIT StringLiteral';'
::= SCALING_FACTOR ';'
| MAX_VALUE ';'
| MIN_VALUE ';'
| DEFAULT_VALUE ';'
| INITIAL_VALUE ';'
|
|
|
::= INTEGER
| UNSIGNED_INTEGER
| FLOAT
| ASCII
| ENUMERATED
| BIT_ENUMERATED
::= '{'','StringLiteral'}'','|'{'','StringLiteral'}'
::= '{'','StringLiteral','StringLiteral',''}'','
| '{'','StringLiteral','StringLiteral',''}'
::= READ_TIMEOUT Id';'
| READ_TIMEOUT ';'
::= HANDLING ';'|HANDLING '&' ';'
::= READ
| WRITE
::= | '&' |
::= HARDWARE
| SOFTWARE
| CORRECTABLE
| UNCORRECTABLE
! ===================================================================
! Array Declaration
! ===================================================================
::= ARRAY Id '{' '}'
::= | TYPE Id ';' | NUMBER_OF_ELEMENTS ';' |
! ===================================================================
! Collection Declaration
! ===================================================================
::= COLLECTION Id '{' '}' | COLLECTION OF VARIABLE Id '{' '}'
::= | MEMBERS '{''}'|
::= Id','Id';'|
! ===================================================================
! List Declaration
! ===================================================================
::= LIST Id '{' '}'
::=
| TYPE Id ';'
| CAPACITY ';'
| COUNT ';'
| COUNT Id';'
|
! ===================================================================
! Command Declaration
! ===================================================================
::= COMMAND Id '{' '}'
::= BLOCK Id';'
| INDEX ';'
| NUMBER ';'
| OPERATION ';'
| TRANSACTION '{' '}'
| RESPONSE_CODES '{' '}'
|
::= READ
| WRITE
| COMMAND
| DATA_EXCHANGE
::= REQUEST '{''}' | REPLY '{''}' |
::= '['']' |
::= Id',' | Id |
::= Id',' | Id |
::= ','','StringLiteral';' |
::= SUCCESS
| MISC_ERROR
| MISC_WARNING
| DATA_ENTRY_ERROR
| MODE_ERROR
::= ','StringLiteral |
! ===================================================================
! Component Declaration
! ===================================================================
::= COMPONENT Id '{' '}'
::=
|
| CAN_DELETE ';'
| CLASSIFICATION ';'
| DECLARATION '{' '}'
| PROTOCOL Id';'
::= TRUE | FALSE
::= NETWORK_COMPONENT
::=
|
The grammar looks a bit funny when you view it. It looks fine in the edit mode. If you have any problems, please let me know. Will mail you the grammar.
Regards,
Ashwin
When I use your engine in Linux the row (line) doesn't increment when the text has only new empty lines.
For example:
A = 2
wrdjsadfklja
Error message: Syntax error at line 2, column 1.
Should be
Syntax error at line 6, column 1.
I suspect the issue is here:
Parser.java (class)
private void consumeBuffer(int count) {
if (count > 0 && count <= lookaheadBuffer.length()) {
// Adjust position
for (int i = 0; i < count; i++) {
char c = lookaheadBuffer.charAt(i);
if (c == 0x0A) {
if (sysPosition.getColumn() > 1) {
// Increment row if Unix EOLN (LF)
sysPosition.incrementLine();
}
} else if (c == 0x0D) {
sysPosition.incrementLine();
} else {
sysPosition.incrementColumn();
}
}
com.creativewidgetworks.goldparser.util.ResourceHelper
In method findClassesInPackage(String packageName)
resouce.getFile() strips away the "jar:" prefix.
The method getJarFile(String filePath) cannot find "jar:file:/" so it is not recognizing jar files.
In getJarFile( String filePath) replacing:
filePath = filePath.substring((filePath.indexOf("jar:file:/") + 9), filePath.indexOf('!'));
with:
filePath = filePath.substring( filePath.indexOf("/"), filePath.indexOf('!'));
Fixed it for me.
Thanks,
Kevin
First of all thanks a lot for your work. This is a wonderful library. I am actually creating a GOLD parser engine for golang using your codebase as a standard as it has an extensive set of test cases. So while porting the codebase, I found out that one of the assert statements in ParserTest.java is commented out at Line 293. Further there was a comment saying
will be null for 1.0 and 0 for 5.0
but actually when I un-commented it, I can see that value of groups.size()
is 2 instead of 0 or null. So I was just wondering, is there any issue or bug behind it?
Moreover, I am just curious to know, if the fixes for the issues (reported by @nimatrueway) are in the master or not?
Thanks again for your work.
I have got message
No rule handler for rule ::= Declaration Id sort .
But I have this handler (RuleHandler13.class) in my folder (\prenex_FOLsn) for handlers.
What should I do?
Due to the restrictions of LALR(1) grammars, some GOLD grammars provoke ambiguous situations for totally legal code. It's not always straightforward how to tweak the grammar for the engine to cope, it may not even be possible.
As far as I can see, the engine always stops on detecting an error. As workaround, the parsed code may be preprocessed, of course, in a trial and error manner, which is very time-consuming for large source files.
In some cases, however, it might be relatively easy to intervene manually and interactively to advise the engine which way to go in order to rerail the parsing process and resume.
Might there be a chance to allow an embedding application to do this, e.g. to skip a line, assign a token type or to decide for a certain reduction among a limited choice, and to resume the parsing process from that very point?
See e.g. these issues for details: fesch/Structorizer.Desktop#470 and fesch/Structorizer.Desktop#472.
Regards, Kay
I have a diff here that shows the change that works for me (I haven't tested it on Windows).
The leading '/' is stripped off the jar file path making it relative instead of absolute... so the jar file can't be found.
I just added 9 instead of 10 to remove the leading 'jar:file:' instead of 'jar:file:/' but I'm not sure if you'll ever see the double slash in a jar URI?
I forked your project and added maven support in a branch. With a little bit of work to your Ant build script I think we could get this working with Ant and Maven. Is this something you are interested in?
The execution doesn't occur using the rule handlers if generateTree(true)
is done.
Is this intentional for some reason ? if not then it will be nice to have both the tree and the reductions to be executed.
According to my hypothesis the following is the problem code in GOLDParser.class:
/**
* Base parser builds a tree of Reduction objects
* Override to process reductions
* @return Boolean to indicate if processing should stop (true) or continue (false).
*/
protected boolean processReduction() {
if (!generateTree && ruleHandlers.size() > 0) {
try {
Reduction reduction = createInstance();
setCurrentReduction(reduction);
} catch (Throwable t) {
addErrorMessage(t.getMessage());
return true;
}
}
return false;
}
I think it can be corrected by doing the following:
/**
* Base parser builds a tree of Reduction objects
* Override to process reductions
* @return Boolean to indicate if processing should stop (true) or continue (false).
*/
protected boolean processReduction() {
if ( ruleHandlers.size() > 0) {
try {
Reduction reduction = createInstance();
setCurrentReduction(reduction);
} catch (Throwable t) {
addErrorMessage(t.getMessage());
return true;
}
}
return false;
}
may I ask where to find document for the engine?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.