metthal / ifj-projekt Goto Github PK
View Code? Open in Web Editor NEWIntepret pre podjazyk jazyka PHP do predmetu IFJ (2013/2014)
Home Page: https://github.com/metthal/IFJ-Projekt/wiki
Intepret pre podjazyk jazyka PHP do predmetu IFJ (2013/2014)
Home Page: https://github.com/metthal/IFJ-Projekt/wiki
Code $x = put_string("abc \x");
ends up with scanner error.
From ifj2013.pdf: Escape sequence can't create error.
Another problem is $x = put_string("\x\n");
where \n
isn't interpreted as escape sequence because of \x
.
Last example $x = put_string("\x\$");
also create an error, altough this should be completely valid.
Provide function with KMP algorithm implemented. Marked with minor priority right now as it is not important for us at this moment.
Create homogenous vector with basic interface.
Generic type, constructor will receive size of one item that will be later used by all non-specific methods like pop, size, capacity.
Specific methods can be:
void vectorPush(Vector* vec, SpecificItem item);
SpecificItem vectorAt(Vector* vec, uint32_t index);
0 (master) metthal [~/skola/IFJ/IFJ-Projekt]$ make
gcc -Wall -Wextra -c -O2 -s -std=c99 string.c -o string.o
gcc -Wall -Wextra -c -O2 -s -std=c99 nierr.c -o nierr.o
gcc -Wall -Wextra -c -O2 -s -std=c99 convert.c -o convert.o
gcc -Wall -Wextra -c -O2 -s -std=c99 parser.c -o parser.o
In file included from parser.c:1:0:
scanner.h:75:1: error: unknown type name ‘FILE’
FILE* scannerOpenFile(const char *fileName);
^
parser.c: In function ‘parse’:
parser.c:13:5: warning: implicit declaration of function ‘prog’ [-Wimplicit-function-declaration]
prog();
^
parser.c: At top level:
parser.c:16:6: warning: conflicting types for ‘prog’ [enabled by default]
void prog()
^
parser.c:13:5: note: previous implicit declaration of ‘prog’ was here
prog();
^
parser.c: In function ‘prog’:
parser.c:27:13: warning: implicit declaration of function ‘body’ [-Wimplicit-function-declaration]
body();
^
parser.c: At top level:
parser.c:35:6: warning: conflicting types for ‘body’ [enabled by default]
void body()
^
parser.c:27:13: note: previous implicit declaration of ‘body’ was here
body();
^
parser.c: In function ‘body’:
parser.c:43:13: warning: implicit declaration of function ‘stmt’ [-Wimplicit-function-declaration]
stmt();
^
parser.c:55:21: warning: implicit declaration of function ‘func’ [-Wimplicit-function-declaration]
func();
^
parser.c: At top level:
parser.c:88:6: warning: conflicting types for ‘func’ [enabled by default]
void func()
^
parser.c:55:21: note: previous implicit declaration of ‘func’ was here
func();
^
parser.c: In function ‘func’:
parser.c:110:5: warning: implicit declaration of function ‘paramList’ [-Wimplicit-function-declaration]
paramList();
^
parser.c:130:5: warning: implicit declaration of function ‘stmtList’ [-Wimplicit-function-declaration]
stmtList();
^
parser.c: At top level:
parser.c:143:6: warning: conflicting types for ‘stmtList’ [enabled by default]
void stmtList()
^
parser.c:130:5: note: previous implicit declaration of ‘stmtList’ was here
stmtList();
^
parser.c:188:6: warning: conflicting types for ‘stmt’ [enabled by default]
void stmt()
^
parser.c:43:13: note: previous implicit declaration of ‘stmt’ was here
stmt();
^
parser.c: In function ‘stmt’:
parser.c:201:13: warning: implicit declaration of function ‘expr’ [-Wimplicit-function-declaration]
expr();
^
parser.c:294:21: warning: implicit declaration of function ‘elseifStmt’ [-Wimplicit-function-declaration]
elseifStmt();
^
parser.c:297:21: warning: implicit declaration of function ‘elseStmt’ [-Wimplicit-function-declaration]
elseStmt();
^
parser.c:359:21: warning: implicit declaration of function ‘forStmt1’ [-Wimplicit-function-declaration]
forStmt1();
^
parser.c:371:21: warning: implicit declaration of function ‘forStmt2’ [-Wimplicit-function-declaration]
forStmt2();
^
parser.c: At top level:
parser.c:427:6: warning: conflicting types for ‘elseifStmt’ [enabled by default]
void elseifStmt()
^
parser.c:294:21: note: previous implicit declaration of ‘elseifStmt’ was here
elseifStmt();
^
parser.c:505:6: warning: conflicting types for ‘elseStmt’ [enabled by default]
void elseStmt()
^
parser.c:297:21: note: previous implicit declaration of ‘elseStmt’ was here
elseStmt();
^
parser.c:562:6: warning: conflicting types for ‘paramList’ [enabled by default]
void paramList()
^
parser.c:110:5: note: previous implicit declaration of ‘paramList’ was here
paramList();
^
parser.c: In function ‘paramList’:
parser.c:574:13: warning: implicit declaration of function ‘nparamList’ [-Wimplicit-function-declaration]
nparamList();
^
parser.c: At top level:
parser.c:584:6: warning: conflicting types for ‘nparamList’ [enabled by default]
void nparamList()
^
parser.c:574:13: note: previous implicit declaration of ‘nparamList’ was here
nparamList();
^
parser.c:614:6: warning: conflicting types for ‘forStmt1’ [enabled by default]
void forStmt1()
^
parser.c:359:21: note: previous implicit declaration of ‘forStmt1’ was here
forStmt1();
^
parser.c:641:6: warning: conflicting types for ‘forStmt2’ [enabled by default]
void forStmt2()
^
parser.c:371:21: note: previous implicit declaration of ‘forStmt2’ was here
forStmt2();
^
parser.c:657:6: warning: conflicting types for ‘expr’ [enabled by default]
void expr()
^
parser.c:201:13: note: previous implicit declaration of ‘expr’ was here
expr();
^
make: *** [parser.o] Error 1
Add doxylike comments to your code. Self explanatory.
Enrich bottom-up parser with unary minus. Precedence table doesn't have to be updated, only during the reduction, if the topmost terminal is STT_Minus
, look at the previous token.
If it is NonTerminal
then it supposes to be rule E -> E - E
, otherwise it is E -> -E
As soon as program enters get_string() function, it is stuck there and nothing happens event after ENTER or CTLR + D.
Can be tested with following code
<?php
$x = get_string();
This will be the main subject of next meeting, because we this is the biggest blocker for us right now. I advice you to watch 2008 demos with date 30.10. and further.
Starting tests...
StringTests:
Starting test suite in test_string.c...
Suite tests passed: 51
Suite tests failed: 0
ScannerTests:
Starting test suite in test_scanner.c...
GetToken - double [ FAIL ]
Test at test_scanner.c:202 failed!
SHOULD_EQUAL(token->type == STT_Double)
GetToken - double - Value [ FAIL ]
Test at test_scanner.c:203 failed!
SHOULD_EQUAL(token->d == 12E-1)
GetToken - double - EOF [ FAIL ]
Test at test_scanner.c:205 failed!
SHOULD_EQUAL(token->type == STT_EOF)
Suite tests passed: 91
Suite tests failed: 3
TokenVectorTests:
Starting test suite in test_vector.c...
Suite tests passed: 30
Suite tests failed: 0
IalTests:
Starting test suite in test_ial.c...
Suite tests passed: 11
Suite tests failed: 0
Result:
Total tests passed: 183
Total tests failed: 3
We need to write lot of test cases to test all possibilities of input. Additional info will be added later.
1. Construct precedence table
2. Implement bottom-up analysis based on precedence table for operators and logical expressions
3. Provide functions to extend parser with expression evaluation
Update 17.11.2013
Terminal to non-terminal, binary operations and parentheses are done, but we still need support for functions
Instruction generation is missing
Update 18.11.2013
Just instruction generation is missing
As we now have scanner, we should also write tests for it.
If the substring
is not part of the string
, length of the string is returned instead of -1. Same applies if the searched string is longer than the original one.
<?php
$str = "acdc";
$tmp = put_string(find_string($str, "a"), "\n");
$tmp = put_string(find_string($str, "c"), "\n");
$tmp = put_string(find_string($str, "d"), "\n");
$tmp = put_string(find_string($str, "0"), "\n");
With this source code
<?php
$a = 3 * 4;
From the debugger
(gdb) bt
#0 0x0000000000409b72 in interpretationLoop (firstInstruction=0x615270, constTable=0x6142a0, addressTable=0x6142d0, stack=0x614e40) at interpreter.c:283
#1 0x000000000040a82f in interpret (firstInstruction=0x615270, constTable=0x6142a0, addressTable=0x6142d0) at interpreter.c:617
#2 0x0000000000404af9 in parse (tokenVector=0x0, testRun=0 '\000') at parser.c:127
#3 0x000000000040741e in main (argc=2, argv=0x7fffffffe058) at main.c:28
(gdb) p *resVal
Cannot access memory at address 0x0
Seems like missing allocation of Value
on the stack before storing the result.
Based on the dynamic array, stack should be implemented. Standalone PR for this one because this one is not blocker and can wait.
The point of this issue:
convert.c
12ab
valid input and can be converted to 12
?Update 16.11.2013
Committed some of the changes, but what I am still curious about, is ERR_Convert
setting in other than stringToDouble
function as it is not needed anywhere in our interpreter. The only possible error can be achieved only in stringToDouble
when there is non-empty string starting with non-numeric format.
<?php
function foo()
{
return 5;
}
$x = put_string(foo(), "\n");
After IST_Call
instruction is processed, instructionPtr
is set to the new instruction from addressTable
, however it seems to be uninitialized.
(gdb) call vectorSize(addressTable)
$4 = 1
(gdb) call *(Instruction*)vectorAt(addressTable, 0)
$5 = {code = 6378656, a = 0, b = 0, res = 0}
Based on finite state machine, scanner needs to be implemented.
Scanner has to contain:
<?php
$x = put_string(find_string("cccababaccc", "ab"), "\n")
Throws
Error at vector.c:63:vectorAt: Undocumented error.
Error at vector.c:63:vectorAt: Undocumented error.
And return with result code 99
Makefile will have these targets:
build (default target) - "release" configuration build with optimalization and with no debugging symbols
debug - build with debugging symbols
clean - removes all files made during compilation
pack - packs the project to compressed tarball ready for submission
profile - build with profiling information for gprof
If any other is needed, will be added later.
Anyone feels like making it look nice & useful?
steps to reproduce:
./ini sources/for.ifj
./ini sources/while.ifj
expectation:
working while and for statement.
If we don't want to delete parentheses support from expression, we will have to implement also logical operatoeras and
, or
, !
, &&
and ||
.
Build
scanner.c: In function ‘scannerGetToken’:
scanner.c:36:17: warning: implicit declaration of function ‘isspace’ [-Wimplicit-function-declaration]
else if ( isspace( symbol ) )
^
scanner.c:185:92: warning: multi-character character constant [-Wmultichar]
if ( ( symbol == '_' ) || ( symbol >= 'a' && symbol <= 'z') || ( symbol >= ' A' && symbol <= 'Z' ) || ( symbol >= '0' && symbol <= '9') ) {
^
scanner.c:12:22: warning: unused variable ‘token’ [-Wunused-variable]
ScannerTokenType token = STT_Empty;
^
scanner.c:298:1: warning: control reaches end of non-void function [-Wreturn-type]
}
Static analysis
Checking scanner.c...
[scanner.c:56]: (style) Consecutive return, break, continue, goto or throw statements are unnecessary.
[scanner.c:60]: (style) Consecutive return, break, continue, goto or throw statements are unnecessary.
[scanner.c:64]: (style) Consecutive return, break, continue, goto or throw statements are unnecessary.
[scanner.c:68]: (style) Consecutive return, break, continue, goto or throw statements are unnecessary.
[scanner.c:72]: (style) Consecutive return, break, continue, goto or throw statements are unnecessary.
[scanner.c:76]: (style) Consecutive return, break, continue, goto or throw statements are unnecessary.
[scanner.c:80]: (style) Consecutive return, break, continue, goto or throw statements are unnecessary.
[scanner.c:84]: (style) Consecutive return, break, continue, goto or throw statements are unnecessary.
[scanner.c:88]: (style) Consecutive return, break, continue, goto or throw statements are unnecessary.
[scanner.c:92]: (style) Consecutive return, break, continue, goto or throw statements are unnecessary.
[scanner.c:12]: (style) Variable 'token' is assigned a value that is never used.
Inside intepreter.c::interpretationLoop::case IST_Jmpz
, instructionPtr
is updated and then it is used to cleanup expression, but it is already pointing at the wrong instruction.
Can be tested on
<?php
if (1)
{
$x = put_string("1");
}
else
{
$x = put_string("0");
}
Self explanatory. Marked as minor priority as it is not important for us right now.
Will benefit string variables most, but in worst case it will save at least one instruction per expression.
This one throws warning under PHP but according to the WIS board, we should just imagine scanner as automaton in which EOF is reached and we are not in finishing state, it is lex error.
<?php
/* non terminated multiline comment
Implementation of parser simulating derivation tree and generation of TAC (Three Address Code).
Operator precedence will have its own PR, it is not subject of this one.
We need dynamic array/vector/string, call it whatever you want. Without this, scanner cannot be implemented.
Additional info later.
Instruction
structure should be enriched with mode
member which would tell instructions which operands is meant to be on stack
and which in constTable
Because of this, bottom-up parser needs to be updated to properly generate instructions with correct mode
Tested with this code
<?php
$x = (7 + 8) * 5 - 4;
if ($x < 5)
{
$tmp = put_string("if\n");
}
elseif ($x < 10)
{
$tmp = put_string("elseif\n");
}
else
{
$tmp = put_string("else\n");
}
According to debugger, after $x < 10
is evaluated, there is one IST_Noop
instruction. This may help.
$ ./ini-test -f
Starting test suite StringTests in test_string.c...
Starting test suite ScannerTests in test_scanner.c...
GetToken - keyword (function) [ FAIL ]
Test at test_scanner.c:110 failed!
SHOULD_EQUAL(token->type == STT_Keyword)
GetToken - keyword (function) - keyword type [ FAIL ]
Test at test_scanner.c:111 failed!
SHOULD_EQUAL(token->keywordType == KTT_Function)
GetToken - double exponent - Value [ FAIL ]
Test at test_scanner.c:180 failed!
SHOULD_EQUAL(token->d == 12.03E+04)
GetToken - double exponent - EOF [ FAIL ]
Test at test_scanner.c:182 failed!
SHOULD_EQUAL(token->type == STT_EOF)
GetToken - <?php [ FAIL ]
Test at test_scanner.c:192 failed!
SHOULD_EQUAL(token->type == STT_Php)
GetToken - <?php - EOF [ FAIL ]
Test at test_scanner.c:194 failed!
SHOULD_EQUAL(token->type == STT_EOF)
GetToken - Colon [ FAIL ]
Test at test_scanner.c:212 failed!
SHOULD_EQUAL(token == NULL)
GetToken - multitoken - concat - operator [ FAIL ]
Test at test_scanner.c:222 failed!
SHOULD_EQUAL(token->type == STT_Dot)
GetToken - multitoken - concat - operand2 [ FAIL ]
Test at test_scanner.c:224 failed!
SHOULD_EQUAL(token->type == STT_Variable)
GetToken - keywordType - If [ FAIL ]
Test at test_scanner.c:246 failed!
SHOULD_EQUAL(token->keywordType == KTT_If)
GetToken - keywordType - Else [ FAIL ]
Test at test_scanner.c:248 failed!
SHOULD_EQUAL(token->keywordType == KTT_Else)
GetToken - keywordType - ElseIf [ FAIL ]
Test at test_scanner.c:250 failed!
SHOULD_EQUAL(token->keywordType == KTT_Elseif)
GetToken - keywordType - Continue [ FAIL ]
Test at test_scanner.c:252 failed!
SHOULD_EQUAL(token->keywordType == KTT_Continue)
GetToken - keywordType - Break [ FAIL ]
Test at test_scanner.c:254 failed!
SHOULD_EQUAL(token->keywordType == KTT_Break)
GetToken - keywordType - While [ FAIL ]
Test at test_scanner.c:256 failed!
SHOULD_EQUAL(token->keywordType == KTT_While)
GetToken - keywordType - For [ FAIL ]
Test at test_scanner.c:258 failed!
SHOULD_EQUAL(token->keywordType == KTT_For)
GetToken - keywordType - True [ FAIL ]
Test at test_scanner.c:260 failed!
SHOULD_EQUAL(token->keywordType == KTT_True)
GetToken - keywordType - False [ FAIL ]
Test at test_scanner.c:262 failed!
SHOULD_EQUAL(token->keywordType == KTT_False)
GetToken - keywordType - Return [ FAIL ]
Test at test_scanner.c:264 failed!
SHOULD_EQUAL(token->keywordType == KTT_Return)
GetToken - keywordType - Function [ FAIL ]
Test at test_scanner.c:266 failed!
SHOULD_EQUAL(token->keywordType == KTT_Function)
GetToken - keywordType - Null [ FAIL ]
Test at test_scanner.c:268 failed!
SHOULD_EQUAL(token->keywordType == KTT_Null)
Tests passed: 95
Tests failed: 21
Function vectorAt
is one of the most frequent in out code. However it can be optimized on some places
vectorAt
, there can be just one use of vectorBack
and then decrementationvectorBack
which is faster than vectorAt
and then aVal - 1
If you know more places where vectorAt
can be optimized out, just write it here.
I will do bottom-up parser and if someone wants to, he can do the interpreter part. If not, I will take care of both. (Don't forget to reference this issue according to the wiki if one issue is divided into more commits)
Provide hash table implementation for symbol table and interface to iterate, search, access etc.
Hash function will be discussed on the next meeting if needed.
Interpreter ends without syntax error, I believe this is a bug.
How to reproduce:
<?php
$a = 5;
$x = put_string("number:", $a,;
Guys please check it.
Additional info later.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.