c2nes / javalang Goto Github PK
View Code? Open in Web Editor NEWPure Python Java parser and tools
License: MIT License
Pure Python Java parser and tools
License: MIT License
When parsing an enum with fields and/or methods they get parsed correctly, but the fields and methods properties on the EnumDeclaration return empty lists.
It looks like it's just because the properties inherited from TypeDeclaration assume that body is a list of declarations, but that doesn't work for EnumBody (it ends up walking the tree, which yields tuples of path,node).
in file "tokenizer.py",these lines:
elif startswith in ("//", "/"):
if self.try_javadoc_comment():
should be:
elif startswith in ("//", "/"):
if startswith == "/" and self.try_javadoc_comment():
or not, this comment in java code will cause parse error:
//************* <------- this will be parsed as javadoc's begin and eat "/" in the end.
void func1() {
}//
Is there a way I can just get the full tree flattened without having to reconstruct it with the paths and nodes from each iteration?
Stack with a problem in this code:
class Test {
public static void main(String[] arg) {
int a = 1;
if (time && !(b || c) ) {
a = 1;
}
}
}
javalang didn't see the logical negation of an expression '!(b || c) '. Only demonstrate logical or - ' || '
On parsing the following test case with javalang, SuperMethodInvocation
is being mistakenly parsed as SuperMemberReference
. But on providing at least one argument to the function fun()
, it's then being correctly regarded as a SuperMethodInvocation
.
class Test
{
Test()
{
super.fun();
}
}
Output on parsing above test file : issue_output
eg: import com.sf.common.util.DateFormatUtils;;
The Node base class should include a 'position' attribute. This should be populated with position information for all nodes types in the parser. The position should be copied from the first token associated with each AST node.
while parsing java file..how can I get the line number and the text line
Could we get a new release on pypi with the changes to add line numbers to the nodes?
For example,I want to parse a java file.
src="source/test.java"
But when I use tree = javalang.parse.parse(src) it failed.
I am totally new at this area.Would you mind helping me?Thank you
Here is simple example:
// Foo.java
class Foo {
void foo(int x) {
assert x > 42;
}
}
Let's parse it:
from javalang.parse import parse
from javalang.tree import AssertStatement
with open('Foo.java') as f:
tree = parse(f.read())
path, assert_stmt = next(tree.filter(AssertStatement))
print(repr(assert_stmt.position))
Expected output:
Position(line=4, column=9)
Actual output:
None
Is there any reason why an AssertStatement
doesn't have a position?
On parsing the following test case with javalang it wasn't able to detect the unary (prefix) operator ++
.
The issue is prevalent for other unary operators as well, such as +, -, ++ (postfix), !
class Test
{
Test()
{
x = ++y;
}
}
Output on parsing above test file : issue_output
The operator has no entry in the output ( test file is being parsed successfully ).
Apologies if this features exists already. Given some kind of code block (like a MethodDeclaration
for example), I'd like to get the number of lines in that block. The AST gives me the start position (line, column) of the block, but not end position.
If this feature exists, I would appreciate pointers to it. If it doesn't I'm happy to discuss or contribute with a PR.
Thanks for your work on this project!
Hi. I want to capture all the assignments in a source code file, which I am doing with
for path, node in tree.filter(javalang.tree.Assignment):
assign +=1
, but the final result is not the number of assignments I have in the code. A look at the Java specification from the Oracle website doesn't make it clear which Node type should I try to catch for a 'generic' assignment.
Great work with this parser.
Thanks
Let this simple Java8 code:
public interface Test{ default void save() {} }
which works according to javac:
echo "public interface Test{ default void save() {} }" > Test.java
javac Test.java
But javalang throws a Syntax error. Here working in a clean virtual env., tested both with Py2 and Py3:
import javalang
tree = javalang.parse.parse("package javalang.brewtab.com; public interface Test{ default void save() {} }")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/roipoussiere/.local/lib/python2.7/site-packages/javalang/parse.py", line 53, in parse
return parser.parse()
File "/home/roipoussiere/.local/lib/python2.7/site-packages/javalang/parser.py", line 110, in parse
return self.parse_compilation_unit()
File "/home/roipoussiere/.local/lib/python2.7/site-packages/javalang/parser.py", line 296, in parse_compilation_unit
type_declaration = self.parse_type_declaration()
File "/home/roipoussiere/.local/lib/python2.7/site-packages/javalang/parser.py", line 341, in parse_type_declaration
return self.parse_class_or_interface_declaration()
File "/home/roipoussiere/.local/lib/python2.7/site-packages/javalang/parser.py", line 354, in parse_class_or_interface_declaration
type_declaration = self.parse_normal_interface_declaration()
File "/home/roipoussiere/.local/lib/python2.7/site-packages/javalang/parser.py", line 429, in parse_normal_interface_declaration
body = self.parse_interface_body()
File "/home/roipoussiere/.local/lib/python2.7/site-packages/javalang/parser.py", line 945, in parse_interface_body
declaration = self.parse_interface_body_declaration()
File "/home/roipoussiere/.local/lib/python2.7/site-packages/javalang/parser.py", line 960, in parse_interface_body_declaration
declaration = self.parse_interface_member_declaration()
File "/home/roipoussiere/.local/lib/python2.7/site-packages/javalang/parser.py", line 986, in parse_interface_member_declaration
declaration = self.parse_interface_method_or_field_declaration()
File "/home/roipoussiere/.local/lib/python2.7/site-packages/javalang/parser.py", line 992, in parse_interface_method_or_field_declaration
java_type = self.parse_type()
File "/home/roipoussiere/.local/lib/python2.7/site-packages/javalang/parser.py", line 461, in parse_type
self.illegal("Expected type")
File "/home/roipoussiere/.local/lib/python2.7/site-packages/javalang/parser.py", line 119, in illegal
raise JavaSyntaxError(description, at)
javalang.parser.JavaSyntaxError
It seems this set should contains 'default'
to support Java8.
For these simple Java8 codes:
public interface Test{ default int foo() {return 0;} }
and
public interface Test{ default void foo() {} }
Javalang throws a Syntax error:
import javalang
javalang.parse.parse("public interface Test{ default int foo() {return 0;} }")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.5/dist-packages/javalang/parse.py", line 53, in parse
return parser.parse()
File "/usr/local/lib/python3.5/dist-packages/javalang/parser.py", line 110, in parse
return self.parse_compilation_unit()
File "/usr/local/lib/python3.5/dist-packages/javalang/parser.py", line 296, in parse_compilation_unit
type_declaration = self.parse_type_declaration()
File "/usr/local/lib/python3.5/dist-packages/javalang/parser.py", line 341, in parse_type_declaration
return self.parse_class_or_interface_declaration()
File "/usr/local/lib/python3.5/dist-packages/javalang/parser.py", line 354, in parse_class_or_interface_declaration
type_declaration = self.parse_normal_interface_declaration()
File "/usr/local/lib/python3.5/dist-packages/javalang/parser.py", line 429, in parse_normal_interface_declaration
body = self.parse_interface_body()
File "/usr/local/lib/python3.5/dist-packages/javalang/parser.py", line 945, in parse_interface_body
declaration = self.parse_interface_body_declaration()
File "/usr/local/lib/python3.5/dist-packages/javalang/parser.py", line 960, in parse_interface_body_declaration
declaration = self.parse_interface_member_declaration()
File "/usr/local/lib/python3.5/dist-packages/javalang/parser.py", line 986, in parse_interface_member_declaration
declaration = self.parse_interface_method_or_field_declaration()
File "/usr/local/lib/python3.5/dist-packages/javalang/parser.py", line 994, in parse_interface_method_or_field_declaration
member = self.parse_interface_method_or_field_rest()
File "/usr/local/lib/python3.5/dist-packages/javalang/parser.py", line 1011, in parse_interface_method_or_field_rest
rest = self.parse_interface_method_declarator_rest()
File "/usr/local/lib/python3.5/dist-packages/javalang/parser.py", line 1056, in parse_interface_method_declarator_rest
self.accept(';')
File "/usr/local/lib/python3.5/dist-packages/javalang/parser.py", line 131, in accept
self.illegal("Expected '%s'" % (accept,))
File "/usr/local/lib/python3.5/dist-packages/javalang/parser.py", line 119, in illegal
raise JavaSyntaxError(description, at)
javalang.parser.JavaSyntaxError
Related to #29.
Hi!
I was trying to use your parse to parse some Java code, but I noticed a problem in the parser.
Here's a sample piece of code:
public void setTest(final Valvel valve) {
((Valvel)valve).stop();
}
If you parse this code with the library, it will not recognize the stop
method invocation. In fact anything after the Cast seems to be ignored, including any arguments passed to the stop method.
Here's an output of the tree:
BODY: [StatementExpression]
CPATH: () CNODE: StatementExpression
Children: [None, Cast]
CPATH: (StatementExpression,) CNODE: Cast
Children: [ReferenceType, MemberReference]
CPATH: (StatementExpression, Cast) CNODE: ReferenceType
Children: [u'Valvel', [], None, None]
CPATH: (StatementExpression, Cast) CNODE: MemberReference
Children: [None, None, '', [], u'valve']
Do you know why this happens?
Thank you
This is a quite interesting Python package, but it lacks a decent documentation (docstring or a readme-like one) or at least, some examples.
Javalang is unable to parse the following javacode. It throws an error cannot parse token '#'
Javacode trying to parse:
/**
Method I am using to parse the code:
def parse_program_class(func):
tree = javalang.parse.parse(func)
return tree
It raises the following exception:
LexerError Traceback (most recent call last)
in
----> 1 parse_program_class(test)
in parse_program_class(func)
1 def parse_program_class(func):
----> 2 tree = javalang.parse.parse(func)
3 return tree
/usr/local/lib/python3.6/dist-packages/javalang/parse.py in parse(s)
50 def parse(s):
51 tokens = tokenize(s)
---> 52 parser = Parser(tokens)
53 return parser.parse()
/usr/local/lib/python3.6/dist-packages/javalang/parser.py in init(self, tokens)
93
94 def init(self, tokens):
---> 95 self.tokens = util.LookAheadListIterator(tokens)
96 self.tokens.set_default(EndOfInput(None))
97
/usr/local/lib/python3.6/dist-packages/javalang/util.py in init(self, iterable)
90 class LookAheadListIterator(object):
91 def init(self, iterable):
---> 92 self.list = list(iterable)
93
94 self.marker = 0
/usr/local/lib/python3.6/dist-packages/javalang/tokenizer.py in tokenize(self)
541
542 else:
--> 543 self.error('Could not process token', c)
544 self.i = self.i + 1
545 continue
/usr/local/lib/python3.6/dist-packages/javalang/tokenizer.py in error(self, message, char)
570
571 if not self.ignore_errors:
--> 572 raise error
573
574 def tokenize(code, ignore_errors=False):
LexerError: Could not process token at "#", line 35: out.println("#ifdef _MSC_VER");
Any suggestions on what I could try or change would be of great help.
Thanks in advance!
I'm a bit confused by these two lines. Shouldn't this instead read
if type(other) is not type(self):
return False
with a not
in there? Consider the following:
tree = javalang.parse.parse('')
tree.__equals__(tree) # returns False
Dear authors,
Thanks for sharing your code. This package is easy to use, while I still encounter some problems.
I want to parse some java code snippet, for example some java functions.
Could you tell me how can parse the java snippet to AST?
here is my code:
import javalang
tokens = javalang.tokenizer.tokenize('public String toString ( ) { return this . getClass ( ) . getName ( ) ; } /** * @return An arbitrary string. */ public String anotherString ( ) { return "An arbitrary string." ; }')
parser = javalang.parser.Parser(tokens)
tree = parser.xxx() // I don't know parse_xx should be here.
I always get the 'javalang.parser.JavaSyntaxError' exception.
Any help from you will be highly appreciated.
I'd appreciate it if you could release a new version with the Java 8 support. Thanks for writing this!
When using Stream.toArray(), a JavaSyntaxError is raised because it doesn't understand how to parse the .toArray(Object[]::new) call.
example:
Person[] men = people.stream()
.filter(p -> p.getGender() == MALE)
.toArray(Person[]::new);
Hi,
Thank you for you implementation, it's very easy to parse code and explore the AST graph.
I implement some validation rules, for exemple, about catch implementation.
After founding a match I would like to render the subtree as code. Is there an API to do that ?
Regards,
Philippe
I was wondering how to check if two trees are equal. There is an equals method in the Node class in ast.py but it does not check the entire hierarchy of the tree from root to leaf and does not handle cases when there are lists etc.
Also, is there any way to return the raw unparsed string for a type or a method? If a raw string can be returned, it can easily be compared.
If we have such an example, Binary Operator instanceof
doesn't have code line number (it is None):
@Override
public InputSource resolveEntity(String publicId, String systemId) throws IOException, SAXException {
LOGGER.log(Level.INFO, "Requested Entity: public id = {0}, system id = {1}", new Object[]{publicId, systemId});
// We only expect a few entries here so use linear search directly. If
// this changes, considering caching using HashMap<String, String>
//
InputSource source = null;
FileObject folder = FileUtil.getConfigFile("DTDs/GlassFish");
if(folder != null) {
for(FileObject fo: folder.getChildren()) {
Object attr;
if((attr = fo.getAttribute("publicId")) instanceof String && attr.equals(publicId)) {
source = new InputSource(fo.getInputStream());
break;
} else if((attr = fo.getAttribute("systemId")) instanceof String && attr.equals(systemId)) {
source = new InputSource(fo.getInputStream());
break;
}
}
}
return source;
}
In the rest of the cases, we have code line number
Is there a way to convert a node (e.g. MethodDeclaration
) into a string, either pretty-print (aka, reformatting) or print the original code?
I didn't find such methods in javalang. I had experiences with Roslyn in .net and javaparse. Both support this functionality.
https://openjdk.java.net/jeps/361
If you don't have time, could you at least give some tips how to fix it properly?
Would be enough to change only parse_switch_block_statement_group
or something else should be done?
Hi,
I would like to take two java files A and B and replace a method in file A with the code of the same method in file B... and then write the result in file C. I have the impression I am able to get the trees with your library, but... how do I do the rest? Or even, can I?
BTW I think your library lacks some examples on how to find a method within a tree, how to access its code, and how to print it back to the screen or to a file. It looks very promising!
Thanks!
This is my code:
fht_android_src = "E:/FHT Mobile/Android/fht/app/src/main/java/com/fht360/fht"
for (dirpath, dirname, filenames) in os.walk(fht_android_src):
for filename in filenames:
if filename.endswith(".java"):
filepath = os.path.join(dirpath, filename)
# see https://stackoverflow.com/questions/19591458/python-reading-from-a-file-and-saving-to-utf-8
if filename == "MainActivity.java":
with io.open(filepath, 'r', encoding="utf8") as f:
content = f.read()
tree = javalang.parse.parse(content)
for path, node in tree:
# if isinstance(node, javalang.tree.ClassDeclaration):
# # print(path, node)
# print(node.name)
if isinstance(node, javalang.tree.MethodDeclaration):
# print(path, node)
if node.name == "onEvent":
handlerNode = node
handlerPath = path
print(handlerNode.parameters[0].type, handlerNode.parameters[0].name)
I am analyzing a typical Android Activity class, which have a bunch of onEvent
methods for EventBus, these methods look like:
public void onEvent(UnreadChatMessageEvent event) {
mNavFragment.showHomeDot(event.count);
SPUtil.saveInteger(this, Constant.UnReadChatMessage, event.count);
doTotalCount();
}
public void onEvent(FriendRequestMsgDotClearEvent event) {
mNavFragment.showFriendsDot(0);
SPUtil.saveInteger(this, Constant.UntreatedFriendRequest, 0);
doTotalCount();
}
I want to get the type of the notification as something like: UnreadChatMessageEvent
, but the type given by javalang is always ReferenceType
:
Release a new version of javalang on pypi.
cc @c2nes
I encountered this crash while parsing random Java from GitHub. This is the crash occurred on PolarPixellateFilter.java from chrisbatt/AndroidFastImageProcessing.
UPDATE: I strongly believe this bug to be caused by javalang
's handling of carriage returns as newlines (or lack thereof). It seems that a double-slash comment //
has innocently commented out the entire rest of this file despite a carriage return ending the comment well before the end of the file.
This crashed occurred both on an Ubuntu machine and an macOS machine, both running Python 3.6.1.
Traceback (most recent call last):
File "/Users/eddieantonio/.pyenv/versions/3.6.0/lib/python3.6/pdb.py", line 1667, in main
pdb._runscript(mainpyfile)
File "/Users/eddieantonio/.pyenv/versions/3.6.0/lib/python3.6/pdb.py", line 1548, in _runscript
self.run(statement)
File "/Users/eddieantonio/.pyenv/versions/3.6.0/lib/python3.6/bdb.py", line 431, in run
exec(cmd, globals, locals)
File "<string>", line 1, in <module>
File "/Users/eddieantonio/Projects/sensibility/test_fail.py", line 4, in <module>
import javalang
File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parse.py", line 53, in parse
return parser.parse()
File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 110, in parse
return self.parse_compilation_unit()
File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 302, in parse_compilation_unit
type_declaration = self.parse_type_declaration()
File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 347, in parse_type_declaration
return self.parse_class_or_interface_declaration()
File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 356, in parse_class_or_interface_declaration
type_declaration = self.parse_normal_class_declaration()
File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 394, in parse_normal_class_declaration
body = self.parse_class_body()
File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 768, in parse_class_body
declaration = self.parse_class_body_declaration()
File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 791, in parse_class_body_declaration
return self.parse_member_declaration()
File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 825, in parse_member_declaration
member = self.parse_method_or_field_declaraction()
File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 839, in parse_method_or_field_declaraction
member = self.parse_method_or_field_rest()
File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 857, in parse_method_or_field_rest
return self.parse_method_declarator_rest()
File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 886, in parse_method_declarator_rest
body = self.parse_block()
File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 1274, in parse_block
statement = self.parse_block_statement()
File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 1339, in parse_block_statement
return self.parse_statement()
File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 1465, in parse_statement
value = self.parse_expression()
File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 1752, in parse_expression
expressionl = self.parse_expressionl()
File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 1767, in parse_expressionl
expression_2 = self.parse_expression_2()
File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 1796, in parse_expression_2
parts = self.parse_expression_2_rest()
File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 1813, in parse_expression_2_rest
expression = self.parse_expression_3()
File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 1855, in parse_expression_3
while token.value in '[.':
TypeError: 'in <string>' requires string as left operand, not NoneType
Uncaught exception. Entering post mortem debugging
Running 'cont' or 'step' will restart the program
> /Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py(1855)parse_expression_3()
It crashes in parser.py.
Popping this in pdb reveals that the token is an EndOfInput
:
-> while token.value in '[.':
(Pdb)
(Pdb) p token
EndOfInput "None"
However, the primary that it just parsed is nowhere near the end of input
(Pdb) p primary
Literal
(Pdb) p primary.position
(1, 951)
The only weird thing about the file is that its newline character is the carriage return (yuck!), hence javalang
believes it's all on one line. Otherwise, javac considers it syntactically-valid Java 8 source code.
Replication package: javalang-crash.zip
Currently javalang raises an error when it encounters the new lambda expression syntax that java 8 provides and instead (of raising an error) it should parse the code.
Here is an some java that triggers an error in javalang (taken from http://docs.oracle.com/javase/tutorial/java/javaOO/lambdaexpressions.html)
public class Calculator {
interface IntegerMath {
int operation(int a, int b);
}
public int operateBinary(int a, int b, IntegerMath op) {
return op.operation(a, b);
}
public static void main(String... args) {
Calculator myApp = new Calculator();
IntegerMath addition = (a, b) -> a + b;
IntegerMath subtraction = (a, b) -> a - b;
System.out.println("40 + 2 = " +
myApp.operateBinary(40, 2, addition));
System.out.println("20 - 10 = " +
myApp.operateBinary(20, 10, subtraction));
}
}
Hi all,
First of all, thank you for this excellent tool. I would like to know if there is a way to serialize the nodes to a known format such as json or XML.
Javalang cannot parse line comments in the last line of code, if this line is terminated by the end of file instead of a line break character. This is probably a rare issue, but I ran across it while parsing an old version of Apache POI.
The following is a minimal example:
import javalang
javalang.parse.parse('// line comment')
It raises the following exception:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-42-babb856b693c> in <module>
----> 1 javalang.parse.parse('// line comment')
.../site-packages/javalang/parse.py in parse(s)
50 def parse(s):
51 tokens = tokenize(s)
---> 52 parser = Parser(tokens)
53 return parser.parse()
.../site-packages/javalang/parser.py in __init__(self, tokens)
93
94 def __init__(self, tokens):
---> 95 self.tokens = util.LookAheadListIterator(tokens)
96 self.tokens.set_default(EndOfInput(None))
97
.../site-packages/javalang/util.py in __init__(self, iterable)
90 class LookAheadListIterator(object):
91 def __init__(self, iterable):
---> 92 self.list = list(iterable)
93
94 self.marker = 0
.../site-packages/javalang/tokenizer.py in tokenize(self)
506 elif startswith in ("//", "/*"):
507 comment = self.read_comment()
--> 508 if comment.startswith("/**"):
509 self.javadoc = comment
510 continue
AttributeError: 'NoneType' object has no attribute 'startswith'
Hi.
I want to find out each method in a java file, then get the method's code.
I appreciate if you can help me with this.
Hi,
I would like to extract source code of user defined methods from a java file.
Suppose, we have a following code in a java file:
public class CallingMethodsInSameClass
{
public static void main(String[] args) {
printOne();
printOne();
printTwo();
}
public static void printOne() {
System.out.println("Hello World");
}
public static void printTwo() {
printOne();
printOne();
}
}
The output should be:
Method 1:
public static void main(String[] args) {
printOne();
printOne();
printTwo();
}
Method 2:
public static void printOne() {
System.out.println("Hello World");
}
Method 3:
public static void printTwo() {
printOne();
printOne();
}
Is there a way to do it by using javalang? Please let me know about it.
When a line is preceded by an inline comment the position of lines tokens are decreased by one.
...
static JavaVersion get(final String nom) { // position (xx, 5)
...
// comment
static JavaVersion get(final String nom) { // position (xx, 4)
I want to convert node to tokens vector.How can I do that? Thank you
javalang.tokenize.tokenize
correctly identifies the unary operators within the token list however, javalang.parse.parse
does not contain any nodes for these.
Using this example:
public class JavaTest
extends Object
{
public static void main(String[] args)
{
int counter = 0;
counter++;
boolean flag = false;
if(!flag && counter)
{
System.out.println("Stuff!");
}
}
}
with this script:
import sys
import javalang
class Example:
def __init__(self, infile):
self.contents = ""
with open(infile, "r") as FIN:
for line in FIN:
self.contents += line
if self.contents:
print("Tokens:")
tokens = list(javalang.tokenizer.tokenize(self.contents))
for token in tokens:
print(token)
print("Nodes:")
tree = javalang.parse.parse(self.contents)
for path, node in tree:
my_str = str(node)
if hasattr(node, "position"):
if node.position:
my_str += " " + str(node.position)
if hasattr(node, "value"):
if node.value:
my_str += " " + str(node.value)
if isinstance(node, javalang.tree.IfStatement):
my_str += "\n--condition: " + str(node.condition)
if isinstance(node.condition, javalang.tree.BinaryOperation):
my_str += "\n----operator: " + str(node.condition.operator)
my_str += "\n----Left: " + str(node.condition.operandl.member)
my_str += "\n----Right: " + str(node.condition.operandr.member)
if isinstance(node, javalang.tree.BinaryOperation):
my_str += "\n--operator: " + str(node.operator)
print(my_str)
if __name__ == "__main__":
if len(sys.argv) > 1:
for name in sys.argv[1:]:
Example(name)
If you run these examples, the tokenizer correctly prints out the postfix increment and the not operators. When breaking down the nodes though, the increment and not are left out entirely. Of special interest is the breakdown of the AND gate:
IfStatement
--condition: BinaryOperation
----operator: &&
----Left: flag
----Right: counter
Popping through the Parser code, I'm not actually seeing anything that finds unary ops in the compilation_unit logic (admittedly, I've only been looking for a bit). Am I just missing something?
I found a bug,
in file : tokenizer.py
line 470 :
escape_code = int(self.data[j:j+4], 16)
should be: escape_code = int(data[j:j+4], 16)
Thanks for your great job!
dear
all !
how to traverse all AST node? thank you very much!
I am using javalang to tokenize files which include Unicode escape sequences. These are correctly tokenized as strings, but the item.value is not handled cleanly. Consider the 2 cases below:
Case 1: builder.append(text, 0, MAX_TEXT).append('\u2026');
Case 2: builder.append(text, 0, MAX_TEXT).append('…');
In both cases, item.value is identical and I get an exception if I try to write the item.value to a file. I can catch the error and successfully print using python like this:
if (token_type == 'String'):
try:
outfile.write(item.value)
except UnicodeEncodeError:
outfile.write(item.value.encode('unicode-escape').decode('utf-8'))
but the python code above prints the same value for Case 1 and 2. I suspect the proper fix is to use raw strings for String token values internal to javalang. Below is an example of raw strings solving the problem.
>>> str1 = '…'
>>> str2 = '\u2026'
>>> print("str1: ",str1," str2:",str2)
str1: … str2: …
>>> str1 == str2
True
>>> str1 = r'…'
>>> str2 = r'\u2026'
>>> print("str1: ",str1," str2:",str2)
str1: … str2: \u2026
>>> str1 == str2
False
Hi everyone,
This project sounds really good.
I have some questions that perhaps could be included in the ReadMe:
Thanks,
Luis
An empty comment with multiple blank lines will cause in IndexError in _left_justify
in javadoc.py
.
The code below is throwing a JavaSyntaxError because of the brackets in "Long[]::new", and I can't figure out why. Is it invalid Java?
public class StatementStepDefinition {
public void foobar() {
final Long[] secondaryTransactionsIds = getSecondaryTransactions().stream().toArray(Long[]::new);
}
}
Can anyone tell me the meaning of 'label' of Statement, it is alway None in my situation.
I'm currently using the Java parser Javalang because I need to iterate through an AST from a java source code, but I need to do it from a python file.
It appears to be quite useful but I have a problem when parsing arguments in a method signature. For exemple in a classic
public void main(String[] args){}
The String[] args is being parsed as a FormalParameter, who's a subclass of Declaration, itself a subclass of Node. In this particular exemple, the type field of FormalParamter will be ReferenceType, that got 4 fields: name,declarations,arguments,sub_type. The name field return only String, and the sub_type returns None. There is no indication of "args" being an array. How can i get that back ?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.