Giter VIP home page Giter VIP logo

javalang's People

Contributors

amar1729 avatar atheriel avatar c2nes avatar cassianomonteiro avatar dbaxa avatar gargarensis avatar johnhawes avatar sandeshc avatar shoaniki avatar tonyroberts avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

javalang's Issues

javalang.tree.EnumDeclaration fields and methods properties not working

When parsing an enum with fields and/or methods they get parsed correctly, but the fields and methods properties on the EnumDeclaration return empty lists.

It looks like it's just because the properties inherited from TypeDeclaration assume that body is a list of declarations, but that doesn't work for EnumBody (it ends up walking the tree, which yields tuples of path,node).

another bug

in file "tokenizer.py",these lines:
elif startswith in ("//", "/"):
if self.try_javadoc_comment():
should be:
elif startswith in ("//", "/
"):
if startswith == "/" and self.try_javadoc_comment():
or not, this comment in java code will cause parse error:
//
************* <------- this will be parsed as javadoc's begin and eat "/" in the end.
void func1() {
}/
/

Full and flattened tree

Is there a way I can just get the full tree flattened without having to reconstruct it with the paths and nodes from each iteration?

Can't recognize binary logical operator '!'

Stack with a problem in this code:

 class Test {
    public static void main(String[] arg) {
		int a = 1;
        if (time && !(b || c) ) {
			a = 1;
    }
}
}

javalang didn't see the logical negation of an expression '!(b || c) '. Only demonstrate logical or - ' || '

Include position information in all AST nodes

The Node base class should include a 'position' attribute. This should be populated with position information for all nodes types in the parser. The position should be copied from the first token associated with each AST node.

New Release?

Could we get a new release on pypi with the changes to add line numbers to the nodes?

How to parse a java file?

For example,I want to parse a java file.
src="source/test.java"
But when I use tree = javalang.parse.parse(src) it failed.
I am totally new at this area.Would you mind helping me?Thank you

`AssertStatement` doesn't have a position

Here is simple example:

// Foo.java
class Foo {
    void foo(int x) {
        assert x > 42;
    }
}

Let's parse it:

from javalang.parse import parse
from javalang.tree import AssertStatement

with open('Foo.java') as f:
    tree = parse(f.read())

path, assert_stmt = next(tree.filter(AssertStatement))

print(repr(assert_stmt.position))

Expected output:

Position(line=4, column=9)

Actual output:

None

Is there any reason why an AssertStatement doesn't have a position?

Feature request: get number of lines in a block?

Apologies if this features exists already. Given some kind of code block (like a MethodDeclaration for example), I'd like to get the number of lines in that block. The AST gives me the start position (line, column) of the block, but not end position.

If this feature exists, I would appreciate pointers to it. If it doesn't I'm happy to discuss or contribute with a PR.

Thanks for your work on this project!

Catching all assignments

Hi. I want to capture all the assignments in a source code file, which I am doing with
for path, node in tree.filter(javalang.tree.Assignment):
assign +=1
, but the final result is not the number of assignments I have in the code. A look at the Java specification from the Oracle website doesn't make it clear which Node type should I try to catch for a 'generic' assignment.

Great work with this parser.
Thanks

Javalang doesn't support Java8 `default` keyword.

Let this simple Java8 code:

public interface Test{ default void save() {} }

which works according to javac:

echo "public interface Test{ default void save() {} }" > Test.java
javac Test.java

But javalang throws a Syntax error. Here working in a clean virtual env., tested both with Py2 and Py3:

import javalang
tree = javalang.parse.parse("package javalang.brewtab.com; public interface Test{ default void save() {} }")

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/roipoussiere/.local/lib/python2.7/site-packages/javalang/parse.py", line 53, in parse
    return parser.parse()
  File "/home/roipoussiere/.local/lib/python2.7/site-packages/javalang/parser.py", line 110, in parse
    return self.parse_compilation_unit()
  File "/home/roipoussiere/.local/lib/python2.7/site-packages/javalang/parser.py", line 296, in parse_compilation_unit
    type_declaration = self.parse_type_declaration()
  File "/home/roipoussiere/.local/lib/python2.7/site-packages/javalang/parser.py", line 341, in parse_type_declaration
    return self.parse_class_or_interface_declaration()
  File "/home/roipoussiere/.local/lib/python2.7/site-packages/javalang/parser.py", line 354, in parse_class_or_interface_declaration
    type_declaration = self.parse_normal_interface_declaration()
  File "/home/roipoussiere/.local/lib/python2.7/site-packages/javalang/parser.py", line 429, in parse_normal_interface_declaration
    body = self.parse_interface_body()
  File "/home/roipoussiere/.local/lib/python2.7/site-packages/javalang/parser.py", line 945, in parse_interface_body
    declaration = self.parse_interface_body_declaration()
  File "/home/roipoussiere/.local/lib/python2.7/site-packages/javalang/parser.py", line 960, in parse_interface_body_declaration
    declaration = self.parse_interface_member_declaration()
  File "/home/roipoussiere/.local/lib/python2.7/site-packages/javalang/parser.py", line 986, in parse_interface_member_declaration
    declaration = self.parse_interface_method_or_field_declaration()
  File "/home/roipoussiere/.local/lib/python2.7/site-packages/javalang/parser.py", line 992, in parse_interface_method_or_field_declaration
    java_type = self.parse_type()
  File "/home/roipoussiere/.local/lib/python2.7/site-packages/javalang/parser.py", line 461, in parse_type
    self.illegal("Expected type")
  File "/home/roipoussiere/.local/lib/python2.7/site-packages/javalang/parser.py", line 119, in illegal
    raise JavaSyntaxError(description, at)
javalang.parser.JavaSyntaxError

It seems this set should contains 'default' to support Java8.

Javalang can not parse interfaces with a body.

For these simple Java8 codes:

    public interface Test{ default int foo() {return 0;} }

and

    public interface Test{ default void foo() {} }

Javalang throws a Syntax error:

import javalang
javalang.parse.parse("public interface Test{ default int foo() {return 0;} }")

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.5/dist-packages/javalang/parse.py", line 53, in parse
    return parser.parse()
  File "/usr/local/lib/python3.5/dist-packages/javalang/parser.py", line 110, in parse
    return self.parse_compilation_unit()
  File "/usr/local/lib/python3.5/dist-packages/javalang/parser.py", line 296, in parse_compilation_unit
    type_declaration = self.parse_type_declaration()
  File "/usr/local/lib/python3.5/dist-packages/javalang/parser.py", line 341, in parse_type_declaration
    return self.parse_class_or_interface_declaration()
  File "/usr/local/lib/python3.5/dist-packages/javalang/parser.py", line 354, in parse_class_or_interface_declaration
    type_declaration = self.parse_normal_interface_declaration()
  File "/usr/local/lib/python3.5/dist-packages/javalang/parser.py", line 429, in parse_normal_interface_declaration
    body = self.parse_interface_body()
  File "/usr/local/lib/python3.5/dist-packages/javalang/parser.py", line 945, in parse_interface_body
    declaration = self.parse_interface_body_declaration()
  File "/usr/local/lib/python3.5/dist-packages/javalang/parser.py", line 960, in parse_interface_body_declaration
    declaration = self.parse_interface_member_declaration()
  File "/usr/local/lib/python3.5/dist-packages/javalang/parser.py", line 986, in parse_interface_member_declaration
    declaration = self.parse_interface_method_or_field_declaration()
  File "/usr/local/lib/python3.5/dist-packages/javalang/parser.py", line 994, in parse_interface_method_or_field_declaration
    member = self.parse_interface_method_or_field_rest()
  File "/usr/local/lib/python3.5/dist-packages/javalang/parser.py", line 1011, in parse_interface_method_or_field_rest
    rest = self.parse_interface_method_declarator_rest()
  File "/usr/local/lib/python3.5/dist-packages/javalang/parser.py", line 1056, in parse_interface_method_declarator_rest
    self.accept(';')
  File "/usr/local/lib/python3.5/dist-packages/javalang/parser.py", line 131, in accept
    self.illegal("Expected '%s'" % (accept,))
  File "/usr/local/lib/python3.5/dist-packages/javalang/parser.py", line 119, in illegal
    raise JavaSyntaxError(description, at)
javalang.parser.JavaSyntaxError

Related to #29.

Method of a casted object not parsed

Hi!

I was trying to use your parse to parse some Java code, but I noticed a problem in the parser.
Here's a sample piece of code:

    public void setTest(final Valvel valve) {
        ((Valvel)valve).stop();     
    }

If you parse this code with the library, it will not recognize the stop method invocation. In fact anything after the Cast seems to be ignored, including any arguments passed to the stop method.
Here's an output of the tree:

BODY: [StatementExpression]
CPATH: () CNODE: StatementExpression
Children: [None, Cast]
CPATH: (StatementExpression,) CNODE: Cast
Children: [ReferenceType, MemberReference]
CPATH: (StatementExpression, Cast) CNODE: ReferenceType
Children: [u'Valvel', [], None, None]
CPATH: (StatementExpression, Cast) CNODE: MemberReference
Children: [None, None, '', [], u'valve']

Do you know why this happens?

Thank you

Unable to parse '#' included in a string

Javalang is unable to parse the following javacode. It throws an error cannot parse token '#'

Javacode trying to parse:
/**

  • Licensed to the Apache Software Foundation (ASF) under one or more
  • contributor license agreements. See the NOTICE file distributed with
  • this work for additional information regarding copyright ownership.
  • The ASF licenses this file to You under the Apache License, Version 2.0
  • (the "License"); you may not use this file except in compliance with
  • the License. You may obtain a copy of the License at
  • http://www.apache.org/licenses/LICENSE-2.0
  • Unless required by applicable law or agreed to in writing, software
  • distributed under the License is distributed on an "AS IS" BASIS,
  • WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  • See the License for the specific language governing permissions and
  • limitations under the License.
    /
    package org.apache.activemq.openwire.tool;
    import java.io.PrintWriter;
    import java.util.Iterator;
    import java.util.List;
    import org.codehaus.jam.JClass;
    import org.codehaus.jam.JProperty;
    /
    *
  • @Version $Revision: 379734 $
    /
    public class CppHeadersGenerator extends CppClassesGenerator {
    protected String getFilePostFix() {
    return ".hpp";
    }
    protected void generateFile(PrintWriter out) {
    generateLicence(out);
    out.println("#ifndef ActiveMQ_" + className + "hpp");
    out.println("#define ActiveMQ_" + className + "hpp");
    out.println("");
    out.println("
    out.println("#ifdef _MSC_VER");
    out.println("#pragma warning( disable : 4290 )");
    out.println("#endif");
    out.println("");
    out.println("#include ");
    out.println("#include "activemq/command/" + baseClass + ".hpp"");
    List properties = getProperties();
    for (Iterator iter = properties.iterator(); iter.hasNext();) {
    JProperty property = (JProperty)iter.next();
    if (!property.getType().isPrimitiveType() && !property.getType().getSimpleName().equals("String") && !property.getType().getSimpleName().equals("ByteSequence")) {
    String includeName = toCppType(property.getType());
    if (property.getType().isArrayType()) {
    JClass arrayType = property.getType().getArrayComponentType();
    if (arrayType.isPrimitiveType()) {
    continue;
    }
    }
    if (includeName.startsWith("array<")) {
    includeName = includeName.substring(6, includeName.length() - 1);
    } else if (includeName.startsWith("p<")) {
    includeName = includeName.substring(2, includeName.length() - 1);
    }
    if (includeName.equals("IDataStructure")) {
    out.println("#include "activemq/" + includeName + ".hpp"");
    } else {
    out.println("#include "activemq/command/" + includeName + ".hpp"");
    }
    }
    }
    out.println("");
    out.println("#include "activemq/protocol/IMarshaller.hpp"");
    out.println("#include "ppr/io/IOutputStream.hpp"");
    out.println("#include "ppr/io/IInputStream.hpp"");
    out.println("#include "ppr/io/IOException.hpp"");
    out.println("#include "ppr/util/ifr/array"");
    out.println("#include "ppr/util/ifr/p"");
    out.println("");
    out.println("namespace apache");
    out.println("{");
    out.println(" namespace activemq");
    out.println(" {");
    out.println(" namespace command");
    out.println(" {");
    out.println(" using namespace ifr;");
    out.println(" using namespace std;");
    out.println(" using namespace apache::activemq;");
    out.println(" using namespace apache::activemq::protocol;");
    out.println(" using namespace apache::ppr::io;");
    out.println("");
    out.println("/
    ");
    out.println(" *");
    out.println(" * Command and marshalling code for OpenWire format for " + className + "");
    out.println(" *");
    out.println(" *");
    out.println(" * NOTE!: This file is autogenerated - do not modify!");
    out.println(" * if you need to make a change, please see the Groovy scripts in the");
    out.println(" * activemq-core module");
    out.println(" *");
    out.println(" /");
    out.println("class " + className + " : public " + baseClass + "");
    out.println("{");
    out.println("protected:");
    for (Iterator iter = properties.iterator(); iter.hasNext();) {
    JProperty property = (JProperty)iter.next();
    String type = toCppType(property.getType());
    String name = decapitalize(property.getSimpleName());
    out.println(" " + type + " " + name + " ;");
    }
    out.println("");
    out.println("public:");
    out.println(" const static unsigned char TYPE = " + getOpenWireOpCode(jclass) + ";");
    out.println("");
    out.println("public:");
    out.println(" " + className + "() ;");
    out.println(" virtual ~" + className + "() ;");
    out.println("");
    out.println(" virtual unsigned char getDataStructureType() ;");
    for (Iterator iter = properties.iterator(); iter.hasNext();) {
    JProperty property = (JProperty)iter.next();
    String type = toCppType(property.getType());
    String propertyName = property.getSimpleName();
    String parameterName = decapitalize(propertyName);
    out.println("");
    out.println(" virtual " + type + " get" + propertyName + "() ;");
    out.println(" virtual void set" + propertyName + "(" + type + " " + parameterName + ") ;");
    }
    out.println("");
    out.println(" virtual int marshal(p marshaller, int mode, p ostream) throw (IOException) ;");
    out.println(" virtual void unmarshal(p marshaller, int mode, p istream) throw (IOException) ;");
    out.println("} ;");
    out.println("");
    out.println("/
    namespace */");
    out.println(" }");
    out.println(" }");
    out.println("}");
    out.println("");
    out.println("#endif /ActiveMQ_" + className + "hpp/");
    }
    }

Method I am using to parse the code:
def parse_program_class(func):
tree = javalang.parse.parse(func)
return tree

It raises the following exception:
LexerError Traceback (most recent call last)
in
----> 1 parse_program_class(test)

in parse_program_class(func)
1 def parse_program_class(func):
----> 2 tree = javalang.parse.parse(func)
3 return tree

/usr/local/lib/python3.6/dist-packages/javalang/parse.py in parse(s)
50 def parse(s):
51 tokens = tokenize(s)
---> 52 parser = Parser(tokens)
53 return parser.parse()

/usr/local/lib/python3.6/dist-packages/javalang/parser.py in init(self, tokens)
93
94 def init(self, tokens):
---> 95 self.tokens = util.LookAheadListIterator(tokens)
96 self.tokens.set_default(EndOfInput(None))
97

/usr/local/lib/python3.6/dist-packages/javalang/util.py in init(self, iterable)
90 class LookAheadListIterator(object):
91 def init(self, iterable):
---> 92 self.list = list(iterable)
93
94 self.marker = 0

/usr/local/lib/python3.6/dist-packages/javalang/tokenizer.py in tokenize(self)
541
542 else:
--> 543 self.error('Could not process token', c)
544 self.i = self.i + 1
545 continue

/usr/local/lib/python3.6/dist-packages/javalang/tokenizer.py in error(self, message, char)
570
571 if not self.ignore_errors:
--> 572 raise error
573
574 def tokenize(code, ignore_errors=False):

LexerError: Could not process token at "#", line 35: out.println("#ifdef _MSC_VER");

Any suggestions on what I could try or change would be of great help.

Thanks in advance!

Node __equals__ not reflexive

I'm a bit confused by these two lines. Shouldn't this instead read

if type(other) is not type(self):
    return False

with a not in there? Consider the following:

tree = javalang.parse.parse('')
tree.__equals__(tree) # returns False

how to parse java code snippet to AST?

Dear authors,

Thanks for sharing your code. This package is easy to use, while I still encounter some problems.
I want to parse some java code snippet, for example some java functions.
Could you tell me how can parse the java snippet to AST?

here is my code:

import javalang
tokens = javalang.tokenizer.tokenize('public String toString ( ) { return this . getClass ( ) . getName ( ) ; } /** * @return An arbitrary string. */ public String anotherString ( ) { return "An arbitrary string." ; }')
parser = javalang.parser.Parser(tokens)
tree = parser.xxx() // I don't know parse_xx should be here.

I always get the 'javalang.parser.JavaSyntaxError' exception.

Any help from you will be highly appreciated.

Release new version?

I'd appreciate it if you could release a new version with the Java 8 support. Thanks for writing this!

Question : how to render a AST node

Hi,

Thank you for you implementation, it's very easy to parse code and explore the AST graph.
I implement some validation rules, for exemple, about catch implementation.

After founding a match I would like to render the subtree as code. Is there an API to do that ?

Regards,
Philippe

Checking if two trees are equal

I was wondering how to check if two trees are equal. There is an equals method in the Node class in ast.py but it does not check the entire hierarchy of the tree from root to leaf and does not handle cases when there are lists etc.
Also, is there any way to return the raw unparsed string for a type or a method? If a raw string can be returned, it can easily be compared.

InstanceOf Binary Operator doesn't have code line

If we have such an example, Binary Operator instanceof doesn't have code line number (it is None):

    @Override
    public InputSource resolveEntity(String publicId, String systemId) throws IOException, SAXException {
        LOGGER.log(Level.INFO, "Requested Entity: public id = {0}, system id = {1}", new Object[]{publicId, systemId});

        // We only expect a few entries here so use linear search directly.  If
        // this changes, considering caching using HashMap<String, String>
        //
        InputSource source = null;
        FileObject folder = FileUtil.getConfigFile("DTDs/GlassFish");
        if(folder != null) {
            for(FileObject fo: folder.getChildren()) {
                Object attr;
                if((attr = fo.getAttribute("publicId")) instanceof String && attr.equals(publicId)) {
                    source = new InputSource(fo.getInputStream());
                    break;
                } else if((attr = fo.getAttribute("systemId")) instanceof String && attr.equals(systemId)) {
                    source = new InputSource(fo.getInputStream());
                    break;
                }
            }
        }

        return source;
    }

In the rest of the cases, we have code line number

Is there a way to print a node to string?

Is there a way to convert a node (e.g. MethodDeclaration) into a string, either pretty-print (aka, reformatting) or print the original code?

I didn't find such methods in javalang. I had experiences with Roslyn in .net and javaparse. Both support this functionality.

replace method with another

Hi,
I would like to take two java files A and B and replace a method in file A with the code of the same method in file B... and then write the result in file C. I have the impression I am able to get the trees with your library, but... how do I do the rest? Or even, can I?
BTW I think your library lacks some examples on how to find a method within a tree, how to access its code, and how to print it back to the screen or to a file. It looks very promising!
Thanks!

How do I get a method's parameter's type?

This is my code:

fht_android_src = "E:/FHT Mobile/Android/fht/app/src/main/java/com/fht360/fht"

for (dirpath, dirname, filenames) in os.walk(fht_android_src):
    for filename in filenames:
        if filename.endswith(".java"):
            filepath = os.path.join(dirpath, filename)
            # see https://stackoverflow.com/questions/19591458/python-reading-from-a-file-and-saving-to-utf-8
            if filename == "MainActivity.java":
                with io.open(filepath, 'r', encoding="utf8") as f:
                    content = f.read()
                    tree = javalang.parse.parse(content)
                    for path, node in tree:
                        # if isinstance(node, javalang.tree.ClassDeclaration):
                        #     # print(path, node)
                        #     print(node.name)
                        if isinstance(node, javalang.tree.MethodDeclaration):
                            # print(path, node)
                            if node.name == "onEvent":
                                handlerNode = node
                                handlerPath = path
                                print(handlerNode.parameters[0].type, handlerNode.parameters[0].name)

I am analyzing a typical Android Activity class, which have a bunch of onEvent methods for EventBus, these methods look like:

public void onEvent(UnreadChatMessageEvent event) {
    mNavFragment.showHomeDot(event.count);

    SPUtil.saveInteger(this, Constant.UnReadChatMessage, event.count);
    doTotalCount();
}

public void onEvent(FriendRequestMsgDotClearEvent event) {
    mNavFragment.showFriendsDot(0);

    SPUtil.saveInteger(this, Constant.UntreatedFriendRequest, 0);
    doTotalCount();
}

I want to get the type of the notification as something like: UnreadChatMessageEvent, but the type given by javalang is always ReferenceType:

image

Uncaught TypeError: 'in <string>' requires string as left operand, not NoneType–Due to CR handling in lexer?

I encountered this crash while parsing random Java from GitHub. This is the crash occurred on PolarPixellateFilter.java from chrisbatt/AndroidFastImageProcessing.

UPDATE: I strongly believe this bug to be caused by javalang's handling of carriage returns as newlines (or lack thereof). It seems that a double-slash comment // has innocently commented out the entire rest of this file despite a carriage return ending the comment well before the end of the file.

This crashed occurred both on an Ubuntu machine and an macOS machine, both running Python 3.6.1.

Traceback (most recent call last):
  File "/Users/eddieantonio/.pyenv/versions/3.6.0/lib/python3.6/pdb.py", line 1667, in main
    pdb._runscript(mainpyfile)
  File "/Users/eddieantonio/.pyenv/versions/3.6.0/lib/python3.6/pdb.py", line 1548, in _runscript
    self.run(statement)
  File "/Users/eddieantonio/.pyenv/versions/3.6.0/lib/python3.6/bdb.py", line 431, in run
    exec(cmd, globals, locals)
  File "<string>", line 1, in <module>
  File "/Users/eddieantonio/Projects/sensibility/test_fail.py", line 4, in <module>
    import javalang
  File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parse.py", line 53, in parse
    return parser.parse()
  File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 110, in parse
    return self.parse_compilation_unit()
  File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 302, in parse_compilation_unit
    type_declaration = self.parse_type_declaration()
  File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 347, in parse_type_declaration
    return self.parse_class_or_interface_declaration()
  File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 356, in parse_class_or_interface_declaration
    type_declaration = self.parse_normal_class_declaration()
  File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 394, in parse_normal_class_declaration
    body = self.parse_class_body()
  File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 768, in parse_class_body
    declaration = self.parse_class_body_declaration()
  File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 791, in parse_class_body_declaration
    return self.parse_member_declaration()
  File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 825, in parse_member_declaration
    member = self.parse_method_or_field_declaraction()
  File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 839, in parse_method_or_field_declaraction
    member = self.parse_method_or_field_rest()
  File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 857, in parse_method_or_field_rest
    return self.parse_method_declarator_rest()
  File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 886, in parse_method_declarator_rest
    body = self.parse_block()
  File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 1274, in parse_block
    statement = self.parse_block_statement()
  File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 1339, in parse_block_statement
    return self.parse_statement()
  File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 1465, in parse_statement
    value = self.parse_expression()
  File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 1752, in parse_expression
    expressionl = self.parse_expressionl()
  File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 1767, in parse_expressionl
    expression_2 = self.parse_expression_2()
  File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 1796, in parse_expression_2
    parts = self.parse_expression_2_rest()
  File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 1813, in parse_expression_2_rest
    expression = self.parse_expression_3()
  File "/Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py", line 1855, in parse_expression_3
    while token.value in '[.':
TypeError: 'in <string>' requires string as left operand, not NoneType
Uncaught exception. Entering post mortem debugging
Running 'cont' or 'step' will restart the program
> /Users/eddieantonio/.pyenv/versions/sensibility/lib/python3.6/site-packages/javalang/parser.py(1855)parse_expression_3()

It crashes in parser.py.

Popping this in pdb reveals that the token is an EndOfInput:

-> while token.value in '[.':
(Pdb)
(Pdb) p token
EndOfInput "None"

However, the primary that it just parsed is nowhere near the end of input

(Pdb) p primary
Literal
(Pdb) p primary.position
(1, 951)

The only weird thing about the file is that its newline character is the carriage return (yuck!), hence javalang believes it's all on one line. Otherwise, javac considers it syntactically-valid Java 8 source code.

Replication package: javalang-crash.zip

Support java 8 syntax, such as lambda expressions

Currently javalang raises an error when it encounters the new lambda expression syntax that java 8 provides and instead (of raising an error) it should parse the code.

Here is an some java that triggers an error in javalang (taken from http://docs.oracle.com/javase/tutorial/java/javaOO/lambdaexpressions.html)

public class Calculator {

    interface IntegerMath {
    int operation(int a, int b);   
    }

    public int operateBinary(int a, int b, IntegerMath op) {
    return op.operation(a, b);
    }

    public static void main(String... args) {

    Calculator myApp = new Calculator();
    IntegerMath addition = (a, b) -> a + b;
    IntegerMath subtraction = (a, b) -> a - b;
    System.out.println("40 + 2 = " +
        myApp.operateBinary(40, 2, addition));
    System.out.println("20 - 10 = " +
        myApp.operateBinary(20, 10, subtraction));    
    }
}

Serialize output

Hi all,
First of all, thank you for this excellent tool. I would like to know if there is a way to serialize the nodes to a known format such as json or XML.

Error parsing line comment in last line with no final line break

Javalang cannot parse line comments in the last line of code, if this line is terminated by the end of file instead of a line break character. This is probably a rare issue, but I ran across it while parsing an old version of Apache POI.

The following is a minimal example:

import javalang
javalang.parse.parse('// line comment')

It raises the following exception:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-42-babb856b693c> in <module>
----> 1 javalang.parse.parse('// line comment')

.../site-packages/javalang/parse.py in parse(s)
     50 def parse(s):
     51     tokens = tokenize(s)
---> 52     parser = Parser(tokens)
     53     return parser.parse()

.../site-packages/javalang/parser.py in __init__(self, tokens)
     93
     94     def __init__(self, tokens):
---> 95         self.tokens = util.LookAheadListIterator(tokens)
     96         self.tokens.set_default(EndOfInput(None))
     97

.../site-packages/javalang/util.py in __init__(self, iterable)
     90 class LookAheadListIterator(object):
     91     def __init__(self, iterable):
---> 92         self.list = list(iterable)
     93
     94         self.marker = 0

.../site-packages/javalang/tokenizer.py in tokenize(self)
    506             elif startswith in ("//", "/*"):
    507                 comment = self.read_comment()
--> 508                 if comment.startswith("/**"):
    509                     self.javadoc = comment
    510                 continue

AttributeError: 'NoneType' object has no attribute 'startswith'

Extracting source code of user defined methods from java file

Hi,

I would like to extract source code of user defined methods from a java file.

Suppose, we have a following code in a java file:

public class CallingMethodsInSameClass
{
     public static void main(String[] args) {
        printOne();
        printOne();
        printTwo();
      }

    public static void printOne() {
       System.out.println("Hello World");
     }

     public static void printTwo() {
       printOne();
       printOne();
    } 
}

The output should be:

Method 1:

public static void main(String[] args) {
        printOne();
        printOne();
        printTwo();
 }

Method 2:

public static void printOne() {
    System.out.println("Hello World");
 }

Method 3:

public static void printTwo() {
    printOne();
    printOne();
} 

Is there a way to do it by using javalang? Please let me know about it.

Error of token position of lines preceded by inline comments

When a line is preceded by an inline comment the position of lines tokens are decreased by one.

   ...
   static JavaVersion get(final String nom) { // position (xx, 5)
   ...
   // comment
   static JavaVersion get(final String nom) { // position (xx, 4)

Unary Operators?

javalang.tokenize.tokenize correctly identifies the unary operators within the token list however, javalang.parse.parse does not contain any nodes for these.

Using this example:

public class JavaTest
  extends Object
{
  public static void main(String[] args)
  {
    int counter = 0;
    counter++;
    boolean flag = false;
    if(!flag && counter)
    {
      System.out.println("Stuff!");
    }
  }
}

with this script:

import sys
import javalang

class Example:
  def __init__(self, infile):
    self.contents = ""
    with open(infile, "r") as FIN:
      for line in FIN:
        self.contents += line
    if self.contents:
      print("Tokens:")
      tokens = list(javalang.tokenizer.tokenize(self.contents))
      for token in tokens:
        print(token)
      print("Nodes:")
      tree = javalang.parse.parse(self.contents)
      for path, node in tree:
        my_str = str(node)
        if hasattr(node, "position"):
          if node.position:
            my_str += " " + str(node.position)
        if hasattr(node, "value"):
          if node.value:
            my_str += " " + str(node.value)
        if isinstance(node, javalang.tree.IfStatement):
          my_str += "\n--condition: " + str(node.condition)
          if isinstance(node.condition, javalang.tree.BinaryOperation):
            my_str += "\n----operator: " + str(node.condition.operator)
            my_str += "\n----Left:  " + str(node.condition.operandl.member)
            my_str += "\n----Right: " + str(node.condition.operandr.member)
        if isinstance(node, javalang.tree.BinaryOperation):
          my_str += "\n--operator: " + str(node.operator)
        print(my_str)

if __name__ == "__main__":
  if len(sys.argv) > 1:
    for name in sys.argv[1:]:
      Example(name)

If you run these examples, the tokenizer correctly prints out the postfix increment and the not operators. When breaking down the nodes though, the increment and not are left out entirely. Of special interest is the breakdown of the AND gate:

IfStatement
--condition: BinaryOperation
----operator: &&
----Left:  flag
----Right: counter

Popping through the Parser code, I'm not actually seeing anything that finds unary ops in the compilation_unit logic (admittedly, I've only been looking for a bit). Am I just missing something?

found a bug

I found a bug,
in file : tokenizer.py
line 470 :
escape_code = int(self.data[j:j+4], 16)
should be: escape_code = int(data[j:j+4], 16)

Thanks for your great job!

String values don't properly handle unicode escapes

I am using javalang to tokenize files which include Unicode escape sequences. These are correctly tokenized as strings, but the item.value is not handled cleanly. Consider the 2 cases below:
Case 1: builder.append(text, 0, MAX_TEXT).append('\u2026');
Case 2: builder.append(text, 0, MAX_TEXT).append('…');

In both cases, item.value is identical and I get an exception if I try to write the item.value to a file. I can catch the error and successfully print using python like this:

      if (token_type == 'String'):
          try:
              outfile.write(item.value)
          except UnicodeEncodeError:
              outfile.write(item.value.encode('unicode-escape').decode('utf-8'))

but the python code above prints the same value for Case 1 and 2. I suspect the proper fix is to use raw strings for String token values internal to javalang. Below is an example of raw strings solving the problem.

>>> str1 = '…'
>>> str2 = '\u2026'
>>> print("str1: ",str1," str2:",str2)
str1:  …  str2: …
>>> str1 == str2
True
>>> str1 = r'…'
>>> str2 = r'\u2026'
>>> print("str1: ",str1," str2:",str2)
str1:  …  str2: \u2026
>>> str1 == str2
False

Questions about the library

Hi everyone,

This project sounds really good.
I have some questions that perhaps could be included in the ReadMe:

  • Is it Python3 compatible?
  • What are the limitations of the library?
  • How can I make a traversal in the AST? Do we have visitors?
  • Do you think it is suitable for a refactoring tool?

Thanks,
Luis

JavaSyntaxError parsing method reference

The code below is throwing a JavaSyntaxError because of the brackets in "Long[]::new", and I can't figure out why. Is it invalid Java?

public class StatementStepDefinition {
    public void foobar() {
        final Long[] secondaryTransactionsIds = getSecondaryTransactions().stream().toArray(Long[]::new);
    }
}

Array argument

I'm currently using the Java parser Javalang because I need to iterate through an AST from a java source code, but I need to do it from a python file.

It appears to be quite useful but I have a problem when parsing arguments in a method signature. For exemple in a classic

public void main(String[] args){}

The String[] args is being parsed as a FormalParameter, who's a subclass of Declaration, itself a subclass of Node. In this particular exemple, the type field of FormalParamter will be ReferenceType, that got 4 fields: name,declarations,arguments,sub_type. The name field return only String, and the sub_type returns None. There is no indication of "args" being an array. How can i get that back ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.