I'm trying to find the comment related to an AST node of the python source code I am a

find_token() skips comments about asttokens HOT 8 CLOSED

gristlabs commented on May 27, 2024

find_token() skips comments

from asttokens.

Comments (8)

dsagal commented on May 27, 2024

Is the linked pull request what you need?

from asttokens.

abulka commented on May 27, 2024

Looks promising!

from asttokens.

dsagal commented on May 27, 2024

I changed it actually, and committed to master, with some tests. Turns out there is no need for include_extra parameter, it should just always use True. So if you are looking for tokenize.COMMENT, it will now find it, and if you were looking for a regular token, it works the same as before. So the interface is the same, but your use case is fixed.

I'm closing, but let me know if you still have any issues with this.

from asttokens.

abulka commented on May 27, 2024

Thanks - any idea when the new version will be available via pip?

from asttokens.

dsagal commented on May 27, 2024

Just published.

from asttokens.

abulka commented on May 27, 2024

Might have found a problem - or maybe its the way I'm using the library. When I scan for comments on a node, the next comment in the entire source code is found - regardless of how far away it is. I need to find comments only on the line that the node is part of.

Here is the repro of the weird behaviour:

import ast
import asttokens
import tokenize
from textwrap import dedent

src = dedent("""
    def hello():
        x = 5
        there()
        
    def there():
        return 999  # my silly comment
    
    hello()  # call it
    there()        
""")

class RecursiveVisitor(ast.NodeVisitor):
    """ example recursive visitor """

    def recursive(func):
        """ decorator to make visitor work recursive """
        def wrapper(self,node):
            self.dump_line_and_comment(node)
            func(self,node)
            for child in ast.iter_child_nodes(node):
                self.visit(child)
        return wrapper

    def dump_line_and_comment(self, node):
        comment = atok.find_token(node.first_token, tokenize.COMMENT)
        print(f'On line "{node.first_token.line.strip():20s}" find_token found "{comment}"')

    @recursive
    def visit_Assign(self,node):
        """ visit a Assign node and visits it recursively"""

    @recursive
    def visit_BinOp(self, node):
        """ visit a BinOp node and visits it recursively"""

    @recursive
    def visit_Call(self,node):
        """ visit a Call node and visits it recursively"""

    @recursive
    def visit_Lambda(self,node):
        """ visit a Function node """

    @recursive
    def visit_FunctionDef(self,node):
        """ visit a Function node and visits it recursively"""


atok = asttokens.ASTTokens(src, parse=True)
tree = atok.tree
visitor = RecursiveVisitor()
visitor.visit(tree)

Gives me:

On line "def hello():        " find_token found "COMMENT:'# my silly comment'"
On line "x = 5               " find_token found "COMMENT:'# my silly comment'"
On line "there()             " find_token found "COMMENT:'# my silly comment'"
On line "def there():        " find_token found "COMMENT:'# my silly comment'"
On line "hello()  # call it  " find_token found "COMMENT:'# call it'"
On line "there()             " find_token found "ENDMARKER:''"

from asttokens.

dsagal commented on May 27, 2024

That's not a problem with this module, it's just not a feature of it: find_token finds the next matching token regardless of the line. But line breaks themselves introduce tokens, so you can write a helper to find the next comment on the same line as a given token, like so:

def find_line_comment(atok, start_token):
    t = start_token
    while t.type not in (tokenize.COMMENT, tokenize.NL, tokenize.NEWLINE, token.ENDMARKER):
      t = atok.next_token(t, include_extra=True)
    return t if t.type == tokenize.COMMENT else None

from asttokens.

abulka commented on May 27, 2024

Thanks - that helper routine works great. A slight tweak I made is to either return the comment string or an empty string:

def find_line_comment(start_token):
    t = start_token
    while t.type not in (tokenize.COMMENT, tokenize.NL, tokenize.NEWLINE, tokenize.ENDMARKER):
        t = self.atok.next_token(t, include_extra=True)
    return t.string if t.type == tokenize.COMMENT else ''

comment = find_line_comment(node.first_token)

P.S. My old hack approach was not very 'token' based and for the curious, was simply:

line = node.first_token.line
comment_i = line.find('#')
comment = line[comment_i:].strip() if comment_i != -1 else ''

from asttokens.

find_token() skips comments about asttokens HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent