This is a requirement for implementing something like rustfmt against syn.

Once we get <a href="https://github.com/rust-lang/rust/issues/38356" data-hovercard-ty

I think nom handles this with their <a href="http://rust.unhandledexpression.com/nom/t

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

This is superseded by <a class="issue-link js-issue-link" data-error-text="Failed to l

Implement spans about syn HOT 7 CLOSED

dtolnay commented on September 26, 2024

Implement spans

from syn.

Comments (7)

mystor commented on September 26, 2024

How would you imagine these spans being implemented? My immediate thought was that they could be implemented by adding a

span: &'a str

field to each of the AST structs, which holds a string slice into the input string. This wouldn't include line and column number info, which would then be at minimum O(n) time to compute, but would act as a sort-of-byte-index tracker.

The other simple option would be to do literally that, which is record the starting and ending byte indexes of each of the AST nodes in the source, like:

span: (usize, usize)

Actually tracking the occurrence of newline characters when parsing seems unpleasant, and like it would require non-insignificant changes to nom, which is especially problematic if we are going to stabilize and expose our internal nom module as we're talking about doing in #81. In addition, even simple changes like adding the byte offsets seem like they would be easier if we tweak parts of how our nom fork work internally to track the original input string in addition to the current working substring (in order to be able to calculate byte offsets).

from syn.

mystor commented on September 26, 2024

I should also add that one of the nice things about doing byte offsets for this is that it makes it very easy to retrieve the original source text for an AST node from the source string, which is nice for error reporting when using full, for example.

from syn.

dtolnay commented on September 26, 2024

Once we get procedural macros I would like to take advantage of the spans contained in those rather than implementing our own separate system. The parser will be able to parse a TokenStream rather than a string and it can keep track of the span of each syntax tree node. Then the user's procedural macro logic will be able to trigger errors on particular syntax tree nodes that rustc is able to display in the right place.

I haven't been keeping track of how far we are from a usable API for iterating through a TokenStream but once we have that, we can implement TokenStream parsing behind a cfg in syn.

from syn.

mystor commented on September 26, 2024

I would also like spans for string inputs as well, for situations where I am parsing full .rs files with syn. Do you think whatever solution we end up using will support both?

from syn.

dtolnay commented on September 26, 2024

I think nom handles this with their InputIter trait which abstracts the difference between &[u8] and &str so that most parsers work with either one. We could do a similar abstraction over &str vs TokenStream and treat spans differently in the two cases. For now, (usize, usize) for the string case seems good to me.

from syn.

dtolnay commented on September 26, 2024

@mystor have you looked into possibly using strata_rs for your use case? It looks like they have spans (which they call Extent) already.

In general it looks like that library is designed for more advanced use cases and it may serve us better to direct people who need spans to use strata_rs and keep syn focused on the proc macro use case.

from syn.

dtolnay commented on September 26, 2024

This is superseded by #142 which will use real spans from the compiler.

from syn.

Implement spans about syn HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent