Comments (7)
How would you imagine these spans being implemented? My immediate thought was that they could be implemented by adding a
span: &'a str
field to each of the AST structs, which holds a string slice into the input string. This wouldn't include line and column number info, which would then be at minimum O(n) time to compute, but would act as a sort-of-byte-index tracker.
The other simple option would be to do literally that, which is record the starting and ending byte indexes of each of the AST nodes in the source, like:
span: (usize, usize)
Actually tracking the occurrence of newline characters when parsing seems unpleasant, and like it would require non-insignificant changes to nom
, which is especially problematic if we are going to stabilize and expose our internal nom
module as we're talking about doing in #81. In addition, even simple changes like adding the byte offsets seem like they would be easier if we tweak parts of how our nom fork work internally to track the original input string in addition to the current working substring (in order to be able to calculate byte offsets).
from syn.
I should also add that one of the nice things about doing byte offsets for this is that it makes it very easy to retrieve the original source text for an AST node from the source string, which is nice for error reporting when using full
, for example.
from syn.
Once we get procedural macros I would like to take advantage of the spans contained in those rather than implementing our own separate system. The parser will be able to parse a TokenStream rather than a string and it can keep track of the span of each syntax tree node. Then the user's procedural macro logic will be able to trigger errors on particular syntax tree nodes that rustc is able to display in the right place.
I haven't been keeping track of how far we are from a usable API for iterating through a TokenStream but once we have that, we can implement TokenStream parsing behind a cfg in syn.
from syn.
I would also like spans for string inputs as well, for situations where I am parsing full .rs files with syn. Do you think whatever solution we end up using will support both?
from syn.
I think nom handles this with their InputIter
trait which abstracts the difference between &[u8] and &str so that most parsers work with either one. We could do a similar abstraction over &str vs TokenStream and treat spans differently in the two cases. For now, (usize, usize)
for the string case seems good to me.
from syn.
@mystor have you looked into possibly using strata_rs
for your use case? It looks like they have spans (which they call Extent) already.
In general it looks like that library is designed for more advanced use cases and it may serve us better to direct people who need spans to use strata_rs and keep syn focused on the proc macro use case.
from syn.
This is superseded by #142 which will use real spans from the compiler.
from syn.
Related Issues (20)
- Delete From impls of ast enums HOT 2
- Consider inserting invisible groups, rather than parens, when applying grouping for precedence HOT 1
- Implement `Debug` for the AST nodes HOT 1
- parse_nested_meta don't work on attribute with multiple meta HOT 2
- Struct Literals in ExprLet HOT 1
- Parse `safe` items in extern blocks
- Parse precise capturing syntax HOT 1
- Scope when parsing delimited group content does not necessarily belong to the right Group token
- Parsing function using `parse` referencing enum fails HOT 1
- Parse attributes on where-predicates
- Deny keyword lifetimes pre-expansion
- Parse unsafe attributes
- Parse unnamed C varargs within function pointer types HOT 1
- [Feature Request] Add support for incomplete expression and statement HOT 1
- FieldMutability is missing Parse and ToTokens HOT 1
- Breaking change to `Generics::lifetimes` in v2.0.73 HOT 8
- ExprPath to_tokens() output can't be parsed as an expression due to missing turbofish
- A
- Documentation discrepancy between `parse` and `parse2`
- Generics::split_for_impl can cause clippy::multiple_bound_locations HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from syn.