Giter VIP home page Giter VIP logo

Comments (7)

mystor avatar mystor commented on September 26, 2024

How would you imagine these spans being implemented? My immediate thought was that they could be implemented by adding a

span: &'a str

field to each of the AST structs, which holds a string slice into the input string. This wouldn't include line and column number info, which would then be at minimum O(n) time to compute, but would act as a sort-of-byte-index tracker.

The other simple option would be to do literally that, which is record the starting and ending byte indexes of each of the AST nodes in the source, like:

span: (usize, usize)

Actually tracking the occurrence of newline characters when parsing seems unpleasant, and like it would require non-insignificant changes to nom, which is especially problematic if we are going to stabilize and expose our internal nom module as we're talking about doing in #81. In addition, even simple changes like adding the byte offsets seem like they would be easier if we tweak parts of how our nom fork work internally to track the original input string in addition to the current working substring (in order to be able to calculate byte offsets).

from syn.

mystor avatar mystor commented on September 26, 2024

I should also add that one of the nice things about doing byte offsets for this is that it makes it very easy to retrieve the original source text for an AST node from the source string, which is nice for error reporting when using full, for example.

from syn.

dtolnay avatar dtolnay commented on September 26, 2024

Once we get procedural macros I would like to take advantage of the spans contained in those rather than implementing our own separate system. The parser will be able to parse a TokenStream rather than a string and it can keep track of the span of each syntax tree node. Then the user's procedural macro logic will be able to trigger errors on particular syntax tree nodes that rustc is able to display in the right place.

I haven't been keeping track of how far we are from a usable API for iterating through a TokenStream but once we have that, we can implement TokenStream parsing behind a cfg in syn.

from syn.

mystor avatar mystor commented on September 26, 2024

I would also like spans for string inputs as well, for situations where I am parsing full .rs files with syn. Do you think whatever solution we end up using will support both?

from syn.

dtolnay avatar dtolnay commented on September 26, 2024

I think nom handles this with their InputIter trait which abstracts the difference between &[u8] and &str so that most parsers work with either one. We could do a similar abstraction over &str vs TokenStream and treat spans differently in the two cases. For now, (usize, usize) for the string case seems good to me.

from syn.

dtolnay avatar dtolnay commented on September 26, 2024

@mystor have you looked into possibly using strata_rs for your use case? It looks like they have spans (which they call Extent) already.

In general it looks like that library is designed for more advanced use cases and it may serve us better to direct people who need spans to use strata_rs and keep syn focused on the proc macro use case.

from syn.

dtolnay avatar dtolnay commented on September 26, 2024

This is superseded by #142 which will use real spans from the compiler.

from syn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.