Comments (8)
I'm curious. For my needs combine works just fine with indentation aware syntax, by providing tokenizer which emits indent
and unindent
tokens. As far as I know python handles indentation at tokenizer level as well as most implementations of yaml.
So the question is how often you actually on raw bytes without any tokenizer? I have always thought it's required for any serious work. (I.e. for anything more complex than parsing 1+2/3
). It's especially important because combine is too slow (to compile) and trying to use it on raw chars will probably make compilation times even slower.
from combine.
I guess this may just be me throwing the idea out there as I am thinking about moving to an indentation based syntax for https://github.com/Marwes/embed_lang. I am aware you can handle indentation in the lexer as I have done that way for Haskell https://github.com/Marwes/haskell-compiler/blob/master/src/lexer.rs#L384-L477. It is not really trivial though so it would be nice if there was a ready way to do it through a library.
I have only done parsers which work directly with char and while its not as efficient as using a separate lexer it does work and I find it makes the parser + lexer simple and easy to modify. If or when I actually need a separate lexer I think it should be easy to move over to that as well. (https://github.com/Marwes/embed_lang/blob/master/parser/src/lib.rs)
Its nice to see someone which has a dedicated lexer though, got any link to that? I am hoping that #37 will make it a bit easier to add a lexer, it would be nice to see and example of working one.
from combine.
This would definitely be a nice feature to have. I keep meaning to add support for I-expressions to my Scheme parser, and built-in support for indentation-sensitive syntax would make that a lot less work.
from combine.
Its nice to see someone which has a dedicated lexer though, got any link to that?
https://github.com/tailhook/marafet/tree/master/marafet_parser/src
It's a little bit shitty, because I've tried to quickly port it to new features (in particular Positioner
and Range
) without getting real understanding of how they are supposed to work.
from combine.
@tailhook Nice, just open an issue if you have trouble understanding Positioner
and Range
I should probably add some better docs for those. Anyway, for Range
you don't need to invent a dummy type, just use the same type you have for Item
. Range
is only meant for RangeStream
to have a way of storing errors.
from combine.
Out of scope.
from combine.
I came across this issue because I've been writing a parser with combine
(and really enjoying it!) and was wondering about the best approach for making it indentation-sensitive.
I totally get that first-class support for this is out of scope, but I'm wondering if there's a recommended approach?
Thanks for a lovely library!
from combine.
@rtfeldman gluon is indentation sensitive but it is quite a mess, really https://github.com/gluon-lang/gluon/blob/master/parser/src/layout.rs .
The basic idea is that as you scan the input text you emit block open
/block close
tokens in between the normal, visible tokens. Then the parser is written to match on those tokens.
Other than that just google around I think, I don't have any good resources for it unfortunately.
from combine.
Related Issues (20)
- Throw stream errors HOT 5
- DateTime parser HOT 1
- take_until_bytes() and partial parsing HOT 2
- Is there a way to get `Stream<Token=char>` from `io::Read`? HOT 1
- Tools for debugging recursion problems? HOT 4
- Some issue with error reporting
- Errors include unprintable or awkwardly printed characters. HOT 6
- `expected` error strings always quote what was expected, even if it isn't a literal HOT 3
- How about offset into some data? HOT 3
- Outdated tutorial HOT 1
- Native/abstracted sub-parsers HOT 6
- XML parsing for React.js to Solid.js conversion HOT 4
- Comparison with LALRPOP
- Unbounded mutual recursion in Parser impl HOT 3
- Adivce on reducing code size in WASM target HOT 7
- Docs unclear whether `parser!` should be used on nightly rust HOT 2
- Parse `std::process::Child` stdout
- Successful parser will not clear the error stack HOT 1
- build failure
- Implement Pratt parsing or precedence climbing HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from combine.