Giter VIP home page Giter VIP logo

Comments (4)

nitz avatar nitz commented on June 30, 2024 1

Also, well, you got me there!

A slight tweak to one of my latest Node parsers and a few removals of Try() where I shouldn't have been, and I'm off to the races solving problems with my text parsers!

Here's what cleared up the zero-length parses, if I'm understanding right: requiring at least one argument if there's no identifier (meaning there should be no case where a node gets returned with no name and no arguments!)

		private static TokenListParser<Token, Element> Node { get; } =
			(from element in Identifier
				.Then(name =>
					Argument.Many().OptionalOrDefault(Array.Empty<Element>())
					.Named("arguments")
					.Select(x => new Node(name, x) as Element)
				)
				.Named("node name")
				.Or(
					Argument.AtLeastOnce()
					.Named("arguments")
					.Select(x => new Node(string.Empty, x) as Element)
				)
			 select element)
			.Named(nameof(Node));

You're fine to close this one if you want, I'm only leaving it open at the moment just in case you have anything to add about another way to force a failure. 🙂

from superpower.

nblumhardt avatar nblumhardt commented on June 30, 2024

Hi! Thanks for dropping by. SDLang looks like a nice little DSL, hopefully writing the parser will turn out to be fun :-)

I'm on pretty limited time to reply - have a couple of points that may help (below), but I'll circle back when I can, so if you end up with more questions feel free to add them here (though replies could be slow, sorry).

Try and backtracking - it's a bit long in the tooth, now, but reading this short series of blog posts by Brian McNamara should get you all the info you need; Superpower uses a slightly different mechanism under the hood but the Try() combinator is essentially the one Brian describes.

Zero-length matches - imagine you have a parser A that parses strings like aaaaa and returns the number of as in them. A.Parse("aaa") would be 3, while A.Parse("") would be 0. Now imagine applying the Many() combinator to it: A.Many(). The A parser will succeed on "", but it won't consume any input, so when Many()triesAagain it will succeed again - and so on, indefinitely. Therein lies the road to madness... so to prevent this sticky situation (which can otherwise get very hard to detect),Many()disallows zero-length matches, and most Superpower parses shouldn't allow them. What it means if you're hitting this: you have a parser that succeeds without consuming input - somewhere. Best to rewrite so that the parser fails on no match, and then useOptionalOrDefault()` to handle the empty case.

How to decide on the boundaries of tokens - in general, tokens should be as inclusive as the grammar allows. If //, when it appears in the grammar, can only mark the start of a comment, and you never "look inside" the comment, then the token should begin at // and continue to the end of the comment. If // can appear in situations where it's not marking a comment, or, if your grammar needs to split the comment contents out from the starting delimiter, treating // as one content and the comment body as another will be needed. There are really no hard-and-fast rules, though.

Hope this helps,
Nick

from superpower.

nitz avatar nitz commented on June 30, 2024

Nick,

Wow, thanks for the response and so quickly — this certainly does clear a lot up!

Especially on the token boundaries. The "look inside" guideline seems to be something I was close to, but wasn't quite getting. Seeing your explanation though helps a ton with that. For instance, I'm planning on not only the traditional C++ style // comments that I'd never need to look inside and would treat as a single chunk. I'm also planning on a /- comment style (one of the features I want to borrow from KDL, a fork of SDL). I still want to process all the text after that comment, but to deserialize it in a specific way that says "this content exists, but isn't part of the regular data stream", so it makes sense for /- to be a token.

On zero length matches, your response makes perfect sense. And I think is definitely what I'm seeing happening given my input. I think my goal is exactly as you say there, write it it so it fails on no match, I'm just a bit confused on how to fail. I have definitely gotten too "permissive" in my approach, but I think that comes from not being exactly sure where will trigger a failure, and assuming each of my rules will run its course even on bad input.

I tried selecting a MyTokenListParserResult.Empty<...>() at one point, but found it wanted the remainder to return with, and couldn't figure out how to give it that. Is that the usual method for saying "I didn't parse anything"? Or should my rules be trying to consume what they expect and that will properly trigger the fail cases?

Thanks again!

Chris

from superpower.

nblumhardt avatar nblumhardt commented on June 30, 2024

Hope it's going well! Will close this to keep the clutter under control in the issue tracker :-)

from superpower.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.