This is a port of agdarsec to Idris.
gallais / idris-tparsec Goto Github PK
View Code? Open in Web Editor NEWTParsec - Total Parser Combinators in Idris
Home Page: https://gallais.github.io/idris-tparsec/
License: GNU General Public License v3.0
TParsec - Total Parser Combinators in Idris
Home Page: https://gallais.github.io/idris-tparsec/
License: GNU General Public License v3.0
This is a port of agdarsec to Idris.
I have written a small generic lexer as doc for the agda standard library's Trie.
I think it would be nice to have a (better, position-aware) version of this for
the total parser libraries. I'm planning to port it to agdarsec.
https://github.com/agda/agda-stdlib/blob/91f9c5ece802554a87f2106daa97902fefd17218/README/Trie.agda
I've pushed an initial attempt to the idris2 branch, the core library itself seems to compile (at least with the latest master
version), but the examples still need some work.
In addition to knowing the starting position of a parsed String, it would be very useful to know the ending position, so I would like to suggest generalising Position
to Span
.
It might be interesting to go through this paper to see if any of their patterns are applicable:
https://dl.acm.org/doi/pdf/10.1145/3471874.3472984 Willis, Wu, [2021] "Design Patterns for Parser Combinators"
https://github.com/j-mie6/design-patterns-for-parser-combinators
Using combinators on rec
appears to be fairly common, and it seems to always require defining helpers to make types align, e.g.:
andbox : All (Box (Parser' s) :-> Box (Parser' t) :-> Box (Parser' (s, t)))
andbox {s} {t} p q = map2 {a=Parser' s} {b=Parser' t} (\p, q => and p q) p q
nelistbox : All (Box (Parser' s) :-> Box (Parser' (NEList s)))
nelistbox {s} p = map {a=Parser' s} nelist p
Should we maybe provide a separate module for these out of the box :), or there's a more elegant solution?
I am very excited about this library. A couple years ago I was all into parser combinators and tried using Lightyear. It made all my functions partial and my program crashed and I had no idea why. I searched around until I found the Agda paper (which I guess you wrote) but was out of energy and gave up. It was beyond me at the time; I was only beginning to put effort toward reading papers. I developed a style for writing total parsers in Idris without combinators. It was heavy on boilerplate but I was that committed to totality. I would write that way even in F#. In many ways I would say the purpose of dependent types is to enable totality.
So I was very excited to see that you had ported your stuff to Idris. I read your Arithmetic example and it was very helpful. I am going to try to read the paper now.
If you wouldn't mind, I would love to see the STLC example documented in the style of the Arithmetic example docs. I would really like to grok and internalize what you've done here but my CS degree didn't prepare me at all for understanding academic computer science so I've been working to understand it on my own time because I have a great hunger for it.
Thank you for making this library.
While working on https://github.com/gallais/agdarky I've noticed that the parser for STLC I had did
not work on examples of the form \f.\a.\b. f a b
which would get parsed as
\f.\a.\b. f (a b)
rather than \f.\a.\b. (f a) b
.
I have pushed a fix on agdarky but haven't had the time to look back at the example
we have here (and in the agdarsec repo).
The idea behind the fix amounts to:
As far I see, the DecEq
constraint on tokens is only used in Combinators.exact
, where it is immediately converted to a boolean. Seems like a weaker Eq
constraint could be used here, sparing the user of writing things like counterproofs. Is this an artifact of an Agda port, or did you intend to have this constraint?
This currently seems to be the biggest feature missing from TParsec. Do you have any suggestions on how to implement this better? Maybe by changing the Success
structure to be a variant type Result = Error | Success
?
Hi, I stumbled upon the following problem when compiling (@clayrat):
erik@munin:~/dev/3rdparty/idris-tparsec (master) $ idris --build TParsec.ipkg
Entering directory `./src'
Type checking ./Relation/Indexed.idr
Type checking ./Relation/Subset.idr
Type checking ./Induction/Nat.idr
Type checking ./Data/SizedDict.idr
Type checking ./Data/Inspect.idr
Type checking ./Data/DList.idr
Type checking ./Data/NEList.idr
Type checking ./TParsec/Result.idr
Type checking ./TParsec/Success.idr
Type checking ./TParsec/Types.idr
./TParsec/Types.idr:109:16-56:
|
109 | recordChar c = MkTPT $ ignore (modify (mapFst (next c)))
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When checking right hand side of recordChar with expected type
TParsecT e a m ()
When checking argument runTPT to constructor TParsec.Types.MkTPT:
No such variable ignore
Fresh list is data structure defined by induction recursion for element types with a decidable equality.
Here's a PR to the Agda standard library with an implementation of fresh lists together with documentation & an updated version of the lexer example using fresh lists: agda/agda-stdlib#878
Please add a .ipkg file for ease of installation an building. It would make using your library more convenient.
I can't find good names for these in Idris: the m
is already taken by the Maybe
-tainted
combinators.
At the moment decimalDigit
in TParsec.Combinators.Numbers
returns a Nat
when it should return a Nat
together with the proof it is less than 10
.
I don't think it's worth using Fin
here as we do want natural numbers in the
other parsers using decimalDigit
.
The one function in TParsec.Running
is only ever useful to write down
examples but not to actually use it. It'd be nice to have specialised runners
for input String
vs. Text
and output Maybe a
or List a
.
I am not sure how much of the time spent parsing examples is actually spent
computing the whole Parser mn p a n
for the concrete size n
of the input
string. If this is indeed happening then we ought to be able to save a lot of
time by adding Lazy
annotations for the second arguments of and
, alt
, etc.
It's so ubiquitous it probably makes sense to have an implementation out of the box.
There's this new package manager for Idris: https://github.com/elba/elba/, might be useful to have TParsec indexed by it?
Seeing how #17 fixes a lot of these, it would be nice to always keep the files clean.
Cf. https://github.com/agda/agda/tree/master/src/fix-agda-whitespace for how it's done in the agda repo.
While I was porting the JSON parser to Agda yesterday I think I have
found a bug in the string literal parser:
idris-tparsec/src/TParsec/Combinators/Chars.idr
Lines 154 to 156 in 75b2887
After having read a \
we will only accept a u
followed by 4 hexadecimal
digits aka. a unicode character. Shouldn't we also accept:
\
"
r
, n
, or t
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.