Giter VIP home page Giter VIP logo

ometa-racket's Introduction

Install | Play | Test | References

Warning: This implementation is experimental and incomplete. Everything is subject to change without warning.

ometa-racket


is an experimental implementation of OMeta with Racket as a host. OMeta is a beautiful instrument for experimenting with designs for programming languages - a convenient way to implement lexers, parsers, tree-transformers and extensions thereof.

From original 2007 paper by Alessandro Warth and Ian Piumarta:

OMeta, a new object-oriented language for pattern matching. OMeta is based on a variant of Parsing Expression Grammars (PEGs) - a recognition-based foundation for describing syntax - which we have extended to handle arbitrary kinds of data.

This here is a recursive interpreter that closely follows operational semantics described in Alessandro Warth's dissertation. It dispatches on tagged lists that represent core OMeta grammar - a deliberately simpleminded approach - think exploratory programming to help one grok and intern ideas. Toy or not you'll find most features you'd expect having read original papers:

  • matching on strings
  • matching on structured data (lists)
  • semantic actions and predicates
  • parameterized rules
  • inheritance
  • foreign rule invocation
  • directly left-recursive rules (with nesting)
  • higher-order apply
  • pattern-matching on rule parameters
  • indirectly left-recursive rules

Install


given the experimental nature of this repo it is unlikely to make appearance in Racket packages or collections. Simply clone it and require ometa.rkt from your source files. Assuming your ometa-racket files and your source file are in the same directory:

#lang racket
(require "ometa.rkt")

Play


Syntax

ometa | define-ometa | omatch | define-ometa-namespace

Technically ometa-racket has enough meat to write a lexer+parser combo for OMeta proper, which uses C-like syntax. I'm working on it. Until then we'll stick with s-expressions. Some of you may even notice just how much it resembles Olin Shivers' SRE regular-expression notation. Such is the power of truly beautiful ideas - we keep reinventing them. Current syntax is verbose because I'm only using a low-level core language.

Core grammar for OMeta parsing expressions (e):

e ==   (empty)
       (atom a)
       (apply A)
       (seq e1 e2)
       (alt e1 e2)
       (many e)
       (~ e)
       (bind x e)
       (-> t)
       (list e)

Extended grammar:

(apply A)   => (apply A arg ...) ; where non-terminal A == (^ RULE [PARENT])
(seq e1 e2) => (seq* e ...)      ; sequence of arbitrary length
(alt e1 e2) => (alt* e ...)      ; arbitrary number of alternatives
(many e)    => (many+ e)         ; one or more
ometa

OMeta code is a first-class value, which you can bind to variables, pass around and close over in data:

(define test-program
        (ometa
          (rule-name e)
          ...))
define-ometa

Or use a shorthand notation:

(define-ometa test-program
              (rule-name e)
              ...)
omatch

Match against a stream. Stream can be a string or a list. For now these are not lazy sequences and will be fully consumed before matching. Namespace argument is optional:

(omatch test-program
        rule-name
        stream
        namespace)
define-ometa-namespace

Extend OMeta namespace if you want to refer to your top-level bindings. Don't forget to pass it to omatch. You definitely want this when inheriting from other parsers or invoking foreign rules:

(define-ometa-namespace ns)

Examples

std | integers | words and decimals | flatten a list | scanners |

std

Without a standard library we'd have to tediously copy/paste the most basic rules to every OMeta project. Let's not do that! Why else have inheritance and access to foreign rules:

#lang racket
(require "ometa.rkt")

;; a reflective hook so that you have access
;; to your top-level bindings from ometa code
(define-ometa-namespace ns)

(define-ometa std
  (char (seq* (bind c (apply anything))
              (->? (char? c))
              (-> c)))
  ;; why bake character-classes into a language when
  ;; its so expressive that adding them is trivial
  (char-range x y
              (seq* (bind c (apply anything))
                    (->? (and (char? c)
                              (char<=? x c y)))
                    (-> c)))
  (letter (alt* (apply char-range #\a #\z)
                (apply char-range #\A #\Z)))
  (digit (apply char-range #\0 #\9))
  (number (many+ (apply digit)))
  (spaces (many+ (atom #\space))))
integers

Let's write a left-recursive rule for parsing integers:

(define char->number (compose string->number list->string list))

(define-ometa integers (<< std)                         ;inherit from std
  (int (alt* (seq* (bind n (apply int))
                   (bind d (apply (^  digit)))          ;invoke a `digit' parent rule
                   (-> (+ (* n 10) (char->number d))))
             (seq* (bind d (apply (^  digit)))
                   (-> (char->number d))))))

(car (omatch integers int "567" ns)) ;;=> 567
words and decimals

Let's extend rules in std. We should be able to parse decimals and identifiers with _:

(define-ometa token (<< std)
  (letter (alt* (atom #\_)              ; accept underscore
                (apply (^ letter))))    ; invoke parent rule
  (id (many+ (apply letter)))
  (number (alt* (seq* (bind pre (apply (^ number)))
                      (atom #\.)
                      (bind post (apply (^ number)))
                      (-> `(,@pre #\. ,@post)))
                (apply (^ number)))))

(car (omatch token id "hello_Id" ns))   ;;=> "hello_Id"
(car (omatch token number "57.877" ns)) ;;=> "57.877"
flatten a list

Let's pattern-match on structured data:

(define input '(1 (2 (3 4)) (((5)) 6)))

(define-ometa flat
  (flatten (seq*
            (list (bind xs (apply inside)))
            (-> xs)))
  (inside  (alt*
            (seq* (list (bind xs (apply inside)))
                  (bind ys (apply inside))
                  (-> (append xs ys)))
            (seq* (bind x (apply anything))
                  (bind xs (apply inside))
                  (-> (cons x xs)))
            (seq* (apply end)
                  (-> null))))
  (end  (seq*
         (~ (apply anything))
         (-> null))))

(car (omatch flat flatten input)) ;;=> '(1 2 3 4 5 6)
scanners

Let's go mental and combine scannerless and scannerful parsing to parse assignments:

(define-ometa toks (<< std)
  (eq  (seq* (atom #\=)
             (-> (make-immutable-hash `((kind . =) (value . "="))))))
  (num (seq* (bind n (apply (^ number)))
             (-> (make-immutable-hash `((kind . num) (value . ,(list->string n)))))))
  ;; not just rules inherited from `std'
  ;; let's invoke one from `token'
  (id  (seq* (bind ls (apply (^ id token)))
             (-> (make-immutable-hash `((kind . id) (value . ,(list->string ls)))))))
  (scanner (seq* (apply (^ spaces))
                 (alt* (apply eq)
                       (apply num)
                       (apply id)))))

(define-ometa assignments (<< toks)
  (token k (seq* (bind t (apply (^ scanner)))
                 (->? (equal? (hash-ref t 'kind #f) k))
                 (-> (hash-ref t 'value))))
  (assign (seq* (bind a (apply token 'id))
                (bind b (apply token '=))
                (bind c (apply token 'num))
                (-> (string-append a b c)))))

(car (omatch assignments assign " my_var    = 56" ns)) ;;=> "myvar=56"

Test


>$ racket ometa-tests.rkt

will run rackunit test-suites.

References


ometa-racket's People

Contributors

lkuper avatar vkz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.