lexi-lambda / hackett Goto Github PK

View Code? Open in Web Editor NEW

1.2K 67.0 49.0 996 KB

WIP implementation of a Haskell-like Lisp in Racket

Home Page: https://lexi-lambda.github.io/hackett/

License: ISC License

Racket 99.64% Shell 0.30% CSS 0.07%

racket

hackett's Introduction

Hackett

Hackett is an attempt to implement a Haskell-like language with support for Racket’s macro system, built using the techniques described in the paper Type Systems as Macros. It is currently extremely work-in-progress.

Here are some of the features that Hackett supports right now:

Bidirectional type inference
Algebraic datatypes (ADTs)
Pattern matching
Exhaustiveness checking
Typeclasses (including multi-parameter typeclasses)
Higher-kinded types
Higher-rank polymorphism
Type-aware/type-directed macros
Laziness
Syntax for infix operators
Scoped type variables

Here are some of the features that still need to be implemented for a minimal release:

Orphan/overlapping instance detection/prevention
Strictness analysis
Kindchecking

And finally, here is a (non-exhaustive) collection of features I would like to eventually support:

Functional dependencies
Row types
GADTs
Type families

Due to the way Hackett is implemented, many things that are language features in Haskell can be derived concepts in Hackett. In fact, Hackett’s ADTs are not primitives, they are actually implemented as a library via the data and case macros in hackett/private/adt. Other things, like newtype deriving and generics, should be possible to implement as derived concepts as well.

Here’s some sample Hackett code that demonstrates some of Hackett’s features:

#lang hackett

(data (Maybe a)
  Nothing
  (Just a))

(def x : Integer
  (let ([y 3]
        [z 7])
    {y + z}))

(class (Show a)
  [show : {a -> String}])

(instance (forall [a] (Show a) => (Show (Maybe a)))
  [show (λ* [[(Just x)] {"(Just " ++ (show x) ++ ")"}]
            [[Nothing ] "Nothing"])])

For a much more in-depth look at Hackett, see the documentation.

Trying Hackett

To reiterate: Hackett is extremely experimental right now. Things are not guaranteed to work correctly (or work at all), and things are likely to change dramatically. If you really want to install Hackett to play around with it, though, you can.

You will need to have Racket installed to use Hackett. Using raco, you can install Hackett as a package:

$ raco pkg install hackett

Now you can use Hackett by writing #lang hackett at the top of a file.

hackett's People

Contributors

Stargazers

Watchers

hackett's Issues

Detection/prevention of orphan instances

I am okay with permitting orphan instances. However, I think they should be an error by default, so users will have to explicitly opt-in to them. The methods outlined in this blog post are probably relevant to this.

Declaring an orphan instance can probably be done via keyword in an instance declaration, like this:

(instance #:orphan (Foo Bar) ...)

It should be an error to use #:orphan when the instance isn’t actually an orphan instance to avoid redundant #:orphan annotations. If this ever becomes a problem, we can probably weaken it to a warning.

Full-program doc examples?

The guide and reference describe Hackett's features in isolation at a REPL, but there aren't any examples of a full Hackett module in the docs. The example code in the README is the sort of thing I'm thinking of. A small (<= 10 lines) complete module might be especially helpful on the very front page of the docs so users can quickly get a sense of what Hackett code looks like.

Ambiguous types should be an error

The following typechecks even though it should not:

(def main : String
  (show (right "hello")))

The above code is ambiguous because it isn’t possible to determine what the first type parameter to Either should be.

Precedence for infix operators

Is it possible implement Haskell syntax for infix operators?

Example :

(infixr 4 +)

(1 + 2)

In this sense (infixr 4 +) is just a shorthand macro definition that gets globally applied to all expressions.

issue with polymorphic values?

I'm having difficulty getting a bit of Hackett code to typecheck; it seems it might be an issue related to instantiating mempty in the following code with the right type:

(defn foldMap : (∀ [A B] (Monoid B) => {{A -> B} -> (List A) -> B})
  [[_ Nil] mempty]
  [[f {x :: xs}] {(f x) ++ (foldMap f xs)}])

typechecker: type mismatch
  between: B1619
      and: (=> ((Monoid a21641^)) a21641^) in: ((f x) ++ (foldMap f xs))

I think this should correspond directly to this bit of Haskell code which type checks the way I would expect:

foldMap' _ [] = mempty
foldMap' f (x:xs) = f x <> foldMap' f xs
> :t foldMap'
foldMap' :: Monoid a => (t -> a) -> [t] -> a

This could be a bug in my code, Hackett, or maybe my understanding of Hackett's type system?

RankNTypes issue

Ran into an issue with the following (contrived) example:

(data Nop
  (nop (∀ [a] {a -> a})))
(defn ->nop : (∀ [a] {a -> Nop -> a})
  [[x (nop f)] (f x)])

The equivalent Haskell is:

data Nop = Nop (forall a. a -> a)
toNot :: a -> Nop -> a
toNot x (Nop f) = f x

Hackett complains with:

; a38218: skolem escaped its scope
;   in: a38218
;  /home/milo/Git/hackett/hackett-lib/hackett/private/base.rkt:113:4 simplify/elaborate
;  /home/milo/Git/hackett/hackett-lib/hackett/private/base.rkt:98:2 τs⇔/λ!
;  /home/milo/Git/hackett/hackett-lib/hackett/private/base.rkt:214:2 τ⇔/λ!
;  /home/milo/Git/hackett/hackett-lib/hackett/private/base.rkt:229:2 τ⇔!
;  /home/milo/Git/hackett/hackett-lib/hackett/private/base.rkt:238:2 τ⇐!
;  /home/milo/Git/hackett/hackett-lib/hackett/private/base.rkt:243:2
;  /home/milo/Git/hackett/hackett-lib/hackett/private/base.rkt:368:0
;  /home/milo/Git/hackett/hackett-lib/hackett/private/base.rkt:200:8 for-loop

Removing the outer quantification also throws the same error

(defn int-nop : {Nop -> Integer}
  [[(nop f)] (f 3)])

For some reason, Hackett really doesn't like using quantified types stored within data. The following does work:

(defn int-nop/fn : {(∀ [a] {a -> a}) -> Integer}
  [[f] (f 3)])

I'm going to guess this has something to do with unordered contexts, or maybe is just a small bug in the implementation. I'll take a look when I have some time

Leading upper case.

When going trough the example I found that constructors do not start with upper case.

Could you give an explanation why?

Can type names also start with lowercase?

Multi-value case expressions

Currently, case can only match against one value at a time. It would be nice if there were a case* that could match upon any number of values at once, like this:

(case* [x y]
  [[(just a) (just b)] (just (tuple a b))]
  [[nothing  nothing]  nothing])

Since there can only ever be one expression on the right hand side of a case* expression, we could also drop the additional grouping brackets. However, that might be sort of hard to read. To alleviate, we could introduce a literal to make the syntax easier to visually parse:

(case* [x y]
  [(just a) (just b) -> (just (tuple a b))]
  [nothing  nothing  -> nothing])

I personally like this syntax, but some people might find it un-lispy. We’d also have to decide if case should also use the literal or not.

Vs Other Similar Languages

Wondering how you stack against:

What are the advantages and disadvantages?

Allow infix syntax in `case` patterns

This should be valid:

(case lst
  [nil ...]
  [{x :: xs} ...])

regression: typeclasses are broken in the repl

Observed behaviour:

$ racket -I hackett
> (++ "foo" "bar")
; @%dictionary-placeholder: should never appear at runtime [,bt for context]
> (show 42)
; @%dictionary-placeholder: should never appear at runtime [,bt for context]

Previous, expected behaviour:

$ racket -I hackett
> (++ "foo" "bar")
: String
"foobar"
> (show 42)
: String
"42"

Expression type annotation vs. Declaration type annotation

Similar to how : declarations work in Typed Racket or how id :: type declarations work in Haskell, I want to be able to separate definitions from their type annotations like this:

(: f {Integer -> Integer})
(def f (λ [x] x))

I have already implemented most of this (same way as Stephen and I did this in Typed Rosette), but there are some design questions left over.

The expression form of type annotation is already called :, and in places where both definitions and expressions are allowed, calling them both : would be ambiguous. The expression version and the declaration version should probably have different names. In Typed Racket the declaration version is : and the expression version is ann, but Hackett might need something different since : is already the expression version.
My preliminary implementation of this doesn't include Scoped Type Variables. I have had trouble "communicating" the scope between the type declaration form and the definition form where the scope is needed. We're working on this now. (Edit: now solved.)

Basic documentation

Without it,i don;t know how to use it.
Please use 'scribble/manual' to add it to hackett.

Allow infix syntax in `data` declarations

These sorts of things should be valid:

(data {a :~: b} ...)
(data (List a)
  {a :: (List a)}
  nil)

Rigid vars allowed to escape their scope

Rigid variables can escape their scope:

#lang hackett
(def weird
  (λ (x)
    (: (λ (y) (if True x y))
       (∀ (A) (-> A A)))))
;;;
> (#:type weird)
: (forall (A1618) {A9 -> A1618 -> A1618})

Weird scenarios can arise when rigid variables escape their scope. Here, A9 refers to a out-of-scope rigid variable. It will fail to unify with anything else but it is clearly not bound by the final forall.

The algorithm in Complete and Easy Bidirectional Typechecking prevents this by keeping a close eye on the scopes of rigid and solver variables.

Simple way to help?

I'm wondering, is there anything I can do to help with this project? It is rare that I am excited by something as much as I have been by this.

I don't know enough about haskell or racket to be implementing this, but there may be other things to help with.

Type annotations should report disappeared uses

Type annotations should record disappeared uses (and disappeared bindings, in the case of forall) so that Check Syntax can annotate them with information.

Built-in n-tuple type

Currently, Hackett provides a Tuple type defined as an ordinary ADT, which is a 2-tuple. It would be nice to transparently support arbitrarily large tuples, like Haskell does.

Exhaustiveness checking for pattern-matching

Functional dependencies

Multi-requirement instances issue???

When I try to Show a tuple, I get this error:

..\AppData\Roaming\Racket\6.10\pkgs\hackett-lib\hackett\private\base.rkt:305:11: Show162: arity mismatch;
the expected number of arguments does not match the given number
expected: 1
given: 2
arguments...:
#
#

So I expect that there's an issue with an instance being dependent on multiple other type classes.
It's not just tuples, either. I tried with my own multi-dependency typeclass, and it failed, giving a similar error (different mostly in that the data type was stored in a different file, I think). Could just be an issue with Show, but I somehow doubt it.

hackett/hackett-lib/hackett/private/base.rkt

Line 305 in f472859

dict-expr)]

Case pattern expanders?

For the module language @iitalics and I are working on, we would like to define macros that can expand in case-pattern positions, similar to how match expanders let us define macros that can expand in match-pattern positions in racket/match.

The specific use case we have is a case-pattern-expander for the #%dot form, so that for example (case x [(M.Constructor1 ....) ....] [(M.Constructor2 ....) ....]) can expand to something like (case x [(internal-Constructor1 ....) ....] [(internal-Constructor2 ....) ....]) when M is a module that defines a datatype with those constructors.

Prevention of overlapping instances

Overlapping instances should be illegal, and detection of overlapping instances should be an error.

Annotation bindings should ensure their definitions are as general as their annotations

This should obviously not typecheck, but it does:

(def x : (∀ [a] a)
  3)

The a should be “rigid”, but currently, the typechecker happily infers Integer from the definition of x and unifies it with a to produce the resulting type.

GHCI-esque toplevel

Right now, doing IO in the Hackett repl is kind of gross; your only option afaik is to wrap your IO ()s in an unsafe-run-io! call. Having e.g. #:run and #:<- commands in the toplevel to run and bind out of IO values would be much nicer.

Functor Instance Misbehaving

I've stared really hard at these few lines and can't think of any explanation for why the instance doesn't typecheck while tl-map does:

(data (TwoLists A)
      (TwoLists (List A) (List A)))

(defn tl-map : (∀ [A B] {{A -> B} -> (TwoLists A) -> (TwoLists B)})
  [[f (TwoLists l1 l2)]
   (TwoLists (map f l1) (map f l2))])

(instance (∀ [A] (Functor (TwoLists A)))
  [map tl-map])

typechecker: type mismatch
  between: (TwoLists A22)
      and: TwoLists in: tl-map

Proposal: a macro DSL based on bidirectional type checking

Hackett's goal has always been for "types and macros to be able to cooperate and communicate with each other". Today, however, Hackett macros are written using syntax-parse's DSL, which makes it easy to write a macro which alters its output depending on the syntactic categories of its inputs, not their types. This proposal introduces a monadic DSL which makes it easy for the macro to depend on the type of its inputs, without unnecessarily forcing the type-checker to commit to a type for all of the inputs before expanding the macro.

Design goals

It should be possible for a macro to ask the type-checker for the inferred type of an input and/or its output.
It should be possible for a macro to tell the type-checker about the expected type of an input and/or its output.
The implementation should not use backtracking.

Unification API

When implementing type inference using unification, the important primitive operations are:

newSVar : (Solver SVar): create a new solver variable about which nothing is known yet.
unify : {TypeExp -> TypeExp -> (Solver Unit)}: unify two type expressions, each of which may contain some solver variables.
readSVar : {SVar -> (Solver TypeExp)}: look up a solver variable to obtain a type expression which may contain some solver variables; in the worst case, the entire expression may itself be a solver variable, in which case we learn nothing.

The core idea of this proposal's DSL is to expose those primitives to the macro implementation, so that it can use readSVar to inspect types, and unify to tell the type-checker about expected types. However, exposing readSVar directly (or equivalently, allowing the macro to call the type checker on a term) is problematic, because how much information we get from readSVar depends on which other expressions the type checker has already visited. Moreover, what should the macro implementation do in the worst case in which readSVar doesn't reveal any new information?

Stuck macros

Here's how I propose to solve both problems at once: instead of giving back an entire type expression containing everything we currently know about the type variable, readSVar should always return a type expression in which only the outermost type constructor is known, and all its arguments (if any) are solver variables. Let's call that variant readSVar'. If the outermost type constructor isn't yet known, readSVar' cannot return yet, so the macro expansion remains stuck until more information is obtained, by visiting and type checking other expressions (which can involve expanding other macros, which might also get stuck). If some macros are still stuck after everything is visited, then type checking fails with an "ambiguous type" error for that solver variable.

This way, the order in which the type checker visits the expressions doesn't affect the output of readSVar': either the outermost type constructor is uniquely determined, in which case the macro will receive that value (perhaps after being stuck and resumed), or two different unification calls give conflicting information for that type constructor, in which case type checking is going to fail with a type error anyway. This is similar to the way in which concurrent programs which only communicate via IVars are guaranteed to give a deterministic answer even though the order in which the threads are scheduled isn't deterministic.

Pattern matching

Repeatedly calling readSVar' in order to expand the outermost type constructor, then second outermost type constructor, etc., can be quite tedious. Let's provide a nicer API based on pattern-matching: type-pattern matching.

In the same way that ordinary pattern-matching expands to a tree of nested case statements, type-pattern matching can be desugared to a tree in which each node calls readSVar' and then examines the resulting type constructor using a case statement. The subtle relationship between the order of the type-patterns and the order in which the readSVar' calls are made (which determine the behaviour of the type checker) is the same as the relationship between the order of ordinary patterns and the order in which inputs are forced by a pure function.

One important difference between ordinary pattern-matching and type-pattern matching is the case in which a case statement with a single alternative is generated: for ordinary pattern-matching, the input is forced and then an exception is raised if the result doesn't match the pattern. For type-pattern matching, we could similarly call readSVar' and then fail with a type error if the outermost constructor doesn't match the pattern. But we can do better: we can unify the input (a solver variable) with the pattern!

Unifying `SVar`s in context

The macro implementation should definitely receive an SVar representing the type of its output, so it can give the type checker some information (via unify) about the type of the expression it plans to produce before the macro is done producing that expression. In particular, providing this information before the macro becomes stuck might help type inference succeed more often.

Which other SVars should we provide to the macro implementation? When type-checking an expression such as (f x), we create one solver variable for f, one solver variable for x, and one solver variable for (f x). So when expanding the macro application (my-macro (f x)), we might naively think of providing the macro implementation with one solver variable for f, one solver variable for x, one solver variable for (f x), and one solver variable for (my-macro (f x)). That would be a mistake, because the macro is free to rearrange and duplicate those components arbitrarily, and it's this rearranged term which needs one solver variable per sub-term, not the input term.

Another difficulty is that an API in which macros can ask about the type of arbitrary expressions encourages a counter-productive style in which the macro asks a lot of questions about expressions without providing much information in return, thereby accidentally hurting type inference.

My proposed solution to both problems in an alternate API in which the only way to ask about the type of a target expression is to put this expression in a bigger expression which the macro implementation promises has the same type as the expression which the macro will output. This is done via a preliminary-output : {Syntax -> (Solver Unit)} command, where the syntax object may contain type annotations in which the type expression contains SVars. The command adds the syntax object to the type checker's list of expressions to visit, so that when readSVar' causes the control to return to the type checker, there is some hope for it to infer some information about the SVars in those type annotations.

Some macros may still wish to delay choosing their output type until they have learned the type of a few target expressions. This can still be done using this API, by using unsafe-coerce! or undefined! around their target expression to ensure that the output type remains unconstrained. However, due to the stigma around those partial functions, macro implementer will be encouraged to only do that if that's what they really want to do, and so in the common case the macro implementer will write an example output containing the target expression, thereby helping type inference.

Summary

A macro is implemented as a function of type {Syntax -> (Solver Syntax)}. The usual syntax-parse mechanism can be used to determine what to do with the input syntax object. If at some point the macro wants to get some information from the type checker, it must first create a solver variable, then it must use that solver variable in a type annotation inside an example output expression, then it must pattern-match on the solver variable in order to force type inference to find some information about the solver variable, or fail trying. Each time an example output is given, its type is unified with the type of the macro's call site (and thus with all the previous example outputs as well), thereby giving some important information for type inference. Eventually, the macro returns its final output, whose type is again unified with the earlier preliminary outputs.

type safe variadic functions

maybe it might be nice having functions with types (or kinds) a -> a -> a become variadic automatically?
for example (:: 1 2 3 4 nil) would be a list, (>>= applyfirst applysecond applythird) would be monadic computations, (-> (Maybe a) a a) would be a function of type {(Maybe a) -> a -> a}. I was reading Hackett's guide and I was thinking is there a reason why it isn't so?

Source location information for missing instance errors

Currently, errors that occur because an instance cannot be deduced do not contain source location information, even though they should.

Support for superclasses

Let and letrec expressions

Currently, Hackett does not actually support any local binding constructs. This is obviously bad, so it should probably support both let and letrec.

Add CONTRIBUTING information

This repository should include a CONTRIBUTING file that outlines the general contribution process as well as references the code of conduct.

Make parantheses optional.

I have not much experience with Lispy languages, but there was always something that bugged me about its syntax: Why do I need to write parentheses when they can be inferred?

Example :

(def foo (a b)
	(doSomething a b)
)

The indentation makes it obvious that doSomething a b is member of (def foo .... Lets take a look how what happens when we hide all () that can be inferred at compile time.

def foo (a b)
	doSomething a b

That is much more pleasant to look at.

In other words, why not make the syntax indentation sensitive? Can it be done in racket?

There is an interesting project Oczor that does exactly that. Except it uses records instead of lists by default.

#:minimal option for typeclasses

It would be useful to have a #:minimal clause as part of the class form, analogous to GHC’s MINIMAL pragma.

Instance declarations should validate that all methods are implemented

Auto-derivation of Show instances

This can be a macro specific to Show for now, and eventually it can be generalized to support more kinds of auto-derivation.

Are variadic functions allowed in hackett?

Instance chaining seems to fail

The following program fails to typecheck:

#lang hackett

(class (Base a)
  [base : {a -> a -> a -> a}])

(class (Middle a)
  [middle : {a -> a -> a}])

(class (Top a)
  [top : {a -> a}])

(instance (∀ [a] (Middle a) => (Base a))
          [base (λ [_] middle)])

(instance (∀ [a] (Top a) => (Middle a))
          [middle (λ [_] top)])

The type checker throws the error typechecker: could not deduce (Top a18) in: (case* (_27) ((_) middle)) with (λ [_] middle) as the error's source location. I think this is supposed to work, since this equivalent Haskell program typechecks successfully:

class Base a where
  base : a -> a -> a -> a

class Middle a where
  middle : a -> a -> a
  
class Top a where
  top : a -> a -> a

instance Middle a => Base a where
  base _ = middle

instance Top a => Middle a where
  middle _ = top

Strict Evaluation

Hello,

Thanks for sharing Hackett. It seems to bring out the best of haskell and scheme.

Just want to check if there is any way to turn off lazy evaluation. I noticed that lazy evaluation is more of a hindrance after the prototype stage (from my Haskell experience).

If I want to patch Hackett to use strict evaluation for my purposes, Is that a big effort?

Thanks again

Feature: implied parentheses

Since this is 21'th century, we probably can have many (most?) parentheses implied, in a way that is not dissimilar to how Haskell does implicit { ... } blocks (which people mostly don't use for a couple of reasons that are not directly relevant here, since the tradeoffs are different, because no macros).

To be precise: the idea is to allow BOTH options (as in Haskell), since:

We still need to allow the fully-parenthesised form to be emitted in macros, where you really really want to be specific about the parentheses.
The long, multi-line expression cases still stand

To consider the difference of presentation -- using the example from the README:

#lang hackett

data Maybe a
  nothing
  just a

def x : Int
  let y 3
      z 7
    y + z

class Show a
  show : Int -> a -> String

instance forall [a] (Show a) => Show (Maybe a)
  def+ show
    nothing -> "nothing"
    just x  -> "just " <> show x

..versus:

#lang hackett

(data (Maybe a)
  nothing
  (just a))

(def x : Int
  (let ([y 3]
        [z 7])
    {y + z}))

(class (Show a)
  [show : {Int -> a -> String}])

(instance (forall [a] (Show a) => (Show (Maybe a)))
  (def+ show
    [nothing  -> "nothing"]
    [(just x) -> {"just " <> (show x)}]))

FYI you can change the repo name on github

And the old name will stay as an alias

Scoped type variables

Hackett should support lexically scoped type variables, a la GHC’s ScopedTypeVariables extension. This should be relatively easy to support, due to the way type variables are effectively already lexically scoped. One thing that will probably be weird is that type variables and ordinary variables current share a single namespace, so they could shadow ordinary variables.

A simple implementation is probably reasonable to start, and we can see if sharing the same namespace causes any problems.

Default implementations of typeclass methods

Multi-parameter typeclasses

What is the type signature of `apply`?

Scheme (and possibly racket) has function called apply and list. Will these function also be available in hackett? What about their type signatures?

Note:

Even though apply takes a list as an argument in scheme it would make much more sense to take a tuple as an argument. There should be also a tuple variant of list. Also foo and apply foo . tuple should be equal.

Slow typechecking due to linear search in type context

Sam Caldwell built a ~1000 line program in Hackett that takes 25 seconds to expand. I spent some time profiling the expansion process and thought I'd share the results, though I'm not familiar enough with Hackett's implementation to suggest a fix.

Code: https://gist.github.com/michaelballantyne/64c18ec55abd7f1acdc4a41655239b54
Profile: https://gist.github.com/michaelballantyne/05488da82d07c5db9d03866b3c024334

It looks like most of the time (~70%) is spent on this line: https://github.com/lexi-lambda/hackett/blob/master/hackett-lib/hackett/private/typecheck.rkt#L222
when called by apply-subst.

The size of the type context for this program reaches about 2500 entries, so the linear search by findf gets pricey, especially with free-identifier=? involved in the predicate. Perhaps the context could use a free identifier table instead?

I also noticed that τ<:/full! appliesapply-current-subst at every type it examines, and that it will then reapply the same substitution when it recurs, resulting in a quadratic number of apply-subst operations being applied to leaves.

Racket/Hackett FFI + contract generation

Like Typed Racket, Hackett should support safe interop with Racket, with boundaries protected with contracts. The interoperation story is more complicated with Hackett because it’s semantically more distant from Racket than Typed Racket is. For example:

Hackett is a lazy language, so values need to be forced when crossing from Hackett to Racket. In fact, they need to be recursively forced, so values that cross the boundary probably need to, at a bare minimum, implement a deepseq-style NFData typeclass.
Hackett functions are curried, which are unpleasant and unidiomatic to use from Racket.
Hackett has typeclasses, and typeclass resolution often can’t be done based on runtime values (for example, it can be based on the return type or even a phantom type with no runtime evidence), so it’s probably not possible to call typeclass-constrained functions from Racket.
Hackett is pure, so Racket code could potentially break invariants that Hackett code assumes. This is probably unavoidable, though, and Hackett users can technically use unsafe-run-io! themselves if they want, anyway.

It would be possible to create To-Racket and From-Racket typeclasses, or something like that, which have type signatures that include an entirely opaque Racket datatype, with some built-in instances for primitive datatypes. This way, conversions could only be defined in terms of the built-in conversions, and the typeclasses could be used to guide contract generation. The obvious downside of this is that crossing a boundary would be enormously expensive, so maybe there’s a way to do something similar using a typed representation of contracts (type-indexed contracts?) that would be more efficient.

DrRacket type tooltips

Hackett should attach 'mouse-over-tooltips syntax properties to expanded expressions to allow DrRacket to render tooltips that show the types of expressions, like Typed Racket does.

Proposal: macro expressions instead of macro bindings

Acknowledgement

This proposal is inspired by a presentation on Typer I recently saw at McGill University's 2017 Programming Languages Workshop. There doesn't seem to be any information on the internet about this academic programming language yet. Typer's goals seem very similar to Hackett's, as it also wants to interleave macro-expansion and type-checking. The main difference is that Typer is a dependently-typed programming language, which aims to combine Lisp+OCaml+Coq instead of Haskell+Racket.

Proposal

Hackett macros currently have to be defined at the top-level, using the special form define-syntax-parser:

(define-syntax-parser list
  [(list) #'Nil]
  [(list x xs ...) #'{x :: (list xs ...)}])

After that definition, the identifier list is known to be a macro, so later occurrences of (list ...) are macro-expanded instead of being treated as a function application:

; {1 :: 2 :: 3 :: Nil}
(list 1 2 3)

The proposal is: instead of a special form, let's have a special type. This type is an otherwise ordinary wrapper around a function which manipulates syntax objects:

(data Macro
  (DefMacro {Syntax -> Syntax}))

The type needs to be treated specially: when type-inference detects that an expression of type Macro is applied to an argument, it knows that this is not a function call which will be evaluated at runtime, but a macro which must be expanded at compile-time. For example:

; {1 :: 2 :: 3 :: Nil}
((DefMacro (syntax-parser
             [(list) #'Nil]
             [(list x xs ...) #'{x :: (list xs ...)}]))
 1 2 3)

Such expressions may refer to locally-bound variables:

; {1 :: 2 :: 3 :: Nil}
(let ([list (DefMacro (syntax-parser
                        [(list) #'Nil]
                        [(list x xs ...) #'{x :: (list xs ...)}]))])
  (list 1 2 3))

In that case, the expression to which the locally-bound variable refers must be evaluated at compile-time. If that expression refers to other variables, then those must be evaluated as well. Obviously, this is only possible for constant expressions, for which all the information is available at compile time; if the expression depends on a lambda-bound variable (such as one bound by the do macro), this results in a compile-time error. Typer calls it an "elaboration error".

; elaboration error
(defn one-two-three : {Macro -> (List Integer)}
  [[list] (list1 2 3)])

Type inference

Worryingly, since the application (f x) could either be a macro application or a function application, we can no longer deduce that there are some types a and b for which f has type {a -> b} and x has type a. We can't even deduce that x has a type, because it could be some keyword recognized by the macro! That sounds pretty bad.

Fortunately, since we know that the only valid macro expressions are constant expressions, intuitively we should always be able to determine from the surrounding context whether that expression is a macro or not. More formally, the algorithm for bidirectional type-checking first infers the type of f, and if it has the form {a -> b}, it checks that x has type a. So changing the algorithm to not check x if it turns out that f has type Macro doesn't affect type inference at all.

Motivating examples

The example given at the top was chosen for its simplicity, but it didn't really demonstrate the advantages of the proposal over define-syntax-parser. Here are some better examples.

Local macros

One advantage is that macros can now be defined in a small local scope. For small syntactic improvements which aren't reusable outside of a narrow scope, it makes more sense to define macros locally than at the top-level. For example, in the cross-product implementation below, there is a pattern which is repeated 3 times with small variations:

(data Vec
      (Vec Integer Integer Integer))

(defn cross-product : {Vec -> Vec -> Vec}
  [[(Vec x1 y1 z1) (Vec x2 y2 z2)]
   (Vec {{y1 * z2} - {z1 * y2}}
        {{z1 * x2} - {x1 * z2}}
        {{x1 * y2} - {y1 * x2}})])

That particular pattern isn't likely to be repeated anywhere outside of that function, so it would be nice to define a local macro eliminating the duplication:

(defn cross-product : {Vec -> Vec -> Vec}
  [[(Vec x1 y1 z1) (Vec x2 y2 z2)]
   (let [[submul (DefMacro
                  (syntax-parser
                    [(submul u:id v:id)
                     #`(- (* #,(suffix-id #'u "1") #,(suffix-id #'v "1"))
                          (* #,(suffix-id #'v "2") #,(suffix-id #'u "2")))]))]]
     (Vec (submul y z)
          (submul z x)
          (submul x y)))])

Of course, the fact that we can't currently define local macros in Hackett isn't very limiting, as we can simply define a top-level macro whose name indicates that it isn't used anywhere else:

(define-syntax-parser cross-product-helper
  [(submul u:id v:id)
   #`(- (* #,(suffix-id #'u "1") #,(suffix-id #'v "1"))
        (* #,(suffix-id #'v "2") #,(suffix-id #'u "2")))])

(defn cross-product : {Vec -> Vec -> Vec}
  [[(Vec x1 y1 z1) (Vec x2 y2 z2)]
   (Vec (cross-product-helper y z)
        (cross-product-helper z x)
        (cross-product-helper x y))])

This example also demonstrates one way to implement DefMacro: by rewriting the program into a form in which all the macros are at the top-level.

Macro combinator libraries

Libraries like turnstile and syntax/parse demonstrate that it is possible to define "meta-macros", that is, macros which define other macros. The idea is to define a DSL for describing macros, and to write a meta-macro which interprets this DSL. Well, in Haskell, our DSLs are combinator libraries, made out of precise types and reusable abstractions like Applicative and Monad, so if we're going to have a DSL for describing macros, it would be nice if that DSL could be a combinator library. Now that a macro is a normal value, of type Macro, which can be constructed using normal expressions, we can do precisely that.

; The intention is to construct a Syntax object out of two
; Syntax variables, e.g. {{x1 * y2} - {y1 * x2}} from x and y
(data (Template a)
      (Template {Syntax -> Syntax -> a}))

(defn runTemplate : {(Template Syntax) -> Macro}
  [[(Template f)]
   (DefMacro (syntax-parser
               [(_ u v) (f #'u #'v)]))])

(instance (Functor Template) ...)
(instance (Applicative Template) ...)  ; for liftA2 below

; (runTemplate (tU "1") u v) ;=> #'u1
(defn tU : {String -> (Template Syntax)}
  [[suffix]
   (Template (λ [u v] (suffix-id u suffix)))])

; (runTemplate (tV "2") u v) ;=> #'v2
(defn tV : {String -> (Template Syntax)}
  [[suffix]
   (Template (λ [u v] (suffix-id v suffix)))])

; (runTemplate (op #'+ (tU "1") (tV "2")) u v) ;=> #'{u1 + v2}
(defn op : {Syntax -> Template Syntax -> Template Syntax -> Template Syntax}
  [[o] (liftA2 (λ [x y] #'{x o y}))])

Now that we have a library for creating macros based on a Template, we can now rewrite our previous example in a much more readable way, by defining a local Template and using runTemplate to turn it into a local macro:

(defn cross-product : {Vec -> Vec -> Vec}
  [[(Vec x1 y1 z1) (Vec x2 y2 z2)]
   (let [[t- (op "-")]
         [t* (op "*")]
         [t (t- {(tU "1") t* (tV "1")}
                {(tV "2") t* (tU "2")})]
         [submul (runTemplate t)]]
     (Vec (submul y z)
          (submul z x)
          (submul x y)))])

This time, the version where all the macros are at the top-level has to run some Hackett code at compile-time, and has to be careful to define the macros in the right order and in the right phase, so having the compiler do that for us is a lot more valuable.

Global dictionaries may sometimes trigger use-before-initialization errors

Sometimes, dictionaries may be inserted in such a way that they get evaluated before they have been initialized. One common case that triggers this error is defining <*> as ap, which will attempt to insert the relevant Monad dictionary. However, Monad instances are usually written below Applicative instances in a module, so the dictionary ends up being referred to before it is initialized, triggering a runtime error. A workaround is to eta-expand the use, as is done here, for example, but this is obviously not an acceptable solution in general.

There are two ways to solve this problem. One way would be to somehow make the dictionary reference lazy, but this might be somewhat difficult to do transparently without adding unnecessary complexity and possible runtime overhead. A nicer way would be to lift definitions for global dictionaries so that they all get initialized before they are referenced.