epfldata / squid Goto Github PK

View Code? Open in Web Editor NEW

196.0 17.0 14.0 4.43 MB

Squid – type-safe metaprogramming and compilation framework for Scala

Home Page: http://epfldata.github.io/squid/

License: Apache License 2.0

Scala 99.93% Shell 0.03% HTML 0.04%

scala metaprogramming type-safety staging optimization

squid's People

Contributors

Stargazers

Watchers

Forkers

fschueler aalexandrov gitter-badger semanticbeeng barambani dispalt dennisvdb mindis akshay-sri123 hamidreza-ramezani gbasler zeta1999 zilutian lptk

squid's Issues

Move away from the current `rewrite` macro implementation

The rewrite macros comprise top-level rewrite calls from the RuleBasedTransformer class, as well as term-level t.rewrite calls, which desugar into the same.

These are currently expanded into calls to RuleBasedTransformer#registerRule – conceptually, an invocation such as rewrite { case code"t" if c => rhs } expands into:

registerRule(
  expression representation of code pattern t, 
  extractArtifact => {
    // ... extract the different trees obtained from extractArtifact
    // ... and pass them through the sub-patterns
    //       (for example, see `SubPattern` in `case code"${SubPattern(x,y,z)}:Int" => ...`)
    if (the guard/condition c of the pattern matching case is respected)
      Some(adapted version of rhs)
    else None
  }
)

This is generated after some crazy macro hackery during which untypecheck is used to get around owner chain corruptions, which incurs limitations on what can go inside a rewrite rule right-hand side. It incurs problems with path-dependent types, some of which are fixed in a very ad-hoc way.

A saner approach would likely be to perform the main code transformation before type checking (using a macro annotation), since the translation is mostly syntactic anyway. But we would have to split from that code the functionality that inspects the types of the pattern and right-hand side in order to refine the type of the rewritten term and to check that the transformation is type-preserving. Indeed, that functionality needs to be expressed in a macro def, as it requires type checking information – but the good news is that it does not change the shape of the tree and should not result in owner chain problems.

The context is inferred `Any` sometimes instead of `AnyRef`

You can see this issue in the following code (extracted from DBLAB):

List(1, 2, 3).foldLeft(ir"0") { (acc, exp) =>
  ir"if($acc != 0) $acc else $exp"
}

Running some operations on `Int` fails

scala> ir"123.toString".run
scala.ScalaReflectionException: expected a member of class Int, you provided method java.lang.Integer.toString

Make a release for Pilatus

Investigate and fix remaining Squid issues raised during Emma experimentation

See: emmalanguage/emma#369

Support Java varargs

It turns out they are encoded differently than Scala varargs. See report here.

Var gets reordered with respect to statements that it depends on when using SC as backend

ir { val ab = new ArrayBuffer[Int]; ab.append(1); var s = ab.size; s }

causes the size computation to occur before the append

Warn about potentially-confusing uses of function applications in patterns

Direct function application syntax on a pattern hole is currently interpreted as higher-order matching. To extract a function, one should ascribe the hole with a function type. This can lead to confusing error, such as the following:

[error] Embedding Error: Unexpected expression in higher-order pattern variable argument: 42
[error]     case ir"$f(42)" => println(s"$f(42)")
[error]          ^

In this case, the error message should inform and guide users who just wanted to extract a function...

`asInstanceOf` is removed when it should not

For example in:

val pgrm = ir{
  val ab = ArrayBuffer[Any]()
  List[Any]().foreach { e =>
      ab += e.asInstanceOf[(Int,String,Double)]._2  // asInstanceOf disappears.. why?
    }
  }

The result is we may generate code that won't type-check.

Add license boilerplate to all source files

[PardisIR] Variable substitution does not always work

ir{
      val x = 5.0
      val x_3 = ${power(3, ir"$$x : Double")}
}

does not substitute the hole properly even though x is captured and type checked against the hole.

Generic arguments of inner classes disappear

If I have a generic class that is also an inner class, then the generic arguments fall off. For example, if I have

class Foo {
  class Foo3[A]
}

And then I do

val y: Foo#Foo3[Int] = ???

Then this turns into

  val y_0 = ((scala.Predef.???): (test.Foo)#Foo3);
  ()

which fails to compile with

Error:(37, 23) class Foo3 takes type parameters

Here is a commit in the squid-examples repo demonstrating the issue:
ggevay/squid-examples@01ed744

Edit: I pasted the wrong commit

Migrate Free Variable syntax from `$$x` to `x?` to `?x`

Syntax is already available; just need to remove uses of the old one, and possibly repurpose it.

Repo Cleanup

Remove outdated stuff (for example the /benchmark folder, as well as some things in /example).

Fix SchedulingANF

The SchedulingANF IR is useful (and even crucial) to avoid code duplication and recomputation in generated code. Yet, the current implementation is broken and should be reworked.

Properly Implement the Simplified Syntax

Have a proper (robust and easy to use) implementation of the simplified syntax as presented in the SPLASH'17 papers. (Currently, one has to use the SimplePredef hacky interface and there is no direct HOPV syntax.)

Code[T,C] type in Predef
code"t" quasiquote interpoler
Higher-order pattern variables
Remove the SimpleReps trait
Add option to import legacy names (IR[T,C], IRCode[T])

Support cross-quotation references in quasicode

Cross-quotation references (CQR) are variables at a given stage that refer to binders at the same stage but across quotation boundaries (not to be confused with cross-stage references).

For example:

code"(x: Int) => ${ foo(code"x+1") }" // occurrence of x refers to binding in outside quote

Because of fundamental limitations, cross-quotation references are impossible to support in quasiquotes. However, it should be possible to support them in quasicode, as in:

code{(x: Int) => ${ foo(code{x+1}) }} // occurrence of x refers to binding in outside quote

What needs to be done:

disable implicit cross-stage reference support in quasicode (so doing CSP in quasicode requires using explicit CrossStage(...) expressions), as they would be locally ambiguous with CQRs;
when QuasiEmbedder encounters a reference to a variable not defined within the quote, it uses that variable's symbol to look up its associated v:Variable[T] symbol in an external macro cache, and creates one if none exists – this symbol is used to add v.Ctx to the current context requirement; now, in place of the CQR, generates a special tree that should not compile (containing a @compileTimeOnly) to be substituted by the outer quote (if any);
when QuasiEmbedder finds a binding of some variable v, it makes sure to check the macro cache to see if it's a cross-quotation binding, in which case it accounts for its v.Ctx context being bound there; then in each further inserted tree of the quote, it looks for the corresponding @compileTimeOnly trees and substitutes them with a reference to the bound variable.

Project Status?

Is this project still active? It's concerning how long it seems to have gone without activity.
If so, is there a point at which we can expect Scala 2.13 support?

Thanks!

Distinguish Extraction Holes from Free Variables

These are two distinct concepts, the conflation of which has only led to unnecessary complexity.
For example, spliced holes make sense but not spliced FVs.

Known problems in particular:

Reinserting an extracted binder will not capture a new binder of the same name. Solution: extracted binder should extract a FV with memory of the original binding (instead of extracting a BoundVal and having corresponding holes keep memory of it!). In fact, ideally extracted binders should behave the same as simple holes, except for the fact one can retrieve symbol info from them with the Sym xtor, and one can reinsert them in binder position to propagate annotations and other meta-info (todo).
It is currently not possible to directly pattern-match on free variables, but that can be a useful pattern. For example consider something like case ir"(acc,a) => ${Addition(ir"?acc:String",rest)}" => ... (currently, the inner pattern will extract any string and associate it with acc, and crash because there is no corresponding $ extraction).

Make `Const` safer with a type class

I had completely forgotten this, but Const is currently very unsafe:

scala> Const(1)
res0: IR.ClosedCode[Int] = code"1"

scala> Const(new{})
java.lang.Error: bad constant value: $anon$1@71be27f7 of class class $anon$1

The Const.apply function should require a type class instance making sure the type can actually be a constant.

Implicit modifiers fall off from ValDefs

If I add the following in https://github.com/LPTK/squid-examples/blob/master/embedding-from-macros/src/test/scala/test/TestTest.scala

  val r2 = Test2.test {
    implicit val x = 42
  }

then it prints

[At compile time:]
	Original program: code"""{
  val x_0 = 42;
  ()
}"""
	Transformed program: code"""{
  val x_0 = 42;
  ()
}"""

i.e., x is not implicit anymore.

Implement New Type and Symbol Systems

To make the new IR faster, and compatible with ScalaJS/ScalaNative (and also to lower its startup time), we would like not to rely on Scala Reflection at runtime.

This means we implement our own representation of types, type symbols and method symbols.

The downside is that we won't be able to inspect at runtime the annotations (such as effects annotations) defined on method symbols. To work around this, we:

will provide nice tools to provide effects annotations manually for external libraries (in the same vein as this);
will extend the functionalities of the @embed macro annotation to look at the annotations placed on methods and register them in the IR;
also, we have to remove the runtime dependencies on Scala Reflection that currently exists in the code that @embed generates.

Don't silently introduce free variables when referencing local class members from within QQs

It can be confusing to unaware programmers that quasiquotes referring to the members of a class from within the class will be compiled as open terms containing a free variable for that class' this instance.

For example, in:

class A { def foo = 27; def bar = ir"foo + 1" }

Field bar will have type IR[Int, {val `A.this`: A}].

Instead, we could just make it an error. Not sure this behavior is actually ever useful.

Path-dependent types turn into type projections

I'm still struggling with getting TypeTag to work. Now the problem is that it is path-dependent on the universe.

Simplified example: Let's say that I have these types:

class Outer {
  class Inner
}

Then

  println(dbg_code"""
              val x = new Outer
              val y: x.Inner = ???""")

prints

code"""{
  val x_0 = new test.Outer();
  val y_1 = ((scala.Predef.???): (test.Outer)#Inner);
  ()
}"""

where x.inner turned into Outer#Inner.

This is a problem in Emma, because after some code fragment runs through Squid, I'm getting errors such as

found   : scala.reflect.api.JavaUniverse#TypeTag[String]
 required: reflect.runtime.universe.TypeTag[String]

Improve the Base interface

This is to prepare the ground for the new IR implementations that will replace the legacy AST/SimpleAST/SimpleANF IRs which are bloated and unoptimal.

We need to:

add a Context abstract type member and an implicit parameter to reification methods, so as to avoid needless global state in the reification process;
abstract over the representation of argument lists List[ArgList], so as to allow alternative and possibly more efficient representations;
abstract over the Extract data type, for the same reason;
perhaps take the precautions needed to make the IRs threadsafe, if it does not incur significant overhead (otherwise make it opt-in) – IIRC the main place where threadsafety would be a problem is with the pattern caching mechanism, and with variable-id generation.

Support pattern matching in quasiquotes

Currently, pattern matching and partial function syntaxes (and derived) are not supported. Implement a virtualization scheme to support them.

Implement `runUnsafe` and `compileUnsafe` for `Code[T]`

Currently Code[T] has access to the enrichments that provide .run and .compile to IR[T,Any] (via InspectableCodeOps). This should probably not be the case.

Vararg support in embedded Defs

Varargs in embedded defs are currently not supported.

@embed object Foo {
    @phase('MyPhase) def foo(x: Int, xs: Int*): Int = x * xs.sum
}

A call to Foo.foo(100, 1, 2, 3) should be correctly inlined.

Support for default values in class's methods

The default methods' parameter values in Squid are missing

`Abort()` should be callable from a function called by the rewrite rule

Currently the rewrite macro rewrites calls to Abort() as calls to `internal abort`, which is unnecessary and makes writing Abort() from a function outside the body of the rewrite rule impossible.

Vararg unquote with non-identifier fails to compile

See the example:

code"Seq(1,2,3)" match {
  case code"Seq[Int]($xs*)" =>
    val xs2 = c"0" +: xs
    code"Seq[Int](${xs2}*)" // works fine
    // a current limitation/bug prevents the following syntax:
    // code"Seq[Int](${c"0" +: xs}*)"
    //   Embedding Error: Quoted expression does not type check: type mismatch;
    //     found: Any; required: Seq[Embedding.IR[Int,?]])
}

Check type hole bounds during extraction

The ability to add bounds on type holes was added to accommodate patterns where these bounds are required (https://github.com/epfldata/sc/commit/9ed047c7a28eaabdd7433161fc931419b78772d3), but the bounds are not yet checked during extraction.
For example, the following matches:
ir"List(0)".erase match { case ir"$ls:List[$t where (t <:< String)]" => t }.

In order for type hole bounds to be checked, we need to add bound parameters to the typeHole function of QuasiBase.

Cannot splice `Var` in quotation

val v = __newVar[Int](unit(1))
ir"println($v)"

Change StaticOptimizer's `DumpFolder` mechanism to something more elegant

Currently in order to dump the code generated by optimizeAs('Name), one has to write:

implicit val `/tmp` = DumpFolder

It would be less cryptic and more general to implement a static macro so we can do:

implicit val dumpFolder = static{DumpFolderPath("/tmp")}

Can use annotations to store the intermediate AST. Can even execute the code at compile-time and use a type-class to parse the result back to an AST.

Solve the annoying `C with AnyRef` problem with contexts

Sometimes one still encounters annoying errors saying a context should be C with AnyRef instead of just C.
This basically happens when using constructs that return IR[T,{}], such as Const(123) because Const is defined as [T: IRType](v: T): IR[T,{}]. For example:

def foo[C](x: IR[Int, C]): IR[Int, C] = ir"$x + ${Const(123)}"

This code fails to compile with:

Error:(37, 56) type mismatch;
 found   : squid.TestDSL.IR[Int,AnyRef with C]
 required: squid.TestDSL.Predef.IR[Int,C]
    (which expands to)  squid.TestDSL.IR[Int,C]
    def foo[C](x: IR[Int, C]): IR[Int, C] = ir"$x + ${Const(123)}"

Name collisions of free and bound variables

We can name free variables in a way to collide with the names of bound variables.

> ir"(x: Int) => 2*x + ($$x_0:Int)"
res0: IR[Int => Int,Any{val x_0: Int}] = ir"((x_0: scala.Int) => (2).*(x_0).+(x_0))"

Virtualize `def` and `lazy val` in embedded code

It would be nice being able to use simple def's (without type parameters) in embedded code, as well as lazy values. For example, it would avoid a workaround mentioned in emmalanguage/emma#369.

This should be doable in the same way it's done with things like if/else, i.e.:

code"if ($cnd) $thn else $els"
// really is equivalent to:
code"squid.lib.IfThenElse($cnd, $thn, $els)"

In the same way, we could have:

code"lazy val $x = $v; $b"
// virtualized as:
code"val $y = squid.utils.Lazy($v); ${b.subs(x) ~> code"$y.value"}"

and:

code"def f(...) = ... ; ..."
// if f is not recursive, as:
code"val f = (...) => ... ; ..."

and for recursive definitions, using a Y combinator with by-name parameters, for example for a recursive definition:

code"def f(...) = ... ; ..."
// if f is recursive, as:
code"val f = Y(f => ...) ; ..."

Finally, it would use an annotation (similar to the one used for implicit bindings) in order to mark virtualized bindings so they can be generated back to their original form when emitting Scala trees.

Reference to local value does not work in quotation

Refer to here:
https://github.com/epfldata/DDBToaster/blob/scgen/ddbtoaster/pardis/tpcc/TpccXactGenerator_SC.scala#L130
and change dsl to ir.

Compositionality of .compile

It would be nice if this worked,

> (ir"(x: Int) => x".compile)(2)
error: type mismatch;
 found   : Int(2)
 required: <:<[AnyRef,Any]
       (ir"(x: Int) => x".compile)(2)
                                   ^

just like this:

> ((x: Int) => x)(2)
res0: Int = 2

Port to Scala 2.12

Code pattern matching should ignore difference between overriding and overridden symbols

The following currently evaluates to false:

trait Foo { def foo: Int }
class Bar extends Foo { override def foo: Int = 0 }

val a = code"(new Bar).foo"  // uses Bar#foo symbol
val bar: ClosedCode[Foo] = code"new Bar"  // upcast from ClosedCode[Bar]
val b = code"$bar.foo"  // uses Foo#foo symbol

println( a =~= b )  // prints false

This is because term equivalence is based on matching semantics, and this does not match:

a match { case code"($f:Bar).foo" => }

n.b. while this matches:

a match { case code"($f:Foo).foo" => }

This is a pretty annoying limitation in many contexts. As an example, this rule won't be able to match uses of the equals method statically applied on a type that overrides it, whereas it should. (This is not to be confused with the unrelated problems introduced by == overloading, though.)

It seems type safety would not be violated by saying that all overriding and overridden symbols should be treated as equal when performing pattern matching. This is because the self object on which any of these methods are called will also be checked during matching at one point or another, so for example we cannot match a call to an overriding method with refined return type if the self object is not actually a subtype of the type where the overriding method is defined.

Warn when type parameters in patterns are inferred as `Nothing`

The following code does not currently raise a warning, but the pattern is too restrictive and does not match:

val q = ir"ArrayBuffer[Int]() foreach println" rewrite {
  case ir"($ab:ArrayBuffer[$t0]).foreach($f)" => ir"???"
}

(The correct pattern is case ir"($ab:ArrayBuffer[$t0]).foreach[$t1]($f)" =>.)

Make syntax of variable function insertion more regular

See: #53 (comment)

Make non-contextual Code the default

It is entirely possible to use Squid without statically checking scopes, similar to MetaOCaml or Dotty's quotes (but of course with more features, like pattern matching).

This simpler mode of usage is less type-safe, but makes many things much easier and less intimidating for beginners. One has to use the OpenCode[T] type (which is a super type of the contextual Code[T,C]), and use .unsafe_asClosedCode whenever one wants to run or compile the piece of code.

However, currently we emphasize the contextual mode of Squid as the default/normal mode, which is intimidating and may scare beginners away. Instead, I think we should promote the simpler interface as the default interface, and reserve the contextual interface for advanced users. This would mean:

Changing the type synonyms: OpenCode[T] is long-winded and a bit ugly; we'd like to use Code[T] instead. The new scheme would be: Code[T] for when you do not know the context, ClosedCode[T] for code that is assuredly closed and can be run; OpenCode[T,C] for open code for which we statically track the scope using contextual type C (for advanced users).
Alternatively, as I previously proposed in LPTK/squid-type-experiments, we could have type synonym in to enable the nice Code[T] in C syntax (meaning Code[T]{ type Ctx >: C }), though that would likely require changes in Squid to look at the Ctx member of code types instead of their C type argument, which may cause weird typing problems down the line.
Changing the code types inferred by default: we should have a SimplePredef which is configured to make quasiquotes infer Code[T] types by default, instead of Code[T,C] types; even though OpenCode[T,C] <: Code[T], this still does help with type inference, such as when doing var c = code"0"; ... ; c = code"<open term>".
Splitting the basic tutorial in two: there should be a basic tutorial using simple Code[T] types and cross-quotation references, which are now supported, and a tutorial for advanced users that demonstrates the OpenCode[T,C] type and its interaction with path-dependent Variable[T]#Ctx types.

Failed to generate semanticdb: repeated argument not allowed

Hello,

I am having trouble with setting up the code nagivation for Squid on VSCode. When I import build, I keep getting error messages complaining about "repeated argument not allowed" when generating semanticdb. All tests passed when I ran bin/testAll.sh. I am using Master branch and default build.sbt. Please let me know what I missed. Thanks!

failed to generate semanticdb for /*local-squid-path*/squid/src/main/scala/squid/OptimTestDSL.scala:
/*local-squid-path*/squid/src/main/scala/squid/OptimTestDSL.scala:58: error: repeated argument not allowed here
      code"List(${ xs map (r => code"$f($r)"): _* })"
                                               ^
	at scala.meta.internal.parsers.Reporter.syntaxError(Reporter.scala:16)
       ...

And the same error message repeated for others as well:

failed to generate semanticdb for /*local-squid-path*/squid/src/test/scala/squid/feature/Antiquotation.scala:
/*local-squid-path*/squid/src/test/scala/squid/feature/Antiquotation.scala:106: error: repeated argument not allowed here
    eqt(code{List(${Seq(x,y):_*})}, code"List(1,2)")

failed to generate semanticdb for /*local-squid-path*/squid/src/test/scala/squid/feature/Quasicode.scala:
/Users/ztian/Desktop/squid/src/test/scala/squid/feature/Quasicode.scala:44: error: repeated argument not allowed here
    eqt(ls, code"List(${seq: _*})")

...

Thanks,
Zilu

Implement pretty-printer

The legacy IRs implemented pretty-printing by converting the code into a Scala AST and then using the Scala pretty-printer!
Not ideal because it's slow, and because Scala's pretty-printer can be glitchy.
The new pretty-printer should:

break lines only if necessary (can take inspiration from this);
print variables as original name + '_' + counter starting from 0 (to avoid clashes);
ideally, have a notion of precedence and associativity to avoid some parentheses;
ideally, be reusable (implemented as a Base), but with hooks to help print virtualized constructs such as if then else correctly (instead of as squid.lib.IfThenElse).

Some rewritings diverge instead of stopping on a recursion limit

See:

scala> ir"val x = 42; println(x.toDouble)" fix_rewrite { case ir"($n:Int).toDouble" => ir"$n.toDouble+1" }
java.lang.StackOverflowError

on the other hand, this one does not exhibit the problem:

scala> ir"val x = 42; println(x.toDouble)" fix_rewrite { case ir"($n:Int).toDouble" => ir"$n.toDouble" }
Rewrite rules did not converge after 8 iterations.
ir"""{
  val x_0 = 42;
  scala.Predef.println(x_0.toDouble)
}"""

Remove special handling of AnyCode

There is no reason to handle AnyCode specially as it is done now (reminiscence of the experimental separation between simple and contextual QQs). It is in fact a soundness hole (see: val x: AnyCode[Int] = code"(?str:String).length"; code"$x".run).

We should direct users to use OpenCode in all cases where AnyCode is used today, and remove special handling in QuasiMacros, QuasiEmbedder, and related $Code functions in QuasiBase.

deprecate insertion of AnyCode expressions, which is unsafe
change doc to mention OpenCode instead of AnyCode
remove special support from the implementation

Support free variables in quasicodes

This doesn't work (but should (?))

ir{($$x: Int)}

The error message is "not found: value $$x"