epfldata / squid Goto Github PK
View Code? Open in Web Editor NEWSquid – type-safe metaprogramming and compilation framework for Scala
Home Page: http://epfldata.github.io/squid/
License: Apache License 2.0
Squid – type-safe metaprogramming and compilation framework for Scala
Home Page: http://epfldata.github.io/squid/
License: Apache License 2.0
The rewrite
macros comprise top-level rewrite
calls from the RuleBasedTransformer
class, as well as term-level t.rewrite
calls, which desugar into the same.
These are currently expanded into calls to RuleBasedTransformer#registerRule
– conceptually, an invocation such as rewrite { case code"t" if c => rhs }
expands into:
registerRule(
expression representation of code pattern t,
extractArtifact => {
// ... extract the different trees obtained from extractArtifact
// ... and pass them through the sub-patterns
// (for example, see `SubPattern` in `case code"${SubPattern(x,y,z)}:Int" => ...`)
if (the guard/condition c of the pattern matching case is respected)
Some(adapted version of rhs)
else None
}
)
This is generated after some crazy macro hackery during which untypecheck
is used to get around owner chain corruptions, which incurs limitations on what can go inside a rewrite rule right-hand side. It incurs problems with path-dependent types, some of which are fixed in a very ad-hoc way.
A saner approach would likely be to perform the main code transformation before type checking (using a macro annotation), since the translation is mostly syntactic anyway. But we would have to split from that code the functionality that inspects the types of the pattern and right-hand side in order to refine the type of the rewritten term and to check that the transformation is type-preserving. Indeed, that functionality needs to be expressed in a macro def, as it requires type checking information – but the good news is that it does not change the shape of the tree and should not result in owner chain problems.
You can see this issue in the following code (extracted from DBLAB):
List(1, 2, 3).foldLeft(ir"0") { (acc, exp) =>
ir"if($acc != 0) $acc else $exp"
}
scala> ir"123.toString".run
scala.ScalaReflectionException: expected a member of class Int, you provided method java.lang.Integer.toString
It turns out they are encoded differently than Scala varargs. See report here.
ir { val ab = new ArrayBuffer[Int]; ab.append(1); var s = ab.size; s }
causes the size computation to occur before the append
Direct function application syntax on a pattern hole is currently interpreted as higher-order matching. To extract a function, one should ascribe the hole with a function type. This can lead to confusing error, such as the following:
[error] Embedding Error: Unexpected expression in higher-order pattern variable argument: 42
[error] case ir"$f(42)" => println(s"$f(42)")
[error] ^
In this case, the error message should inform and guide users who just wanted to extract a function...
For example in:
val pgrm = ir{
val ab = ArrayBuffer[Any]()
List[Any]().foreach { e =>
ab += e.asInstanceOf[(Int,String,Double)]._2 // asInstanceOf disappears.. why?
}
}
The result is we may generate code that won't type-check.
ir{
val x = 5.0
val x_3 = ${power(3, ir"$$x : Double")}
}
does not substitute the hole properly even though x is captured and type checked against the hole.
If I have a generic class that is also an inner class, then the generic arguments fall off. For example, if I have
class Foo {
class Foo3[A]
}
And then I do
val y: Foo#Foo3[Int] = ???
Then this turns into
val y_0 = ((scala.Predef.???): (test.Foo)#Foo3);
()
which fails to compile with
Error:(37, 23) class Foo3 takes type parameters
Here is a commit in the squid-examples repo demonstrating the issue:
ggevay/squid-examples@01ed744
Edit: I pasted the wrong commit
Syntax is already available; just need to remove uses of the old one, and possibly repurpose it.
Remove outdated stuff (for example the /benchmark
folder, as well as some things in /example
).
The SchedulingANF
IR is useful (and even crucial) to avoid code duplication and recomputation in generated code. Yet, the current implementation is broken and should be reworked.
Have a proper (robust and easy to use) implementation of the simplified syntax as presented in the SPLASH'17 papers. (Currently, one has to use the SimplePredef
hacky interface and there is no direct HOPV syntax.)
Code[T,C]
type in Predef
code"t"
quasiquote interpolerSimpleReps
traitIR[T,C]
, IRCode[T]
)Cross-quotation references (CQR) are variables at a given stage that refer to binders at the same stage but across quotation boundaries (not to be confused with cross-stage references).
For example:
code"(x: Int) => ${ foo(code"x+1") }" // occurrence of x refers to binding in outside quote
Because of fundamental limitations, cross-quotation references are impossible to support in quasiquotes. However, it should be possible to support them in quasicode, as in:
code{(x: Int) => ${ foo(code{x+1}) }} // occurrence of x refers to binding in outside quote
What needs to be done:
disable implicit cross-stage reference support in quasicode (so doing CSP in quasicode requires using explicit CrossStage(...)
expressions), as they would be locally ambiguous with CQRs;
when QuasiEmbedder encounters a reference to a variable not defined within the quote, it uses that variable's symbol to look up its associated v:Variable[T]
symbol in an external macro cache, and creates one if none exists – this symbol is used to add v.Ctx
to the current context requirement; now, in place of the CQR, generates a special tree that should not compile (containing a @compileTimeOnly
) to be substituted by the outer quote (if any);
when QuasiEmbedder finds a binding of some variable v
, it makes sure to check the macro cache to see if it's a cross-quotation binding, in which case it accounts for its v.Ctx
context being bound there; then in each further inserted tree of the quote, it looks for the corresponding @compileTimeOnly
trees and substitutes them with a reference to the bound variable.
Thanks!
These are two distinct concepts, the conflation of which has only led to unnecessary complexity.
For example, spliced holes make sense but not spliced FVs.
Known problems in particular:
Sym
xtor, and one can reinsert them in binder position to propagate annotations and other meta-info (todo).case ir"(acc,a) => ${Addition(ir"?acc:String",rest)}" => ...
(currently, the inner pattern will extract any string and associate it with acc
, and crash because there is no corresponding $
extraction).I had completely forgotten this, but Const
is currently very unsafe:
scala> Const(1)
res0: IR.ClosedCode[Int] = code"1"
scala> Const(new{})
java.lang.Error: bad constant value: $anon$1@71be27f7 of class class $anon$1
The Const.apply
function should require a type class instance making sure the type can actually be a constant.
If I add the following in https://github.com/LPTK/squid-examples/blob/master/embedding-from-macros/src/test/scala/test/TestTest.scala
val r2 = Test2.test {
implicit val x = 42
}
then it prints
[At compile time:]
Original program: code"""{
val x_0 = 42;
()
}"""
Transformed program: code"""{
val x_0 = 42;
()
}"""
i.e., x
is not implicit anymore.
To make the new IR faster, and compatible with ScalaJS/ScalaNative (and also to lower its startup time), we would like not to rely on Scala Reflection at runtime.
This means we implement our own representation of types, type symbols and method symbols.
The downside is that we won't be able to inspect at runtime the annotations (such as effects annotations) defined on method symbols. To work around this, we:
@embed
macro annotation to look at the annotations placed on methods and register them in the IR;@embed
generates.It can be confusing to unaware programmers that quasiquotes referring to the members of a class from within the class will be compiled as open terms containing a free variable for that class' this
instance.
For example, in:
class A { def foo = 27; def bar = ir"foo + 1" }
Field bar
will have type IR[Int, {val `A.this`: A}]
.
Instead, we could just make it an error. Not sure this behavior is actually ever useful.
I'm still struggling with getting TypeTag
to work. Now the problem is that it is path-dependent on the universe
.
Simplified example: Let's say that I have these types:
class Outer {
class Inner
}
Then
println(dbg_code"""
val x = new Outer
val y: x.Inner = ???""")
prints
code"""{
val x_0 = new test.Outer();
val y_1 = ((scala.Predef.???): (test.Outer)#Inner);
()
}"""
where x.inner
turned into Outer#Inner
.
This is a problem in Emma, because after some code fragment runs through Squid, I'm getting errors such as
found : scala.reflect.api.JavaUniverse#TypeTag[String]
required: reflect.runtime.universe.TypeTag[String]
This is to prepare the ground for the new IR implementations that will replace the legacy AST
/SimpleAST
/SimpleANF
IRs which are bloated and unoptimal.
We need to:
add a Context
abstract type member and an implicit
parameter to reification methods, so as to avoid needless global state in the reification process;
abstract over the representation of argument lists List[ArgList]
, so as to allow alternative and possibly more efficient representations;
abstract over the Extract
data type, for the same reason;
perhaps take the precautions needed to make the IRs threadsafe, if it does not incur significant overhead (otherwise make it opt-in) – IIRC the main place where threadsafety would be a problem is with the pattern caching mechanism, and with variable-id generation.
Currently, pattern matching and partial function syntaxes (and derived) are not supported. Implement a virtualization scheme to support them.
Currently Code[T]
has access to the enrichments that provide .run
and .compile
to IR[T,Any]
(via InspectableCodeOps
). This should probably not be the case.
Varargs in embedded defs are currently not supported.
@embed object Foo {
@phase('MyPhase) def foo(x: Int, xs: Int*): Int = x * xs.sum
}
A call to Foo.foo(100, 1, 2, 3)
should be correctly inlined.
The default methods' parameter values in Squid are missing
Currently the rewrite
macro rewrites calls to Abort()
as calls to `internal abort`
, which is unnecessary and makes writing Abort()
from a function outside the body of the rewrite rule impossible.
See the example:
code"Seq(1,2,3)" match {
case code"Seq[Int]($xs*)" =>
val xs2 = c"0" +: xs
code"Seq[Int](${xs2}*)" // works fine
// a current limitation/bug prevents the following syntax:
// code"Seq[Int](${c"0" +: xs}*)"
// Embedding Error: Quoted expression does not type check: type mismatch;
// found: Any; required: Seq[Embedding.IR[Int,?]])
}
The ability to add bounds on type holes was added to accommodate patterns where these bounds are required (https://github.com/epfldata/sc/commit/9ed047c7a28eaabdd7433161fc931419b78772d3), but the bounds are not yet checked during extraction.
For example, the following matches:
ir"List(0)".erase match { case ir"$ls:List[$t where (t <:< String)]" => t }
.
In order for type hole bounds to be checked, we need to add bound parameters to the typeHole
function of QuasiBase
.
val v = __newVar[Int](unit(1))
ir"println($v)"
Currently in order to dump the code generated by optimizeAs('Name)
, one has to write:
implicit val `/tmp` = DumpFolder
It would be less cryptic and more general to implement a static
macro so we can do:
implicit val dumpFolder = static{DumpFolderPath("/tmp")}
Can use annotations to store the intermediate AST. Can even execute the code at compile-time and use a type-class to parse the result back to an AST.
Sometimes one still encounters annoying errors saying a context should be C with AnyRef
instead of just C
.
This basically happens when using constructs that return IR[T,{}]
, such as Const(123)
because Const
is defined as [T: IRType](v: T): IR[T,{}]
. For example:
def foo[C](x: IR[Int, C]): IR[Int, C] = ir"$x + ${Const(123)}"
This code fails to compile with:
Error:(37, 56) type mismatch;
found : squid.TestDSL.IR[Int,AnyRef with C]
required: squid.TestDSL.Predef.IR[Int,C]
(which expands to) squid.TestDSL.IR[Int,C]
def foo[C](x: IR[Int, C]): IR[Int, C] = ir"$x + ${Const(123)}"
We can name free variables in a way to collide with the names of bound variables.
> ir"(x: Int) => 2*x + ($$x_0:Int)"
res0: IR[Int => Int,Any{val x_0: Int}] = ir"((x_0: scala.Int) => (2).*(x_0).+(x_0))"
It would be nice being able to use simple def
's (without type parameters) in embedded code, as well as lazy
values. For example, it would avoid a workaround mentioned in emmalanguage/emma#369.
This should be doable in the same way it's done with things like if
/else
, i.e.:
code"if ($cnd) $thn else $els"
// really is equivalent to:
code"squid.lib.IfThenElse($cnd, $thn, $els)"
In the same way, we could have:
code"lazy val $x = $v; $b"
// virtualized as:
code"val $y = squid.utils.Lazy($v); ${b.subs(x) ~> code"$y.value"}"
and:
code"def f(...) = ... ; ..."
// if f is not recursive, as:
code"val f = (...) => ... ; ..."
and for recursive definitions, using a Y combinator with by-name parameters, for example for a recursive definition:
code"def f(...) = ... ; ..."
// if f is recursive, as:
code"val f = Y(f => ...) ; ..."
Finally, it would use an annotation (similar to the one used for implicit bindings) in order to mark virtualized bindings so they can be generated back to their original form when emitting Scala trees.
Refer to here:
https://github.com/epfldata/DDBToaster/blob/scgen/ddbtoaster/pardis/tpcc/TpccXactGenerator_SC.scala#L130
and change dsl
to ir
.
It would be nice if this worked,
> (ir"(x: Int) => x".compile)(2)
error: type mismatch;
found : Int(2)
required: <:<[AnyRef,Any]
(ir"(x: Int) => x".compile)(2)
^
just like this:
> ((x: Int) => x)(2)
res0: Int = 2
The following currently evaluates to false:
trait Foo { def foo: Int }
class Bar extends Foo { override def foo: Int = 0 }
val a = code"(new Bar).foo" // uses Bar#foo symbol
val bar: ClosedCode[Foo] = code"new Bar" // upcast from ClosedCode[Bar]
val b = code"$bar.foo" // uses Foo#foo symbol
println( a =~= b ) // prints false
This is because term equivalence is based on matching semantics, and this does not match:
a match { case code"($f:Bar).foo" => }
n.b. while this matches:
a match { case code"($f:Foo).foo" => }
This is a pretty annoying limitation in many contexts. As an example, this rule won't be able to match uses of the equals
method statically applied on a type that overrides it, whereas it should. (This is not to be confused with the unrelated problems introduced by ==
overloading, though.)
It seems type safety would not be violated by saying that all overriding and overridden symbols should be treated as equal when performing pattern matching. This is because the self object on which any of these methods are called will also be checked during matching at one point or another, so for example we cannot match a call to an overriding method with refined return type if the self object is not actually a subtype of the type where the overriding method is defined.
The following code does not currently raise a warning, but the pattern is too restrictive and does not match:
val q = ir"ArrayBuffer[Int]() foreach println" rewrite {
case ir"($ab:ArrayBuffer[$t0]).foreach($f)" => ir"???"
}
(The correct pattern is case ir"($ab:ArrayBuffer[$t0]).foreach[$t1]($f)" =>
.)
See: #53 (comment)
It is entirely possible to use Squid without statically checking scopes, similar to MetaOCaml or Dotty's quotes (but of course with more features, like pattern matching).
This simpler mode of usage is less type-safe, but makes many things much easier and less intimidating for beginners. One has to use the OpenCode[T]
type (which is a super type of the contextual Code[T,C]
), and use .unsafe_asClosedCode
whenever one wants to run or compile the piece of code.
However, currently we emphasize the contextual mode of Squid as the default/normal mode, which is intimidating and may scare beginners away. Instead, I think we should promote the simpler interface as the default interface, and reserve the contextual interface for advanced users. This would mean:
Changing the type synonyms: OpenCode[T]
is long-winded and a bit ugly; we'd like to use Code[T]
instead. The new scheme would be: Code[T]
for when you do not know the context, ClosedCode[T]
for code that is assuredly closed and can be run; OpenCode[T,C]
for open code for which we statically track the scope using contextual type C
(for advanced users).
Alternatively, as I previously proposed in LPTK/squid-type-experiments, we could have type synonym in
to enable the nice Code[T] in C
syntax (meaning Code[T]{ type Ctx >: C }
), though that would likely require changes in Squid to look at the Ctx
member of code types instead of their C
type argument, which may cause weird typing problems down the line.
Changing the code types inferred by default: we should have a SimplePredef
which is configured to make quasiquotes infer Code[T]
types by default, instead of Code[T,C]
types; even though OpenCode[T,C] <: Code[T]
, this still does help with type inference, such as when doing var c = code"0"; ... ; c = code"<open term>"
.
Splitting the basic tutorial in two: there should be a basic tutorial using simple Code[T]
types and cross-quotation references, which are now supported, and a tutorial for advanced users that demonstrates the OpenCode[T,C]
type and its interaction with path-dependent Variable[T]#Ctx
types.
Hello,
I am having trouble with setting up the code nagivation for Squid on VSCode. When I import build, I keep getting error messages complaining about "repeated argument not allowed"
when generating semanticdb. All tests passed when I ran bin/testAll.sh
. I am using Master branch and default build.sbt. Please let me know what I missed. Thanks!
failed to generate semanticdb for /*local-squid-path*/squid/src/main/scala/squid/OptimTestDSL.scala:
/*local-squid-path*/squid/src/main/scala/squid/OptimTestDSL.scala:58: error: repeated argument not allowed here
code"List(${ xs map (r => code"$f($r)"): _* })"
^
at scala.meta.internal.parsers.Reporter.syntaxError(Reporter.scala:16)
...
And the same error message repeated for others as well:
failed to generate semanticdb for /*local-squid-path*/squid/src/test/scala/squid/feature/Antiquotation.scala:
/*local-squid-path*/squid/src/test/scala/squid/feature/Antiquotation.scala:106: error: repeated argument not allowed here
eqt(code{List(${Seq(x,y):_*})}, code"List(1,2)")
failed to generate semanticdb for /*local-squid-path*/squid/src/test/scala/squid/feature/Quasicode.scala:
/Users/ztian/Desktop/squid/src/test/scala/squid/feature/Quasicode.scala:44: error: repeated argument not allowed here
eqt(ls, code"List(${seq: _*})")
...
Thanks,
Zilu
The legacy IRs implemented pretty-printing by converting the code into a Scala AST and then using the Scala pretty-printer!
Not ideal because it's slow, and because Scala's pretty-printer can be glitchy.
The new pretty-printer should:
'_'
+ counter starting from 0 (to avoid clashes);Base
), but with hooks to help print virtualized constructs such as if then else
correctly (instead of as squid.lib.IfThenElse
).See:
scala> ir"val x = 42; println(x.toDouble)" fix_rewrite { case ir"($n:Int).toDouble" => ir"$n.toDouble+1" }
java.lang.StackOverflowError
on the other hand, this one does not exhibit the problem:
scala> ir"val x = 42; println(x.toDouble)" fix_rewrite { case ir"($n:Int).toDouble" => ir"$n.toDouble" }
Rewrite rules did not converge after 8 iterations.
ir"""{
val x_0 = 42;
scala.Predef.println(x_0.toDouble)
}"""
There is no reason to handle AnyCode
specially as it is done now (reminiscence of the experimental separation between simple and contextual QQs). It is in fact a soundness hole (see: val x: AnyCode[Int] = code"(?str:String).length"; code"$x".run
).
We should direct users to use OpenCode
in all cases where AnyCode
is used today, and remove special handling in QuasiMacros
, QuasiEmbedder
, and related $Code
functions in QuasiBase
.
AnyCode
expressions, which is unsafeOpenCode
instead of AnyCode
This doesn't work (but should (?))
ir{($$x: Int)}
The error message is "not found: value $$x"
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.