jasmin-lang / jasmin Goto Github PK
View Code? Open in Web Editor NEWLanguage for high-assurance and high-speed cryptography
License: MIT License
Language for high-assurance and high-speed cryptography
License: MIT License
When copying a reg array to a bigger reg array, we get an assert false. I would expect a proper error message.
fn f () -> reg u64 {
reg u64 res;
reg u64[1] a;
reg u64[2] b;
b = a;
res = a[0];
return res;
}
Even stranger, the assert false is the same with reg ptr.
fn f () -> reg u64 {
reg u64 res;
reg ptr u64[1] a;
reg ptr u64[2] b;
b = a;
res = a[0];
return res;
}
The program is expected to be rejected, but the error message could be clearer.
export fn main (reg u64 x) -> reg u64 {
reg u64[10] t;
stack u64[5] a;
reg u64 res;
a = t[0:5];
res = a[0];
return res;
}
For loops whose bounds are not constant are not inlined. Consider, for instance:
export fn main (reg u64 io) -> reg u64 {
inline int i;
reg u64 res;
for i = 0 to (int)io {
res = i;
}
return res;
}
The problem is that some later passes assume that all for loops have been inlined. This program, for instance, is rejected with error:
Fatal error: exception File "src/liveness.ml", line 58, characters 14-20: Assertion failed
I think we should either support this kind of for loops or reject them directly during unrolling.
In the following code
fn f () -> reg u64 {
reg u64 r;
r = 0;
return r;
}
export fn main (reg u64 x) -> reg u64 {
reg u64 r;
f ();
r = x;
return r;
}
we get the warning warning: at line -1, introduce 1 _ lvalues
(instead of line 9). This seems to be due to the absence of assignment. Indeed, in this other example
fn f () -> reg u64, reg u64 {
reg u64 r;
r = 0;
return r, r;
}
export fn main (reg u64 x) -> reg u64 {
reg u64 r;
r = f ();
return r;
}
we correctly get a warning mentioning line 9.
A code like the following
param int N = 2;
u64[N] a = {1,2};
is not accepted by jasmin. Is it on purpose? I realize while writing this issue that we do not have a way to write initialization lists of parametrized size so it is probably of little interest anyway.
Currently, during lowering, we transform some instructions (in particular additions) if the immediate they take as an argument is too large to be taken directly. We then introduce a move of this immediate to a register, and replace the immediate by this register in the original instruction. This approach has several drawbacks:
When compiling the following program in the glob_array3
branch
inline fn sum (stack u64[2] a) -> reg u64 {
reg u64 r;
r = a[0];
r += a[1];
return r;
}
export fn main (reg u64 x) -> reg u64 {
reg u8 r;
stack u64[4] a;
reg u64 res;
inline int i;
r = 0;
for i = 0 to 4 {
a[i] = 3;
}
res = sum(a);
return res;
}
we first get a warning can not ensure that the type U64[4] is compatible with U64[2]
that is not fatal, and then Fatal error: exception File "src/alias.ml", line 122, characters 2-8: Assertion failed
. The error is expected but could be clearer.
Note that if we try a more naive error:
stack u64[4] a;
stack u64[2] b;
a = b; // or b = a;
we get the same warning, but then a typing error. Maybe we should get the same typing error in the first case.
I must admit that the non-fatal warning, followed in both cases by a fatal error due to the same problem, is rather confusing. Is the warning redundant? Or are there cases where the warning is triggered and the compilation still possible?
In the following code
export fn main (reg u64 x) -> reg u64 {
stack u64[3] a;
stack u64[2] b;
b = a[2:2];
x = b[0];
return x;
}
we try to get a too large slice of an array. This is rejected as expected, but the message is a bit terse: At line 4: range_in_slice: range not included
. This should probably mention at least the source array.
When trying to print some iinfo
, I get this warning. Maybe a variable introduced by the compiler not the right way. I will try to give a minimal example later.
For now we use AT&T syntax, it could be interesting to be able to generate Intel syntax.
The program is admittedly ugly, but it still produces an error.
fn f(reg ptr u64[10] x) -> reg ptr u64[8] {
return x;
}
export fn main (reg u64 x) -> reg u64 {
stack u64[15] a;
reg u64 res;
a[0:8] = f (a[0:10]);
res = a[0];
return res;
}
When trying to define (in glob_array3
branch) a non-inlined function taking as an argument a reg array, for instance:
fn sum (reg u64[2] a) -> reg u64 {
reg u64 r;
r = a[0];
r += a[1];
return r;
}
export fn f (reg u64 x, reg u64 y) -> reg u64 {
reg u64[2] a;
reg u64 res;
a[0] = x;
a[1] = y;
res = sum (a);
return res;-
}
we get the following imprecise error: : line -1 (-1): variable a.143 is an array (function argument)
. The location should be improved.
It would be nice to have emacs/vim/sublime... support. Is there already something lying around somewhere? What kind of features can we have without doing something too complex?
With the following code
fn f (reg ptr u64[3] r) -> reg u64 {
reg u64 res;
stack u64[3] s;
s = r;
res = s[0];
return res;
}
we get the error cannot put a reg ptr argument into the local stack
but with no location, so it can be difficult to know what is the source of the error.
This is not trivial to solve since the code that emits this error does not have the locations at hand.
In the following code
inline fn f (reg mut ptr u64[3] x) {
x[0]=2;
}
export fn main (reg u64 io) -> reg u64 {
return io;
}
we get an error at the regalloc pass, which is sensible, but located at line 3 (the end of function f
), which is not. The right line should probably be main
's declaration, i.e. line 5.
If we try to call an export function, we do not get a clear message "cannot call an export function". Instead, we can get:
regalloc.ml
export fn f(reg u64 arg) -> reg u64 {
reg u64 res;
res = arg;
return res;
}
export fn main () -> reg u64 {
reg u64 res x;
x = 2;
res = f (res);
res += x;
return res;
}
linearisation: (one_varmap) nowhere to store the return address
export fn f(reg u64 arg) -> reg u64 {
reg u64 res;
res = arg;
return res;
}
export fn main () -> reg u64 {
reg u64 res;
res = f (res);
return res;
}
Is this something useful? Are the options complex enough so that it is worth spending some time on that?
With uninitialized stack variables, we get an assert false in src/varalloc.ml
at line 199.
Example:
fn f () -> reg u64 {
stack u64 s1;
reg u64 res;
res = s1;
return res;
}
The following program returns zero:
export
fn test() -> reg u8 {
reg u8 a;
a = 256;
return a;
}
On the following example
fn f (reg ptr u64[2] r) -> reg u32 {
reg u32 res;
stack u64[2] a;
stack u32[10] b;
a[0]=0;
b[1:2] = a[u32 0:2];
res = b[1];
return res;
}
jasmin returns an error about a bad alignment. This is because the stack alloc oracle (in OCaml) decides to align one slot to contain both a
and b
aligned on 64 bits and with b
at the beginning of the slot. Due to the assignment b[1:2] = a[u32 0:2];
, a
is aligned on 32 bits but not 64, thus it cannot perform a[0]=0
.
The allocator could rather decide to add some padding, either before or inside the slot. In that second option, it could allocate a larger slot and put b
32 bits after the start of the slot. Then a
would be aligned on 64 bits and the assignment a[0] = 0
would succeed.
This is really low-priority. This is an artificial example that I created to trigger the "bad alignment" error.
In the following example
export fn main () -> reg u64 {
reg u64 res;
stack u64[3] a, b;
a[0:2] = b[1:2];
res = a[0];
return res;
}
the OCaml oracle associates a partly negative range to variable b
. Then the Coq checker fails due to the negative starting index.
I am not sure what to do here. Should the OCaml oracle be patched to fail early? Or do we consider that this is normal, but just need a bettor error message from the Coq checker?
With cwd in compiler/
the following opam pin add -n jasmin .
command fails with this error:
Package jasmin does not exist, create as a NEW package? [Y/n] y
[jasmin.~dev] synchronised from file:///home/daan/jasmin/compiler
[WARNING] Failed checks on jasmin package definition from source at
file:///home/daan/jasmin/compiler:
error 57: Synopsis and description must not be both empty
jasmin is now pinned to file:///home/daan/jasmin/compiler (version 0.1)
cc @ildyria
Easy to reproduce on glob_array3
:
u64[4] g = {1};
Should instead trigger a nice error.
When compiling the following program
export fn main () -> reg u256 {
reg u256 xmm;
reg u64 x;
x = 2;
xmm = #VMOV_256(x);
return xmm;
}
we get a typing error because xmm
is of type u256
instead of u128
. Indeed, the annotation _256
is silently dropped. It would be better to have an error in this case. Maybe it was done like this to avoid another constructor PrimXXX
?
It was decided to release two versions in a short period of time. The first one, called "jasmin 0" (finally named 21.0) will contain what is currently in master. It is needed since we are about to merge glob_array3 in master and some people may want to keep working with the older version. It will also serve as a rehearsal for the second release. The second one, probably called "jasmin 2022.01" (finally called 22.0) (we need to synchronize with EasyCrypt) will contain what is currently in glob_array3.
This issue is an attempt to list the things that we want to do before the releases.
The recent progresses about randomness and the support of multiple architectures were considered not mature enough to be in the release (this needs to be 100% confirmed for randomness, since the developers of libjade were really requesting this feature). They will be released in a next release in a few months (since we don't have a rigid agenda, and these are rather major features, nothing prevents us from making a new release as soon as these features are added in master).
The file tests/fails/spill.jazz
reveals a bug in the register allocation. The allocator tries to use RAX
for the return address of the called function gosub
while it is already used in the callee main
. Apparently it does not take into account all the registers that are live in the callee.
Consider this example:
fn f (reg ptr u64[2] io) -> reg ptr u64[2], reg ptr u64[2] {
return io, io;
}
export fn main (reg u64 io) -> reg u64 {
inline int i;
reg u64 res;
stack u64[2] a;
a[0] = 1;
a[1] = 2;
a, a = f (a);
res = a[0];
return res;
}
Function f
returns its input both as result 1 and result 2. The pass makeReferenceArguments introduces two different reg ptr
for the two results (plus one for the arg), giving:
rp1, rp2 = f (rp0);
a = rp1;
a = rp2;
Then stack alloc rejects the program due to a = rp1
because rp1
is not valid (because of rp2
) and a copy of an array is allowed only if that array is fully valid (that could be changed, but this is how it works now).
Was the transformation of makeReferenceArguments the right one in this case? It could have used the same reg ptr
in the results of f
and then only one assignment a = rp
. Or we could just reject this kind of programs, because it can be a source of headache while bringing no real feature (since the second result shadows the first one, why would we want to do that?).
When accessing a reg array at an index that is not known at compile time, the compiler fails on an assert false
. The program is expected to be rejected, but the error message could be clearer.
export fn main (reg u64 x) -> reg u64 {
reg u64[10] t;
reg u64 res;
inline int i;
res = 3;
t[(int)res] = 0;
res = t[0];
return res;
}
The following code does not type check (typing.ml) because pretyping does not insert the good "type annotation" in the assignment.
inline fn test(reg u64 buffer) -> inline int
{
inline int i j;
i = 0;
j = 1 << i;
return j;
}
Using nix-shell to setup all dependencies and build the compiler (in the compiler folder) does not work.
As indicated by the following output it is missing a dependency (yojson):
ocamlbuild -use-ocamlfind -j 2 "entry/jasminc.native"
Finished, 0 targets (0 cached) in 00:00:00.
+ ocamlfind ocamlc -rectypes -c -g -bin-annot -w Y -w Z -w -3 -w -23 -w +28 -w +33 -w -40 -w -58 -package 'batteries, menhirLib, zarith, apron.octMPQ, apron.ppl, apron.boxMPQ, yojson' -I CIL -I src -I src/safety -I src/safety/domains -o src/utils.cmi src/utils.mli
ocamlfind: Package 'yojson' not found
Command exited with code 2.
Compilation unsuccessful after building 4 targets (3 cached) in 00:00:00.
make: *** [Makefile:55: native] Error 10
I have created a pull request #10 to add the missing dependency (yojson).
Benjamin thinks that the Loop.nb
could be axiomatized on the Coq side and set on the OCaml side on a higher constant (500, 1000?) that could be customized by the user using a command line argument. The error should then be adapted to tell the user that the command line option exists.
When lowering does not know what to do for a given instruction, it returns it without touching it. The following passes in the compiler then receive a sometimes nonsensical instruction. It would be cleaner if lowering failed in these cases.
Hi, I'm trying to run a simple example from this repo but I keep getting the error below.
I think I had it running properly before but now I can't figure out what I have done differently before.
> /app/easycrypt/ec.native compile -I Jasmin:/app/jasmin/eclib/ addNv3_ct_proof.ec
[critical] [: line 1 (0-24)] In external theory AddNv3_ct [./addNv3_ct.ec: line 2 (0-33)]:
In external theory JModel [: line 1 (0) to line 4 (58)]:
In external theory JWord [<unknown> location]:
unknown type `t'
[|] [0002] 4.6 % (-1.0B / [frag -1.0B])
Would be cool to get some hints on what I am possibly doing wrong.
Thanks
I was playing with a toy example (on the glob_array3
branch) and get the error in the title. Probably, the example makes no sense, but I was asked to report it, so here it is.
export fn main (reg u64 x) -> reg u64 {
stack u64[10] t;
reg u64 res;
inline int i;
inline int j;
for i = 0 to 10 {
t[i] = i;
res = 2 * i;
j =(int) res;
t[j] += x;
}
res = t[0];
return res;
}
After discussing with Benjamin, it seems the behaviour of the Jasmin compiler regarding the #SETcc
instruction/intrinsic is not intuitive at times.
Consider the following block of code:
export fn test(reg u64 p p1 p2) -> reg u8
{
reg u8 s t8;
reg bool cf sf of zf pf;
of, cf, sf, pf, zf = #CMP(p, p);
s = #SETcc(cf || zf);
return s;
}
which compiles correctly and checks if p <= p
using the SETBE
instruction.
If the SETcc
instruction is called as follows (swapping the cf
and zf
flags):
s = #SETcc(zf || cf);
the compiler returns the following error:
compilation error in functions test and test at position line 7: assembler error Invalid condition (BE)
although cf || zf
and zf || cf
are logically equivalent.
Currently, the compilation of jasminc
depends on proof files (psem.v, asmgen.v, linear_proof.v, etc.). This is due to the algorithmic and proof parts being mixed in such files. Extracting these parts and moving them, or splitting the files, should help clean the architecture and reduce the compilation time for those who do not need the proofs.
One difficult point is allocation.v, since it plays with functors. This seems doable, but the fully satisfying solution has not been found yet.
Dead code is aggressive by default. This is not necessarily a bad thing, but flag AT_keep
should prevent that. Is it working properly?
The program
export fn main () -> reg u64 {
reg u64 res;
return res;
}
fails to compile with the error message
"", line 1 (0) to line 4 (1):
internal compilation error in function main:
one-varmap checker: not able to ensure equality of the result
Please report at https://github.com/jasmin-lang/jasmin/issues
The compiler should probably fail more gracefully.
Consider the following code, where function f1_while_scale
adds all elements from global array c
. The safety checker does not find any violation for this example.
u16[16] c =
{ 1,1,1,1,
2,2,2,2,
3,3,3,3,
4,4,4,4
};
inline fn f1_while_scale() -> reg u16
{
reg u64 i;
reg u16 r;
reg ptr u16[16] cp;
i = 0;
r = 0;
cp = c;
while(i < 16)
{ r += cp[(int) i];
i += 1;
}
return r;
}
export fn e1_while_scale() -> reg u16
{
reg u16 r;
r = f1_while_scale();
return r;
}
Now consider a different implementation of the previous program where cp[(int) i]
is replaced by cp.[(int) i]
(no scale factor) and variable i
is incremented by 2. The loop condition is also changed accordingly.
u16[16] c =
{ 1,1,1,1,
2,2,2,2,
3,3,3,3,
4,4,4,4
};
inline fn f2_while_no_scale() -> reg u16
{
reg u64 i;
reg u16 r;
reg ptr u16[16] cp;
i = 0;
r = 0;
cp = c;
// i < 32 : *** Possible Safety Violation(s): line 23: in_bound: cp.[U16 ((int of u64) i) ] (length 32 U8)
// i < 31 : *** No Safety Violation
// i < (32-1) : *** No Safety Violation
// i < ((16*2) - 1) : *** Possible Safety Violation(s): line 23: in_bound: cp.[U16 ((int of u64) i) ] (length 32 U8)
while(i < 32 )
{ r += cp.[(int) i];
i += 2;
}
return r;
}
export fn e2_while_no_scale() -> reg u16
{
reg u16 r;
r = f2_while_no_scale();
return r;
}
In this last example, when the condition of the loop is i < 32
, a "possible safety violation" warning is printed, which makes sense if we assume that, and in this case, only the loop condition taken into consideration during the analysis (i < 32
) and thus, using just that information, i
could actually be 31
. If i
is 31
, the u16
read from the expression cp.[(int) i] is invalid.
If we change the condition to i < 31
(which does not change the computation since i
is pair) the warning disappears. It would be interesting if the safety analysis could take into account how loop variables get incremented/decremented.
I pushed some files here: https://github.com/tfaoliveira/jasmin/tree/master/safetycheck-u16
safetycheck-u16
folder, there is a sequence of examples (1..7.jazz) that can be used to test the safety checker for the described issue. Examples that currently fail are: 2.jazz (shown here); and 5.jazz, 6.jazz and 7.jazz (unrolled versions of the previous examples that also include an external pointer in the computation).make
if jasminc
is in the PATH.Benjamin finds that the inductive type used until now is cumbersome. He suggests using boxes à la Format
module of OCaml, that allows to print as much information as we want.
It is explored in branch pp_error
.
If we have time, we can try to add colours...
In the following code
fn f () -> reg u64 {
reg u64 r;
return r;
}
export fn main (reg u64 x) -> reg u64 {
reg u64 r;
r = f ();
return r;
}
the regalloc pass fails with the message Regalloc: unknown variable r.128
. If we initialize r
in function f
, the error disappears. Maybe the error should mention an uninitialized variable instead of an unknown one?
If the size of the return type of a function does not match the number of objects returned, such as in the following case
fn sum (reg u64[4] a) {
reg u64 r;
r = a[0];
return r;
}
the error mentions an invalid number of arguments
(in the example 0 provided instead of 1
). It seems that arguments and results are treated in the same way. This is perhaps interesting from the developer point of view, but confusing from the user one.
The compiler build instructions in compiler/README.md seem a little out of date.
This step:
opam repo add coq-extra-dev https://coq.inria.fr/opam/extra-dev
requires Opam 2.0. After scratching my entire install and re-installing everything with Opam 2.0, I get through the dependency setup, but the build fails like so:
cwinter@MSRC-4079444:/mnt/f/dev/jasmin/compiler$ make CIL build
rm -f CIL/*.ml CIL/*.mli ../proofs/extraction.vo
make -C ../proofs extraction
make[1]: Entering directory '/mnt/f/dev/jasmin/proofs'
rm -f lang/ocaml/*.ml rm -f lang/ocaml/*.mli
touch lang/extraction.v && make -f Makefile.coq lang/extraction.vo
make[2]: Entering directory '/mnt/f/dev/jasmin/proofs'
COQDEP lang/extraction.v
COQC lang/gen_map.v
File "./lang/gen_map.v", line 496, characters 28-32:
Error:
Cannot find a physical path bound to logical path matching suffix
<> and prefix CoqWord.
Makefile.coq:365: recipe for target 'lang/gen_map.vo' failed
make[2]: *** [lang/gen_map.vo] Error 1
make[2]: Leaving directory '/mnt/f/dev/jasmin/proofs'
Makefile:15: recipe for target 'extraction' failed
make[1]: *** [extraction] Error 2
make[1]: Leaving directory '/mnt/f/dev/jasmin/proofs'
Makefile:59: recipe for target 'CIL' failed
make: *** [CIL] Error 2
So, I guess, whatever library contains CoqWord
is missing?
If we compile the following function in the glob_array3
branch
inline fn sum (stack u64[2] a) -> stack u64 {
reg u64 r;
r = a[0];
r += a[1];
return r;
}
the stack
storage in the type is simply ignored and replace by reg
. If we replace stack
with either reg ptr
or stack ptr
, we get the same result. This is a bit confusing.
Maybe this is done on purpose to have a uniform syntax, and it is known that the storage modifier in the return type of a function is ignored and instead inferred from the body of the function, in which case this issue can be marked as invalid, but I prefer to ask.
The following program produces an obscure error message:
inline
fn f(reg u8 a b) {
a = #ADD_u8(a, b);
}
Fatal error: exception End_of_file
If we copy a stack array into a reg array, we can easily get errors. It is enough to have the size of the target array (in bytes) not a multiple of the cell size of the source array.
If the target array is so small that is cannot store even one element of the source array, we have an internal compilation error.
fn f () -> reg u64 {
reg u64 res;
stack u16[1] a;
reg u256[10] b;
a = b;
res = (64u)a[0];
return res;
}
// internal compilation error in function f:
// array copy: bad type for copy
If the target array can store at least one element of the source array, we have a type error.
fn f () -> reg u64 {
reg u64 res;
stack u16[33] a;
reg u256[10] b;
a = b;
res = (64u)a[0];
return res;
}
// typing error: the left value a has type u16[33] while u8[64] is expected
The error message is not perfect, because it suggests a good size, but there are other solutions (e.g. u16[16]
, and even u16[48]
if we consider also larger arrays).
The following program
u64 foo = 4;
export fn main () -> reg u64 {
stack u64 y;
reg u64 ret;
y = 42;
ret = y + foo;
return ret;
}
fails with
line 7 (4-18):
internal compilation error in function main:
asmgen: (compile_arg) not compatible asm_arg
Please report at https://github.com/jasmin-lang/jasmin/issues
Presumably this is because y
is stack allocated and the program should be rejected more gracefully.
Hi Jasmin team
Im currently writing a fuzzer for the Jasmin compiler I haven't gotten it to crash yet but I've got this error message (and more like them):
compilation error in functions f0 and f0 at position line 14: assembler error Invalid pexpr for word
PLEASE REPORT
The source Jasmin code is auto generated. And line 14 is:
v6 +=(32123*((-(-v6))*32123));
I haven't been able to figure out if this way of assign a variable to itself is illegal in Jasmin or what?
Best regards Thor
Consider this example
export fn f (reg u128 x y) -> reg u128 {
reg u128 z;
z = (32u)#VPAND(x, y);
return z;
}
This example compiles fine. The cast (32u)
is silently ignored. I think this is done on purpose, but really confusing for the user that has no feedback on what happens. An error would be more user-friendly. Or a warning explaining that this is ignored.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.