asm-js / validator Goto Github PK
View Code? Open in Web Editor NEWA reference validator for asm.js.
License: Apache License 2.0
A reference validator for asm.js.
License: Apache License 2.0
Hi,
The second question of the FAQ mentions "C programs compiled to asm.js", but that doesn’t make sense. Is that sentence missing something like "with emscripten"?
Allow empty for-loop clauses in the spec.
Allow empty statements in the spec.
The switch rules should be able to tell the switch block always returns if the last case always returns, even if it's not a default.
In the README example, for the code: ((((x+1)|0)>>>0)/(x|0))>>>0;
the first fragment A == (((x+1)|0)>>>0)' which returns an unsigned and the second
B == (x|0)returns a signed. This causes the divide
A/B` to have a sign mismatch. What's the desired behavior of this expression, Should it fail? If so what should it be changed to?
The code:
function mymodule(global, foreign, buffer) {
"use asm";
var g_i = 0;
return { g_export: g_i };
}
fails to validate with the error: expected import binding or heap view declaration, got Literal node
This seems to be simply due to not checking this, and depending on location either tries to verify it as an import, or an export.
Why is it that the return
statement is not valid if using the unsigned type annotation (>>> 0
)?
http://asmjs.org/spec/latest/#return-type-annotations
This means that we can't return unsigned integers.
I might be missing something, but I would expect to see 4294967295 and not -1 in the following test case. It appears the the unsigned conversion (-1 >>> 0) is lost if it's passed directly to an external function. It works fine if you assign it to a variable first.
function asmModule(global, env, buffer) {
"use asm";
var U4 = new global.Uint32Array(buffer);
var print = env.print;
function foo() {
// outputs -1, but if we assign this to a variable and print it we get the expected 4294967295
print(-1 >>> 0);
}
return { foo: foo };
}
var fast = asmModule(
this,
{ print: function(x) { print(x); } },
new ArrayBuffer(1024*8)
);
fast.foo();
Known library functions like Math.sin should have function types so their result values can be used.
C allows floats to be used as conditionals. Decide whether we should allow that too, in which case the conditionals in the type checker should check for \number rather than \int.
The division operator is the trickiest part of the unification algorithm. Work this one out in the validator implementation first. This will help drive out the spec details.
Constants should almost never need an annotation, unless they are large constants that could either be an int32 or uint32 and they are used with an overloaded operator.
It would be great if DataViews could be used to access the memory buffer. Any reason why they're not included?
We can replace 64-bit integers with a simple FFI that allocates them in a separate external ArrayBuffer and returns indices. It'll be more expensive but it's simpler for now. We can re-add them later if necessary. But they'd be better with ES7 value types anyway.
Do ranking and path compression for tvar unificaton (see Issue #27) to keep resolve-chains down to inverse-Ackermann.
asm.js doesn't know about destructuring; make sure everything that could be a Pattern is just an Identifier.
Add support for function pointers.
Define a surface syntax that looks more like a traditional assembly or bytecode syntax.
Add support for uint8, uint16.
Many operations, like the binary operators, provide a context for their arguments that already performs a coercion. So we should eliminate the need for annotations in those cases.
Add support for binary data types and objects.
Could we consider the inclusion of string types?
It may be an immutable subtype of extern
.
String type marker:
arg = ""+arg;
It would have the following expressions:
str.length
: size of the string (a constant). Type: unsigned
.str.charCodeAt(index)
: character code, a nonnegative integer less than 2^16. Type: (unsigned) → fixnum
.This is motivated by the idea that converting an external string into a typed array before passing it in an asm.js exported function is probably not the most efficient approach. Yet, in my experience, string parsing in JS can be boosted a lot.
Right now the validator is forced to perform inference to compute the result types of the functions. Come up with a policy that makes it possible to compute the function types eagerly.
Create a front end for the syntax defined by Issue #13 that translates the surface syntax into asm.js.
It would be great if typed arrays could be used as parameters for functions written in asm.js.
Why?
In order to speed up image processing algorithms, asm.js would be an ideal candidate. To accomplish that, you need some way to pass the image data to the asm.js code. The image data is usually provided by .getImageData() which returns an object containing the image in an UInt8ClampedArray.
It would be great when we could pass this data directly to asm.js
Formula:
a[(p&K)>>k]
where a
is a b
-bit array type, k
= b/8
and K
is a power of 2 minus 1.
Add update operators to the spec.
Implement type-variable unification and come up with a test case or 7.
Hi,
In http://asmjs.org/spec/latest/#programming-model one can read:
HEAP32[p >> 2]|0
The shift converts the byte offset to a 32-bit word offset, […]
Shouldn’t that be p >> 4
? 4 being the byte size of a 32-bit word.
Allow var declarations nested in blocks, as long as their definition dominates uses and they aren't used in outer blocks.
In Luke's variation of the semantics:
bits32
is not a subtype of float64
~~
(float -> int) and +
(int -> float)bits32
not int32
or uint32
bits32
or float64
not int32
or uint32
We have the following type signatures:
(_+_)|_ : bits32 x bits32 x bits32 -> int32
(_+_) : float64 x float64 -> float64
(_|_) : bits32 x bits32 -> int32
~~_ : float64 -> int32
~_ : bits32 -> int32
+_ : bits32 -> float64
Everything that extracts the type from a \kappa_X
should also check that it's a valid X.
Case expressions should only take literals.
Add support for sign extension operations.
Just a bug so that we don't forget to do this, and it is pretty obscure right now in the spec how to do double->integer conversion.
I'd recommend adding a note at the bottom in section 8 about it, even if it's not a single operator.
Implement global variables in the validator.
Implement an asm.js verifier that takes JS source and a set of types for the outer variables and checks the source to see if it is valid asm.js.
To ensure zero-bailing of array indexing, the validator should require the indices to be &-masked.
The binary operations should have custom rules for different types, rather than using the sloppy and incorrect type
metafunction.
There should be a rule allowing coercing the result of an FFI call to a machine.js type.
As a compiler author, I'd really like to see the details of the FFI specified so I can understand what kind of interactions are possible across the FFI boundary between normal JS and asm.js. My compiler doesn't use a virtual heap like emscripten does so at present it's hard to tell whether I could use asm.js at all, even if my individual functions are expressible in asm.js's subset of the language (and would benefit from the resulting performance improvements).
A few sample questions:
Is it possible for JS objects to cross the FFI boundary? If no, what happens when they cross it? A runtime error?
Is it possible to load properties from JS objects inside of asm.js? (Guessing no)
What about property stores?
As a baseline, I could probably do everything I want with the ability for JS objects to cross the FFI boundary, even if they end up as opaque void * style data that is simply passed back out to other FFI functions (and stored in locals). For example, this would let me generate asm.js code for many methods of C# structs and classes that take a 'this' reference as an opaque JS object, and for the relatively rare cases where JS object interactions are needed, I could call getter/setter functions through the FFI, passing the this-reference in.
Being able to read/write ordinary properties of JS objects would be a huge boon too, but isn't necessary if I can bridge with the FFI. Even constrained access to properties (i.e. no prototype chain support, for example) would be great.
The switch discriminant should only allow extern types.
Add support for pure void pointers.
The validator seems to be broken. First of all the package.json
file says "main": "lib/asm.js"
but asm.js
is in tests/
and not lib/
. After moving the file to lib/
I get the following error message
Error: Cannot find module './tables'
at Function.Module._resolveFilename (module.js:338:15)
at Function.Module._load (module.js:280:25)
at Module.require (module.js:364:17)
at require (module.js:380:17)
at Object.<anonymous> (/home/ruediger/develop/demos/node/node_modules/asm.js/lib/validate.js:8:14)
at Module._compile (module.js:456:26)
at Object.Module._extensions..js (module.js:474:10)
at Module.load (module.js:356:32)
at Function.Module._load (module.js:312:12)
at Module.require (module.js:364:17)
The require is in validate.js
: var tables = require('./tables');
. But there is no tables.js
available. The problems seem to have been introduced in 182e4b9
Hi,
It would be nice to have a link to this tracker (or whatever is the preferred way to send feedback) on http://asmjs.org/spec/latest/
I only found it because http://asmjs.org/spec/ had an error message that hinted that the project is hosted on GitHub.
Unary minus isn't safe without coercion because it can generate -0 which isn't a valid integer. But LLVM generates 0 - x
anyway, so let's just eliminate unary minus.
Add support in the model for variables that are global to the capsule (e.g., representing globals in a C/C++ program compiled with Emscripten).
The error messages should provide source locations for better diagnostics. This could be done by tracking the current location statefully to avoid API proliferation, but we'll have to deal with fail
being called from lots of different modules that may not have access to the state.
Hi Dave,
I am excited about this project. From Alon Zakai's brief description, I consider asm.js to be a formalization of the Emscripten ABI. And on that note, I'd like to point out a bit of weirdness.
If you've got a C++ function that returns a C++ bool, depending on the generated Emscripten code, it will either return a JavaScript boolean or an integer (probably 0 or 1). In JavaScript, booleans and integers have different types, so presumably JITs will best handle code that only uses one or the other. I would look to asm.js to specify the representation of booleans.
Please forgive me if you've already considered this.
Luke suggests eliminating booleans and just using ints for truthiness. Compilers can lower boolean types to int8.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.