gwsystems / awsm Goto Github PK
View Code? Open in Web Editor NEWWebAssembly ahead-of-time compiler and runtime. Focuses on generating fast code, simplicity, and portability.
License: BSD 3-Clause "New" or "Revised" License
WebAssembly ahead-of-time compiler and runtime. Focuses on generating fast code, simplicity, and portability.
License: BSD 3-Clause "New" or "Revised" License
Hi @geky
In my use of case the data-layout metadata is needed. Taking a look at the llvm-alt crate, it seems that there is no way using the wrapper. However, calling directly to the llvm-sys underlying crate seems to work.
if let Some(ref layout) = opt.layout {
unsafe {
let c_target = std::ffi::CString::new(layout.to_string()).unwrap();
llvm::ffi::core::LLVMSetDataLayout(llvm_module.into(), c_target.as_ptr());
}
}
WDYT? Thanks in advance
Add a CI infrastructure to aWsm
. Run a subset of the tests on each PR to validate the soundness of the request.
Hi, may I check if the current release support all MVP features? It's nice if we can list the features supported for post-MVP https://webassembly.org/roadmap/
Thanks.
See title.
A recent bug turned out to be due to calls to build_result
being executed with the builder being in the right place. A fix for this was to explicitly call b.position_at_end(self.bb);
at the top of the function. This task is to investigate the different code paths where build_result
is called to determine if there are either:
This bug is related to WASI-SDK work and thus is in the branch wasi-abstract-interface
.
Currently, at least two tests trigger a fatal LLVM error during the final clang step linking the *.bc emitted from aWsm with the uvwasi runtime:
error: Invalid instruction with no BB (Producer: 'LLVM11.1.0' Reader: 'LLVM 11.1.0')
This suggests that perhaps the LLVM bitcode emitted by the compiler is invalid
Additionally, one test is functionally correct at optimization level, but returns a different result at a different optimization level.
This suggests that different WebAssembly opcodes emitted at different optimization levels may cause a control flow bug.
Here follows steps to recreate the bugs and a zip containing the *.wasm files of the two apps at optimization levels 0-3 with associated textfiles containing opcode counts of each variant.
wasi-abstract-interface
. The setup scripts likely differ from what you already have, so you may need to run the setup script.awsm/test/wasi
The first test I want you to look at is in awsm/test/wasi/c/atof.c
This program deserializes arg 1 into a float and then serialized it back out to stdout.
#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char** argv) {
double pi = atof(argv[1]);
printf("%.2lf\n", pi);
}
You can build and run this natively with the following to get a feel for it:
sean@MANPUTER:~/projects/awsm/tests/wasi$ clang c/atof.c
sean@MANPUTER:~/projects/awsm/tests/wasi$ ./a.out 3.14
3.14
sean@MANPUTER:~/projects/awsm/tests/wasi$ rm a.out
It turns out that the behavior of this program varies on the optimization level passed to the WASI-SDK clang compiler.
At optimization level 3,
wasm/%.wasm: c/%.c
@mkdir -p wasm
${WASMCC} ${WASMLINKFLAGS} ${OPTFLAGS} -O3 $< -o $@
everything builds and runs correctly:
sean@MANPUTER:~/projects/awsm/tests/wasi$ ./vm/atof_vm 3.14
3.14
At optimization levels 1 and 2:
wasm/%.wasm: c/%.c
@mkdir -p wasm
${WASMCC} ${WASMLINKFLAGS} ${OPTFLAGS} -O2 $< -o $@
aWsm compiles correctly, but the final clang linking with the runtime fails with an error error: Invalid instruction with no BB (Producer: 'LLVM11.1.0' Reader: 'LLVM 11.1.0')
that suggests that maybe LLVM bitcode emitted by the compiler is invalid:
sean@MANPUTER:~/projects/awsm/tests/wasi$ make clean vm/atof_vm
rm -f ./wasm/* ./bc/* ./vm/*_vm
/opt/wasi-sdk//bin/clang --sysroot=/opt/wasi-sdk//share/wasi-sysroot -Wl,--allow-undefined,-z,stack-size=32768,--threads=1 -O3 -flto -O2 c/atof.c -o wasm/atof.wasm
../../target/release/awsm wasm/atof.wasm -o bc/atof.bc
INFO [awsm] running awsm("wasm/atof.wasm", Some("bc/atof.bc"))
INFO [awsm::wasm] Globals taking up 4 bytes
INFO [awsm::wasm] Data initializers taking up 2239 bytes
INFO [awsm::wasm] Function table entries 4 (ignoring fragmentation)
INFO [awsm::codegen::memory] memory limits ResizableLimits { initial: 1, maximum: None }
INFO [awsm] awsm finished successfully
clang -pthread -ldl -lm -O3 -flto -I../../runtime/libc/wasi/include -I../../runtime/thirdparty/dist/include bc/atof.bc ../../runtime/runtime.c ../../runtime/libc/wasi/wasi_main.c ../../runtime/libc/wasi/wasi_backing.c ../../runtime/libc/wasi/wasi_impl_uvwasi.c ../../runtime/libc/env.c ../../runtime/memory/64bit_nix.c ../../runtime/thirdparty/dist/lib/libuvwasi_a.a ../../runtime/thirdparty/dist/lib/libuv_a.a -o vm/atof_vm
error: Invalid instruction with no BB (Producer: 'LLVM11.1.0' Reader: 'LLVM 11.1.0')
1 error generated.
make: *** [Makefile:24: vm/atof_vm] Error 1
rm bc/atof.bc wasm/atof.wasm
At optimization level 0:
wasm/%.wasm: c/%.c
@mkdir -p wasm
${WASMCC} ${WASMLINKFLAGS} ${OPTFLAGS} -O0 $< -o $@
Everything compiles correctly, but the program does not work as expected:
sean@MANPUTER:~/projects/awsm/tests/wasi$ ./vm/atof_vm 3.14
0.00
The Fibonacci app works a bit differently. It is only able to compile at optimization level 0
wasm/%.wasm: c/%.c
@mkdir -p wasm
${WASMCC} ${WASMLINKFLAGS} ${OPTFLAGS} -O0 $< -o $@
sean@MANPUTER:~/projects/awsm/tests/wasi$ echo "10" | ./vm/fibonacci_vm
55
At optimization levels 1, 2, and 3, it errors out in the final linking step with the same error message as above error: Invalid instruction with no BB (Producer: 'LLVM11.1.0' Reader: 'LLVM 11.1.0')
Here is the zip containing the wasm files and associated opcode counts. Despite the issues identified above, all of these *.wasm files correctly execute on wamtime:
sample.wasm.zip
I've been trying to get GOCR compiling and running with silverfish but there seem to be some weirdness around the function pointers. Was able to succesfully compile it.
The call here leads to an assert in the runtime: https://github.com/phanikishoreg/gocr/blob/master/src/list.c#L325
assert triggered in the runtime: https://github.com/gparmer/silverfish/blob/master/runtime/memory/64bit_nix.c#L129
I debugged to see what the assert is and why that is triggered, and this is the log:
$ ./gocr.out -f ASCII < test.pbm
// log here: "add_function_to_table" <func_index> <func type_id> <func addr>
add_function_to_table 1 3 0x40a760
add_function_to_table 2 1 0x40a7c0
add_function_to_table 3 1 0x40a9a0
add_function_to_table 4 2 0x40ac40
add_function_to_table 5 0 0x40bf10
add_function_to_table 6 2 0x40bf70
add_function_to_table 7 1 0x40bf20
add_function_to_table 8 1 0x40cd70
add_function_to_table 9 1 0x40cda0
add_function_to_table 10 4 0x40ec10
add_function_to_table 11 2 0x53a080 // this seem to be the function..
add_function_to_table 12 4 0x53a180
add_function_to_table 13 4 0x53a150
01 // program output
// log here: "get_function_from_table" <func_idx> <func typeid in indirect table> <type_id param>
get_function_from_table 11 2 5 // called with a different type id 5
get_function_from_table 11 2 5 // second call..
$
The problem is, on function table initialization, the type_id of that function is set to 2. But on call to get_function_from_table, the type_id requested is 5. Unsure how this is happening but this is how far I got with the debugging so far. In the above logs, I've disabled that assert I pointed to and the program works otherwise, outputs "01" that's in the input image.
@Others : I would really appreciate if you can help me fix this. For now, I'm going to just comment out the "type check" assert in the runtime (still keeping the function pointer NULL check).
This feature enables a WebAssembly function to return a tuple of values. When writing WebAssembly by hand, I've found this to be a helpful way to return an (i32, i32)
fat pointer to a string or static array of values stored in linear memory.
I am unclear how one can do this with C via WASI-SDK. I've seen code examples that have suggested to me that rightmost arguments can be tagged to be returned as part of a multi-value return. See the C header for WASI itself as an example.
Reference: https://github.com/WebAssembly/multi-value
Currently, the awsm runtime implicitly treats a function called main
as the entrypoint.
However, this differs from the WASI standard, which expects this to be _start
.
I propose that we change this to align to WASI and make it simpler for us to run WASI module.
Currently, the abstract WASI layer uses dynamic memory allocation via VLAs. This has negative security implications for stack overflows and likely will inhibit refactoring of the cortex-m backing to use WASI. This should be addressed and improved to resolve these issues.
Add a github pull request template to make sure all contributors understand the requirements.
i32.trunc_sat_f32_s
i32.trunc_sat_f32_u
i32.trunc_sat_f64_s
i32.trunc_sat_f64_u
i64.trunc_sat_f32_s
i64.trunc_sat_f32_u
i64.trunc_sat_f64_s
i64.trunc_sat_f64_u
Also need to verify that the alternative conversion instructions trap as expected
Given the shift to WASI and the uvwasi backing, we can potentially remove wasmception from the runtimes in this repo.
It's possible that we might ingest WASI functions exported with names that are not valid C identifiers.
For example, here is a test in the wasm spec test suite:
(func (export "type-f64-value") (result f64)
(block (result f64) (f64.neg (br_if 0 (f64.const 4) (i32.const 1))))
)
This seems to generate valid LLVM bitcode with the same symbol:
; Function Attrs: nounwind
define double @wasmf_type-f64-value() #1 {
entry:
br label %b_0
exit: ; preds = %b_1
ret double %0
b_0: ; preds = %entry
br i1 true, label %b_1, label %b_2
b_1: ; preds = %b_2, %b_0
%0 = phi double [ 4.000000e+00, %b_0 ], [ -4.000000e+00, %b_2 ]
br label %exit
b_2: ; preds = %b_0
br label %b_1
}
However, this fails when we try to link with C runtime code:
dist/br_if_0.c:12:25: error: expected ';' after top level declarator
extern double wasmf_type-f64-value();
^
;
This is because C only supports ASCII characters, numbers, and underscores in symbol names.
Given the way that we resolve exported symbols, I suspect that we need to encode export names. I suggest replacing invalid symbols with ASCII char codes with leading and trailing underscores.
For example, wasmf_type-f64-value
would be encoded as wasmf_type_45_f64_45_value
The runtime heap base is currently set to memory_size under certain conditions, but the reasoning is undocumented and no contributor recalls why this is written this way. Given that this code was recently refactored, this should be investigated by checking out the following commit (cd9b619) and using GitLens or a similar tool to track down the origin of this code to better understand and document this logic. It likely is useful to also review how wasmg___heap_base
is generated to understands paths when that might be zero or non-zero.
runtime_heap_base = wasmg___heap_base;
if (runtime_heap_base == 0) {
runtime_heap_base = memory_size;
}
Follow up context from @Others
"I do think it maybe should always be at the end of memory (not at the heap base), since otherwise malloc might overwrite this. Honestly this is basically just a wasmception hack since wasi does APIs in a more sane way."
According to the WebAssembly specification, abnormal behavior, such as an out-of-bounds memory access, should trigger a "trap."
In the web context, this causes a JavaScript exception. However, the WebAssembly specification is being extended to support "in module" trap handlers (https://github.com/WebAssembly/exception-handling/blob/master/proposals/exception-handling/Exceptions.md)
Currently, we are mostly enforcing invariants via assert
. We should likely replace this with a centralized entrypoint for trap handling logic called via setjmp and longjmp. I believe this would just log a descriptive and error out for now.
This is a higher-level post MVP feature that may not make sense for us. The idea of a reference is that a WebAssembly sandbox can be passed an opaque host reference that it can manage.
ref.null
ref.is_null
ref.func
References:
As of this comment, wasmparser
is on 0.83.0
. Given that a release that may contain breaking changes occurs every month, this may be a higher version when you read this.
In #77, I updated from 0.39.2
to 0.58.0
. I stopped at this level because 0.59.0
contains a complete rewrite of the parsing logic. This may require substantial changed on our end. See bytecodealliance/wasm-tools@db2ef19 for specifics.
Given that breaking changes occur every 3 releases or so, I estimate that going from 0.58.0
to 0.83.0
may take up to 40 hours of engineering effort. This is a hard technical blocker for an WebAssembly instructions that 0.58.0
cannot parse.
Would this work on a nRF52832 chip (512KB Flash, 64KB RAM)?
Has the footprint of the runtime been measured?
We still don't support blocks that consume an argument, or yield more than one piece of data. We probably should assert that those aren't being generated somewhere. It likely would just lead to a crash, but weird code gen bugs are best proactively avoided
Xiaosu noticed that printf and scanf functions seem to cause a substantial slowdown of modules running under Sledge. This might be due to LTO, it might be Sledge-backing specific, or it might be here.
The scope of this task is to compare the overhead of scanf and printf under wasmception, wasi-sdk, and native to determine if this is an issue here or downstream.
Relevant Instructions:
i32.extend8_s
i32.extend16_s
i64.extend8_s
i64.extend16_s
i64.extend32_s
Reference:
Does aWsm support using an LLVM version other than 9? I tried building with LLVM 15, and managed to build the binary but it encounters an immediate segmentation fault when trying to build something. It looks like the crash happens inside LLVMGetElementType
in libLLVM-15.so
at src/codegen/function.rs:172
.
Including for completeness, but this is a very large surface area of instructions and is not expected to be beneficial to our research.
Reference: https://github.com/webassembly/simd
Hi everyone!
First, thanks a lot for the cool project and making the source code of your research public! :-)
I have a question regarding using aWsm not for executing Wasm binaries, but for program analysis. In particular, I was wondering if you have tested lifting WebAssembly binaries to LLVM IR with aWsm, and then analyzing that LLVM IR with standard analysis passes.
E.g., we tried obtaining a callgraph of a Wasm binary by first compiling with awsm binary.wasm -o bitcode.bc
and then running llvm-opt bitcode.bc --dot-callgraph
on aWsm's LLVM IR output. However, the produced call graph is missing edges for indirect calls. I know it is not the original use case of aWsm, but since you know Wasm and LLVM well, I wanted to ask anyway:
Thanks again and all the best!
Did you guys get a chance to compare the performance between awsm and wamr?
Implement support for f32 and f64 globals.
The branch implement-missing-globals includes a modified WAT unit test to test for these values. The test runs properly with wasmtime. Your goal is to get this to run successfully on aWsm and SLEdge.
Need to either remove llvm-alt as a dependency, or switch to using the gwsystems fork.
i64.rotl
i64.rotr
There are some corners of the WebAssembly spec that are strict and explicit where the C and LLVM specifications remain undefined. An example of this is UINT32_MIN % -1, which overflows with undefined semantics in C and LLVM, but which correctly returns 0 in WebAssembly. Achieving strict compliance thus requires special shims beyond LLVM intrinsics. There may be a performance cost to this.
We can enhance our fast unsafe math compiler option to use a "close enough" LLVM intrinsic in place of a totally compliant implementation in certain cases.
Hello!
Does aWsm translate binaries whose tables are exported? I keep getting the following error when I try to compile binaries that export their libraries.
thread 'main' panicked at 'Have not implemented export kind Table', src/wasm.rs:1238:26
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Thanks!
At the moment the dual dependencies on llvm-sys causes a strange error with LLVM:
$ ./target/debug/silverfish
: CommandLine Error: Option 'help-list' registered more than once!
LLVM ERROR: inconsistency in registered CommandLine options
Some googling around shows this is an issue that can happen if libLLVM gets linked twice:
https://stackoverflow.com/questions/54273632/llvm-linking-commandline-error-option-help-list-registered-more-than-once-l
The dual dependencies against llvm-sys might be the issue? I'm not sure if there is a way to tell cargo to deduplicate the linking:
$ cargo tree
llvm-alt v0.5.0 (git+https://github.com/Others/llvm-rs.git#754fdf857f83e64d371a046dcb8fa3587a176cf8)
├── cbox v0.3.0
│ └── libc v0.2.51 (*)
├── libc v0.2.51 (*)
└── llvm-sys v0.3.0 (git+https://github.com/Others/llvm-sys.rs#f4d6b7523e9c3e71a45aa8155f0ef0911689033a)
llvm-sys v90.0.0
└── libc v0.2.51 (*)
This proposal may offer some performance improvements. I think the way we codegen passive data segments as initializer functions should map to the behavior below, but would require deeper study.
Instructions:
memory.copy
table.copy
memory.fill
memory.init
table.init
data.drop
elem.drop
The clang WebAssembly linker takes a flag to statically set the stack size of the resulting binary.
Currently our Makefiles define linker flags as follows:
WASMLINKFLAGS=-Wl,--allow-undefined,-z,stack-size=32768,--threads=1
In order to prepare for the cortex-m refactor for WASI, we need to better understand how this flag impacts the memory consumption of the resulting binary.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.