The wonnx from metavai

Wonnx is a GPU-accelerated ONNX inference run-time written 100% in Rust, ready for the web.

Supported Platforms (enabled by `wgpu`)

API	Windows	Linux & Android	macOS & iOS
Vulkan	✅	✅
Metal			✅
DX12	✅ (W10 only)
DX11	🚧
GLES3		🆗

✅ = First Class Support — 🆗 = Best Effort Support — 🚧 = Unsupported, but support in progress

Getting started

Install Rust
Install Vulkan, Metal, or DX12 for the GPU API.
Ensure Git LFS is installed
git clone this repo.

git clone https://github.com/webonnx/wonnx.git
git lfs install

From the command line

Ensure Git LFS is initialized and has downloaded the model files (in wonnx/examples/data/models). Then, you're all set! You can run an example:

cargo run --example squeeze --release

Or you can try the CLI (see the README for more information):

cargo run --release -- info ./data/models/opt-squeeze.mnist
cargo run --release -- infer ./data/models/opt-squeeze.onnx -i data=./data/images/pelican.jpeg --labels ./data/models/squeeze-labels.txt --top 3

From Python

pip install wonnx

And then:

from wonnx import PySession
session = PySession.from_path(
    "../data/models/single_relu.onnx"
)
inputs = {"x": [-1.0, 2.0]}
assert session.run(inputs) == {"y": [0.0, 2.0]}

To build the Python module for development:

cd wonnx-py
python3 -m venv .env
source .env/bin/activate
pip install maturin
maturin develop

Then run python3 with the above Python code!

Running a model from scratch

To run an onnx model, first simplify it with onnx-simplifier, with the command:

# pip install -U pip && pip install onnx-simplifier
python -m onnxsim mnist-8.onnx opt-mnist.onnx

Then you can run it following the example in the examples folder:

cargo run --example mnist --release

fn main() -> HashMap<String, Vec<f32>> {
    let mut input_data = HashMap::new();
    let image = load_squeezenet_image(); // Load image
    input_data.insert("data".to_string(), InputTensor::F32(image.as_slice().unwrap()));

    let session = pollster::block_on(wonnx::Session::from_path(
        "examples/data/models/opt-squeeze.onnx",
    ))
    .expect("session did not create");
    let result = pollster::block_on(session.run(input_data)).unwrap();
    let result = result["squeezenet0_flatten0_reshape0"];
    let mut probabilities = result.iter().enumerate().collect::<Vec<_>>();

    probabilities.sort_unstable_by(|a, b| b.1.partial_cmp(a.1).unwrap());

    assert_eq!(probabilities[0].0, 22);
}

Examples are available in the examples folder

Tested models

Squeezenet
MNIST

GPU selection

You may set the following environment variables to influence GPU selection by WGPU:

WGPU_ADAPTER_NAME with a substring of the name of the adapter you want to use (e.g. 1080 will match NVIDIA GeForce 1080ti).
WGPU_BACKEND with a comma separated list of the backends you want to use (vulkan, metal, dx12, dx11, or gl).
WGPU_POWER_PREFERENCE with the power preference to choose when a specific adapter name isn't specified (high or low)

Contribution: On implementing a new Operator

Contribution are very much welcomed even without large experience in DL, WGSL, or Rust. I hope that, this project can be a sandbox for all of us to learn more about those technologies beyond this project initial scope.

To implement an operator all you have to do is:

Add a new matching pattern in compiler.rs
Retrieve its attributes values using the get_attribute function:

    let alpha = get_attribute("alpha", Some(1.0), node);
    // or without default value
    let alpha = get_attribute::<f32>("alpha", None, node);

Add any variable you want to use in the WGSL shader using context.
Write a new WGSL template in the templates folder.

Available types are in structs.wgsl but you can also generate new ones within your templates.

Respect the binding layout that each entry is incremented by 1 starting from 0, with input first and output last. If the number of binding is above 4. Increment the binding group. You can change the input within sequencer.rs
Write the logic.

There is default variables in the context:

{{ i_lens[0] }}: the length of the input 0. This also work for output: {{ o_lens[0] }} and other input {{ i_lens[1] }}
{{ i_shape[0] }}: the array of dimensions of input 0. To get the first dimension of the array, just use: {{ i_shape[0][0] }}
{{ i_chunks[0] }}: the size of the chunks of each dimensions of input 0. By default, each variable is represented as a long array of values where to get to specific values you have to move by chunks. Those chunks are represented within this variable. To get the size of the chunks of the first dimensions use: {{ i_chunks[0][0] }}.
{{ op_type }} the op type as some op_type like activation are using the same template.

Test it using the utils function and place it in the tests folder. The test can look as follows:

#[test]
fn test_matmul_square_matrix() {
    // USER INPUT

    let n = 16;
    let mut input_data = HashMap::new();

    let data_a = ndarray::Array2::eye(n);
    let mut data_b = ndarray::Array2::<f32>::zeros((n, n));
    data_b[[0, 0]] = 0.2;
    data_b[[0, 1]] = 0.5;

    let sum = data_a.dot(&data_b);

    input_data.insert("A".to_string(), data_a.as_slice().unwrap());
    input_data.insert("B".to_string(), data_b.as_slice().unwrap());

    let n = n as i64;
    let model = model(graph(
        vec![tensor("A", &[n, n]), tensor("B", &[n, n])],
        vec![tensor("C", &[n, n])],
        vec![],
        vec![],
        vec![node(vec!["A", "B"], vec!["C"], "MatMul", "MatMul", vec![])],
    ));

    let session =
        pollster::block_on(wonnx::Session::from_model(model)).expect("Session did not create");

    let result = pollster::block_on(session.run(input_data)).unwrap();

    assert_eq!(result["C"].as_slice(), sum.as_slice().unwrap());
}

Check out tera documentation for other templating operation: https://tera.netlify.app/docs/

If at any point you want to do optimisation of several node you can do it within sequencer.rs.

Supported Operators (ref ONNX IR)

Operator	Since version	Implemented
Abs	13, 6, 1	✅
Acos	7	✅
Acosh	9
Add	14, 13, 7, 6, 1	✅
And	7, 1	✅
ArgMax	13, 12, 11, 1
ArgMin	13, 12, 11, 1
Asin	7	✅
Asinh	9
Atan	7	✅
Atanh	9
AveragePool	11, 10, 7, 1	✅
BatchNormalization	15, 14, 9, 7, 6, 1	✅
BitShift	11
Cast	13, 9, 6, 1	✅
Ceil	13, 6, 1	✅
Clip	13, 12, 11, 6, 1	✅
Compress	11, 9
Concat	13, 11, 4, 1	✅
ConcatFromSequence	11
Constant	13, 12, 11, 9, 1
ConstantOfShape	9
Conv	11, 1	✅
ConvInteger	10
ConvTranspose	11, 1
Cos	7	✅
Cosh	9	✅
CumSum	14, 11
DepthToSpace	13, 11, 1
DequantizeLinear	13, 10
Det	11
Div	14, 13, 7, 6, 1	✅
Dropout	13, 12, 10, 7, 6, 1	✅
Einsum	12
Elu	6, 1	✅
Equal	13, 11, 7, 1	✅
Erf	13, 9
Exp	13, 6, 1	✅
Expand	13, 8
EyeLike	9
Flatten	13, 11, 9, 1	✅
Floor	13, 6, 1	✅
GRU	14, 7, 3, 1
Gather	13, 11, 1	✅ (axis=0)
GatherElements	13, 11
GatherND	13, 12, 11
Gemm	13, 11, 9, 7, 6, 1	✅
GlobalAveragePool	1	✅
GlobalLpPool	2, 1
GlobalMaxPool	1
Greater	13, 9, 7, 1	✅
GridSample	16
HardSigmoid	6, 1
Hardmax	13, 11, 1
Identity	16, 14, 13, 1	✅
If	16, 13, 11, 1
InstanceNormalization	6, 1
IsInf	10
IsNaN	13, 9
LRN	13, 1
LSTM	14, 7, 1
LeakyRelu	6, 1	✅
Less	13, 9, 7, 1	✅
Log	13, 6, 1	✅
Loop	16, 13, 11, 1
LpNormalization	1
LpPool	11, 2, 1
MatMul	13, 9, 1	✅
MatMulInteger	10
Max	13, 12, 8, 6, 1
MaxPool	12, 11, 10, 8, 1	✅
MaxRoiPool	1
MaxUnpool	11, 9
Mean	13, 8, 6, 1
Min	13, 12, 8, 6, 1	✅
Mod	13, 10	✅
Mul	14, 13, 7, 6, 1	✅
Multinomial	7
Neg	13, 6, 1
NonMaxSuppression	11, 10
NonZero	13, 9
Not	1
OneHot	11, 9
Optional	15
OptionalGetElement	15
OptionalHasElement	15
Or	7, 1	✅
PRelu	9, 7, 6, 1
Pad	13, 11, 2, 1
Pow	15, 13, 12, 7, 1
QLinearConv	10
QLinearMatMul	10
QuantizeLinear	13, 10
RNN	14, 7, 1
RandomNormal	1
RandomNormalLike	1
RandomUniform	1
RandomUniformLike	1
Reciprocal	13, 6, 1	✅
ReduceL1	13, 11, 1
ReduceL2	13, 11, 1
ReduceLogSum	13, 11, 1
ReduceLogSumExp	13, 11, 1
ReduceMax	13, 12, 11, 1
ReduceMean	13, 11, 1
ReduceMin	13, 12, 11, 1
ReduceProd	13, 11, 1
ReduceSum	13, 11, 1
ReduceSumSquare	13, 11, 1
Relu	14, 13, 6, 1	✅
Reshape	14, 13, 5, 1	✅
Resize	13, 11, 10	✅
ReverseSequence	10
RoiAlign	16, 10
Round	11
Scan	11, 9, 8
Scatter (deprecated)	11, 9
ScatterElements	16, 13, 11
ScatterND	16, 13, 11
Selu	6, 1
SequenceAt	11
SequenceConstruct	11
SequenceEmpty	11
SequenceErase	11
SequenceInsert	11
SequenceLength	11
Shape	15, 13, 1
Shrink	9
Sigmoid	13, 6, 1	✅
Sign	13, 9
Sin	7	✅
Sinh	9	✅
Size	13, 1
Slice	13, 11, 10, 1
Softplus	1	✅
Softsign	1	✅
SpaceToDepth	13, 1
Split	13, 11, 2, 1
SplitToSequence	11
Sqrt	13, 6, 1	✅
Squeeze	13, 11, 1	✅
StringNormalizer	10
Sub	14, 13, 7, 6, 1	✅
Sum	13, 8, 6, 1
Tan	7	✅
Tanh	13, 6, 1	✅
TfIdfVectorizer	9
ThresholdedRelu	10
Tile	13, 6, 1
TopK	11, 10, 1
Transpose	13, 1	✅
Trilu	14
Unique	11
Unsqueeze	13, 11, 1	✅
Upsample (deprecated)	10, 9, 7
Where	16, 9
Xor	7, 1
Function	Since version
Bernoulli	15
CastLike	15
Celu	12	✅
DynamicQuantizeLinear	11
GreaterOrEqual	12	✅
HardSwish	14
LessOrEqual	12	✅
LogSoftmax	13, 11, 1
MeanVarianceNormalization	13, 9
NegativeLogLikelihoodLoss	13, 12
Range	11
Softmax	13, 11, 1	✅
SoftmaxCrossEntropyLoss	13, 12

metavai / wonnx Goto Github PK

wonnx's Introduction

Supported Platforms (enabled by `wgpu`)

Getting started

From the command line

From Python

Running a model from scratch

Tested models

GPU selection

Contribution: On implementing a new Operator

Supported Operators (ref ONNX IR)

wonnx's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

metavai / wonnx Goto Github PK

wonnx's Introduction

Supported Platforms (enabled by wgpu)

Getting started

From the command line

From Python

Running a model from scratch

Tested models

GPU selection

Contribution: On implementing a new Operator

Supported Operators (ref ONNX IR)

wonnx's People

Contributors

Recommend Projects

Recommend Topics

Recommend Org

Supported Platforms (enabled by `wgpu`)