Giter VIP home page Giter VIP logo

segment_anything_clone's Introduction

Segment Anything Simple Web demo

This front-end only React based web demo shows how to load a fixed image and corresponding .npy file of the SAM image embedding, and run the SAM ONNX model in the browser using Web Assembly with mulithreading enabled by SharedArrayBuffer, Web Worker, and SIMD128.

Run the app

Install Yarn

npm install --g yarn

Build and run:

yarn && yarn start

Navigate to http://localhost:8081/

Move your cursor around to see the mask prediction update in real time.

Export the image embedding

In the ONNX Model Example notebook upload the image of your choice and generate and save corresponding embedding.

Initialize the predictor:

checkpoint = "sam_vit_h_4b8939.pth"
model_type = "vit_h"
sam = sam_model_registry[model_type](checkpoint=checkpoint)
sam.to(device='cuda')
predictor = SamPredictor(sam)

Set the new image and export the embedding:

image = cv2.imread('src/assets/dogs.jpg')
predictor.set_image(image)
image_embedding = predictor.get_image_embedding().cpu().numpy()
np.save("dogs_embedding.npy", image_embedding)

Save the new image and embedding in src/assets/data.

Export the ONNX model

You also need to export the quantized ONNX model from the ONNX Model Example notebook.

Run the cell in the notebook which saves the sam_onnx_quantized_example.onnx file, download it and copy it to the path /model/sam_onnx_quantized_example.onnx.

Here is a snippet of the export/quantization code:

onnx_model_path = "sam_onnx_example.onnx"
onnx_model_quantized_path = "sam_onnx_quantized_example.onnx"
quantize_dynamic(
    model_input=onnx_model_path,
    model_output=onnx_model_quantized_path,
    optimize_model=True,
    per_channel=False,
    reduce_range=False,
    weight_type=QuantType.QUInt8,
)

NOTE: if you change the ONNX model by using a new checkpoint you need to also re-export the embedding.

Update the image, embedding, model in the app

Update the following file paths at the top ofApp.tsx:

const IMAGE_PATH = "/assets/data/dogs.jpg";
const IMAGE_EMBEDDING = "/assets/data/dogs_embedding.npy";
const MODEL_DIR = "/model/sam_onnx_quantized_example.onnx";

ONNX multithreading with SharedArrayBuffer

To use multithreading, the appropriate headers need to be set to create a cross origin isolation state which will enable use of SharedArrayBuffer (see this blog post for more details)

The headers below are set in configs/webpack/dev.js:

headers: {
    "Cross-Origin-Opener-Policy": "same-origin",
    "Cross-Origin-Embedder-Policy": "credentialless",
}

Structure of the app

App.tsx

  • Initializes ONNX model
  • Loads image embedding and image
  • Runs the ONNX model based on input prompts

Stage.tsx

  • Handles mouse move interaction to update the ONNX model prompt

Tool.tsx

  • Renders the image and the mask prediction

helpers/maskUtils.tsx

  • Conversion of ONNX model output from array to an HTMLImageElement

helpers/onnxModelAPI.tsx

  • Formats the inputs for the ONNX model

helpers/scaleHelper.tsx

  • Handles image scaling logic for SAM (longest size 1024)

hooks/

  • Handle shared state for the app

segment_anything_clone's People

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.