Giter VIP home page Giter VIP logo

Comments (5)

joshmarinacci avatar joshmarinacci commented on August 17, 2024

from webxr-api.

blairmacintyre avatar blairmacintyre commented on August 17, 2024

So, the goal here would be to start exposing our view on how to do "high performance vision" using the web. So, some thoughts:

  • the end goal is "programmers can write/load javascript and/or webassembly code to process sensor data efficiently each frame". I think we want the possibility of leveraging multiple cores, etc., so we probably want these in in something like a Worker, or perhaps a "SensorWorker" that gets triggered each time there is data. Yes? The simple approach is just a callback (synchronous) but that limits ability to parallelize; however, parallelization only matters if we can efficiently share data (we can, right?)
  • for a Reality, we need to have a way to expose the set of sensors available to it to set these up
  • for a camera sensor, we want to "try to set" and/or query the resolution, get the camera intrinsics (in this case, we make them up based on what we're telling the app the FOV is; for new platforms, you can get the intrinsics from ARKit, ARCore, etc). In this case, we want to constrain the max size, so we might want to say what the max width or height is (256? 320?).
  • each frame, we need to get the data, and probably thus need to have a way to say "synch with sensor workers" at the point we need it.

We can demonstrate this by restructuring the JSARToolkit to work in this manner (aside: we should eventually try to build that using WebAssembly).

from webxr-api.

joshmarinacci avatar joshmarinacci commented on August 17, 2024

I believe there is something called a SharedArrayBuffer to enable efficient data sharing between the main thread and worker threads. Of course, I bet you could do a lot of this processing better on the GPU, at least if we are talking about vision algorithms.

https://webkit.org/blog/7846/concurrent-javascript-it-can-work/

from webxr-api.

blairmacintyre avatar blairmacintyre commented on August 17, 2024

@joshmarinacci yes, that's what I was remembering, SharedArrayBuffer's.

The larger question is the architecture.

For example, should we ask people to prepare a worker, and give it to us, or prepare a script that has a certain API that could be used in a worker, that we'll either fire off in a worker, or call synchronously, depending on the platform and so forth (e.g., we might or might not want to fire off too many threads, and in a native implementation, we might actually have reasons to choose this).

I suspect that we should have them create and pass in an object with a certain API, and we could wrap it in a worker or just call it. In addition to the image buffer data, the API would need to pass in details of the camera (e.g., intrinsics), and could also pass in data similar to what is available to the render function (perspective rendering parameters, head-pose, etc).

As for WebAssembly/Javascript vs GPU: I agree 100% we should also be thinking about GPU, but we will likely need/want to provide both, for two reasons:

  • people might have code that only works in one of two places
  • the platform may only provide image/sensor data in one or both places (e.g., perhaps the image data isn't in memory, only the GPU)?

I am unclear as to what the best API would be to set up something to process video frames that are already in texture memory. I suspect they would need to prepare a shader, for example? But, we may not want to deal with that until we get someone involved who has experience doing CV with GPU shaders in (probably) WebGL2.

from webxr-api.

blairmacintyre avatar blairmacintyre commented on August 17, 2024

As for how to demo it; I've already done a bit of hacking on the JSARToolkit code and could pretty easily modify it to support a "here's a frame, and camera info, process the data".

from webxr-api.

Related Issues (9)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.