Backends should be able to implement their own type of tensor objects. A good exam

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

[Feature request] Backend-specific tensors about glow HOT 6 OPEN

jsubag commented on August 19, 2024

[Feature request] Backend-specific tensors

from glow.

Comments (6)

bertmaher commented on August 19, 2024 4

That's a pretty interesting use-case. I think we could avoid the camera->system copy by using the Tensor(void *data, TypeRef ty) constructor to make a tensor backed by the camera memory, and then bind that tensor to an input Variable.

from glow.

opti-mix commented on August 19, 2024 1

@bertmaher I originally introduced the Tensor(void *data, TypeRef ty) constructor exactly for integrating with other 3rd party frameworks and the like, where the tensor itself is allocated and manged outside Glow.

But of course, this constructor requires that the payload is at least in the same address space and has the same memory layout, alignment, padding, etc. There could be use-cases, where it is not the case. E.g. the payload of the tensor is in a different address space (e.g. GPU memory, etc) or has a different memory layout (e.g. padding, alignment, row vs column major, etc).

Such use-cases may indeed need more than just this constructor.

from glow.

nadavrot commented on August 19, 2024

@jsubag Jacob, how do graphic drivers solve this problem? When I program cuda/opencl code I don't need to allocate user-space buffers with the driver. I understand that in some cases the driver would need to copy the data. Do you know what latency will this copy introduce?

from glow.

jsubag commented on August 19, 2024

@nadavrot On GPUs there's usually a copy between system memory and the GPU memory but here there are usages that may require an additional copy.

Consider the case of an application analyzing images taken from a camera on the same host machine. Typically, the captured images are written to a specific location designated by the camera driver. In the current Glow design this data will be copied to a Tensor used as an input (backed by system memory), then the backend will be able to copy it a second time to the GPU.
The first copy can be removed by exposing the right mechanism so that the application can initiate only the "second" copy from camera output buffer to the GPU resource.
There are other mechanisms in the GPU domain such as sharing EGL resources (potentially saving all copies) but i think those require a tighter handshake between the components & drivers.

Additionally, some GPUs require the system memory to be copied from be pinned and aligned.
If the memory backing the tensor isn't complying with these that can incur a third copy from the tensor system memory to a another system memory location that complies with the pinning/alignment requirements (this is usually the result of DMA requirements).

Latency numbers vary across different hardware - but if we're talking about copying a few MB in system memory on a modern computer that shouldn't take more than a millisecond.
However, if your input/output are larger and your workload is latency sensitive this can be the difference between reaching your target frame-rate and missing it.
So the more we can save the better :)

from glow.

jsubag commented on August 19, 2024

@opti-mix For the cases you mention there should be a mechanism to trigger data transfer between the backend specific address space/layout/etc. and host memory space (maybe similar to GPU-style map/unmap).

from glow.

nadavrot commented on August 19, 2024

@jsubag This issue is related to #1334

from glow.

[Feature request] Backend-specific tensors about glow HOT 6 OPEN

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent