Describe the question. Hello, I was wondering if

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Extract motion vectors about dali HOT 7 OPEN

rvandeghen commented on May 30, 2024

Extract motion vectors

from dali.

Comments (7)

JanuszL commented on May 30, 2024

Hi @rvandeghen,

Thank you for reaching out. As far as I understand NVDEC doesn't expose this info and it is impossible to do that in DALI. What you can do instead is use the 'optical flow' operator.

from dali.

rvandeghen commented on May 30, 2024

@JanuszL thanks for the reply.

Do you know how to return both list of frames and list of OF ? I have an error which I guess comes from the fact that len(frames) = len(OF) + 1, thus the shapes mismatch.

The code I use is the following:

@pipeline_def
def create_video_reader_pipeline(sequence_length, files, crop_size, stride=1, shard_id=0, num_shards=1, seed=0):
    images = fn.readers.video(device="gpu",
                              filenames=files,
                              sequence_length=sequence_length,
                              normalized=False,
                              random_shuffle=False,
                              image_type=types.RGB,
                              dtype=types.UINT8,
                              initial_fill=16,
                              prefetch_queue_depth=2,
                              pad_last_batch=True,
                              name="Reader",
                              stride=stride,
                              enable_frame_num=False,
                              shard_id=shard_id,
                              num_shards=num_shards,
                              seed=seed,
                             )
    
    of = fn.optical_flow(images, output_grid=1)
    

    images = fn.crop_mirror_normalize(images,
                                      dtype=types.FLOAT,
                                      output_layout="FCHW",
                                      mean=[0.279*255, 0.452*255, 0.378*255],
                                      std=[0.188*255, 0.188*255, 0.171*255],
                                      mirror=False,#fn.random.coin_flip(),
                                      seed=seed
                                     )

    return images, of

class VideoDataset(pytorch.DALIGenericIterator):
    def __init__(self, *kargs, **kvargs):
        super().__init__(*kargs, **kvargs)

    def __next__(self):
        out, of = super().__next__()
        # DDP is used so only one pipeline per process
        # also we need to transform dict returned by DALIClassificationIterator to iterable
        # and squeeze the lables
        out = out[0]["data"]
        of = of[0]["data]

        B, F, C, H, W = out.size()
        out = out.view(B*F, C, H, W)
        return out, of

device_id = 0
shard_id = 0
num_shards = 1
batch_size = 1
sequence_length = 10


crop_size=(224, 224)
stride=5

pipeline = create_video_reader_pipeline(batch_size=batch_size,
                                        sequence_length=sequence_length,
                                        num_threads=10,
                                        device_id=device_id,
                                        shard_id=shard_id,
                                        num_shards=num_shards,
                                        files=container_files,
                                        crop_size=crop_size,
                                        stride=stride,
                                        )

train_loader = VideoDataset(pipeline,
                            ["data"],
                            reader_name="Reader",
                            auto_reset=True,
                            last_batch_policy=pytorch.LastBatchPolicy.FILL
                            )

Error:

IndexError                                Traceback (most recent call last)
Cell In[40], line 22
      9 stride=5
     11 pipeline = create_video_reader_pipeline(batch_size=batch_size,
     12                                         sequence_length=sequence_length,
     13                                         num_threads=10,
   (...)
     19                                         stride=stride,
     20                                         )
---> 22 train_loader = VideoDataset(pipeline,
     23                             ["data"],
     24                             reader_name="Reader",
     25                             auto_reset=True,
     26                             last_batch_policy=pytorch.LastBatchPolicy.FILL
     27                             )

Cell In[39], line 37, in VideoDataset.__init__(self, *kargs, **kvargs)
     36 def __init__(self, *kargs, **kvargs):
---> 37     super().__init__(*kargs, **kvargs)

File ~/micromamba/envs/sn_mae/lib/python3.10/site-packages/nvidia/dali/plugin/pytorch.py:194, in DALIGenericIterator.__init__(self, pipelines, output_map, size, reader_name, auto_reset, fill_last_batch, dynamic_shape, last_batch_padded, last_batch_policy, prepare_first_batch)
    192 if self._prepare_first_batch:
    193     try:
--> 194         self._first_batch = DALIGenericIterator.__next__(self)
    195         # call to `next` sets _ever_consumed to True but if we are just calling it from
    196         # here we should set if to False again
    197         self._ever_consumed = False

File ~/micromamba/envs/sn_mae/lib/python3.10/site-packages/nvidia/dali/plugin/pytorch.py:220, in DALIGenericIterator.__next__(self)
    218 # segregate outputs into categories
    219 for j, out in enumerate(outputs[i]):
--> 220     category_outputs[self.output_map[j]] = out
    222 # Change DALI TensorLists into Tensors
    223 category_tensors = dict()

IndexError: list index out of range

from dali.

JanuszL commented on May 30, 2024

Hi @rvandeghen,

I think your pipeline returns more than the iterator consumes. Can you try:

train_loader = VideoDataset(pipeline,
                            ["images", "of"],
                            reader_name="Reader",
                            auto_reset=True,
                            last_batch_policy=pytorch.LastBatchPolicy.FILL
                            )

from dali.

rvandeghen commented on May 30, 2024

Hi @JanuszL,

Indeed the optical flow gives good results at barely no extra cost. However, I found in the blogpost the following information and I would like to know if DALI exposes this buffer ?

The Optical Flow API returns a buffer consisting of confidence levels (called cost) for each of the flow vectors to deal with these situations. The application can use this cost buffer to selectively accept or discard regions of the flow vector map.

Renaud

from dali.

JanuszL commented on May 30, 2024

Hi @rvandeghen,

Currently, the operator doesn't ask the Optical Flow SDK to provide such values.
If you have some spare time:

you need to enable enableOutputCost at https://github.com/NVIDIA/DALI/blob/main/dali/operators/sequence/optical_flow/optical_flow_impl/optical_flow_impl.cc#L321
create one more buffer for the cost https://github.com/NVIDIA/DALI/blob/main/dali/operators/sequence/optical_flow/optical_flow_impl/optical_flow_impl.cc#L145
pass it upon invocation https://github.com/NVIDIA/DALI/blob/main/dali/operators/sequence/optical_flow/optical_flow_impl/optical_flow_impl.cc#L349
extend the operator by an additional argument which defines if the cost is returned and exposes one more output - https://github.com/NVIDIA/DALI/blob/main/dali/operators/sequence/optical_flow/optical_flow.h and https://github.com/NVIDIA/DALI/blob/main/dali/operators/sequence/optical_flow/optical_flow.cc

from dali.

Extract motion vectors about dali HOT 7 OPEN

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent