facebookresearch / pytorchvideo Goto Github PK
View Code? Open in Web Editor NEWA deep learning library for video understanding research.
Home Page: https://pytorchvideo.org/
License: Apache License 2.0
A deep learning library for video understanding research.
Home Page: https://pytorchvideo.org/
License: Apache License 2.0
Should add Lambda(lambda x: x/255.0)
in tutorial, right ?
File "train.py", line 349, in len
return self.dataset.num_videos()
TypeError: 'int' object is not callable
I would like to use this on real-time video rather than video files. Possible?
ImportError: cannot import name 'Normalize' from 'pytorchvideo.transforms.transforms'
I want to use the default dataset code and change the model file, Are there any instructions about training and inference code to reproduce the existing model results and develop my own model, thanks!
While I was following the "Running a pre-trained PyTorchVideo classification model using Torch Hub" tutorial I am getting this error when I call preds = model(inputs). I only get this error when I use "slowfast_r50" as the model name. When I change the model name to slow_r50 it works fine.
RuntimeError Traceback (most recent call last)
in
----> 1 preds = model(inputs)
~\anaconda3\lib\site-packages\torch\nn\modules\module.py in _call_impl(self, *input, **kwargs)
887 result = self._slow_forward(*input, **kwargs)
888 else:
--> 889 result = self.forward(*input, **kwargs)
890 for hook in itertools.chain(
891 _global_forward_hooks.values(),
~\pytorchvideo\pytorchvideo\models\net.py in forward(self, x)
41 def forward(self, x: torch.Tensor) -> torch.Tensor:
42 for idx in range(len(self.blocks)):
---> 43 x = self.blocksidx
44 return x
45
~\anaconda3\lib\site-packages\torch\nn\modules\module.py in _call_impl(self, *input, **kwargs)
887 result = self._slow_forward(*input, **kwargs)
888 else:
--> 889 result = self.forward(*input, **kwargs)
890 for hook in itertools.chain(
891 _global_forward_hooks.values(),
~\pytorchvideo\pytorchvideo\models\net.py in forward(self, x)
85 for pathway_idx in range(len(self.multipathway_blocks)):
86 if self.multipathway_blocks[pathway_idx] is not None:
---> 87 x_out[pathway_idx] = self.multipathway_blocks[pathway_idx](
88 x[pathway_idx]
89 )
~\anaconda3\lib\site-packages\torch\nn\modules\module.py in _call_impl(self, *input, **kwargs)
887 result = self._slow_forward(*input, **kwargs)
888 else:
--> 889 result = self.forward(*input, **kwargs)
890 for hook in itertools.chain(
891 _global_forward_hooks.values(),
~\pytorchvideo\pytorchvideo\models\stem.py in forward(self, x)
251
252 def forward(self, x: torch.Tensor) -> torch.Tensor:
--> 253 x = self.conv(x)
254 if self.norm is not None:
255 x = self.norm(x)
~\anaconda3\lib\site-packages\torch\nn\modules\module.py in _call_impl(self, *input, **kwargs)
887 result = self._slow_forward(*input, **kwargs)
888 else:
--> 889 result = self.forward(*input, **kwargs)
890 for hook in itertools.chain(
891 _global_forward_hooks.values(),
~\anaconda3\lib\site-packages\torch\nn\modules\conv.py in forward(self, input)
518 self.weight, self.bias, self.stride, _triple(0),
519 self.dilation, self.groups)
--> 520 return F.conv3d(input, self.weight, self.bias, self.stride,
521 self.padding, self.dilation, self.groups)
522
RuntimeError: Expected 5--dimensional input for 5-dimensional weight [64, 3, 1, 7, 7], but got 4-dimensional input of size [1, 8, 256, 256] instead
If you do not know the root cause of the problem / bug, and wish someone to help you, please
post according to this template:
The hub contains model defined with lambda function. Those should be avoided as they can't be pickled.
NOTE: Please look at the existing list of Issues tagged with the label 'bug`. Only open a new issue if this bug has not already been reported. If an issue already exists, please comment there instead..
Please include the following (depending on what the issue is):
git diff
) or code you wrote<put diff or code here>
<put logs here>
Please also simplify the steps as much as possible so they do not require additional resources to
run, such as a private dataset, models, etc.
I tried to use the tutorial provided in the documentation "Training a PyTorchVideo classification model" using the kinetics dataset.
in the official annotation file I noticed there are 2 columns "time_start" and "time_end" which explains when the activity occurs. I didn't see any mention for that in the LabeledVideoDataset class, you just expect the csv to contain "video_path , label".
so my question is , do you assume the videos are already cropped in the right time ? or you don't care about the frame labels and you feed just a random crop of the video and you assume that the activity is contained in that ?
By the way, the official script for downloading the kinetics dataset doesn't work at list for me(I couldn't initialize new conda env with the environment file provided by them, and even after installing by hand the package the script doesn't do anything).
Hi!
I would like to fine-tune pre-trained model using AVA dataset format. How to achieve this using pytorchvideo?
Current tutorial shows on how to run inference on already fine-tuned models.
Thank you!
As described in some of your tutorials, I typically use torchvision transformations which I compose and then feed into ApplyTransformToKey. I have tried many, and they all seem to work.
However, it seems ColorJitter (from torchvision.transforms) is the first that does not work. It confuses the time dimension with the channel dimension in the tensor, and therefore says "TypeError: Input image tensor permitted channel values are [3], but found 30".
I was wondering, is this intended behavior? If so, is there any way I can make ColorJitter work for video with pytorchvideo?
Thanks a lot in advance!
I want to train my own video classification model, but I don't know how to load part of the pre-trained model. All I find is create_resnet and creat_slowfast ?
I am using my own data set for training. The previous data is similar to Kinetics' mp4 storage method. Now I want to convert mp4 to video frames, and each folder contains jpg format video frames. How can I modify the code?
I am trying to load every model from the model zoo one-by-one. However, there is no list of model names to use in the torch.hub.load() function. For example, I have from the tutorial 'slow_r50' is the model corresponding to the Slow R50 8x8 model, but I can't find how to load the Slow R50 4x16 model. More specifically, what is the string I use in the torch.hub.load() function? Another example is the I3D model. I have tried a number of different strings trying to load a pre-trained I3D model from torch hub but non work. I see in the hubconf.py a few different models, but not all of them.
Could you supply a list of strings for the torch.hub.load() function and their corresponding model zoo names?
Thanks!
Request: Implementation for AVA Dataset for localized human actions research
We're doing some work into action recognition models with localised actions in a frame, using the other facebookresearch slowfast repo. This implementation seems like a better way to go in terms of development, but currently stuck because we have our own dataset formatted on AVA.
To be able to train slowfast models using AVA dataset! Would be willing to help in the creation of the DataLoader as well.
Hi, I'm following the tutorial Training a PyTorchVideo classification model and I believe I can't load the data correctly.
I'm using Google Colab and my Kinetics400 is in my Google Drive. I've preprocessed the Kinetics such that all the videos are rescaled to height=256 pixels.
My Dataloader is implemented in the same way as described in the tutorial:
class KineticsDataModule(pytorch_lightning.LightningDataModule):
"""
This LightningDataModule implementation constructs a PyTorchVideo Kinetics dataset for both
the train and val partitions. It defines each partition's augmentation and
preprocessing transforms and configures the PyTorch DataLoaders.
"""
# Dataset configuration
_DATA_PATH = '/content/drive/MyDrive/Datasets/Kinetics400/'
_CLIP_DURATION = 2 # Duration of sampled clip for each video
_BATCH_SIZE = 8
_NUM_WORKERS = 8 # Number of parallel processes fetching data
def train_dataloader(self):
"""
Create the Kinetics train partition from the list of video labels
in {self._DATA_PATH}/train.csv. Add transform that subsamples and
normalizes the video before applying the scale, crop and flip augmentations.
"""
train_transform = Compose(
[
ApplyTransformToKey(
key="video",
transform=Compose(
[
UniformTemporalSubsample(8),
Normalize((0.45, 0.45, 0.45), (0.225, 0.225, 0.225)),
RandomShortSideScale(min_size=256, max_size=320),
RandomCrop(244),
RandomHorizontalFlip(p=0.5),
]
),
),
]
)
train_dataset = pytorchvideo.data.Kinetics(
data_path=os.path.join(self._DATA_PATH, "train.csv"),
clip_sampler=pytorchvideo.data.make_clip_sampler("random", self._CLIP_DURATION),
transform=train_transform
)
return torch.utils.data.DataLoader(
train_dataset,
batch_size=self._BATCH_SIZE,
num_workers=self._NUM_WORKERS,
)
def val_dataloader(self):
"""
Create the Kinetics val partition from the list of video labels
in {self._DATA_PATH}/val.csv. Add transform that subsamples and
normalizes the video before applying the scale.
"""
val_transform = Compose(
[
ApplyTransformToKey(
key="video",
transform=Compose(
[
UniformTemporalSubsample(8),
Normalize((0.45, 0.45, 0.45), (0.225, 0.225, 0.225)),
]
),
),
]
)
val_dataset = pytorchvideo.data.Kinetics(
data_path=os.path.join(self._DATA_PATH, "val.csv"),
clip_sampler=pytorchvideo.data.make_clip_sampler("uniform", self._CLIP_DURATION),
transform=val_transform
)
return torch.utils.data.DataLoader(
val_dataset,
batch_size=self._BATCH_SIZE,
num_workers=self._NUM_WORKERS,
)
I built a default ResNet just like the tutorial. Following the tutorial until the training step, I'm running a cell in Google Colab with only train()
to run the function def train()
.
Even though I'm randomly cropping to 224x224 in Transforms, I'm getting the following error:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-7-2da0ffaf5447> in <module>()
----> 1 train()
13 frames
<ipython-input-6-cd4463cf3c91> in train()
3 data_module = KineticsDataModule()
4 trainer = pytorch_lightning.Trainer()
----> 5 trainer.fit(classification_module, data_module)
/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py in fit(self, model, train_dataloader, val_dataloaders, datamodule)
456 )
457
--> 458 self._run(model)
459
460 assert self.state.stopped
/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py in _run(self, model)
754
755 # dispatch `start_training` or `start_evaluating` or `start_predicting`
--> 756 self.dispatch()
757
758 # plugin will finalized fitting (e.g. ddp_spawn will load trained model)
/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py in dispatch(self)
795 self.accelerator.start_predicting(self)
796 else:
--> 797 self.accelerator.start_training(self)
798
799 def run_stage(self):
/usr/local/lib/python3.7/dist-packages/pytorch_lightning/accelerators/accelerator.py in start_training(self, trainer)
94
95 def start_training(self, trainer: 'pl.Trainer') -> None:
---> 96 self.training_type_plugin.start_training(trainer)
97
98 def start_evaluating(self, trainer: 'pl.Trainer') -> None:
/usr/local/lib/python3.7/dist-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py in start_training(self, trainer)
142 def start_training(self, trainer: 'pl.Trainer') -> None:
143 # double dispatch to initiate the training loop
--> 144 self._results = trainer.run_stage()
145
146 def start_evaluating(self, trainer: 'pl.Trainer') -> None:
/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py in run_stage(self)
805 if self.predicting:
806 return self.run_predict()
--> 807 return self.run_train()
808
809 def _pre_training_routine(self):
/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py in run_train(self)
840 self.progress_bar_callback.disable()
841
--> 842 self.run_sanity_check(self.lightning_module)
843
844 self.checkpoint_connector.has_trained = False
/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py in run_sanity_check(self, ref_model)
1105
1106 # run eval step
-> 1107 self.run_evaluation()
1108
1109 self.on_sanity_check_end()
/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py in run_evaluation(self, on_epoch)
947 dl_max_batches = self.evaluation_loop.max_batches[dataloader_idx]
948
--> 949 for batch_idx, batch in enumerate(dataloader):
950 if batch is None:
951 continue
/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py in __next__(self)
515 if self._sampler_iter is None:
516 self._reset()
--> 517 data = self._next_data()
518 self._num_yielded += 1
519 if self._dataset_kind == _DatasetKind.Iterable and \
/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py in _next_data(self)
1197 else:
1198 del self._task_info[idx]
-> 1199 return self._process_data(data)
1200
1201 def _try_put_index(self):
/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py in _process_data(self, data)
1223 self._try_put_index()
1224 if isinstance(data, ExceptionWrapper):
-> 1225 data.reraise()
1226 return data
1227
/usr/local/lib/python3.7/dist-packages/torch/_utils.py in reraise(self)
427 # have message field
428 raise self.exc_type(message=msg)
--> 429 raise self.exc_type(msg)
430
431
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/worker.py", line 202, in _worker_loop
data = fetcher.fetch(index)
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 35, in fetch
return self.collate_fn(data)
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/collate.py", line 73, in default_collate
return {key: default_collate([d[key] for d in batch]) for key in elem}
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/collate.py", line 73, in <dictcomp>
return {key: default_collate([d[key] for d in batch]) for key in elem}
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/collate.py", line 55, in default_collate
return torch.stack(batch, 0, out=out)
RuntimeError: stack expects each tensor to be equal size, but got [3, 8, 256, 454] at entry 0 and [3, 8, 256, 144] at entry 5
I was expecting something like [3, 8, 244, 244] due to RandomCrop(244)
in the DataLoader. What am I missing? Thanks in advance for your help!
Hi,
I am trying to use EncodedVideoDataset with Voxceleb2, which features >1 million 3-20 second videos . Currently, this is making me run out of RAM, and my process is killed (I checked dmesg -T, to make sure, and this is indeed what is happening). I have 32 gb of RAM on my machine, which should not be that low, I suppose, but I guess this is quite a lot of data as well.
Is there any workaround for this, or am I absolutely forced to move to a machine with more RAM?
Thanks a lot in advance.
I want to work with video data with clips of different lengt, i.e. different number of frames. As of now, in the default_collate(batch) in .../torch/utils/data/_utils/collate.py the elements of the batch are transformed into a tensor using torch.stack(batch, 0, out=out). Is there any plans for introducing nestedtensors https://github.com/pytorch/nestedtensor in the near future to be able to work video clips of varying length?
Thanks in advance and thank you for the library, it is a pleasure to work with.
Hi, can pytorchvideo support python3.6? I'm running my program in a docker where I cannot change its python version and I hope there is a way to circumvent this minor version difference.
Enable the EncodedVideoDataset
class to accept an fps
argument which will determine the fps in which the videos are read
Several action recognition method make use of lowering the input fps as a way of reducing computational load. (e.g. https://arxiv.org/abs/2103.13915) .
Currently there is no way of using the Kinetics dataset with a required fps. It would also be useful to specify the desired number of frames per clip, and get a clip with uniformly sampled frames from the whole video.
If you do not know the root cause of the problem / bug, and wish someone to help you, please
post according to this template:
detectron2 and pytorchvideo have contradicting requirements for fvcore
If you do not know the root cause of the problem / bug, and wish someone to help you, please
post according to this template:
It throws an exception when I load the dataset from a directory by using Kinetics dataset in the pytorchvideo data module, and the directory format already was the following.
dir_path/<class_name>/<video_name>.mp4
../input/kinetics400partial/valid
βββ blowing_glass
β βββ ****.mp4
β βββ ****.mp4
β βββ ****.mp4
β βββ .....
βββ long_jump
β βββ ****.mp4
β βββ ****.mp4
| βββ .....
Please include the following (depending on what the issue is):
The code of the data module as following.
class KineticsDataModule(pl.LightningDataModule):
def __init__(self):
super().__init__()
self.transform = Compose(
[
ApplyTransformToKey(
key="video",
transform=Compose(
[
UniformTemporalSubsample(8),
Normalize((0.45, 0.45, 0.45), (0.225, 0.225, 0.225)),
RandomShortSideScale(min_size=256, max_size=320),
RandomCrop(244),
RandomHorizontalFlip(p=0.5),
]
),
),
]
)
def train_dataloader(self):
train_dataset = pytorchvideo.data.Kinetics(
data_path="../input/kinetics400partial/train",
clip_sampler=pytorchvideo.data.make_clip_sampler("random", 2),
transform=self.transform,
)
return torch.utils.data.DataLoader(
train_dataset,
batch_size=8,
num_workers=8,
)
def val_dataloader(self):
val_dataset = pytorchvideo.data.Kinetics(
data_path="../input/kinetics400partial/valid",
clip_sampler=pytorchvideo.data.make_clip_sampler("uniform", 2),
transform=self.transform,
)
return torch.utils.data.DataLoader(
val_dataset,
batch_size=8,
num_workers=8,
)
and the full log.
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
<ipython-input-6-a7ab4758bd42> in <module>
2 data_module = KineticsDataModule()
3 trainer = pl.Trainer()
----> 4 trainer.fit(classification_module, datamodule=data_module)
/opt/conda/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py in fit(self, model, train_dataloader, val_dataloaders, datamodule)
497
498 # dispath `start_training` or `start_testing` or `start_predicting`
--> 499 self.dispatch()
500
501 # plugin will finalized fitting (e.g. ddp_spawn will load trained model)
/opt/conda/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py in dispatch(self)
544
545 else:
--> 546 self.accelerator.start_training(self)
547
548 def train_or_test_or_predict(self):
/opt/conda/lib/python3.7/site-packages/pytorch_lightning/accelerators/accelerator.py in start_training(self, trainer)
71
72 def start_training(self, trainer):
---> 73 self.training_type_plugin.start_training(trainer)
74
75 def start_testing(self, trainer):
/opt/conda/lib/python3.7/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py in start_training(self, trainer)
112 def start_training(self, trainer: 'Trainer') -> None:
113 # double dispatch to initiate the training loop
--> 114 self._results = trainer.run_train()
115
116 def start_testing(self, trainer: 'Trainer') -> None:
/opt/conda/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py in run_train(self)
605 self.progress_bar_callback.disable()
606
--> 607 self.run_sanity_check(self.lightning_module)
608
609 # set stage for logging
/opt/conda/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py in run_sanity_check(self, ref_model)
844 # to make sure program won't crash during val
845 if should_sanity_check:
--> 846 self.reset_val_dataloader(ref_model)
847 self.num_sanity_val_batches = [
848 min(self.num_sanity_val_steps, val_batches) for val_batches in self.num_val_batches
/opt/conda/lib/python3.7/site-packages/pytorch_lightning/trainer/data_loading.py in reset_val_dataloader(self, model)
362 has_step = is_overridden('validation_step', model)
363 if has_loader and has_step:
--> 364 self.num_val_batches, self.val_dataloaders = self._reset_eval_dataloader(model, 'val')
365
366 def reset_test_dataloader(self, model) -> None:
/opt/conda/lib/python3.7/site-packages/pytorch_lightning/trainer/data_loading.py in _reset_eval_dataloader(self, model, mode)
276 # always get the loaders first so we can count how many there are
277 loader_name = f'{mode}_dataloader'
--> 278 dataloaders = self.request_dataloader(getattr(model, loader_name))
279
280 if not isinstance(dataloaders, list):
/opt/conda/lib/python3.7/site-packages/pytorch_lightning/trainer/data_loading.py in request_dataloader(self, dataloader_fx)
396 The dataloader
397 """
--> 398 dataloader = dataloader_fx()
399 dataloader = self._flatten_dl_only(dataloader)
400
<ipython-input-3-e66142f2568e> in val_dataloader(self)
56 data_path="../input/kinetics400partial/valid",
57 clip_sampler=pytorchvideo.data.make_clip_sampler("uniform", 2),
---> 58 transform=self.transform,
59 )
60 return torch.utils.data.DataLoader(
/opt/conda/lib/python3.7/site-packages/pytorchvideo/data/encoded_video_dataset.py in labeled_encoded_video_dataset(data_path, clip_sampler, video_sampler, transform, video_path_prefix, decode_audio, decoder)
266 # with PyTorch DataLoader workers. To avoid this, we make sure the PathManager
267 # calls (made by LabeledVideoPaths) are wrapped in their own sandboxed process.
--> 268 labeled_video_paths = LabeledVideoPaths.from_path(data_path)
269
270 labeled_video_paths.path_prefix = video_path_prefix
/opt/conda/lib/python3.7/site-packages/pytorchvideo/data/labeled_video_paths.py in from_path(cls, data_path)
30 return LabeledVideoPaths.from_csv(data_path)
31 elif g_pathmgr.isdir(data_path):
---> 32 return LabeledVideoPaths.from_directory(data_path)
33 else:
34 raise FileNotFoundError(f"{data_path} not found.")
/opt/conda/lib/python3.7/site-packages/pytorchvideo/data/labeled_video_paths.py in from_directory(cls, dir_path)
102 assert (
103 len(video_paths_and_label) > 0
--> 104 ), f"Failed to load dataset from {dir_path}."
105 return cls(video_paths_and_label)
106
AssertionError: Failed to load dataset from ../input/kinetics400partial/valid.
Hello I trained model with custom dataset for video classification.
and I get a weights file(.ckpt). but I want to use it android app(https://github.com/pytorch/android-demo-app/tree/master/TorchVideo).
How to save .pt from ckpt?
and I want to test(or demo) a video clip using my model. your test code is not working. help me
It throws an exception when I follow the official tutorial to implement a video classification model.
https://pytorchvideo.org/docs/tutorial_classification
Environment:
python version: macOS-10.16-x86_64-i386-64bit
python version: 3.8.5
torch version: 1.8.1
torchvision version: 0.9.1
pytorch_lightning version: 1.2.8
pytorchvideo version: 0.1.0
fvcore version: 0.1.4.post20210326
The code
import os
import pytorch_lightning as pl
import pytorchvideo.data
import torch.utils.data
from pytorchvideo.transforms import (
ApplyTransformToKey,
RandomShortSideScale,
RemoveKey,
ShortSideScale,
UniformTemporalSubsample
)
from torchvision.transforms import (
Compose,
Normalize,
RandomCrop,
RandomHorizontalFlip
)
class KineticsDataModule(pl.LightningDataModule):
def __init__(self):
super().__init__()
self.transform = Compose(
[
ApplyTransformToKey(
key="video",
transform=Compose(
[
UniformTemporalSubsample(8),
Normalize((0.45, 0.45, 0.45), (0.225, 0.225, 0.225)),
RandomShortSideScale(min_size=256, max_size=320),
RandomCrop(244),
RandomHorizontalFlip(p=0.5),
]
),
),
]
)
def train_dataloader(self):
train_dataset = pytorchvideo.data.Kinetics(
data_path=VIDEO_PATH + "/train",
clip_sampler=pytorchvideo.data.make_clip_sampler("random", 2),
transform=self.transform,
)
return torch.utils.data.DataLoader(
train_dataset,
batch_size=8,
num_workers=8,
)
def val_dataloader(self):
val_dataset = pytorchvideo.data.Kinetics(
data_path=VIDEO_PATH + "/valid",
clip_sampler=pytorchvideo.data.make_clip_sampler("uniform", 2),
transform=self.transform,
)
return torch.utils.data.DataLoader(
val_dataset,
batch_size=8,
num_workers=8,
)
import pytorchvideo.models.resnet
import torch
import torch.nn as nn
import torch.nn.functional as F
def make_kinetics_resnet():
return pytorchvideo.models.resnet.create_resnet(
input_channel=3,
model_depth=50, #
model_num_class=4,
norm=nn.BatchNorm3d,
activation=nn.ReLU,
)
class ClassificationModule(pl.LightningModule):
def __init__(self):
super().__init__()
self.model = make_kinetics_resnet()
def forward(self, x):
return self.model(x)
def training_step(self, batch, batch_idx):
# The model expects a video tensor of shape (B, C, T, H, W), which is the
# format provided by the dataset
y_hat = self.model(batch["video"])
# Compute cross entropy loss, loss.backwards will be called behind the scenes
# by PyTorchLightning after being returned from this method.
loss = F.cross_entropy(y_hat, batch["label"])
# Log the train loss to Tensorboard
self.log("train_loss", loss.item())
return loss
def validation_step(self, batch, batch_idx):
y_hat = self.model(batch["video"])
loss = F.cross_entropy(y_hat, batch["label"])
self.log("val_loss", loss)
return loss
def configure_optimizers(self):
"""
Setup the Adam optimizer. Note, that this function also can return a lr scheduler, which is
usually useful for training video models.
"""
return torch.optim.Adam(self.parameters(), lr=1e-1)
classification_module = ClassificationModule()
data_module = KineticsDataModule()
trainer = pl.Trainer()
trainer.fit(classification_module, datamodule=data_module)
The full log:
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
| Name | Type | Params
-------------------------------
0 | model | Net | 31.7 M
-------------------------------
31.7 M Trainable params
0 Non-trainable params
31.7 M Total params
126.646 Total estimated model params size (MB)
Validation sanity check: 0%
0/2 [00:00<?, ?it/s]
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-7-a7ab4758bd42> in <module>
2 data_module = KineticsDataModule()
3 trainer = pl.Trainer()
----> 4 trainer.fit(classification_module, datamodule=data_module)
~/opt/anaconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py in fit(self, model, train_dataloader, val_dataloaders, datamodule)
497
498 # dispath `start_training` or `start_testing` or `start_predicting`
--> 499 self.dispatch()
500
501 # plugin will finalized fitting (e.g. ddp_spawn will load trained model)
~/opt/anaconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py in dispatch(self)
544
545 else:
--> 546 self.accelerator.start_training(self)
547
548 def train_or_test_or_predict(self):
~/opt/anaconda3/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py in start_training(self, trainer)
71
72 def start_training(self, trainer):
---> 73 self.training_type_plugin.start_training(trainer)
74
75 def start_testing(self, trainer):
~/opt/anaconda3/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py in start_training(self, trainer)
112 def start_training(self, trainer: 'Trainer') -> None:
113 # double dispatch to initiate the training loop
--> 114 self._results = trainer.run_train()
115
116 def start_testing(self, trainer: 'Trainer') -> None:
~/opt/anaconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py in run_train(self)
605 self.progress_bar_callback.disable()
606
--> 607 self.run_sanity_check(self.lightning_module)
608
609 # set stage for logging
~/opt/anaconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py in run_sanity_check(self, ref_model)
862
863 # run eval step
--> 864 _, eval_results = self.run_evaluation(max_batches=self.num_sanity_val_batches)
865
866 self.on_sanity_check_end()
~/opt/anaconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py in run_evaluation(self, max_batches, on_epoch)
711 dl_max_batches = self.evaluation_loop.max_batches[dataloader_idx]
712
--> 713 for batch_idx, batch in enumerate(dataloader):
714 if batch is None:
715 continue
~/opt/anaconda3/lib/python3.8/site-packages/torch/utils/data/dataloader.py in __iter__(self)
353 return self._iterator
354 else:
--> 355 return self._get_iterator()
356
357 @property
~/opt/anaconda3/lib/python3.8/site-packages/torch/utils/data/dataloader.py in _get_iterator(self)
299 else:
300 self.check_worker_number_rationality()
--> 301 return _MultiProcessingDataLoaderIter(self)
302
303 @property
~/opt/anaconda3/lib/python3.8/site-packages/torch/utils/data/dataloader.py in __init__(self, loader)
912 # before it starts, and __del__ tries to join but will get:
913 # AssertionError: can only join a started process.
--> 914 w.start()
915 self._index_queues.append(index_queue)
916 self._workers.append(w)
~/opt/anaconda3/lib/python3.8/multiprocessing/process.py in start(self)
119 'daemonic processes are not allowed to have children'
120 _cleanup()
--> 121 self._popen = self._Popen(self)
122 self._sentinel = self._popen.sentinel
123 # Avoid a refcycle if the target function holds an indirect
~/opt/anaconda3/lib/python3.8/multiprocessing/context.py in _Popen(process_obj)
222 @staticmethod
223 def _Popen(process_obj):
--> 224 return _default_context.get_context().Process._Popen(process_obj)
225
226 class DefaultContext(BaseContext):
~/opt/anaconda3/lib/python3.8/multiprocessing/context.py in _Popen(process_obj)
282 def _Popen(process_obj):
283 from .popen_spawn_posix import Popen
--> 284 return Popen(process_obj)
285
286 class ForkServerProcess(process.BaseProcess):
~/opt/anaconda3/lib/python3.8/multiprocessing/popen_spawn_posix.py in __init__(self, process_obj)
30 def __init__(self, process_obj):
31 self._fds = []
---> 32 super().__init__(process_obj)
33
34 def duplicate_for_child(self, fd):
~/opt/anaconda3/lib/python3.8/multiprocessing/popen_fork.py in __init__(self, process_obj)
17 self.returncode = None
18 self.finalizer = None
---> 19 self._launch(process_obj)
20
21 def duplicate_for_child(self, fd):
~/opt/anaconda3/lib/python3.8/multiprocessing/popen_spawn_posix.py in _launch(self, process_obj)
45 try:
46 reduction.dump(prep_data, fp)
---> 47 reduction.dump(process_obj, fp)
48 finally:
49 set_spawning_popen(None)
~/opt/anaconda3/lib/python3.8/multiprocessing/reduction.py in dump(obj, file, protocol)
58 def dump(obj, file, protocol=None):
59 '''Replacement for pickle.dump() using ForkingPickler.'''
---> 60 ForkingPickler(file, protocol).dump(obj)
61
62 #
TypeError: cannot pickle 'torch._C.Generator' object
I am trying to run the demo under pytorchvideo/tutorials/video_detection_example/video_detection_inference_tutorial.ipynb on Google Colab however when I try load the video using
encoded_vid = pytorchvideo.data.encoded_video.EncodedVideo.from_path('theatre.webm')
print('Completed loading encoded video.')
Google colab crashes saying there is not enough RAM. Do I need to get Google Colab Pro to be able to run this Demo?
RuntimeError: stack expects each tensor to be equal size, but got [3, 61, 864, 1152] at entry 0 and [3, 60, 864, 1152] at entry 1
Hi everyone! I am getting the aforementioned error when using a custom dataset, does anyone know why?
I assumed it was due to videos having different framerate ( so the Time - T - dimension would be different from one clip sample to the other) therefore I preprocessed them to have the same frame rate. However, it still does not work.
Thanks.
Hi everyone,
I am trying to use a custom dataset with pytorchvideo following the tutorial for training, however, I am getting the following error: pytorchvideo.data.labeled_video_dataset: Failed to load video with error: video/_104.avi not found.; trial 0
I have created the csv files formatted as written in the data preparation tutorial as "path_to_video label".
the datapath directory I specified is the one containing train.csv and val.csv and also in the same directory there is a "video" folder where the videos are, as suggested from the error the path says "video/_104.avi".
How can I fix?
Thanks!
I'm writing a dataloaer file with pytorchvideo currently, all the thing is working quite fine except only the speed. It takes me almost 1 second to load a video at one time. Some of my codes are shown below.
transform = ApplyTransformToKey(
key="video",
transform=Compose([
UniformTemporalSubsample(64),
Lambda(lambda x: x / 255.0),
NormalizeVideo(mean, std),
ShortSideScale(size=256),
CenterCropVideo(256),
]),
)
My transform, in simple, getting 64 frames in 64 second-length video clip.
def __getitem__(self, index):
video_data = self.video.get_clip(start_sec=index * self.length, end_sec=(index + 1) * self.length)
video_data = self.transform(video_data)
imgs_data, audios_data = video_data['video'], video_data['audio']
labels = torch.from_numpy(self.va_read(int(index + 1))).t()
return imgs_data, audios_data, labels, self.movie_name, index
part of the consumming time of getting the data. The n
Epoch: [0][4/24] Time_train(average) 30.307 (30.044) Data_load(average) 30.035 (29.764)
Epoch: [0][5/24] Time_train(average) 58.678 (31.728) Data_load(average) 58.383 (31.448)
Epoch: [0][6/24] Time_train(average) 49.739 (35.320) Data_load(average) 49.315 (35.036)
Epoch: [0][6/24] Time_train(average) 53.579 (38.151) Data_load(average) 53.336 (37.872)
Epoch: [0][7/24] Time_train(average) 51.027 (38.822) Data_load(average) 50.760 (38.543)
So is there anything I can do for speeding up the loading video? One more thing is that given the fixth length of video clip, the length of audios seems is not the same which will cause an error if the batch size is more than 1, how can i set the sampling rate of audios? I will be thankful for all kinds of help or suggestions.
It would be great if the models API had an easy way to instantiate a model with the purpose of feature extraction!
Meanwhile, is there any workaround?
Currently, it seems that we can only get a single clip from a video at each iteration.
Hi everyone!
Can someone help me understand how do I load a CSN from torch hub? is it possible?
Thanks!
Link: https://pytorchvideo.org/
My import libraries
import torch
from torchvision.transforms import Compose, Lambda
from torchvision.transforms._transforms_video import (
CenterCropVideo,
NormalizeVideo,
)
from pytorchvideo.data.encoded_video import EncodedVideo
from pytorchvideo.transforms import (
ApplyTransformToKey,
ShortSideScale,
UniformTemporalSubsample,
UniformCropVideo
)
Current version on website:
# Generate top 5 predictions
post_act = F.softmax(dim=1)
preds = post_act(preds)
pred_class_ids = preds.topk(k=5).indices
Correct version:
# Generate top 5 predictions
post_act =torch.nn.Softmax(dim=1)
preds = post_act(preds)
pred_class_ids = preds.topk(k=5).indices
print(pred_class_ids)
#Mapping predicted classes
pred_class_names = [kinetics_id_to_classname[int(i)] for i in pred_classes[0]]
print("Predicted labels: %s" % ", ".join(pred_class_names))
[1] tensor([[192, 104, 315, 268, 78]], device='cuda:0')
[2] Predicted labels: marching, driving tractor, sled dog racing, riding camel, crossing river
https://pytorch.org/docs/stable/generated/torch.nn.Softmax.html
I looking into how to load a pre-trained model (on other dataset except Kinetics). In particular, I want to load slowfast
, pre-trained on Charades. For that, I tried the following.
from pytorchvideo.models.hub.slowfast import _slowfast
import torch
root_url = "https://dl.fbaipublicfiles.com/pytorchvideo/model_zoo"
checkpoint_path = f"{root_url}/charades/SLOWFAST_8x8_R50.pyth"
slowfast = _slowfast(pretrained=True, checkpoint_path=checkpoint_path)
with torch.no_grad():
output = slowfast([torch.rand(1, 3, 8, 224, 224), torch.rand(1, 3, 32, 224, 224)])
# torch.Size([1, 400])
print(output.size())
My questions are:
Since the model loading succeeds, why the model output still corresponds to the Kinetics 400 number of classes? I did expect an error, in the case the model architecture and the loaded model don't match?
I didn't find a concrete tutorial how to load the models?
Once the model is loaded, what is the right way to preprocess the video, so that it corresponds to the exact way of training, on the corresponding dataset?
I get a RuntimeError: No such operator video_reader::read_video_from_memory
when trying to decode a video using the torchvision
backend.
Run the following code:
from pytorchvideo.data.encoded_video import EncodedVideo
path_video = "/path/to/video.mp4"
video_pyav = EncodedVideo.from_path(path_video, decoder='pyav') # Runs without any problem
video_torchvision = EncodedVideo.from_path(path_video, decoder='torchvision') # Throws error
Logs:
Failed to decode video of name <video_name>.mp4. No such operator video_reader::read_video_from_memory
Traceback (most recent call last):
File "/path/to/conda/env/lib/python3.7/site-packages/pytorchvideo/data/encoded_video_torchvision.py", line 206, in _torch_vision_decode_video
raise e
File "/path/to/conda/env/lib/python3.7/site-packages/pytorchvideo/data/encoded_video_torchvision.py", line 180, in _torch_vision_decode_video
tv_result = torch.ops.video_reader.read_video_from_memory(
File "/path/to/conda/env/lib/python3.7/site-packages/torch/_ops.py", line 60, in __getattr__
op = torch._C._jit_get_operation(qualified_op_name)
RuntimeError: No such operator video_reader::read_video_from_memory
python-BaseException
The current version of the libraries is:
pytorchvideo
-> 0.1.2
torch
-> 1.9.0+cu111
torchvision
-> 0.10.0+cu111
Thanks!
It would be great if LabeledVideoDataset could return clip_start and clip_end, which are return from the sampler.
This is important if you are loading video and audio separately, since you need to know the clip_start and clip_end from the sampler to use the same for the audio, which was loaded separately but must be synced.
Literally adding clip_start and clip_end to the return dict of LabeledVideoDataset. I have done this locally and it works fine. I would do a pull request but it seems like such a small detail that it's not worth it. But if you would like me to do that let me know.
When I load a pretrained network and I specify the number of classes, I correctly receive a tensor having 4 dimensions. However, I then compare this with the labels as suggested in the tutorial. Do I need something that retrieves the index of the most probable class or it is handled automatically by lightning?
Thanks
is this going to be supported on pytorch hub
Hi everyone!
I have a problem arising only when I exploit a pretrained network, no matter what the network is.
In particular I am getting the following error:
Traceback (most recent call last):
File "C:/Users/Microlab/Desktop/Marco/videoClassification/Train.py", line 199, in
train()
File "C:/Users/Microlab/Desktop/Marco/videoClassification/Train.py", line 196, in train
trainer.fit(classification_module)
File "C:\Users\Microlab\miniconda3\envs\simone\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 458, in fit
self._run(model)
File "C:\Users\Microlab\miniconda3\envs\simone\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 756, in _run
self.dispatch()
File "C:\Users\Microlab\miniconda3\envs\simone\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 797, in dispatch
self.accelerator.start_training(self)
File "C:\Users\Microlab\miniconda3\envs\simone\lib\site-packages\pytorch_lightning\accelerators\accelerator.py", line 96, in start_training
self.training_type_plugin.start_training(trainer)
File "C:\Users\Microlab\miniconda3\envs\simone\lib\site-packages\pytorch_lightning\plugins\training_type\training_type_plugin.py", line 144, in start_training
self._results = trainer.run_stage()
File "C:\Users\Microlab\miniconda3\envs\simone\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 807, in run_stage
return self.run_train()
File "C:\Users\Microlab\miniconda3\envs\simone\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 869, in run_train
self.train_loop.run_training_epoch()
File "C:\Users\Microlab\miniconda3\envs\simone\lib\site-packages\pytorch_lightning\trainer\training_loop.py", line 566, in run_training_epoch
self.on_train_epoch_end(epoch_output)
File "C:\Users\Microlab\miniconda3\envs\simone\lib\site-packages\pytorch_lightning\trainer\training_loop.py", line 606, in on_train_epoch_end
training_epoch_end_output = model.training_epoch_end(processed_epoch_output)
File "C:/Users/Microlab/Desktop/Marco/videoClassification/Train.py", line 133, in training_epoch_end
self.logger.experiment.add_graph(LightVideoClassification(), sampleImg)
File "C:\Users\Microlab\miniconda3\envs\simone\lib\site-packages\torch\utils\tensorboard\writer.py", line 723, in add_graph
self._get_file_writer().add_graph(graph(model, input_to_model, verbose))
File "C:\Users\Microlab\miniconda3\envs\simone\lib\site-packages\torch\utils\tensorboard_pytorch_graph.py", line 292, in graph
raise e
File "C:\Users\Microlab\miniconda3\envs\simone\lib\site-packages\torch\utils\tensorboard_pytorch_graph.py", line 286, in graph
trace = torch.jit.trace(model, args)
File "C:\Users\Microlab\miniconda3\envs\simone\lib\site-packages\torch\jit_trace.py", line 742, in trace
_module_class,
File "C:\Users\Microlab\miniconda3\envs\simone\lib\site-packages\torch\jit_trace.py", line 940, in trace_module
_force_outplace,
File "C:\Users\Microlab\miniconda3\envs\simone\lib\site-packages\torch\nn\modules\module.py", line 887, in _call_impl
result = self._slow_forward(*input, **kwargs)
File "C:\Users\Microlab\miniconda3\envs\simone\lib\site-packages\torch\nn\modules\module.py", line 860, in _slow_forward
result = self.forward(*input, **kwargs)
File "C:/Users/Microlab/Desktop/Marco/videoClassification/Train.py", line 70, in forward
return self.model(x)
File "C:\Users\Microlab\miniconda3\envs\simone\lib\site-packages\torch\nn\modules\module.py", line 887, in _call_impl
result = self._slow_forward(*input, **kwargs)
File "C:\Users\Microlab\miniconda3\envs\simone\lib\site-packages\torch\nn\modules\module.py", line 860, in _slow_forward
result = self.forward(*input, **kwargs)
File "C:\Users\Microlab\miniconda3\envs\simone\lib\site-packages\pytorchvideo\models\net.py", line 43, in forward
x = self.blocksidx
File "C:\Users\Microlab\miniconda3\envs\simone\lib\site-packages\torch\nn\modules\module.py", line 887, in _call_impl
result = self._slow_forward(*input, **kwargs)
File "C:\Users\Microlab\miniconda3\envs\simone\lib\site-packages\torch\nn\modules\module.py", line 860, in _slow_forward
result = self.forward(*input, **kwargs)
File "C:\Users\Microlab\miniconda3\envs\simone\lib\site-packages\pytorchvideo\models\stem.py", line 253, in forward
x = self.conv(x)
File "C:\Users\Microlab\miniconda3\envs\simone\lib\site-packages\torch\nn\modules\module.py", line 887, in _call_impl
result = self._slow_forward(*input, **kwargs)
File "C:\Users\Microlab\miniconda3\envs\simone\lib\site-packages\torch\nn\modules\module.py", line 860, in _slow_forward
result = self.forward(*input, **kwargs)
File "C:\Users\Microlab\miniconda3\envs\simone\lib\site-packages\torch\nn\modules\conv.py", line 521, in forward
self.padding, self.dilation, self.groups)
RuntimeError: Expected 5-dimensional input for 5-dimensional weight [64, 3, 3, 7, 7], but got 4-dimensional input of size [3, 60, 224, 224] instead
Epoch 1: 75%|ββββββββ | 174/232 [03:34<01:11, 1.23s/it, loss=0.703, v_num=15]
As you can see, the first epoch is perfectly finished as well as 75% of the second one.
Can anyone help me?
I followed the tutorial to train my own video classification model, but this bug appeared when I wanted to use the saved model for inferenceγ
Traceback (most recent call last): File "eval.py", line 181, in <module> main() File "eval.py", line 122, in main model = MyLightingModule.load_from_checkpoint(checkpoint_path) File "/data1/thorqian/anaconda3/lib/python3.8/site-packages/pytorch_lightning/core/saving.py", line 157, in load_from_checkpoint model = cls._load_model_state(checkpoint, strict=strict, **kwargs) File "/data1/thorqian/anaconda3/lib/python3.8/site-packages/pytorch_lightning/core/saving.py", line 199, in _load_model_state model = cls(**_cls_kwargs) TypeError: __init__() missing 1 required positional argument: 'args'
Then they checked the information on the Internet and they said they could add this code.
Lightning-AI/pytorch-lightning#2909
However, I added this line of code to retrain and there was a bug. Can someone tell me why?
raceback (most recent call last): File "train_frame.py", line 630, in <module> main() File "train_frame.py", line 610, in main train(args) File "train_frame.py", line 617, in train trainer.fit(classification_module, data_module) File "/data1/thorqian/anaconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 458, in fit self._run(model) File "/data1/thorqian/anaconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 756, in _run self.dispatch() File "/data1/thorqian/anaconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 797, in dispatch self.accelerator.start_training(self) File "/data1/thorqian/anaconda3/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 96, in start_training self.training_type_plugin.start_training(trainer) File "/data1/thorqian/anaconda3/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 144, in start_training self._results = trainer.run_stage() File "/data1/thorqian/anaconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 807, in run_stage return self.run_train() File "/data1/thorqian/anaconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 869, in run_train self.train_loop.run_training_epoch() File "/data1/thorqian/anaconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/training_loop.py", line 584, in run_training_epoch self.trainer.run_evaluation(on_epoch=True) File "/data1/thorqian/anaconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1006, in run_evaluation self.evaluation_loop.on_evaluation_end() File "/data1/thorqian/anaconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/evaluation_loop.py", line 102, in on_evaluation_end self.trainer.call_hook('on_validation_end', *args, **kwargs) File "/data1/thorqian/anaconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1223, in call_hook trainer_hook(*args, **kwargs) File "/data1/thorqian/anaconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/callback_hook.py", line 227, in on_validation_end callback.on_validation_end(self, self.lightning_module) File "/data1/thorqian/anaconda3/lib/python3.8/site-packages/pytorch_lightning/callbacks/model_checkpoint.py", line 249, in on_validation_end self.save_checkpoint(trainer) File "/data1/thorqian/anaconda3/lib/python3.8/site-packages/pytorch_lightning/callbacks/model_checkpoint.py", line 298, in save_checkpoint self._save_top_k_checkpoint(trainer, monitor_candidates) File "/data1/thorqian/anaconda3/lib/python3.8/site-packages/pytorch_lightning/callbacks/model_checkpoint.py", line 669, in _save_top_k_checkpoint self._update_best_and_save(current, trainer, monitor_candidates) File "/data1/thorqian/anaconda3/lib/python3.8/site-packages/pytorch_lightning/callbacks/model_checkpoint.py", line 730, in _update_best_and_save self._save_model(trainer, filepath) File "/data1/thorqian/anaconda3/lib/python3.8/site-packages/pytorch_lightning/callbacks/model_checkpoint.py", line 449, in _save_model self._do_save(trainer, filepath) File "/data1/thorqian/anaconda3/lib/python3.8/site-packages/pytorch_lightning/callbacks/model_checkpoint.py", line 460, in _do_save trainer.save_checkpoint(filepath, self.save_weights_only) File "/data1/thorqian/anaconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/properties.py", line 330, in save_checkpoint self.checkpoint_connector.save_checkpoint(filepath, weights_only) File "/data1/thorqian/anaconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/checkpoint_connector.py", line 392, in save_checkpoint self.trainer.accelerator.save_checkpoint(_checkpoint, filepath) File "/data1/thorqian/anaconda3/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 516, in save_checkpoint self.training_type_plugin.save_checkpoint(checkpoint, filepath) File "/data1/thorqian/anaconda3/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 256, in save_checkpoint atomic_save(checkpoint, filepath) File "/data1/thorqian/anaconda3/lib/python3.8/site-packages/pytorch_lightning/utilities/cloud_io.py", line 64, in atomic_save torch.save(checkpoint, bytesbuffer) File "/data1/thorqian/anaconda3/lib/python3.8/site-packages/torch/serialization.py", line 379, in save _save(obj, opened_zipfile, pickle_module, pickle_protocol) File "/data1/thorqian/anaconda3/lib/python3.8/site-packages/torch/serialization.py", line 484, in _save pickler.dump(obj) _pickle.PicklingError: Can't pickle <function <lambda> at 0x7ff195ab3b80>: attribute lookup <lambda> on pytorchvideo.models.resnet failed Exception ignored in: <function tqdm.__del__ at 0x7ff1b151cdc0> Traceback (most recent call last): File "/data1/thorqian/anaconda3/lib/python3.8/site-packages/tqdm/std.py", line 1145, in __del__ File "/data1/thorqian/anaconda3/lib/python3.8/site-packages/tqdm/std.py", line 1299, in close File "/data1/thorqian/anaconda3/lib/python3.8/site-packages/tqdm/std.py", line 1492, in display File "/data1/thorqian/anaconda3/lib/python3.8/site-packages/tqdm/std.py", line 1148, in __str__ File "/data1/thorqian/anaconda3/lib/python3.8/site-packages/tqdm/std.py", line 1450, in format_dict TypeError: cannot unpack non-iterable NoneType object
Current tutorial only support train on slow_r50 model, but not slowfast_r50 model.
I want pytorch-video to support train on slowfast_r50 model by modifying the data-loader in tutorial-classification-example.
Hi,
I want to train the slowfast model SLOWFAST_8x8_R50 with my own dataset ( 9 classes of actions) , could you provide Fine/tune code?
thanks alot.
I was wondering if there was a way to do action recognition with pytorchvideo using a live webcam
when using
model_name = "slowfast_r50"
SLOWFAST50_MODEL = torch.hub.load("facebookresearch/pytorchvideo", model=model_name, pretrained=True)
I came across bug follows
/root/.cache/torch/hub/facebookresearch_pytorchvideo_master/hubconf.py in <module>()
2
3 dependencies = ["torch"]
----> 4 from pytorchvideo.models.hub import ( # noqa: F401, E402
5 efficient_x3d_s,
6 efficient_x3d_xs,
ImportError: cannot import name 'slow_r50_detection' from 'pytorchvideo.models.hub' (/usr/local/lib/python3.7/dist-packages/pytorchvideo/models/hub/__init__.py)
My code came from https://pytorchvideo.org/docs/tutorial_torchhub_inference.
My environment is google colab.
It Seems is a bug from official respository. Thanks
Hi! I have tried using the recommended action recognition models on pre-existing videos. I would like to visualize the results as shown in this video - https://www.youtube.com/watch?v=b7-gnpqz9Qg&ab_channel=FacebookAI
NOTE: Please look at the existing list of Issues tagged with the label 'enhancement`. Only open a new issue if you do not see your feature request there.
Currently, the model available don't support pretrained=True
and num_classes={something_different_than_orignal_model}
as load_state_dict is using load_state_dict=True
NOTE: we only consider adding new features if they are useful for many users.
NOTE: Please look at the existing list of Issues tagged with the label 'enhancement`. Only open a new issue if you do not see your feature request there.
Dear people from PyTorchVideo,
First, congratulations on releasing this framework. It is fabulous !
We plan to integrate this framework within Lightning Flash.
We recently created a new BETA
data processing API called DataPipeline. It makes Dataset obsolete and enable very thin customization and quick data augmentation experimentation.
It is built out of Preprocess and Postprocess with multiple hooks to override. It aims at bridging the skew between training / serving.
Here is the tutorial: https://lightning-flash.readthedocs.io/en/latest/custom_task.html
Here is the doc about it: https://lightning-flash.readthedocs.io/en/latest/general/data.html
Here is the Lightning Flash GitHub: https://github.com/PytorchLightning/lightning-flash
We are really keen to collaborate to make this framework integrated within Lightning Flash.
Small nits for the documentation:
PyTorch Lightning provides quantization as a Callback: https://pytorch-lightning.readthedocs.io/en/stable/advanced/pruning_quantization.html?highlight=Quantization#quantization
PyTorch Lightning help with jit export: https://pytorch-lightning.readthedocs.io/en/stable/api/pytorch_lightning.core.lightning.html?highlight=to_jit#pytorch_lightning.core.lightning.LightningModule.to_torchscript
LightningModule provides a load_from_checkpoint
function which will download automatically from url and doesn't require to instantiate the model. It relies on save_hyperparameters
internally.
Here: https://pytorch-lightning.readthedocs.io/en/stable/common/lightning_module.html#save-hyperparameters
This should work.
model = EfficientX3d.load_from_checkpoint('https://dl.fbaipublicfiles.com/pytorchvideo/model_zoo/kinetics/efficient_x3d_xs_original_form.pyth')
Best,
Thomas Chaton.
NOTE: we only consider adding new features if they are useful for many users.
I'm trying to sample video clips and audio samples stored in mp4 format by creating a dataset as follows:
sampler = RandomSampler
num_frames = 25
fps = 25
dataset = labeled_encoded_video_dataset(
data_path=os.path.join(root, "val.csv"),
clip_sampler=make_clip_sampler("uniform", num_frames / fps),
video_path_prefix=root,
transform=val_transform,
video_sampler=sampler,
)
Given that the duration passed to make_clip_sampler is 1 second (num_frames / fps), I expected 25 video frames and 16,000 audio samples to be returned (sampling rate = 16,000). But instead, each sample consists of 26 video frames and 15,360 audio samples. I was digging into the code in encoded_video_pyav.py, and I'm wondering whether an off-by-one error is being made in the below code by using a "<=" operator instead of a "<"?
video_frames = [
f
for f, pts in self._video
if pts >= video_start_pts and pts **<=** video_end_pts
]
Also, it seems to me that audio frames, each of length 1024, are concatenated, always producing audio tensors with lengths that are multiples of 1024 (15,360 in this example), instead of lengths corresponding to the duration specified (16,000 in this example).
Is this behaviour expected? How can I get video and audio lengths that correspond to the duration passed to make_clip_sampler above?
Many thanks!
When I use the follow step to install pytorchvideo:
3. Install from a local clone
git clone https://github.com/facebookresearch/pytorchvideo.git
cd pytorchvideo
pip install -e .
# For developing and testing
pip install -e . [test,dev]
I had got the follow error, How can I resolve this problem?
Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Obtaining file:///home/tccmedia/github/pytorchvideo
Collecting fvcore
Downloading fvcore-0.1.5.post20210630.tar.gz (49 kB)
|ββββββββββββββββββββββββββββββββ| 49 kB 3.2 MB/s
Collecting av
Downloading av-8.0.3-cp36-cp36m-manylinux2010_x86_64.whl (37.2 MB)
|ββββββββββββββββββββββββββββββββ| 37.2 MB 6.7 MB/s
Collecting parameterized
Downloading parameterized-0.8.1-py2.py3-none-any.whl (26 kB)
Collecting iopath
Downloading iopath-0.1.9-py3-none-any.whl (27 kB)
ERROR: Package 'pytorchvideo' requires a different Python: 3.6.9 not in '>=3.7'
Hello,
Thank you for releasing this nice framework for video domain.
I was wondering if it is possible to use the C2D model with this library ? In fact, it is mention in the benchmarking of pytorchvideo, but it seems not possible to use it with the torch.hub method, as it is not mentionned in the hubconf.py
Is there another way to use the C2D model ?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. πππ
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google β€οΈ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.