Giter VIP home page Giter VIP logo

classyvision's Introduction

GitHub license PRs Welcome

Classy Vision is no longer actively maintained.

The latest stable version is 0.7.0 and is available on pip, and has been tested to work with Pytorch 2.0.

What's New:

2020-11-20: Classy Vision v0.5 Released

New Features

  • Release Vision Transformers model implementation, with recipes(#646)
  • Implemented gradient clipping (#643)
  • Implemented gradient accumulation (#644)
  • Added support for AdamW (#636)
  • Added Precise batch norm hook (#592)
  • Added support for adaptive pooling in fully_convolutional_linear_head (#602)
  • Added support for sync batch norm group size (#534)
  • Added a CSV Hook to manually inspect model predictions
  • Added a ClassyModel tutorial (#485)
  • Migrated to Hydra 1.0 (#536)
  • Migrated off of tensorboardX (#488)

Breaking Changes

  • ClassyOptimizer API improvements
    • added OptionsView to retrieve options from the optimizer param_group
  • Removed ClassyModel.evaluation_mode (#521)
  • Removed ImageNetDataset, now a subset of ImagePathDataset (#494)
  • Renamed is_master to is_primary in distributed_util (#576)
2020-04-29: Classy Vision v0.4 Released

New Features

  • Release EfficientNet model implementation (#475)
  • Add support to convert any PyTorch model to a ClassyModel with the ability to attach heads to it (#461)
    • Added a corresponding tutorial on ClassyModel and ClassyHeads (#485)
  • Squeeze and Excitation support for ResNe(X)t and DenseNet models (#426, #427)
  • Made ClassyHooks registrable (#401) and configurable (#402)
  • Migrated to TorchElastic v0.2.0 (#464)
  • Add SyncBatchNorm support (#423)
  • Implement mixup train augmentation (#469)
  • Support LARC for SGD optimizer (#408)
  • Added convenience wrappers for Iterable datasets (#455)
  • Tensorboard improvements
    • Plot histograms of model weights to Tensorboard (#432)
    • Reduce data logged to tensorboard (#436)
  • Invalid (NaN / Inf) loss detection
  • Revamped logging (#478)
  • Add bn_weight_decay configuration option for ResNe(X)t models
  • Support specifying update_interval to Parameter Schedulers (#418)

Breaking changes

  • ClassificationTask API improvement and train_step, eval_step simplification
    • Removed local_variables from ClassificationTask (#411, #412, #413, #414, #416, #421)
    • Move use_gpu from ClassyTrainer to ClassificationTask (#468)
    • Move num_dataloader_workers out of ClassyTrainer (#477)
  • Rename lr to value in parameter schedulers (#417)
2020-03-06: Classy Vision v0.3 Released

Release notes

  • checkpoint_folder renamed to checkpoint_load_path (#379)
  • head support on DenseNet (#383)
  • Cleaner abstraction in ClassyTask/ClassyTrainer: eval_step, on_start, on_end, …
  • Speed metrics in TB (#385)
  • test_phase_period in ClassificationTask (#395)
  • support for losses with trainable parameters (#394)
  • Added presets for some typical ResNe(X)t configurations: #405)

About

Classy Vision is a new end-to-end, PyTorch-based framework for large-scale training of state-of-the-art image and video classification models. Previous computer vision (CV) libraries have been focused on providing components for users to build their own frameworks for their research. While this approach offers flexibility for researchers, in production settings it leads to duplicative efforts, and requires users to migrate research between frameworks and to relearn the minutiae of efficient distributed training and data loading. Our PyTorch-based CV framework offers a better solution for training at scale and for deploying to production. It offers several notable advantages:

  • Ease of use. The library features a modular, flexible design that allows anyone to train machine learning models on top of PyTorch using very simple abstractions. The system also has out-of-the-box integration with Amazon Web Services (AWS), facilitating research at scale and making it simple to move between research and production.
  • High performance. Researchers can use the framework to train Resnet50 on ImageNet in as little as 15 minutes, for example.
  • Demonstrated success in training at scale. We’ve used it to replicate the state-of-the-art results from the paper Exploring the Limits of Weakly Supervised Pretraining.
  • Integration with PyTorch Hub. AI researchers and engineers can download and fine-tune the best publically available ImageNet models with just a few lines of code.
  • Elastic training. We have also added experimental integration with PyTorch Elastic, which allows distributed training jobs to adjust as available resources in the cluster changes. It also makes distributed training robust to transient hardware failures.

Classy Vision is beta software. The project is under active development and our APIs are subject to change in future releases.

Installation

Installation Requirements

Make sure you have an up-to-date installation of PyTorch (1.6), Python (3.6) and torchvision (0.7). If you want to use GPUs, then a CUDA installation (10.1) is also required.

Installing the latest stable release

To install Classy Vision via pip:

pip install classy_vision

To install Classy Vision via conda (only works on linux):

conda install -c conda-forge classy_vision

Manual install of latest commit on main

Alternatively you can do a manual install.

git clone https://github.com/facebookresearch/ClassyVision.git
cd ClassyVision
pip install .

Getting started

Classy Vision aims to support a variety of projects to be built and open sourced on top of the core library. We provide utilities for setting up a project in a standard format with some simple generated examples to get started with. To start a new project:

classy-project my-project
cd my-project

We even include a simple, synthetic, training example to show how to use Classy Vision:

 ./classy_train.py --config configs/template_config.json

Voila! A few seconds later your first training run using our classification task should be done. Check out the results in the output folder:

ls output_<timestamp>/checkpoints/
checkpoint.torch model_phase-0_end.torch model_phase-1_end.torch model_phase-2_end.torch model_phase-3_end.torch

checkpoint.torch is the latest model (in this case, same as model_phase-3_end.torch), a checkpoint is saved at the end of each phase.

For more details / tutorials see the documentation section below.

Documentation

Please see our tutorials to learn how to get started on Classy Vision and customize your training runs. Full documentation is available here.

Join the Classy Vision community

See the CONTRIBUTING file for how to help out.

License

Classy Vision is MIT licensed, as found in the LICENSE file.

Citing Classy Vision

If you use Classy Vision in your work, please use the following BibTeX entry:

@misc{adcock2019classy,
  title={Classy Vision},
  author={{Adcock}, A. and {Reis}, V. and {Singh}, M. and {Yan}, Z. and {van der Maaten}, L. and {Zhang}, K. and {Motwani}, S. and {Guerin}, J. and {Goyal}, N. and {Misra}, I. and {Gustafson}, L. and {Changhan}, C. and {Goyal}, P.},
  howpublished = {\url{https://github.com/facebookresearch/ClassyVision}},
  year={2019}
}

classyvision's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

classyvision's Issues

Video inference sample code

I used my own video classification trained with the configuration of ucf101. It seems that there is a little overfitting. I want to use my code for inference, but I can't find a related tutorial. I can only find it, and I don't know what to do next. plaese help me @mannatsingh

from classy_vision.generic.util import load_checkpoint
from classy_vision.models import ClassyModel

# This is important: importing models here will register your custom models with Classy Vision
# so that it can instantiate them appropriately from the checkpoint file
# See more information at https://classyvision.ai/api/models.html#classy_vision.models.register_model
from classy_vision import models


# Update this with your actual directory:
checkpoint_dir = '/home/sucom/hdd_1T/project/video_rec_0831/ClassyVision/checkpoint_hand/model_phase-2198_end.torch'
checkpoint_data = load_checkpoint(checkpoint_dir)
model = ClassyModel.from_checkpoint(checkpoint_data)

Keeping track of the best model checkpoint

🚀 Feature

There is currently no way of keeping track of the best checkpoint in classy vision, with a focus on a specific meter.

Motivation / Pitch

Knowing which checkpoint performs better against a validation set is a common practice in ML, it makes sense to automate this process for the user.

Alternatives

The only way I've managed to do it so far was by analyzing the contents of the saved checkpoints for a specific meter or by manually looking at the training logs. Both of these are very tedious processes.

Additional Context

I think this can be implemented by a ClassyHook that watches the values of a user-specified meter and saves the best phase with a given strategy (max/min). The downside I see is that in order to be accurate, this needs to be synced across all the workers in the distributed case.

Results can then be logged or dumped to a file in the master worker.

Where find metadata_file in UCF101

❓ Questions and Help

Hi, i looking
https://classyvision.ai/tutorials/video_classification

and i have my own dataset...
where i can find structure for metadata_file

metadata_file = "[PUT THE FILE PATH OF DATASET META DATA HERE]"

datasets = {}
datasets["train"] = build_dataset({
    "name": "ucf101",
    "split": "train",
    "batchsize_per_replica": 8,  # For training, we use 8 clips in a minibatch in each model replica
    "use_shuffle": True,         # We shuffle the clips in the training split
    "num_samples": 64,           # We train on 16 clips in one training epoch
    "clips_per_video": 1,        # For training, we randomly sample 1 clip from each video
    "frames_per_clip": 8,        # The video clip contains 8 frames
    "video_dir": video_dir,
    "splits_dir": splits_dir,
    "metadata_file": metadata_file,
    "fold": 1,
    "transforms": {
        "video": [
            {
                "name": "video_default_augment",
                "crop_size": 112,
                "size_range": [128, 160]
            }
        ]
    }
})

Early Stopping Hook

❓ Questions and Help

I'm trying to implement an early stopping hook that interacts with other custom hooks that I have implemented, but I'm having trouble of getting a sensible way of sending a stop signal to the ClassyTrainer.

The only way I managed to get it working was this:

    def on_loss_and_meter(
        self, task: "tasks.ClassyTask", local_variables: Dict[str, Any]
    ) -> None:
        # FIXME: Ugly hack
        task.done_training = lambda: True
        logging.info("early stopping")

Is there a cleaner way of achieving this in the current state of the project?

How to create resnext3d model?

According to API reference of ResNeXt3D, if I want to create a ResNeXt3D model, the parameter num_groups should be greater than 1, but for the test case in

model_config_template = {
"name": "resnext3d",
"input_key": "video",
"clip_crop_size": 112,
"skip_transformation_type": "postactivated_shortcut",
"frames_per_clip": 32,
"input_planes": 3,
"stem_name": "resnext3d_stem",
"stem_planes": 64,
"stem_temporal_kernel": 3,
"stage_planes": 64,
"num_groups": 1,
"width_per_group": 16,

the num_groups is 1, I'm confused it when I refer to
# ResNeXt3D-101
{"residual_transformation_type": pbt, "num_blocks": [3, 4, 23, 3]},
]
.

I want to create a resnext3d model for video classification for UCF101 dataset, I find there have a config file in https://github.com/facebookresearch/ClassyVision/blob/master/classy_vision/configs/ucf101/r3d34.json, is it a resnet3d-34 model or resnext3d-34 model? Thanks!

How to use TimeMetricsHook in a train task?

❓ Questions and Help

I want use TimeMetricsHook to get the training performance, but I face the following error:

Traceback (most recent call last):
  File "main.py", line 255, in <module>
    main()
  File "main.py", line 111, in main
    train(datasets, model, loss, optimizer, meters, args)
  File "main.py", line 145, in train
    trainer.train(task)
  File "/home/xiaobinz/anaconda3/envs/pytorch-bw/lib/python3.6/site-packages/classy_vision/trainer/classy_trainer.py", line 78, in train
    task.step(self.use_gpu)
  File "/home/xiaobinz/anaconda3/envs/pytorch-bw/lib/python3.6/site-packages/classy_vision/tasks/classy_task.py", line 178, in step
    hook.on_step(self)
  File "/home/xiaobinz/anaconda3/envs/pytorch-bw/lib/python3.6/site-packages/classy_vision/hooks/time_metrics_hook.py", line 50, in on_step
    self._log_performance_metrics(task)
  File "/home/xiaobinz/anaconda3/envs/pytorch-bw/lib/python3.6/site-packages/classy_vision/hooks/time_metrics_hook.py", line 84, in _log_performance_metrics
    get_rank(), task.perf_stats.report_str()
  File "/home/xiaobinz/anaconda3/envs/pytorch-bw/lib/python3.6/site-packages/classy_vision/generic/perf_stats.py", line 217, in report_str
    name_width = max(len(k) for k in self._host_stats.keys())
ValueError: max() arg is an empty sequence

The following code is how do I use TimeMetricsHook in my model:

def train(datasets, model, loss, optimizer, meters, args):
     task = (ClassificationTask()
             .set_num_epochs(args.num_epochs)
             .set_loss(loss)
             .set_model(model)
             .set_optimizer(optimizer)
             .set_meters(meters))
     for phase in ["train", "test"]:
         task.set_dataset(datasets[phase], phase)

     hooks = [LossLrMeterLoggingHook(log_freq=args.print_freq)]
     # show progress
     hooks.append(ProgressBarHook())
     # show time bar
     hooks.append(TimeMetricsHook(log_freq=args.print_freq))

     checkpoint_dir = f"{args.video_dir}/checkpoint/classy_checkpoint_{time.time()}"
     os.mkdir(checkpoint_dir)
     hooks.append(CheckpointHook(checkpoint_dir, input_args={}))

     task = task.set_hooks(hooks)
     trainer = LocalTrainer(use_gpu = args.cuda)
     trainer.train(task)

The LossLrMeterLoggingHook , ProgressBarHook, and CheckpointHook can works, but add TimeMetricsHook, will get the error.

Load pretrained Imagenet models by pytorch

When I load a pretrained imagenet model by pytorch using a finetune task, an AssertionError: Checkpoint does not contain classy_state_dict was raised. So I want to know, how to load a imagenet model to initialize the backbone of a class model?
Thank you.

Explain how the registration process works

📚 Documentation

While in our tutorials we explain how to create a new project and the registration works seamlessly, we should also document how it works so that users can also understand how to use Classy Vision without creating a project if they choose to (while still recommending creating a project to avoid these hassles). Creating this as a follow up after #312.

FineTuningTask's pretrained_checkpoint should be configurable

🚀 Feature

It seems logical to me that the set_pretrained_checkpoint argument from the fine tuning task should be present in the config.

Motivation

I'm using a custom train script, instead of the one provided in classy vision, and it seems silly having to implement a feature that belongs to a specific task in a generic training pipeline.

Effectively, this can be achieved by loading the checkpoint in the from_config method in the FineTuningTask class.

Happy to do a PR for this.

ValueError: bad value(s) in fds_to_keep

i use ucf-101 example but get this problem

Traceback (most recent call last):
  File "/home/sucom/hdd_1T/project/video_rec/my_video_rec/self_video_train.py", line 160, in <module>
    trainer.train(task)
  File "/home/sucom/.conda/envs/classy_vision/lib/python3.6/site-packages/classy_vision/trainer/local_trainer.py", line 27, in train
    super().train(task)
  File "/home/sucom/.conda/envs/classy_vision/lib/python3.6/site-packages/classy_vision/trainer/classy_trainer.py", line 45, in train
    task.on_phase_start()
  File "/home/sucom/.conda/envs/classy_vision/lib/python3.6/site-packages/classy_vision/tasks/classification_task.py", line 945, in on_phase_start
    self.advance_phase()
  File "/home/sucom/.conda/envs/classy_vision/lib/python3.6/site-packages/classy_vision/tasks/classification_task.py", line 847, in advance_phase
    self.create_data_iterator()
  File "/home/sucom/.conda/envs/classy_vision/lib/python3.6/site-packages/classy_vision/tasks/classification_task.py", line 900, in create_data_iterator
    self.data_iterator = iter(self.dataloaders[self.phase_type])
  File "/home/sucom/.conda/envs/classy_vision/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 279, in __iter__
    return _MultiProcessingDataLoaderIter(self)
  File "/home/sucom/.conda/envs/classy_vision/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 719, in __init__
    w.start()
  File "/home/sucom/.conda/envs/classy_vision/lib/python3.6/multiprocessing/process.py", line 105, in start
    self._popen = self._Popen(self)
  File "/home/sucom/.conda/envs/classy_vision/lib/python3.6/multiprocessing/context.py", line 284, in _Popen
    return Popen(process_obj)
  File "/home/sucom/.conda/envs/classy_vision/lib/python3.6/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/home/sucom/.conda/envs/classy_vision/lib/python3.6/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/home/sucom/.conda/envs/classy_vision/lib/python3.6/multiprocessing/popen_spawn_posix.py", line 59, in _launch
    cmd, self._fds)
  File "/home/sucom/.conda/envs/classy_vision/lib/python3.6/multiprocessing/util.py", line 417, in spawnv_passfds
    False, False, None)
ValueError: bad value(s) in fds_to_keep

Examples for inference

❓ Questions and Help

Documenting discussion from the ClassyVision slack channel about examples/best practices for inference.
Note : https://github.com/facebookresearch/ClassyVision/blob/master/tutorials/wsl_model_predict.ipynb shows an example for inference, but it is not referenced in the tutorials due to being unpolished.

Load model checkpoint

    task = build_task(config)

    # Load checkpoint, if available.
    if args.checkpoint_load_path is None:
        return print('NO CHECKPOINT PROVIDED')
    checkpoint = load_checkpoint(args.checkpoint_load_path)
    task.set_checkpoint(checkpoint)

    classy_interface = ClassyHubInterface.from_task(task)

Specify filepaths to images for inference

    inference_root_dir =<HARD CODE OR IMPORT FROM YOUR CONFIG>
    inference_filepaths = []
    for image in os.listdir(inference_root_dir):
        inference_filepaths.append(os.path.join(inference_root_dir, image))

Specify image transforms and create dataloader

    Normalize = transforms.Normalize(mean = [0.485, 0.456, 0.406],
                                      std = [0.229, 0.224, 0.225])
    ToTensor = transforms.ToTensor()
    transforms_composed = transforms.Compose([ToTensor, Normalize])
    
    # Transform wrapper that says which key to apply the transform to
    transforms_composed_cv = ApplyTransformToKey(
        transform=transforms_composed,
        key="input",
    )
    
    dataset = classy_interface.create_image_dataset(image_paths = inference_filepaths, 
                                                    transform=transforms_composed_cv,
                                                    shuffle=False)

OR

Specify transforms from the config. @mannatsingh I couldn't get this to work.

Example portion from the config:

        "inference": {
            "name": "<REDACTED>",
            "filepaths": "<REDACTED>",
            "transforms": [{"name": "generic_image_transform", "transforms": [
                {"name": "ToTensor"},
                {
                    "name": "Normalize",
                    "mean": [0.485, 0.456, 0.406],
                    "std": [0.229, 0.224, 0.225]
                }
            ]}]
        }

Creating the dataset object

    dataset = classy_interface.create_image_dataset(
                                                    image_paths = config['dataset']['inference']['filepaths'],
                                                    shuffle=False,
                                                    phase_type = 'inference'
                                                    )

I believe this approach doesn't work for me, because my images for inference are not arranged in the following format. Naturally, the training/val set was setup in this manner, but obviously there are no classes for inference.

    root/dog/xxx.png
    root/dog/xxy.png
    root/dog/xxz.png

    root/cat/123.png
    root/cat/nsdf3.png
    root/cat/asd932_.png

Lastly, iterate through the dataset and compute class probabilities. (only doing batch size=1)

    classy_interface.eval()
    for input in dataset:
        input['input'] = input['input'].unsqueeze(0) # form the batch dimension
        output = classy_interface.predict(input)
        output = torch.nn.functional.softmax(output, dim=1) # this is obviously task dependent

Separate train / eval code in tasks

🚀 Feature

Currently the task abstraction just has a train step, but has two types of phases, train and test. So we end up having to take care of both training and evaluation in the same set of code (currently done with if-statements). The proposal is to improve the separation of the train / eval code by adding an eval_step to the tasks.

Motivation

Currently we have two types of phases used during training, test phase and train phases. In test phases we use if-statements to turn off things like auto-grad and the backprop step. This is cumbersome and makes it harder to write separate logic for evaluation which many tasks need.

Pitch

What would need to be done:

  1. Add an eval_step call to the task abstraction
  2. Migrate current tasks (e.g. ClassificationTask) code to use train step and eval step
  3. Update trainers to use eval step for test phases. Note, steps 2/3 are tricky because any user-created tasks will fail after 3 is done.

Alternatives

Some alternatives we've thought about:

  1. A drawback of the proposed solution is that if we want to do a new kind of evaluation, then we need to write a new task. So one solution is to have separate evaluation tasks. Then we can potentially mix and match training / eval setups. This pushes the requirement to the trainer to have a separate train and eval task. This could be a better solution and might be easier to migrate existing tasks without breaking them. cc @vreis
  2. Other solutions?

on_exception hook event

🚀 Feature

It would be nice to have the ability to process failed experiments by applying an action over an exception.

Pitch

This would be useful to trigger edge cases that might happen such as interrupted training procedures and saving the current state of the model, for example.

Another thing that comes to mind is a hook that sends you a warning of a failed training experiment, via some webhook.

There are multiple scenarios where this would be useful, for multiple reasons.

I would be happy to help, as always :)

Training crashing with distributed training

❓ Questions and Help

I'm training a model that I trained before with classy vision, except that now i moved into AWS with ray and the script crashes with this error:

(pid=3617, ip=172.31.20.185)   File "/home/ubuntu/projects/facebook-challenge/dfdc/train.py", line 148, in train
(pid=3617, ip=172.31.20.185)     raise ex
(pid=3617, ip=172.31.20.185)   File "/home/ubuntu/projects/facebook-challenge/dfdc/train.py", line 143, in train
(pid=3617, ip=172.31.20.185)     trainer.train(task)
(pid=3617, ip=172.31.20.185)   File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/classy_vision/trainer/classy_trainer.py", line 85, in train
(pid=3617, ip=172.31.20.185)     task.train_step(self.use_gpu, local_variables)
(pid=3617, ip=172.31.20.185)   File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/classy_vision/tasks/classification_task.py", line 647, in train_step
(pid=3617, ip=172.31.20.185)     local_variables["sample"]["input"]
(pid=3617, ip=172.31.20.185)   File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
(pid=3617, ip=172.31.20.185)     result = self.forward(*input, **kwargs)
(pid=3617, ip=172.31.20.185)   File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/distributed.py", line 464, in forward
(pid=3617, ip=172.31.20.185)     self.reducer.prepare_for_backward([])
(pid=3617, ip=172.31.20.185) RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. This error indicates that your module has parameters that were not used in producing loss. You can enable unused parameter detection by (1) passing the keyword argument `find_unused_parameters=True` to `torch.nn.parallel.DistributedDataParallel`; (2) making sure all `forward` function outputs participate in calculating loss. If you already have done the above two steps, then the distributed data parallel module wasn't able to locate the output tensors in the return value of your module's `forward` function. Please include the loss function and the structure of the return value of `forward` of your module when reporting this issue (e.g. list, dict, iterable). (prepare_for_backward at /opt/conda/conda-bld/pytorch_1579027003190/work/torch/csrc/distributed/c10d/reducer.cpp:514)

I'm not sure what the unused parameter might be. I had an unused key in the dictionary of the dataset that was not being used in the loss calculation but I removed that and the script still crashes in the same location.

Any idea on how to solve or debug this?

For context, I'm using an IterableDataset, so that's slightly different than usual.

EDIT: I'm using classy vision v0.2

Configuration argument explanation

❓ Questions and Help

I am new to this repo, I haven't use configuration file to configure my task before, so I am confused about arguments in "lr" shown below such as "interval_scaling" etc, and I am thinking about the starting lr=0.1 maybe too big? Please help me out, thanks!

this screenshot is made in "tutorial getting_started".
image

GenericImageTransform sample type check

🐛 Bug

The __call__() of GenericImageTransform in transform.utils does not check whether input is a tuple or a dict.

To Reproduce

  1. classy-project my-project to initialize an example project.
  2. Set SampleType.TUPLE to SampleType.DICT in line 32 of my_data.py

Expected behavior

KeyError: Caught KeyError in DataLoader worker process 0.

No such file or directory: '/usr/local/lib/python3.6/dist-packages/classy_vision/templates/synthetic'

🐛 Bug

When installing it under a virtualenv, setting up a new project is not working because of FileNotFoundError

To Reproduce

Steps to reproduce the behavior:

  1. mkvirtualenv myenv --python=python3.6; work on myenv
  2. classy-project my_project

Expected behavior

Environment

  • What commands did you use to install Classy Vision (conda/pip/build from source)? pip install classy_vision
  • What does classy_vision.__version__ print? (If applicable)
>>> import classy_vision; print(classy_vision.__version__)
0.1.0

PyTorch version: 1.3.1
Is debug build: No
CUDA used to build PyTorch: 10.1.243

OS: Ubuntu 18.04.3 LTS
GCC version: (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
CMake version: Could not collect

Python version: 3.6
Is CUDA available: Yes
CUDA runtime version: Could not collect
GPU models and configuration:
GPU 0: TITAN RTX
GPU 1: TITAN RTX

Nvidia driver version: 418.87.01
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.3
/usr/local/cuda-10.1/targets/x86_64-linux/lib/libcudnn.so.7

Versions of relevant libraries:
[pip3] numpy==1.18.0
[pip3] torch==1.3.1
[pip3] torchvision==0.4.2
[conda] Could not collect

how to training myself video ?

❓ Questions and Help

Before creating an issue, please go over our Tutorials and API Reference. If you cannot find the information you are looking for, please enquire in Classy Vision's #help slack channel before creating an issue.

I want to train my own video recognition, including screwing the bottle cap to the right and screwing the bottle cap to the left. Is this possible? How many videos need to be taken to ensure the training effect? Can I directly use the configuration of ucf101 for training?

please help me @mannatsingh

Training with Soft Labels

I was looking into using soft labels (e.g. [0.2, 0.3, 0.5] instead of one-hot encoding on the last class). However, I wanted to ask if there would be any major blockers for training in this manner.

I know that I'll need to modify the loss and dataloader to handle a list of scores instead of a single class label. I'm not sure if that'll break other functions that expect the "[input, target]" default output from the dataloader (in the "_get_typed_sample" function).

I'm much less clear if I'll need to modify the training hooks or if other wrapper functions will cause issues.

Appreciate any advice and thanks!

add model zoo

🚀 Feature

model zoo just like detectron2

Motivation

We can quickly know the accuracy/train time/inference time of each model.

Pitch

just like this, detectron2 model zoo

Alternatives

N/A

Additional context

N/A

Support YAML configs

🚀 Feature

Make classy_train.py work with YAML config files in addition to JSON.

Motivation

YAML is better than JSON, period.

Pitch

We already have experimental support for YAML by using the Hydra library (https://hydra.cc). We need to use it more, make sure it's stable and document it.

Alternatives

Support YAML without Hydra. There's lots of benefits of using Hydra though.

Additional context

https://hydra.cc
https://github.com/facebookresearch/hydra

Auto scale learning rate based on batch size

🚀 Feature

Auto scale learning rate based on batch size

Motivation

Changing the number of workers in distributed training requires adjusting hyperparameters. https://arxiv.org/abs/1706.02677 proposed a linear scaling rule to adjust the learning rate based on the batch size

Pitch

ClassificationTask should have a flag (default True), that would rescale the learning rate based on the batch size. The task is a natural place to put this since we don't want all parameter schedulers to reimplement the same logic. We could consider having the same in the optimizer instead, but I have a sense it'll require more boilerplate.

Alternatives

Hydra (http://hydra.cc) would enable a different solution for this problem: the config file could have a "rescale" parameter for the learning rate, and we could use the "interpolation" feature to rescale by "1/{batch_size}", where batch_size is defined elsewhere in the config.

Conda install instructions in /README.md fail

📚 Documentation

The main README of the ClassyVision project on GitHub prescribes the following command line for installing the framework:

conda install -c conda-forge classy_vision

This gives the result:

--- ~ » conda install -c conda-forge classy_vision                                                                                                                     
Collecting package metadata (repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.

PackagesNotFoundError: The following packages are not available from current channels:

  - classy_vision

Current channels:

  - https://conda.anaconda.org/conda-forge/osx-64
  - https://conda.anaconda.org/conda-forge/noarch
  - https://repo.anaconda.com/pkgs/main/osx-64
  - https://repo.anaconda.com/pkgs/main/noarch
  - https://repo.anaconda.com/pkgs/r/osx-64
  - https://repo.anaconda.com/pkgs/r/noarch

To search for alternate channels that may provide the conda package you're
looking for, navigate to

    https://anaconda.org

and use the search bar at the top of the page.

Using -c pytorch and using no channel flag (i.e., conda's default channels) produce similar results.

Prepare the dataset, EOFError: Ran out of input

📚 Documentation

I am following the video classification tutorial from https://classyvision.ai/tutorials/video_classification
in the first step Prepare the dataset it asks to download the dataset from link, well i have downloaded it and all files and structures look great.

The code snippet which is given as an example to build the dataset is not working

from classy_vision.dataset import build_dataset

# set it to the folder where video files are saved
video_dir = "UCF-101"
# set it to the folder where dataset splitting files are saved
splits_dir = "ucfTrainTestlist"
# set it to the file path for saving the metadata
metadata_file = "metadata.txt"

datasets = {}
datasets["train"] = build_dataset({
    "name": "ucf101",
    "split": "train",
    "batchsize_per_replica": 8,  # For training, we use 8 clips in a minibatch in each model replica
    "use_shuffle": True,         # We shuffle the clips in the training split
    "num_samples": 64,           # We train on 16 clips in one training epoch
    "clips_per_video": 1,        # For training, we randomly sample 1 clip from each video
    "frames_per_clip": 8,        # The video clip contains 8 frames
    "video_dir": video_dir,
    "splits_dir": splits_dir,
    "metadata_file": metadata_file,
    "fold": 1,
    "transforms": {
        "video": [
            {
                "name": "video_default_augment",
                "crop_size": 112,
                "size_range": [128, 160]
            }
        ]
    }
})

datasets["test"] = build_dataset({
    "name": "ucf101",
    "split": "test",
    "batchsize_per_replica": 10,  # For testing, we will take 1 video once a time, and sample 10 clips per video
    "use_shuffle": False,         # We do not shuffle clips in the testing split
    "num_samples": 80,            # We test on 80 clips in one testing epoch
    "clips_per_video": 10,        # We sample 10 clips per video
    "frames_per_clip": 8,
    "video_dir": video_dir,
    "splits_dir": splits_dir,
    "metadata_file": metadata_file,
    "fold": 1,
    "transforms": {
        "video": [
            {
                "name": "video_default_no_augment",
                "size": 128
            }
        ]
    }    
})

when i execute the above code it says

---------------------------------------------------------------------------
EOFError                                  Traceback (most recent call last)
<ipython-input-10-de8cc992b85c> in <module>
     26                 "name": "video_default_augment",
     27                 "crop_size": 112,
---> 28                 "size_range": [128, 160]
     29             }
     30         ]

/opt/conda/lib/python3.6/site-packages/classy_vision/dataset/__init__.py in build_dataset(config, *args, **kwargs)
     25     "folder": "/data"}` will find a class that was registered as "my_dataset"
     26     (see :func:`register_dataset`) and call .from_config on it."""
---> 27     return DATASET_REGISTRY[config["name"]].from_config(config, *args, **kwargs)
     28 
     29 

/opt/conda/lib/python3.6/site-packages/classy_vision/dataset/classy_ucf101.py in from_config(cls, config)
    166             if "fold" in config
    167             else 1,  # UCF101 has 3 folds. Use fold 1 by default
--> 168             config["metadata_file"],
    169         )

/opt/conda/lib/python3.6/site-packages/classy_vision/dataset/classy_ucf101.py in __init__(self, split, batchsize_per_replica, shuffle, transform, num_samples, frames_per_clip, video_width, video_height, video_min_dimension, audio_samples, step_between_clips, frame_rate, clips_per_video, video_dir, splits_dir, fold, metadata_filepath)
     82         if os.path.exists(metadata_filepath):
     83             metadata = UCF101Dataset.load_metadata(
---> 84                 metadata_filepath, video_dir=video_dir, update_file_path=True
     85             )
     86 

/opt/conda/lib/python3.6/site-packages/classy_vision/dataset/classy_video_dataset.py in load_metadata(cls, filepath, video_dir, update_file_path)
    189                 with the full video file path saved in the meta data.
    190         """
--> 191         metadata = torch.load(filepath)
    192         if video_dir is not None and update_file_path:
    193             # video path in meta data can be computed in a different root video folder

/opt/conda/lib/python3.6/site-packages/torch/serialization.py in load(f, map_location, pickle_module, **pickle_load_args)
    590                     return torch.jit.load(f)
    591                 return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
--> 592         return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
    593 
    594 

/opt/conda/lib/python3.6/site-packages/torch/serialization.py in _legacy_load(f, map_location, pickle_module, **pickle_load_args)
    760             "functionality.".format(type(f)))
    761 
--> 762     magic_number = pickle_module.load(f, **pickle_load_args)
    763     if magic_number != MAGIC_NUMBER:
    764         raise RuntimeError("Invalid magic number; corrupt file?")

EOFError: Ran out of input

could you please help me out with the error, i don't understand where the problem is.
These all are in my current directory

UCF-101     UCF101TrainTestSplits-RecognitionTask.zip  metadata.txt
UCF101.rar  classy-vision-notebook.ipynb	       ucfTrainTestlist

and the UCF-101 directory contains a directory for each class with their related videos

ApplyEyeMakeup	   Drumming	       MilitaryParade	   Shotput
ApplyLipstick	   Fencing	       Mixing		   SkateBoarding
Archery		   FieldHockeyPenalty  MoppingFloor	   Skiing
BabyCrawling	   FloorGymnastics     Nunchucks	   Skijet
BalanceBeam	   FrisbeeCatch        ParallelBars	   SkyDiving
BandMarching	   FrontCrawl	       PizzaTossing	   SoccerJuggling
BaseballPitch	   GolfSwing	       PlayingCello	   SoccerPenalty
Basketball	   Haircut	       PlayingDaf	   StillRings
BasketballDunk	   HammerThrow	       PlayingDhol	   SumoWrestling
BenchPress	   Hammering	       PlayingFlute	   Surfing
Biking		   HandstandPushups    PlayingGuitar	   Swing
Billiards	   HandstandWalking    PlayingPiano	   TableTennisShot
BlowDryHair	   HeadMassage	       PlayingSitar	   TaiChi
BlowingCandles	   HighJump	       PlayingTabla	   TennisSwing
BodyWeightSquats   HorseRace	       PlayingViolin	   ThrowDiscus
Bowling		   HorseRiding	       PoleVault	   TrampolineJumping
BoxingPunchingBag  HulaHoop	       PommelHorse	   Typing
BoxingSpeedBag	   IceDancing	       PullUps		   UnevenBars
BreastStroke	   JavelinThrow        Punch		   VolleyballSpiking
BrushingTeeth	   JugglingBalls       PushUps		   WalkingWithDog
CleanAndJerk	   JumpRope	       Rafting		   WallPushups
CliffDiving	   JumpingJack	       RockClimbingIndoor  WritingOnBoard
CricketBowling	   Kayaking	       RopeClimbing	   YoYo
CricketShot	   Knitting	       Rowing
CuttingInKitchen   LongJump	       SalsaSpin
Diving		   Lunges	       ShavingBeard

Documentation on folder structure for video/image datasets

Please add more documentation on how you can create a new custom dataset to work with data loading utility functions.

It's much better to give an example on how these directories - video_dir, splits_dir are organized.

from classy_vision.dataset import build_dataset

# set it to the folder where video files are saved
video_dir = "[PUT YOUR VIDEO FOLDER HERE]"
# set it to the folder where dataset splitting files are saved
splits_dir = "[PUT THE FOLDER WHICH CONTAINS SPLITTING FILES HERE]"
# set it to the file path for saving the metadata
metadata_file = "[PUT THE FILE PATH OF DATASET META DATA HERE]"

ImagePathDataset DATASET_REGISTRY error

❓ Questions and Help

config:
{
"name": "classification_task",
"use_gpu": true,
"loss": {
"name": "CrossEntropyLoss"
},
"dataset": {
"num_workers": 40,
"train": {
"name": "image_path",
"batchsize_per_replica": 32,
"num_samples": null,
"use_shuffle": true,
"transforms": [
{
"name": "apply_transform_to_key",
"transforms": [
{
"name": "RandomResizedCrop",
"size": 224
},
{ "name": "RandomHorizontalFlip" },
{ "name": "ToTensor" },
{
"name": "Normalize",
"mean": [ 0.485, 0.456, 0.406 ],
"std": [ 0.229, 0.224, 0.225 ]
}
],
"key": "input"
}
],
"image_folder": "/public/home/cxj/Tutorial/classy_vision/my-project/data/train"
},
"test": {
"name": "image_path",
"batchsize_per_replica": 32,
"num_samples": null,
"use_shuffle": false,
"transforms": [
{
"name": "apply_transform_to_key",
"transforms": [
{
"name": "Resize",
"size": 256
},
{
"name": "CenterCrop",
"size": 224
},
{ "name": "ToTensor" },
{
"name": "Normalize",
"mean": [ 0.485, 0.456, 0.406 ],
"std": [ 0.229, 0.224, 0.225 ]
}
],
"key": "input"
}
],
"image_folder": "/public/home/cxj/Tutorial/classy_vision/my-project/data/test"
}
},
"meters": {
"accuracy": {
"topk": [ 1, 5 ]
}
},
"model": {
"name": "resnet",
"num_blocks": [ 3, 4, 6, 3 ],
"small_input": false,
"zero_init_bn_residuals": true,
"heads": [
{
"name": "fully_connected",
"unique_id": "default_head",
"num_classes": 4,
"fork_block": "block3-2",
"in_plane": 2048
}
]
},
"optimizer": {
"name": "sgd",
"param_schedulers": {
"lr": {
"name": "step",
"values": [ 0.1, 0.01, 0.001 ]
}
},
"num_epochs": 50,
"weight_decay": 1e-4,
"momentum": 0.9
}
}

...
for split in ["train", "test"]:
dataset = build_dataset(config["dataset"][split])
task.set_dataset(dataset, split)

error:
File "Pretrained_Checkpoint_Model_Zoo.py", line 83, in
dataset = build_dataset(config["dataset"][split])
File "/public/home/cxj/anaconda3/envs/cu100/lib/python3.7/site-packages/classy_vision/dataset/init.py", line 27, in build_dataset
dataset = DATASET_REGISTRY[config["name"]].from_config(config, *args, **kwargs)
KeyError: 'image_path'

Support for IterableDatasets

🚀 Feature

Pytorch introduced the IterableDataset in v1.2 to allow users to process streams of information. ClassyVision currently only supports map style datasets, it would be nice to extend the support to IterableDatasets, given that they are especially useful to process video streams.

Motivation / Pitch

Map style datasets assume each sample can be read completely independently from each other. In some situations, such as processing video streams, it is extremely expensive to open and close a video stream N times to read N frames. IterableDatasets allow streams to be open once and then results are yielded as requested by the data loader, which is substantially more efficient.

Additional context

There are a few important differences between Map style datasets and iterable datasets, that break the current classy vision dataset paradigm:

  1. In an iterable dataset, there is no __getitem__ method, this is replaced by the __iter__ method
  2. The __len__ method is optional in iterable dataset
  3. Samplers do not work with IterableDatasets, sampling and shuffling has to be handled in the dataset

I've come up with a template dataset called ChunkDataset hides some of this complexity away, which might be nice to help beginner users to get started. Nevertheless, in order to get this working with my code, I had to subclass ClassificationTask to modify it and had to create and entirely new base class for this style of dataset (ClassyDataset is not compatible).

chunk.py

class ChunkDataset(IterableDataset):
    def __init__(self, indices: List, process_fn: Callable):
        """
        Subset of IterableDataset to serve as a base class to process streams of data,
        such as audio, video or text.

        Args:
            indices (list): list of arguments to provide to process_fn
            process_fn (callable): function that processes the indices and returns an iterator
        """
        self.idxs = indices
        self._process_fn = process_fn
        self.epoch = 0
        self.shuffle = False

        # replacement for distributed sampler
        distributed = dist.is_available() and dist.is_initialized()
        if distributed:
            num_replicas = dist.get_world_size()
            rank = dist.get_rank()
            self.idxs = self.idxs[rank::num_replicas]

    def __iter__(self):
        # deterministically shuffle based on epoch
        g = torch.Generator()
        g.manual_seed(self.epoch)
        if self.shuffle:
            indices = torch.randperm(len(self.idxs), generator=g).tolist()
            idxs = [self.idxs[i] for i in indices]
        else:
            idxs = self.idxs

        return self._process_fn(idxs)

    @staticmethod
    def worker_init_fn(worker_id):
        worker_info = torch.utils.data.get_worker_info()
        dataset = worker_info.dataset  # the dataset copy in this worker process
        n_workers = worker_info.num_workers
        dataset.idxs = dataset.idxs[worker_id::n_workers]

    def set_epoch(self, epoch):
        self.epoch = epoch
        return self

    def set_shuffle(self, shuffle: bool = True):
        self.shuffle = shuffle
        return self

    def __len__(self):
        raise NotImplementedError


class ClassyChunkDataset(IterableDataset):
    """
    Class representing a dataset abstraction to wrap a ChunkDataset.

    This class wraps a :class:`ChunkDataset` via the `dataset` attribute
    and configures the dataloaders needed to access the datasets.
    Transforms which need to be applied to the data should be specified in this class.
    ClassyChunkDataset can be instantiated from a configuration file as well.
    """

    def __init__(
        self,
        dataset: ChunkDataset,
        batchsize_per_replica: int,
        shuffle: bool,
        transform: Optional[Union[ClassyTransform, Callable]],
    ) -> None:
        """
        Constructor for a ClassyDataset.

        Args:
            batchsize_per_replica: Positive integer indicating batch size for each
                replica
            shuffle: Whether to shuffle between epochs
            transform: When set, transform to be applied to each sample
            num_samples: When set, this restricts the number of samples provided by
                the dataset
        """
        # Asserts:
        assert is_pos_int(
            batchsize_per_replica
        ), "batchsize_per_replica must be a positive int"
        assert isinstance(shuffle, bool), "shuffle must be a boolean"

        # Assignments:
        self.batchsize_per_replica = batchsize_per_replica
        self.shuffle = shuffle
        self.transform = transform
        self.dataset = dataset

        if self.shuffle:
            self.dataset = self.dataset.set_shuffle()

    @classmethod
    def from_config(cls, config: Dict[str, Any]) -> "ClassyDataset":
        """Instantiates a ClassyDataset from a configuration.

        Args:
            config: A configuration for the ClassyDataset.

        Returns:
            A ClassyDataset instance.
        """
        raise NotImplementedError

    @classmethod
    def parse_config(cls, config: Dict[str, Any]):
        """
        This function parses out common config options.

        Args:
            config: A dict with the following string keys -

                | *batchsize_per_replica* (int): Must be a positive int, batch size
                |    for each replica
                | *use_shuffle* (bool): Whether to enable shuffling for the dataset
                | *num_samples* (int, optional): When set, restricts the number of
                     samples in a dataset
                | *transforms*: list of tranform configurations to be applied in order

        Returns:
            A tuple containing the following variables -
                | *transform_config*: Config for the dataset transform. Can be passed to
                |    :func:`transforms.build_transform`
                | *batchsize_per_replica*: Batch size per replica
                | *shuffle*: Whether we should shuffle between epochs
                | *num_samples*: When set, restricts the number of samples in a dataset
        """
        batchsize_per_replica = config.get("batchsize_per_replica")
        shuffle = config.get("use_shuffle")
        num_samples = config.get("num_samples")
        transform_config = config.get("transforms")
        return transform_config, batchsize_per_replica, shuffle, num_samples

    def __iter__(self):
        for sample in self.dataset:
            if self.transform is not None:
                sample = self.transform(sample)
            yield sample

    def __len__(self):
        return len(self.dataset)

    def iterator(self, *args, **kwargs):
        """
        Returns an iterable which can be used to iterate over the data.

        Args:
            shuffle_seed (int, optional): Seed for the shuffle
            current_phase_id (int, optional): The epoch being fetched. Needed so that
                each epoch has a different shuffle order
        Returns:
            An iterable over the data
        """
        # TODO: Fix naming to be consistent (i.e. everyone uses epoch)
        epoch = kwargs.get("current_phase_id", 0)
        assert isinstance(epoch, int), "Epoch must be an int"

        self.dataset = self.dataset.set_epoch(epoch)

        return DataLoader(
            self,
            batch_size=self.batchsize_per_replica,
            num_workers=kwargs.get("num_workers", 0),
            pin_memory=kwargs.get("pin_memory", False),
            multiprocessing_context=kwargs.get("multiprocessing_context", None),
            worker_init_fn=self.worker_init_fn
        )

    def get_batchsize_per_replica(self):
        """
        Get the batch size per replica.

        Returns:
            The batch size for each replica.
        """
        return self.batchsize_per_replica

    def get_global_batchsize(self):
        """
        Get the global batch size, combined over all the replicas.

        Returns:
            The overall batch size of the dataset.
        """
        return self.get_batchsize_per_replica() * get_world_size()

    @staticmethod
    def worker_init_fn(worker_id):
        worker_info = torch.utils.data.get_worker_info()
        dataset = worker_info.dataset.dataset  # the dataset copy in this worker process
        n_workers = worker_info.num_workers
        dataset.idxs = dataset.idxs[worker_id::n_workers]

Discriminative learning rates for FineTuningTask

🚀 Feature

Right now we only support fine tuning by freezing the trunk weights, or training all weights together. Discriminative learning rates means we can apply different learning rates for different parts of the model, which usually leads to better performance.

Motivation

https://arxiv.org/pdf/1801.06146.pdf introduced discriminative fine-tuning in NLP. Since then it's been found to be useful in computer vision as well.

Pitch

This could be implemented in either FineTuningTask or ClassyModel. I'd rather keep ClassyModel as simple as possible and move this type of logic to the task level.

Alternatives

N/A

Additional context

N/A

Issues with Dataloader in getting started tutorial

🐛 Bug

Getting the following error, when running part (6) of the Getting Started tutorial.

To Reproduce

Steps to reproduce the behavior:
Running part (6) of the getting started tutorial.

Here is the error stack that I got it.

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/opt/conda/lib/python3.6/multiprocessing/spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "/opt/conda/lib/python3.6/multiprocessing/spawn.py", line 114, in _main
    prepare(preparation_data)
  File "/opt/conda/lib/python3.6/multiprocessing/spawn.py", line 225, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/opt/conda/lib/python3.6/multiprocessing/spawn.py", line 277, in _fixup_main_from_path
    run_name="__mp_main__")
  File "/opt/conda/lib/python3.6/runpy.py", line 263, in run_path
    pkg_name=pkg_name, script_name=fname)
  File "/opt/conda/lib/python3.6/runpy.py", line 96, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "/opt/conda/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/data/work/resnext3d/pt-version/training-scripts/classy-vison-project/custom_train.py", line 75, in <module>
    trainer.train(task)
  File "/opt/conda/lib/python3.6/site-packages/classy_vision/trainer/local_trainer.py", line 27, in train
    super().train(task)
  File "/opt/conda/lib/python3.6/site-packages/classy_vision/trainer/classy_trainer.py", line 45, in train
    task.on_phase_start()
  File "/opt/conda/lib/python3.6/site-packages/classy_vision/tasks/classification_task.py", line 1106, in on_phase_start
    self.advance_phase()
  File "/opt/conda/lib/python3.6/site-packages/classy_vision/tasks/classification_task.py", line 1008, in advance_phase
    self.create_data_iterator()
  File "/opt/conda/lib/python3.6/site-packages/classy_vision/tasks/classification_task.py", line 1061, in create_data_iterator
    self.data_iterator = iter(self.dataloaders[self.phase_type])
  File "/root/.local/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 325, in __iter__
    return _MultiProcessingDataLoaderIter(self)
  File "/root/.local/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 777, in __init__
    w.start()
  File "/opt/conda/lib/python3.6/multiprocessing/process.py", line 105, in start
    self._popen = self._Popen(self)
  File "/opt/conda/lib/python3.6/multiprocessing/context.py", line 284, in _Popen
    return Popen(process_obj)
  File "/opt/conda/lib/python3.6/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/opt/conda/lib/python3.6/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/opt/conda/lib/python3.6/multiprocessing/popen_spawn_posix.py", line 42, in _launch
    prep_data = spawn.get_preparation_data(process_obj._name)
  File "/opt/conda/lib/python3.6/multiprocessing/spawn.py", line 143, in get_preparation_data
    _check_not_importing_main()
  File "/opt/conda/lib/python3.6/multiprocessing/spawn.py", line 136, in _check_not_importing_main
    is not going to be frozen to produce an executable.''')
RuntimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.
Traceback (most recent call last):
  File "/root/.local/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 822, in _try_get_data
    data = self._data_queue.get(timeout=timeout)
  File "/opt/conda/lib/python3.6/multiprocessing/queues.py", line 104, in get
    if not self._poll(timeout):
  File "/opt/conda/lib/python3.6/multiprocessing/connection.py", line 257, in poll
    return self._poll(timeout)
  File "/opt/conda/lib/python3.6/multiprocessing/connection.py", line 414, in _poll
    r = wait([self], timeout)
  File "/opt/conda/lib/python3.6/multiprocessing/connection.py", line 911, in wait
    ready = selector.select(timeout)
  File "/opt/conda/lib/python3.6/selectors.py", line 376, in select
    fd_event_list = self._poll.poll(timeout)
  File "/root/.local/lib/python3.6/site-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
    _error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 3175) exited unexpectedly with exit code 1. Details are lost due to multiprocessing. Rerunning with num_workers=0 may give better error trace.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "custom_train.py", line 75, in <module>
    trainer.train(task)
  File "/opt/conda/lib/python3.6/site-packages/classy_vision/trainer/local_trainer.py", line 27, in train
    super().train(task)
  File "/opt/conda/lib/python3.6/site-packages/classy_vision/trainer/classy_trainer.py", line 48, in train
    task.step()
  File "/opt/conda/lib/python3.6/site-packages/classy_vision/tasks/classy_task.py", line 160, in step
    self.train_step()
  File "/opt/conda/lib/python3.6/site-packages/classy_vision/tasks/classification_task.py", line 913, in train_step
    sample = next(self.get_data_iterator())
  File "/root/.local/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 400, in __next__
    data = self._next_data()
  File "/root/.local/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 1017, in _next_data
    idx, data = self._get_data()
  File "/root/.local/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 984, in _get_data
    success, data = self._try_get_data()
  File "/root/.local/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 835, in _try_get_data
    raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str))
RuntimeError: DataLoader worker (pid(s) 3175) exited unexpectedly

I tried setting the num_workers=0, but i get an issue saying use num_workers>0.
Please let us know a way forward for this issue.

cc: @sunway513

Can't transforms when loading dataset classy_cifar10 or classy_cifar100

Steps to reproduce the behavior:

  1. Config to load dataset Cifar in file "ds_cifar.py":

from classy_vision.dataset import build_dataset, CIFARDataset
dataset_config = {
"name": "classy_cifar10",
"batchsize_per_replica": 10,
"use_shuffle": True,
"transforms": [
{
"name": "apply_transform_to_key",
"transforms": [
{"name": "ToTensor"},
{"name": "Normalize", "mean": [0.485, 0.456, 0.406], "std": [0.229, 0.224, 0.225]}
],
"key": "input"
}
],
"root":"data",
"train":True,
"download":True,
}

ds_cifar = build_dataset(dataset_config)
assert isinstance(ds_cifar, CIFARDataset) // pass
print(ds_cifar[0])

  1. Run: python ds_cifar.py
    ### ERROR:
    Files already downloaded and verified
    Traceback (most recent call last):
    File "ds_cifar.py", line 24, in
    print(ds_cifar[0])
    File "/home/photonic/anaconda3/envs/classify/lib/python3.7/site-packages/classy_vision/dataset/classy_dataset.py", line 122, in getitem
    return self.transform(sample)
    File "/home/photonic/anaconda3/envs/classify/lib/python3.7/site-packages/torchvision/transforms/transforms.py", line 61, in call
    img = t(img)
    File "/home/photonic/anaconda3/envs/classify/lib/python3.7/site-packages/classy_vision/dataset/transforms/util.py", line 76, in call
    sample
    TypeError: '<' not supported between instances of 'str' and 'int'

Environment

build from source
pytorch version: 1.6.0
classy_vision version: 0.5.0.dev

how to fix this bug?
Thank you.

Cannot launch Ray Autoscaler when running ray example

❓ Questions and Help

Hi, I'm trying to train a model with classy vision following the ray distributed example, but I'm not managing to launch the cluster properly (or at least I think this is the case)

Everything works, including the ray start cluster_config.yml command.

However, when I tail the ray logs, I'm getting a recurrent error:

2019-12-31 14:30:47,828	ERROR node_provider.py:302 -- create_instances: Max attempts (5) exceeded.
2019-12-31 14:30:47,828	ERROR autoscaler.py:349 -- Launch failed
Traceback (most recent call last):
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/ray/autoscaler/autoscaler.py", line 347, in run
    self._launch_node(config, count)
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/ray/autoscaler/autoscaler.py", line 337, in _launch_node
    }, count)
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/ray/autoscaler/aws/node_provider.py", line 230, in create_node
    self._create_node(node_config, tags, count)
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/ray/autoscaler/aws/node_provider.py", line 303, in _create_node
    raise exc
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/ray/autoscaler/aws/node_provider.py", line 290, in _create_node
    created = self.ec2_fail_fast.create_instances(**conf)
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/boto3/resources/factory.py", line 520, in do_action
    response = action(self, *args, **kwargs)
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/boto3/resources/action.py", line 83, in __call__
    response = getattr(parent.meta.client, operation_name)(**params)
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/botocore/client.py", line 324, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/botocore/client.py", line 622, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (VcpuLimitExceeded) when calling the RunInstances operation: You have requested more vCPU capacity than your current vCPU limit of 4 allows for the instance bucket that the specified instance type belongs to. Please visit http://aws.amazon.com/contact-us/ec2-request to request an adjustment to this limit.

Do I really need to contact Amazon to extend my service limits to run distributed training with classy vision and ray?

Thanks,
Miguel

Make onnx exports compatible with TensorRT

🚀 Feature

The current use of either the reshape() or view() function in the FullyConnectedHead of FullyConvolutionalLinear modules results in an onnx export that is not compatible with TensorRT.

Motivation

TensorRT is the deep learning inference runtime from NVIDIA that allows for improved performance when running inference workloads on NVIDIA GPUs. The current implementation of the FullyConnectedHead and FullyConvolutionalLinear modules use either reshape or view in a manner that is incompatible with TensorRT and therefore when trying to import a model that has been exported to onnx will result in errors stating such.

The torchvision repository had the same problem as can be seen from this issue

Pitch

Replace the use of both view() and reshape() with that of flatten().

Alternatives

Additional context

Pass task reference ParamScheculer.__call__

🚀 Feature

We should pass a reference to the task object to the optimizer + param scheduler to be able to use more complex parameter schedulers

Motivation / Pitch

Basic PyTorch learning rate schedulers such as the ReduceLROnPlateau require task information such as access to the validation loss. Currently, it is not possible to implement this in Classy Vision because there is no access to the task by the scheduler.

By adding a task reference, the user could access the local variables, or the task meters to make informed decisions on their parameter scheduling.

This would imply also giving the optimizer access to the task during the update scheduler method.

Alternatives

As far as I know, the only (hacky) way of achieving this is by using a custom hook, but hooks are not configurable via configuration file, and it doesn't make much sense to have a Parameter Scheduler as a Hook.

Let me know what you think!

Fails to create new project (FileNotFoundError)

🐛 Bug

To Reproduce

Steps to reproduce the behavior:

  1. pip3 install classy_vision
  2. classy-project my-project

Traceback (most recent call last): File "/home/nazmul/.local/bin/classy-project", line 47, in <module> shutil.copytree(template_path, destination_path) File "/usr/lib/python3.8/shutil.py", line 552, in copytree with os.scandir(src) as itr: FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/lib/python3.8/dist-packages/classy_vision/templates/synthetic'

Expected behavior

This should create a new project, according to the documentation.

Environment

  • What commands did you use to install Classy Vision (conda/pip/build from source)? pip
  • If you are building from source, which commit is it?
  • What does classy_vision.__version__ print? (If applicable): 0.4.0

Please copy and paste the output from the Pytorch
environment collection script
(or fill out the checklist below manually).

You can get the script and run it with:

wget https://raw.githubusercontent.com/pytorch/pytorch/master/torch/utils/collect_env.py
# For security purposes, please check the contents of collect_env.py before running it.
python collect_env.py
  • PyTorch Version (e.g., 1.0): 1.5.1
  • OS (e.g., Linux): Linux (Ubuntu)
  • How you installed PyTorch (conda, pip, source): pip
  • Build command you used (if compiling from source):
  • Python version: 3.8.2
  • CUDA/cuDNN version: 10.1.243
  • GPU models and configuration: GeForce GT 710
  • Nvidia driver version: 440.100
  • Any other relevant information:

Additional context

Versions of relevant libraries:
[pip3] numpy==1.19.0
[pip3] torch==1.5.1
[pip3] torchvision==0.6.1
[conda] Could not collect

Mixed precision training

🚀 Feature

Support training on 16-bit floating point

Motivation

Modern GPUs have specialized hardware for computing 16bit floating point numbers. 16 bit floating point operations are faster to compute, lower communication overhead and decrease memory usage, leading to faster training.

Pitch

Use the Amp implementation from https://github.com/NVIDIA/apex

Alternatives

N/A

Additional context

N/A

ucf 101 only train one epoch ,The program did not stop, but it was unable to enter the next training!

❓ Questions and Help

Before creating an issue, please go over our Tutorials and API Reference. If you cannot find the information you are looking for, please enquire in Classy Vision's #help slack channel before creating an issue.

/home/sucom/.conda/envs/cv_py36/bin/python /home/sucom/hdd_1T/project/video_rec_0831/ClassyVision/main_train.py
INFO:root:Classy Vision's default training script.
INFO:root:AMP disabled
INFO:root:mixup disabled
INFO:root:Synchronized Batch Normalization is disabled
INFO:root:Logging outputs to ./output_2020-08-31T11:43:29.853157
INFO:root:Logging checkpoints to ./checkpoint_101
INFO:root:Starting training on rank 0 worker. World size is 1
INFO:root:Using GPU, CUDA device index: 0
INFO:root:Starting training. Task: <classy_vision.tasks.classification_task.ClassificationTask object at 0x7ff9590a1748> initialized with config:
{
    "name": "classification_task",
    "checkpoint_folder": "./classy_checkpoint_{time.time()}",
    "checkpoint_period": 1,
    "num_epochs": 3000000,
    "loss": {
        "name": "CrossEntropyLoss"
    },
    "dataset": {
        "train": {
            "name": "ucf101",
            "split": "train",
            "batchsize_per_replica": 16,
            "use_shuffle": true,
            "num_samples": null,
            "frames_per_clip": 32,
            "step_between_clips": 1,
            "clips_per_video": 1,
            "video_dir": "/home/sucom/hdd_1T/project/video_rec/UCF-101",
            "splits_dir": "/home/sucom/hdd_1T/project/video_rec/ucfTrainTestlist",
            "metadata_file": "./ucf101_metadata.pt",
            "fold": 1,
            "transforms": {
                "video": [
                    {
                        "name": "video_default_augment",
                        "crop_size": 112,
                        "size_range": [
                            128,
                            160
                        ]
                    }
                ]
            }
        },
        "test": {
            "name": "ucf101",
            "split": "test",
            "batchsize_per_replica": 10,
            "use_shuffle": false,
            "num_samples": null,
            "frames_per_clip": 32,
            "step_between_clips": 1,
            "clips_per_video": 10,
            "video_dir": "/home/sucom/hdd_1T/project/video_rec/UCF-101",
            "splits_dir": "/home/sucom/hdd_1T/project/video_rec/ucfTrainTestlist",
            "metadata_file": "./ucf101_metadata.pt",
            "fold": 1,
            "transforms": {
                "video": [
                    {
                        "name": "video_default_no_augment",
                        "size": 128
                    }
                ]
            }
        }
    },
    "meters": {
        "accuracy": {
            "topk": [
                1,
                5
            ]
        },
        "video_accuracy": {
            "topk": [
                1,
                5
            ],
            "clips_per_video_train": 1,
            "clips_per_video_test": 10
        }
    },
    "model": {
        "name": "resnext3d",
        "frames_per_clip": 32,
        "input_planes": 3,
        "clip_crop_size": 112,
        "skip_transformation_type": "postactivated_shortcut",
        "residual_transformation_type": "basic_transformation",
        "num_blocks": [
            3,
            4,
            6,
            3
        ],
        "input_key": "video",
        "stem_name": "resnext3d_stem",
        "stem_planes": 64,
        "stem_temporal_kernel": 3,
        "stem_maxpool": false,
        "stage_planes": 64,
        "stage_temporal_kernel_basis": [
            [
                3
            ],
            [
                3
            ],
            [
                3
            ],
            [
                3
            ]
        ],
        "temporal_conv_1x1": [
            false,
            false,
            false,
            false
        ],
        "stage_temporal_stride": [
            1,
            2,
            2,
            2
        ],
        "stage_spatial_stride": [
            1,
            2,
            2,
            2
        ],
        "num_groups": 1,
        "width_per_group": 64,
        "num_classes": 101,
        "heads": [
            {
                "name": "fully_convolutional_linear",
                "unique_id": "default_head",
                "pool_size": [
                    4,
                    7,
                    7
                ],
                "activation_func": "softmax",
                "num_classes": 101,
                "fork_block": "pathway0-stage4-block2",
                "in_plane": 512
            }
        ]
    },
    "optimizer": {
        "name": "sgd",
        "param_schedulers": {
            "lr": {
                "name": "composite",
                "schedulers": [
                    {
                        "name": "linear",
                        "start_value": 0.005,
                        "end_value": 0.04
                    },
                    {
                        "name": "cosine",
                        "start_value": 0.04,
                        "end_value": 4e-05
                    }
                ],
                "lengths": [
                    0.13,
                    0.87
                ],
                "update_interval": "epoch",
                "interval_scaling": [
                    "rescaled",
                    "rescaled"
                ]
            }
        },
        "weight_decay": 0.005,
        "momentum": 0.9,
        "nesterov": true,
        "num_epochs": 3000000,
        "lr": 0.1,
        "use_larc": false,
        "larc_config": {
            "clip": true,
            "eps": 1e-08,
            "trust_coefficient": 0.02
        }
    }
}
INFO:root:Number of parameters in model: 63527717
INFO:root:FLOPs for forward pass: 150950 MFLOPs
INFO:root:Number of activations in model: 65329152
INFO:root:Approximate meters: [0] train phase 0 (0.84% done), loss: 4.8705, meters: [accuracy_meter(top_1=0.012500,top_5=0.025000), video_accuracy_meter(top_1=0.012500,top_5=0.025000)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (1.68% done), loss: 4.8350, meters: [accuracy_meter(top_1=0.025000,top_5=0.037500), video_accuracy_meter(top_1=0.025000,top_5=0.037500)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (2.52% done), loss: 4.8853, meters: [accuracy_meter(top_1=0.029167,top_5=0.050000), video_accuracy_meter(top_1=0.029167,top_5=0.050000)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (3.36% done), loss: 4.9368, meters: [accuracy_meter(top_1=0.021875,top_5=0.046875), video_accuracy_meter(top_1=0.021875,top_5=0.046875)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (4.19% done), loss: 4.9501, meters: [accuracy_meter(top_1=0.020000,top_5=0.047500), video_accuracy_meter(top_1=0.020000,top_5=0.047500)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (5.03% done), loss: 4.9432, meters: [accuracy_meter(top_1=0.018750,top_5=0.047917), video_accuracy_meter(top_1=0.018750,top_5=0.047917)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (5.87% done), loss: 4.9325, meters: [accuracy_meter(top_1=0.017857,top_5=0.060714), video_accuracy_meter(top_1=0.017857,top_5=0.060714)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (6.71% done), loss: 4.9288, meters: [accuracy_meter(top_1=0.017188,top_5=0.064062), video_accuracy_meter(top_1=0.017188,top_5=0.064062)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (7.55% done), loss: 4.9075, meters: [accuracy_meter(top_1=0.018056,top_5=0.063889), video_accuracy_meter(top_1=0.018056,top_5=0.063889)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (8.39% done), loss: 4.8897, meters: [accuracy_meter(top_1=0.018750,top_5=0.063750), video_accuracy_meter(top_1=0.018750,top_5=0.063750)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (9.23% done), loss: 4.8742, meters: [accuracy_meter(top_1=0.020455,top_5=0.072727), video_accuracy_meter(top_1=0.020455,top_5=0.072727)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (10.07% done), loss: 4.8573, meters: [accuracy_meter(top_1=0.020833,top_5=0.076042), video_accuracy_meter(top_1=0.020833,top_5=0.076042)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (10.91% done), loss: 4.8543, meters: [accuracy_meter(top_1=0.019231,top_5=0.074038), video_accuracy_meter(top_1=0.019231,top_5=0.074038)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (11.74% done), loss: 4.8426, meters: [accuracy_meter(top_1=0.019643,top_5=0.073214), video_accuracy_meter(top_1=0.019643,top_5=0.073214)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (12.58% done), loss: 4.8429, meters: [accuracy_meter(top_1=0.018333,top_5=0.072500), video_accuracy_meter(top_1=0.018333,top_5=0.072500)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (13.42% done), loss: 4.8333, meters: [accuracy_meter(top_1=0.018750,top_5=0.074219), video_accuracy_meter(top_1=0.018750,top_5=0.074219)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (14.26% done), loss: 4.8194, meters: [accuracy_meter(top_1=0.019853,top_5=0.077206), video_accuracy_meter(top_1=0.019853,top_5=0.077206)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (15.10% done), loss: 4.7999, meters: [accuracy_meter(top_1=0.020139,top_5=0.079861), video_accuracy_meter(top_1=0.020139,top_5=0.079861)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (15.94% done), loss: 4.7868, meters: [accuracy_meter(top_1=0.019737,top_5=0.085526), video_accuracy_meter(top_1=0.019737,top_5=0.085526)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (16.78% done), loss: 4.7729, meters: [accuracy_meter(top_1=0.020000,top_5=0.086875), video_accuracy_meter(top_1=0.020000,top_5=0.086875)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (17.62% done), loss: 4.7737, meters: [accuracy_meter(top_1=0.020238,top_5=0.085714), video_accuracy_meter(top_1=0.020238,top_5=0.085714)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (18.46% done), loss: 4.7648, meters: [accuracy_meter(top_1=0.020455,top_5=0.087500), video_accuracy_meter(top_1=0.020455,top_5=0.087500)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (19.30% done), loss: 4.7544, meters: [accuracy_meter(top_1=0.020109,top_5=0.090761), video_accuracy_meter(top_1=0.020109,top_5=0.090761)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (20.13% done), loss: 4.7446, meters: [accuracy_meter(top_1=0.020312,top_5=0.093229), video_accuracy_meter(top_1=0.020312,top_5=0.093229)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (20.97% done), loss: 4.7368, meters: [accuracy_meter(top_1=0.022000,top_5=0.095000), video_accuracy_meter(top_1=0.022000,top_5=0.095000)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (21.81% done), loss: 4.7282, meters: [accuracy_meter(top_1=0.023558,top_5=0.095673), video_accuracy_meter(top_1=0.023558,top_5=0.095673)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (22.65% done), loss: 4.7134, meters: [accuracy_meter(top_1=0.024074,top_5=0.098148), video_accuracy_meter(top_1=0.024074,top_5=0.098148)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (23.49% done), loss: 4.7056, meters: [accuracy_meter(top_1=0.024554,top_5=0.098661), video_accuracy_meter(top_1=0.024554,top_5=0.098661)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (24.33% done), loss: 4.6991, meters: [accuracy_meter(top_1=0.024569,top_5=0.100000), video_accuracy_meter(top_1=0.024569,top_5=0.100000)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (25.17% done), loss: 4.6883, meters: [accuracy_meter(top_1=0.026250,top_5=0.101667), video_accuracy_meter(top_1=0.026250,top_5=0.101667)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (26.01% done), loss: 4.6800, meters: [accuracy_meter(top_1=0.027016,top_5=0.103226), video_accuracy_meter(top_1=0.027016,top_5=0.103226)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (26.85% done), loss: 4.6754, meters: [accuracy_meter(top_1=0.026563,top_5=0.103125), video_accuracy_meter(top_1=0.026563,top_5=0.103125)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (27.68% done), loss: 4.6670, meters: [accuracy_meter(top_1=0.028030,top_5=0.104924), video_accuracy_meter(top_1=0.028030,top_5=0.104924)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (28.52% done), loss: 4.6583, meters: [accuracy_meter(top_1=0.029044,top_5=0.105147), video_accuracy_meter(top_1=0.029044,top_5=0.105147)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (29.36% done), loss: 4.6521, meters: [accuracy_meter(top_1=0.028571,top_5=0.106071), video_accuracy_meter(top_1=0.028571,top_5=0.106071)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (30.20% done), loss: 4.6461, meters: [accuracy_meter(top_1=0.029514,top_5=0.107292), video_accuracy_meter(top_1=0.029514,top_5=0.107292)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (31.04% done), loss: 4.6366, meters: [accuracy_meter(top_1=0.030743,top_5=0.109122), video_accuracy_meter(top_1=0.030743,top_5=0.109122)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (31.88% done), loss: 4.6297, meters: [accuracy_meter(top_1=0.029934,top_5=0.109211), video_accuracy_meter(top_1=0.029934,top_5=0.109211)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (32.72% done), loss: 4.6226, meters: [accuracy_meter(top_1=0.030449,top_5=0.109936), video_accuracy_meter(top_1=0.030449,top_5=0.109936)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (33.56% done), loss: 4.6202, meters: [accuracy_meter(top_1=0.030000,top_5=0.111562), video_accuracy_meter(top_1=0.030000,top_5=0.111562)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (34.40% done), loss: 4.6139, meters: [accuracy_meter(top_1=0.029878,top_5=0.113415), video_accuracy_meter(top_1=0.029878,top_5=0.113415)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (35.23% done), loss: 4.6086, meters: [accuracy_meter(top_1=0.030060,top_5=0.113988), video_accuracy_meter(top_1=0.030060,top_5=0.113988)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (36.07% done), loss: 4.6008, meters: [accuracy_meter(top_1=0.030523,top_5=0.115407), video_accuracy_meter(top_1=0.030523,top_5=0.115407)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (36.91% done), loss: 4.5942, meters: [accuracy_meter(top_1=0.031534,top_5=0.117330), video_accuracy_meter(top_1=0.031534,top_5=0.117330)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (37.75% done), loss: 4.5920, meters: [accuracy_meter(top_1=0.031667,top_5=0.118889), video_accuracy_meter(top_1=0.031667,top_5=0.118889)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (38.59% done), loss: 4.5869, meters: [accuracy_meter(top_1=0.032065,top_5=0.119837), video_accuracy_meter(top_1=0.032065,top_5=0.119837)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (39.43% done), loss: 4.5826, meters: [accuracy_meter(top_1=0.032181,top_5=0.120213), video_accuracy_meter(top_1=0.032181,top_5=0.120213)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (40.27% done), loss: 4.5796, meters: [accuracy_meter(top_1=0.032031,top_5=0.120573), video_accuracy_meter(top_1=0.032031,top_5=0.120573)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (41.11% done), loss: 4.5736, meters: [accuracy_meter(top_1=0.032908,top_5=0.122194), video_accuracy_meter(top_1=0.032908,top_5=0.122194)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (41.95% done), loss: 4.5696, meters: [accuracy_meter(top_1=0.033000,top_5=0.122250), video_accuracy_meter(top_1=0.033000,top_5=0.122250)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (42.79% done), loss: 4.5642, meters: [accuracy_meter(top_1=0.033578,top_5=0.123284), video_accuracy_meter(top_1=0.033578,top_5=0.123284)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (43.62% done), loss: 4.5570, meters: [accuracy_meter(top_1=0.033894,top_5=0.124519), video_accuracy_meter(top_1=0.033894,top_5=0.124519)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (44.46% done), loss: 4.5494, meters: [accuracy_meter(top_1=0.034670,top_5=0.125943), video_accuracy_meter(top_1=0.034670,top_5=0.125943)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (45.30% done), loss: 4.5461, meters: [accuracy_meter(top_1=0.034722,top_5=0.126157), video_accuracy_meter(top_1=0.034722,top_5=0.126157)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (46.14% done), loss: 4.5412, meters: [accuracy_meter(top_1=0.035000,top_5=0.127500), video_accuracy_meter(top_1=0.035000,top_5=0.127500)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (46.98% done), loss: 4.5348, meters: [accuracy_meter(top_1=0.035268,top_5=0.129241), video_accuracy_meter(top_1=0.035268,top_5=0.129241)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (47.82% done), loss: 4.5319, meters: [accuracy_meter(top_1=0.035746,top_5=0.130482), video_accuracy_meter(top_1=0.035746,top_5=0.130482)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (48.66% done), loss: 4.5266, meters: [accuracy_meter(top_1=0.036207,top_5=0.131681), video_accuracy_meter(top_1=0.036207,top_5=0.131681)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (49.50% done), loss: 4.5224, meters: [accuracy_meter(top_1=0.036017,top_5=0.132203), video_accuracy_meter(top_1=0.036017,top_5=0.132203)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (50.34% done), loss: 4.5178, meters: [accuracy_meter(top_1=0.036458,top_5=0.133958), video_accuracy_meter(top_1=0.036458,top_5=0.133958)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (51.17% done), loss: 4.5145, meters: [accuracy_meter(top_1=0.036885,top_5=0.135246), video_accuracy_meter(top_1=0.036885,top_5=0.135246)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (52.01% done), loss: 4.5121, meters: [accuracy_meter(top_1=0.036895,top_5=0.135484), video_accuracy_meter(top_1=0.036895,top_5=0.135484)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (52.85% done), loss: 4.5090, meters: [accuracy_meter(top_1=0.036706,top_5=0.135714), video_accuracy_meter(top_1=0.036706,top_5=0.135714)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (53.69% done), loss: 4.5047, meters: [accuracy_meter(top_1=0.036523,top_5=0.136719), video_accuracy_meter(top_1=0.036523,top_5=0.136719)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (54.53% done), loss: 4.5009, meters: [accuracy_meter(top_1=0.037115,top_5=0.136731), video_accuracy_meter(top_1=0.037115,top_5=0.136731)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (55.37% done), loss: 4.4985, meters: [accuracy_meter(top_1=0.037311,top_5=0.136364), video_accuracy_meter(top_1=0.037311,top_5=0.136364)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (56.21% done), loss: 4.5002, meters: [accuracy_meter(top_1=0.037313,top_5=0.135075), video_accuracy_meter(top_1=0.037313,top_5=0.135075)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (57.05% done), loss: 4.4956, meters: [accuracy_meter(top_1=0.037316,top_5=0.136397), video_accuracy_meter(top_1=0.037316,top_5=0.136397)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (57.89% done), loss: 4.4899, meters: [accuracy_meter(top_1=0.037500,top_5=0.137319), video_accuracy_meter(top_1=0.037500,top_5=0.137319)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (58.72% done), loss: 4.4852, meters: [accuracy_meter(top_1=0.037500,top_5=0.138571), video_accuracy_meter(top_1=0.037500,top_5=0.138571)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (59.56% done), loss: 4.4816, meters: [accuracy_meter(top_1=0.037676,top_5=0.139437), video_accuracy_meter(top_1=0.037676,top_5=0.139437)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (60.40% done), loss: 4.4760, meters: [accuracy_meter(top_1=0.037847,top_5=0.140278), video_accuracy_meter(top_1=0.037847,top_5=0.140278)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (61.24% done), loss: 4.4739, meters: [accuracy_meter(top_1=0.037842,top_5=0.140753), video_accuracy_meter(top_1=0.037842,top_5=0.140753)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (62.08% done), loss: 4.4686, meters: [accuracy_meter(top_1=0.038851,top_5=0.142568), video_accuracy_meter(top_1=0.038851,top_5=0.142568)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (62.92% done), loss: 4.4676, meters: [accuracy_meter(top_1=0.039333,top_5=0.142833), video_accuracy_meter(top_1=0.039333,top_5=0.142833)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (63.76% done), loss: 4.4645, meters: [accuracy_meter(top_1=0.039474,top_5=0.142928), video_accuracy_meter(top_1=0.039474,top_5=0.142928)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (64.60% done), loss: 4.4592, meters: [accuracy_meter(top_1=0.039610,top_5=0.143669), video_accuracy_meter(top_1=0.039610,top_5=0.143669)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (65.44% done), loss: 4.4545, meters: [accuracy_meter(top_1=0.040224,top_5=0.145032), video_accuracy_meter(top_1=0.040224,top_5=0.145032)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (66.28% done), loss: 4.4493, meters: [accuracy_meter(top_1=0.041456,top_5=0.146519), video_accuracy_meter(top_1=0.041456,top_5=0.146519)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (67.11% done), loss: 4.4460, meters: [accuracy_meter(top_1=0.042344,top_5=0.147344), video_accuracy_meter(top_1=0.042344,top_5=0.147344)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (67.95% done), loss: 4.4432, meters: [accuracy_meter(top_1=0.042593,top_5=0.148611), video_accuracy_meter(top_1=0.042593,top_5=0.148611)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (68.79% done), loss: 4.4393, meters: [accuracy_meter(top_1=0.042683,top_5=0.148780), video_accuracy_meter(top_1=0.042683,top_5=0.148780)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (69.63% done), loss: 4.4351, meters: [accuracy_meter(top_1=0.043373,top_5=0.149849), video_accuracy_meter(top_1=0.043373,top_5=0.149849)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (70.47% done), loss: 4.4336, meters: [accuracy_meter(top_1=0.043750,top_5=0.150446), video_accuracy_meter(top_1=0.043750,top_5=0.150446)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (71.31% done), loss: 4.4304, meters: [accuracy_meter(top_1=0.044118,top_5=0.150882), video_accuracy_meter(top_1=0.044118,top_5=0.150882)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (72.15% done), loss: 4.4279, meters: [accuracy_meter(top_1=0.044477,top_5=0.151163), video_accuracy_meter(top_1=0.044477,top_5=0.151163)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (72.99% done), loss: 4.4263, meters: [accuracy_meter(top_1=0.044253,top_5=0.151580), video_accuracy_meter(top_1=0.044253,top_5=0.151580)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (73.83% done), loss: 4.4222, meters: [accuracy_meter(top_1=0.044886,top_5=0.152415), video_accuracy_meter(top_1=0.044886,top_5=0.152415)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (74.66% done), loss: 4.4202, meters: [accuracy_meter(top_1=0.044944,top_5=0.152949), video_accuracy_meter(top_1=0.044944,top_5=0.152949)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (75.50% done), loss: 4.4174, meters: [accuracy_meter(top_1=0.044861,top_5=0.153889), video_accuracy_meter(top_1=0.044861,top_5=0.153889)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (76.34% done), loss: 4.4140, meters: [accuracy_meter(top_1=0.044780,top_5=0.154808), video_accuracy_meter(top_1=0.044780,top_5=0.154808)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (77.18% done), loss: 4.4097, meters: [accuracy_meter(top_1=0.045380,top_5=0.155978), video_accuracy_meter(top_1=0.045380,top_5=0.155978)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (78.02% done), loss: 4.4063, meters: [accuracy_meter(top_1=0.045699,top_5=0.156989), video_accuracy_meter(top_1=0.045699,top_5=0.156989)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (78.86% done), loss: 4.4029, meters: [accuracy_meter(top_1=0.046676,top_5=0.158378), video_accuracy_meter(top_1=0.046676,top_5=0.158378)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (79.70% done), loss: 4.3999, meters: [accuracy_meter(top_1=0.046579,top_5=0.158947), video_accuracy_meter(top_1=0.046579,top_5=0.158947)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (80.54% done), loss: 4.3972, meters: [accuracy_meter(top_1=0.046354,top_5=0.158854), video_accuracy_meter(top_1=0.046354,top_5=0.158854)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (81.38% done), loss: 4.3947, meters: [accuracy_meter(top_1=0.046392,top_5=0.159794), video_accuracy_meter(top_1=0.046392,top_5=0.159794)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (82.21% done), loss: 4.3930, meters: [accuracy_meter(top_1=0.046556,top_5=0.159949), video_accuracy_meter(top_1=0.046556,top_5=0.159949)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (83.05% done), loss: 4.3905, meters: [accuracy_meter(top_1=0.046338,top_5=0.160732), video_accuracy_meter(top_1=0.046338,top_5=0.160732)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (83.89% done), loss: 4.3892, meters: [accuracy_meter(top_1=0.046000,top_5=0.160875), video_accuracy_meter(top_1=0.046000,top_5=0.160875)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (84.73% done), loss: 4.3870, meters: [accuracy_meter(top_1=0.046287,top_5=0.161510), video_accuracy_meter(top_1=0.046287,top_5=0.161510)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (85.57% done), loss: 4.3851, meters: [accuracy_meter(top_1=0.046078,top_5=0.161887), video_accuracy_meter(top_1=0.046078,top_5=0.161887)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (86.41% done), loss: 4.3797, meters: [accuracy_meter(top_1=0.046723,top_5=0.163956), video_accuracy_meter(top_1=0.046723,top_5=0.163956)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (87.25% done), loss: 4.3759, meters: [accuracy_meter(top_1=0.046995,top_5=0.164663), video_accuracy_meter(top_1=0.046995,top_5=0.164663)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (88.09% done), loss: 4.3731, meters: [accuracy_meter(top_1=0.047381,top_5=0.165000), video_accuracy_meter(top_1=0.047381,top_5=0.165000)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (88.93% done), loss: 4.3693, meters: [accuracy_meter(top_1=0.047524,top_5=0.165920), video_accuracy_meter(top_1=0.047524,top_5=0.165920)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (89.77% done), loss: 4.3674, meters: [accuracy_meter(top_1=0.047897,top_5=0.166355), video_accuracy_meter(top_1=0.047897,top_5=0.166355)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (90.60% done), loss: 4.3626, meters: [accuracy_meter(top_1=0.048264,top_5=0.167245), video_accuracy_meter(top_1=0.048264,top_5=0.167245)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (91.44% done), loss: 4.3601, meters: [accuracy_meter(top_1=0.048050,top_5=0.167317), video_accuracy_meter(top_1=0.048050,top_5=0.167317)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (92.28% done), loss: 4.3575, meters: [accuracy_meter(top_1=0.048409,top_5=0.167614), video_accuracy_meter(top_1=0.048409,top_5=0.167614)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (93.12% done), loss: 4.3543, meters: [accuracy_meter(top_1=0.048986,top_5=0.169032), video_accuracy_meter(top_1=0.048986,top_5=0.169032)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (93.96% done), loss: 4.3506, meters: [accuracy_meter(top_1=0.049442,top_5=0.169866), video_accuracy_meter(top_1=0.049442,top_5=0.169866)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (94.80% done), loss: 4.3464, meters: [accuracy_meter(top_1=0.050221,top_5=0.171018), video_accuracy_meter(top_1=0.050221,top_5=0.171018)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (95.64% done), loss: 4.3408, meters: [accuracy_meter(top_1=0.051096,top_5=0.172807), video_accuracy_meter(top_1=0.051096,top_5=0.172807)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (96.48% done), loss: 4.3366, meters: [accuracy_meter(top_1=0.051848,top_5=0.174565), video_accuracy_meter(top_1=0.051848,top_5=0.174565)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (97.32% done), loss: 4.3333, meters: [accuracy_meter(top_1=0.052263,top_5=0.175216), video_accuracy_meter(top_1=0.052263,top_5=0.175216)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (98.15% done), loss: 4.3321, meters: [accuracy_meter(top_1=0.052030,top_5=0.175427), video_accuracy_meter(top_1=0.052030,top_5=0.175427)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (98.99% done), loss: 4.3299, meters: [accuracy_meter(top_1=0.052331,top_5=0.176377), video_accuracy_meter(top_1=0.052331,top_5=0.176377)], lr: 0.0050
INFO:root:Approximate meters: [0] train phase 0 (99.83% done), loss: 4.3267, meters: [accuracy_meter(top_1=0.052521,top_5=0.176891), video_accuracy_meter(top_1=0.052521,top_5=0.176891)], lr: 0.0050
INFO:root:Synced meters: [0] train phase 0 (100.00% done), loss: 4.3269, meters: [accuracy_meter(top_1=0.052538,top_5=0.176804), video_accuracy_meter(top_1=0.052538,top_5=0.176804)], lr: 0.0050, processed batches: 596
INFO:root:Plotting to Tensorboard for train phase 0
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Summary name Learning Rate/train is illegal; using Learning_Rate/train instead.
INFO:root:Done plotting to Tensorboard
INFO:root:Saving checkpoint to './checkpoint_101'...

I followed your advice and I wait for a long time , The program did not stop, but it was unable to enter the next training can you help me! @mannatsingh

how to train FixResNeXt-101 32x*d?

❓ Questions and Help

Before creating an issue, please go over our Tutorials and API Reference. If you cannot find the information you are looking for, please enquire in Classy Vision's #help slack channel before creating an issue.
I want to train FixResNeXt-101 32x48d or EfficientNet-B* what should I do? thanks a lot

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.