Giter VIP home page Giter VIP logo

Comments (6)

xvjiarui avatar xvjiarui commented on May 28, 2024

Sorry for the late reply.
Could you please provide the error message?
The training procedure should be the same as mmdetection.

from gcnet.

aravind3134 avatar aravind3134 commented on May 28, 2024

I tried to run a config file changing the data location.

In my case, the number of classes are only 2. I also have to change the name of the classes. I think I am getting error only because of it.

Please let me know how to do it. What should be changed?

As of now, I get the following index error:

Traceback (most recent call last):
Traceback (most recent call last):
File "./tools/train.py", line 103, in
File "./tools/train.py", line 103, in
main()
main()
File "./tools/train.py", line 99, in main
File "./tools/train.py", line 99, in main
logger=logger)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/mmdet-0.6.0+a9fcc88-py3.6.egg/mmdet/apis/train.py", line 60, in train_detector
logger=logger)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/mmdet-0.6.0+a9fcc88-py3.6.egg/mmdet/apis/train.py", line 60, in train_detector
_dist_train(model, dataset, cfg, validate=validate)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/mmdet-0.6.0+a9fcc88-py3.6.egg/mmdet/apis/train.py", line 189, in _dist_train
_dist_train(model, dataset, cfg, validate=validate)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/mmdet-0.6.0+a9fcc88-py3.6.egg/mmdet/apis/train.py", line 189, in _dist_train
runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/mmcv-0.2.14-py3.6-linux-x86_64.egg/mmcv/runner/runner.py", line 358, in run
runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/mmcv-0.2.14-py3.6-linux-x86_64.egg/mmcv/runner/runner.py", line 358, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/mmcv-0.2.14-py3.6-linux-x86_64.egg/mmcv/runner/runner.py", line 260, in train
epoch_runner(data_loaders[i], **kwargs)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/mmcv-0.2.14-py3.6-linux-x86_64.egg/mmcv/runner/runner.py", line 260, in train
for i, data_batch in enumerate(data_loader):
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 582, in next
for i, data_batch in enumerate(data_loader):
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 582, in next
return self._process_next_batch(batch)
return self._process_next_batch(batch)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 608, in _process_next_batch
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 608, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
IndexError: Traceback (most recent call last):

Thanks

from gcnet.

xvjiarui avatar xvjiarui commented on May 28, 2024

It seems that there is some problem with your data loader.
I suggest you use single process to debug your code, e.g. 1 gpu only, so you could add breakpoint inside your code.

from gcnet.

aravind3134 avatar aravind3134 commented on May 28, 2024

Hey, Can you please tell me the changes required to successfully train a custom data set created in COCO data set format with GCNet?

from gcnet.

xvjiarui avatar xvjiarui commented on May 28, 2024

I think there are two workarounds. Either of them should be fine.

  1. convert your data into exactly the same format as COCO annotation.
  2. follow this to create your own dataset

from gcnet.

aravind3134 avatar aravind3134 commented on May 28, 2024

Hey, I am trying to run my own data in same format as COCO dataset and use one of the configuration files to run training. As my data doesn't have segmantation attribute, I tried to run the my dataset and coco dataset with the setting 'with_mask' as 'False' in the config file. Do I need to change something else in the configuration file to make it work?

I am using the config file in this location: configs/gcnet/r50/mask_rcnn_r50_fpn_2x.py

Error:
Traceback (most recent call last): File "./tools/train.py", line 106, in <module> main() File "./tools/train.py", line 101, in main logger=logger) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/mmdet-0.6.0+a9fcc88-py3.6.egg/mmdet/apis/train.py", line 65, in train_detector _dist_train(model, dataset, cfg, validate=validate) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/mmdet-0.6.0+a9fcc88-py3.6.egg/mmdet/apis/train.py", line 201, in _dist_train runner.run(data_loaders, cfg.workflow, cfg.total_epochs) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/mmcv-0.2.14-py3.6-linux-x86_64.egg/mmcv/runner/runner.py", line 361, in run epoch_runner(data_loaders[i], **kwargs) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/mmcv-0.2.14-py3.6-linux-x86_64.egg/mmcv/runner/runner.py", line 264, in train self.model, data_batch, train_mode=True, **kwargs) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/mmdet-0.6.0+a9fcc88-py3.6.egg/mmdet/apis/train.py", line 44, in batch_processor losses = model(**data) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__ result = self.forward(*input, **kwargs) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/mmcv-0.2.14-py3.6-linux-x86_64.egg/mmcv/parallel/distributed.py", line 50, in forward return self.module(*inputs[0], **kwargs[0]) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__ result = self.forward(*input, **kwargs) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/mmdet-0.6.0+a9fcc88-py3.6.egg/mmdet/core/fp16/decorators.py", line 49, in new_func return old_func(*args, **kwargs) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/mmdet-0.6.0+a9fcc88-py3.6.egg/mmdet/models/detectors/base.py", line 86, in forward return self.forward_train(img, img_meta, **kwargs) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/mmdet-0.6.0+a9fcc88-py3.6.egg/mmdet/models/detectors/two_stage.py", line 183, in forward_train sampling_results, gt_masks, self.train_cfg.rcnn) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/mmdet-0.6.0+a9fcc88-py3.6.egg/mmdet/models/mask_heads/fcn_mask_head.py", line 112, in get_target gt_masks, rcnn_train_cfg) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/mmdet-0.6.0+a9fcc88-py3.6.egg/mmdet/core/mask/mask_target.py", line 10, in mask_target pos_assigned_gt_inds_list, gt_masks_list, cfg_list) TypeError: 'NoneType' object is not iterable Traceback (most recent call last): File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/torch/distributed/launch.py", line 235, in <module> main() File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/torch/distributed/launch.py", line 231, in main cmd=process.args) subprocess.CalledProcessError: Command '['/home/ubuntu/anaconda3/envs/tensorflow_p36/bin/python', '-u', './tools/train.py', '--local_rank=0', 'configs/gcnet/r50/mask_rcnn_r50_fpn_2x.py', '--launcher', 'pytorch']' returned non-zero exit status 1.

from gcnet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.