Comments (8)
Thank you for your issue.
At present, distributed mode is needed when searching even if only one gpu is used. It is hacky and we are refactoring the search part. The new version will no longer have this problem.
from mmrazor.
这个问题现在的版本解决了吗?我也遇到一样的问题
from mmrazor.
You can avoid this by trying distributed mode.
Plus, using English is more appreciated for better community discussion around the world.
from mmrazor.
where to do the setup you said
from mmrazor.
where to do the setup you said
You can set the job launcher to one of pytorch
, slurm
or mpi
(ref to here ) to use distributed mode.
from mmrazor.
$ python ./tools/mmcls/search_mmcls.py \
> configs/pruning/autoslim/autoslim_mbv2_search_8xb1024_ci10.py \
> output/epoch_50.pth \
> --work-dir output \
> --launcher pytorch
/home/tanghuayang/venv_torch/lib/python3.6/site-packages/mmrazor/utils/setup_env.py:33: UserWarning: Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
f'Setting OMP_NUM_THREADS environment variable for each process '
/home/tanghuayang/venv_torch/lib/python3.6/site-packages/mmrazor/utils/setup_env.py:43: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
f'Setting MKL_NUM_THREADS environment variable for each process '
Traceback (most recent call last):
File "./tools/mmcls/search_mmcls.py", line 181, in <module>
main()
File "./tools/mmcls/search_mmcls.py", line 99, in main
init_dist(args.launcher, **cfg.dist_params)
File "/home/tanghuayang/venv_torch/lib64/python3.6/site-packages/mmcv/runner/dist_utils.py", line 18, in init_dist
_init_dist_pytorch(backend, **kwargs)
File "/home/tanghuayang/venv_torch/lib64/python3.6/site-packages/mmcv/runner/dist_utils.py", line 29, in _init_dist_pytorch
rank = int(os.environ['RANK'])
File "/usr/lib64/python3.6/os.py", line 669, in __getitem__
raise KeyError(key) from None
KeyError: 'RANK'
Is it necessary to configure cfg.dist_params? And, how to configure it?
from mmrazor.
$ python ./tools/mmcls/search_mmcls.py \ > configs/pruning/autoslim/autoslim_mbv2_search_8xb1024_ci10.py \ > output/epoch_50.pth \ > --work-dir output \ > --launcher pytorch /home/tanghuayang/venv_torch/lib/python3.6/site-packages/mmrazor/utils/setup_env.py:33: UserWarning: Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. f'Setting OMP_NUM_THREADS environment variable for each process ' /home/tanghuayang/venv_torch/lib/python3.6/site-packages/mmrazor/utils/setup_env.py:43: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. f'Setting MKL_NUM_THREADS environment variable for each process ' Traceback (most recent call last): File "./tools/mmcls/search_mmcls.py", line 181, in <module> main() File "./tools/mmcls/search_mmcls.py", line 99, in main init_dist(args.launcher, **cfg.dist_params) File "/home/tanghuayang/venv_torch/lib64/python3.6/site-packages/mmcv/runner/dist_utils.py", line 18, in init_dist _init_dist_pytorch(backend, **kwargs) File "/home/tanghuayang/venv_torch/lib64/python3.6/site-packages/mmcv/runner/dist_utils.py", line 29, in _init_dist_pytorch rank = int(os.environ['RANK']) File "/usr/lib64/python3.6/os.py", line 669, in __getitem__ raise KeyError(key) from None KeyError: 'RANK'
Is it necessary to configure cfg.dist_params? And, how to configure it?
it's runing, use the following command:
$ RANK=0 WORLD_SIZE=1 MASTER_ADDR=127.0.0.1 MASTER_PORT=1692 python ./tools/mmcls/search_mmcls.py \
configs/pruning/autoslim/autoslim_mbv2_search_8xb1024_ci10.py \
output/epoch_50.pth \
--work-dir output \
--launcher pytorch
from mmrazor.
$ python ./tools/mmcls/search_mmcls.py \ > configs/pruning/autoslim/autoslim_mbv2_search_8xb1024_ci10.py \ > output/epoch_50.pth \ > --work-dir output \ > --launcher pytorch /home/tanghuayang/venv_torch/lib/python3.6/site-packages/mmrazor/utils/setup_env.py:33: UserWarning: Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. f'Setting OMP_NUM_THREADS environment variable for each process ' /home/tanghuayang/venv_torch/lib/python3.6/site-packages/mmrazor/utils/setup_env.py:43: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. f'Setting MKL_NUM_THREADS environment variable for each process ' Traceback (most recent call last): File "./tools/mmcls/search_mmcls.py", line 181, in <module> main() File "./tools/mmcls/search_mmcls.py", line 99, in main init_dist(args.launcher, **cfg.dist_params) File "/home/tanghuayang/venv_torch/lib64/python3.6/site-packages/mmcv/runner/dist_utils.py", line 18, in init_dist _init_dist_pytorch(backend, **kwargs) File "/home/tanghuayang/venv_torch/lib64/python3.6/site-packages/mmcv/runner/dist_utils.py", line 29, in _init_dist_pytorch rank = int(os.environ['RANK']) File "/usr/lib64/python3.6/os.py", line 669, in __getitem__ raise KeyError(key) from None KeyError: 'RANK'
Is it necessary to configure cfg.dist_params? And, how to configure it?
it's runing, use the following command:
$ RANK=0 WORLD_SIZE=1 MASTER_ADDR=127.0.0.1 MASTER_PORT=1692 python ./tools/mmcls/search_mmcls.py \ configs/pruning/autoslim/autoslim_mbv2_search_8xb1024_ci10.py \ output/epoch_50.pth \ --work-dir output \ --launcher pytorch
but, how to write these configuration parameters into cfg.dist_params?
from mmrazor.
Related Issues (20)
- I can't reproduce dfad results
- How to get started??
- [Bug] TypeError: 'NoneType' object is not iterable
- Try to reproduce CWD in VOC data set
- [Bug] (suggested temporary fix) Pytorch >= 2 causes mmrazor.engine to fail HOT 4
- [Bug] (suggested fix) `nn.Parameter` are not added to root after being traced in `mmrazor.models.task_utils.tracer.fx.custom_tracer.build_graphmodule()` HOT 2
- [Bug] (suggested fix) `mmrazor.models.algorithms.quantization.mm_architecture.MMArchitectureQuant.sync_qparams()` fails if there are modules present in other modes but not in forward `mode='tensor'` HOT 4
- I want to obtain the current epoch value and associate it with the custom distillation loss
- cannot use recorder to obtain panoptic_head info from mask2former
- [Bug] `mmrazor.engine.runner.quantization_loops.QATValLoop` calls `after_val_epoch` hook twice with different keys, causing `mmengine.hooks.checkpoint_hook._save_best_checkpoint()` to fail with `KeyError` for the `save_best` config
- [Bug] Custom Distillation MMSeg CWD loss nan problem
- When I use methodoutputs to access the results of assigner, I only obtain one sample
- Regarding tables and accuracy
- [Bug] (suggested fix) `mmrazor.models.algorithms.mm_architecture.MMArchitectureQuant.get_deploy_model()` fails if `predict` mode lacks nodes from the `model.quantizer.tracer.skipped_methods` configuration, but the architecture `quantizer.prepare(fp32_model)` has these nodes. HOT 4
- Is this a dead project ? HOT 1
- 我在用mmrazor通过yolov5-x蒸馏yolov5-s时候遇到了问题 HOT 1
- No Sign of activation quantization with QAT HOT 1
- MAP is stucked at 0 for Mobilenet V2 SSD QAT without pretrained model [Bug]
- [Docs] A100算力加持!书生大模型实战营第3期全面升级,趣味闯关模式等你开启
- Missing keys after RTMDET knowledge distillation HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mmrazor.