chengyangfu / pytorch-vgg-cifar10 Goto Github PK

View Code? Open in Web Editor NEW

342.0 342.0 117.0 214 KB

This is the PyTorch implementation of VGG network trained on CIFAR10 dataset

License: MIT License

Python 12.84% Jupyter Notebook 86.63% Shell 0.53%

pytorch-vgg-cifar10's Issues

How is the data divided among workers

Hello, I want to know if the data is divided equally among all workers?

Is your code based on a code template?

I once saw a code on AlexNet that is very similar to your coding style（https://github.com/jiecaoyu/pytorch_imagenet）...
May I ask that is your code based on a standard code template?

Question regarding VGG cfg

Sorry I have a stupid question.
in the cfg: [64, 'M', 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'],

what 'M' stands for?

Should it be decay by 2?

pytorch-vgg-cifar10/main.py

Line 271 in 2a41b25

"""Sets the learning rate to the initial LR decayed by 10 every 30 epochs"""

I need to train this model (VGG) using some oversampling technique in the Cifar-10 database. But I do not know how to do it in Pytorch.
I want to simulate a dataset of real world, since in the real world the classes are unbalanced. However, I need to adjust the network to learn the unbalanced classes. So first I need to simulate the problem of class imbalance at the dataset, because CIFAR-10 is a balanced dataset. And then apply some oversampling technique. Could you give me an example?

which pytorch version did you use to train this model? i am trying to do torch. load on my CPU but even after i give map_location cpu, it still gives pytorch not compiled for CUDA enabled assertion error

What is the model_best.pth file's model?

I need Vgg16 pretrained model.
but there is error like this.

RuntimeError: Error(s) in loading state_dict for DataParallel:
Missing key(s) in state_dict: "module.features.0.bias", "module.features.0.weight", "module.features.2.bias", "module.features.2.weight", "module.features.5.bias", "module.features.5.weight", "module.features.7.bias", "module.features.7.weight", "module.features.10.bias", "module.features.10.weight", "module.features.12.bias", "module.features.12.weight", "module.features.14.bias", "module.features.14.weight", "module.features.17.bias", "module.features.17.weight", "module.features.19.bias", "module.features.19.weight", "module.features.21.bias", "module.features.21.weight", "module.features.24.bias", "module.features.24.weight", "module.features.26.bias", "module.features.26.weight", "module.features.28.bias", "module.features.28.weight", "module.classifier.1.bias", "module.classifier.1.weight", "module.classifier.4.bias", "module.classifier.4.weight", "module.classifier.6.bias", "module.classifier.6.weight".
Unexpected key(s) in state_dict: "state_dict", "best_prec1", "epoch".

What is the exact model?

The version of Pytorch and torchvision

Please pardon my poor English.
And I wonder what's the version of pytorch and torchvision.
I have met some error as follow:

RuntimeError: not allowed to set torch.backends.cudnn flags after disable_global_flags; please use flags() context manager instead

I have't figure out what's the cause ...

I run your code and always nan loss, can you help me?

I just run with ./run.sh and got nan loss after a few steps. Here is the printed log:

(base) root@For-Judy-And-Ian:~/pytorchProjects/pytorch-vgg-cifar10-master# ./run.sh
python main.py --arch=vgg11 --save-dir=save_vgg11 |& tee -a log_vgg11
Files already downloaded and verified
Epoch: [0 ][ 0 /391] Time 0.831 (0.831) Data 0.190 (0.190) Loss 2.3037 (2.3037) Prec@1 10.938 (10.938)
Epoch: [0 ][20 /391] Time 0.018 (0.052) Data 0.000 (0.009) Loss 2.2982 (2.3029) Prec@1 9.375 (9.487)
Epoch: [0 ][40 /391] Time 0.012 (0.035) Data 0.000 (0.005) Loss 2.2928 (2.3018) Prec@1 12.500 (9.546)
Epoch: [0 ][60 /391] Time 0.012 (0.030) Data 0.000 (0.003) Loss 2.2685 (2.2970) Prec@1 15.625 (10.720)
Epoch: [0 ][80 /391] Time 0.012 (0.027) Data 0.000 (0.002) Loss 2.1417 (2.2787) Prec@1 21.875 (11.960)
Epoch: [0 ][100/391] Time 0.016 (0.026) Data 0.000 (0.002) Loss 2.1417 (2.2518) Prec@1 22.656 (13.134)
Epoch: [0 ][120/391] Time 0.029 (0.024) Data 0.000 (0.002) Loss 1.9975 (2.2189) Prec@1 21.094 (14.463)
Epoch: [0 ][140/391] Time 0.028 (0.024) Data 0.000 (0.002) Loss 2.0889 (2.1959) Prec@1 26.562 (15.459)
Epoch: [0 ][160/391] Time 0.018 (0.023) Data 0.000 (0.001) Loss 2.0179 (2.1856) Prec@1 21.875 (16.193)
Epoch: [0 ][180/391] Time 0.012 (0.023) Data 0.000 (0.001) Loss 1.9825 (2.1645) Prec@1 25.000 (16.894)
Epoch: [0 ][200/391] Time 0.012 (0.022) Data 0.000 (0.001) Loss 1.8724 (2.1434) Prec@1 27.344 (17.623)
Epoch: [0 ][220/391] Time 0.012 (0.022) Data 0.000 (0.001) Loss 2.0147 (2.1258) Prec@1 25.000 (18.121)
Epoch: [0 ][240/391] Time 0.012 (0.021) Data 0.000 (0.001) Loss 1.8679 (2.1128) Prec@1 22.656 (18.458)
Epoch: [0 ][260/391] Time 0.016 (0.021) Data 0.000 (0.001) Loss 1.8262 (2.0923) Prec@1 28.125 (19.202)
Epoch: [0 ][280/391] Time 0.012 (0.021) Data 0.000 (0.001) Loss 1.7779 (2.0737) Prec@1 31.250 (19.834)
Epoch: [0 ][300/391] Time 0.011 (0.020) Data 0.000 (0.001) Loss 1.7415 (2.0569) Prec@1 38.281 (20.359)
Epoch: [0 ][320/391] Time 0.012 (0.020) Data 0.000 (0.001) Loss 1.7895 (2.0431) Prec@1 26.562 (20.863)
Epoch: [0 ][340/391] Time 0.012 (0.020) Data 0.000 (0.001) Loss 1.7198 (2.0292) Prec@1 31.250 (21.355)
Epoch: [0 ][360/391] Time 0.012 (0.019) Data 0.000 (0.001) Loss 1.9042 (2.0171) Prec@1 27.344 (21.827)
Epoch: [0 ][380/391] Time 0.012 (0.019) Data 0.000 (0.001) Loss 2.6430 (2.0338) Prec@1 12.500 (21.900)
Test[0/79] Time 0.136 (0.136) Loss 2.3228 (2.3228) Prec@1 10.938 (10.938)
Test[20/79] Time 0.004 (0.013) Loss 2.3267 (2.3337) Prec@1 7.812 (8.891)
Test[40/79] Time 0.013 (0.009) Loss 2.3235 (2.3322) Prec@1 10.156 (8.670)
Test[60/79] Time 0.011 (0.009) Loss 2.3311 (2.3303) Prec@1 10.156 (8.799)
* Prec@1 8.810
Epoch: [1 ][ 0 /391] Time 0.099 (0.099) Data 0.085 (0.085) Loss 2.3538 (2.3538) Prec@1 8.594 (8.594)
Epoch: [1 ][20 /391] Time 0.028 (0.021) Data 0.000 (0.005) Loss nan (nan) Prec@1 1.562 (8.036)
Epoch: [1 ][40 /391] Time 0.018 (0.019) Data 0.000 (0.003) Loss nan (nan) Prec@1 1.562 (5.011)
Epoch: [1 ][60 /391] Time 0.012 (0.017) Data 0.000 (0.002) Loss nan (nan) Prec@1 1.562 (3.893)
Epoch: [1 ][80 /391] Time 0.012 (0.016) Data 0.000 (0.002) Loss nan (nan) Prec@1 2.344 (3.279)
Epoch: [1 ][100/391] Time 0.013 (0.016) Data 0.002 (0.001) Loss nan (nan) Prec@1 2.344 (2.908)
Epoch: [1 ][120/391] Time 0.017 (0.015) Data 0.000 (0.001) Loss nan (nan) Prec@1 3.906 (2.686)
Epoch: [1 ][140/391] Time 0.012 (0.016) Data 0.000 (0.001) Loss nan (nan) Prec@1 2.344 (2.549)
Epoch: [1 ][160/391] Time 0.012 (0.015) Data 0.000 (0.001) Loss nan (nan) Prec@1 2.344 (2.451)
Epoch: [1 ][180/391] Time 0.017 (0.016) Data 0.000 (0.001) Loss nan (nan) Prec@1 3.906 (2.348)
Epoch: [1 ][200/391] Time 0.012 (0.016) Data 0.000 (0.001) Loss nan (nan) Prec@1 2.344 (2.320)
Epoch: [1 ][220/391] Time 0.011 (0.015) Data 0.000 (0.001) Loss nan (nan) Prec@1 0.781 (2.238)
Epoch: [1 ][240/391] Time 0.012 (0.015) Data 0.000 (0.001) Loss nan (nan) Prec@1 0.781 (2.217)
Epoch: [1 ][260/391] Time 0.013 (0.015) Data 0.000 (0.001) Loss nan (nan) Prec@1 1.562 (2.176)
Epoch: [1 ][280/391] Time 0.012 (0.015) Data 0.000 (0.001) Loss nan (nan) Prec@1 2.344 (2.149)
Epoch: [1 ][300/391] Time 0.016 (0.015) Data 0.000 (0.001) Loss nan (nan) Prec@1 1.562 (2.108)
Epoch: [1 ][320/391] Time 0.018 (0.015) Data 0.007 (0.001) Loss nan (nan) Prec@1 0.781 (2.078)
Epoch: [1 ][340/391] Time 0.017 (0.015) Data 0.006 (0.001) Loss nan (nan) Prec@1 3.125 (2.067)
Epoch: [1 ][360/391] Time 0.018 (0.015) Data 0.006 (0.001) Loss nan (nan) Prec@1 3.906 (2.052)
Epoch: [1 ][380/391] Time 0.012 (0.016) Data 0.000 (0.001) Loss nan (nan) Prec@1 0.781 (2.010)
Test[0/79] Time 0.094 (0.094) Loss nan (nan) Prec@1 0.000 (0.000)
Test[20/79] Time 0.009 (0.014) Loss nan (nan) Prec@1 0.000 (0.335)
Test[40/79] Time 0.015 (0.015) Loss nan (nan) Prec@1 0.000 (0.419)
Test[60/79] Time 0.015 (0.015) Loss nan (nan) Prec@1 0.000 (0.538)
* Prec@1 0.540

License?

Hi Cheng-Yang,

Thanks for the CIFAR10 training code.

I wonder whether you would be interested to please license this code under an open source license? See e.g. the licenses here:

https://opensource.org/licenses

If so, then others will be more likely to contribute to your codebase, or incorporate it into their projects.

I use your code the result all are 100 or 99.9?

I use the main.py to train cifar10 ,the result is incredible, the test acc is100% or 99.9%,

Why not giving the whole model to DataParallel ？

I feel confused about the code in main function as follows :
model.features = torch.nn.DataParallel(model.features)
May I ask that why not giving the whole model to DataParallel ？

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.