yeyun111 / dlcv_for_beginners Goto Github PK

View Code? Open in Web Editor NEW

1.4K 93.0 637.0 25.47 MB

《深度学习与计算机视觉》配套代码

Home Page: https://frombeijingwithlove.github.io/dlcv_for_beginners/

License: BSD 3-Clause "New" or "Revised" License

Python 99.22% Shell 0.78%

computer-vision book beginners example deep-learning caffe mxnet pytorch u-net model-fusion

dlcv_for_beginners's Introduction

👋 Hi

dlcv_for_beginners's People

Stargazers

Watchers

Forkers

quincyhuang ambier maxkevin satchelwu leezqcst ilibx chengduozhao dwy555 zhanglamei helang818 sujinpeng fanspring esraaragaa zhihengli-ur qchengray kqmw123 liytt85 yingchao-mai caozq19 anazou c1a1o1 crestxu ghellkity johnova lplg huanleo volpato30 jianweilin shaoyiwork hxl1990 tekkin boosting uitb xxxhycl2010 vivilouies hooloong pengyan-sophia statml chenhoushuang rkshuai teyenliu falbertliu zhoushaojun leleda xiangyuwei deepxkn yuulin w928028422 guofeng007 kifish zbxzc35 ccdump shangdiuser 675492062 hkxiron zhenyu koryako iceguner yingshuoli zhanglianjie-163 slidelucask geekzhangwei llcf abilibili flyingkiss breetz zhangyang5511 daixs iamyubin aa12356jm jackieshang hkcaesar copperdong bityangke aijiajia ccfiona i9670 191578010 grseb9s charlotteyang12 jayromia qingchunnianshao anjoah guowenzhe haha370104 xiaomiren owliz adong7639 akafen crazyvertigo tinacristal toccator linyuefeng123 superxiaoying ycdhonker wanggithubwei winper001 byrantwithyou wustmeiming santan169

dlcv_for_beginners's Issues

10.1.1的例程有错误

无法对中文检索词进行检索,我对程序进行了几处修改可以了
第一处,要对url中的中文进行转码,需要提前import string:
url = URL_TEMPLATE.format(keyword=keyword, index=index) url = urllib.parse.quote(url, safe=string.printable)
第二处,要以'r'模式而不是'rb'模式读取keywords.txt:
with open('keywords.txt', 'r') as f:

page243,244，lenet5两处错误

page243, pool2 = mx.symbol.Pooling(data=conv1, pool_type="max",
kernel=(2, 2), stride=(2, 2))
此处data应该=conv2,否则就只有一个卷积层了，
page244, mean_r=128,
scale=0.00390625,
这里不正确，因为数据中均值并不是128，这样会是数据发生严重偏移，如果采用这样最终正确率仅为0.1，
望修正

adversarial example for single channel (greyscale) caffe models

hi,
i'm using the adversarial-example-caffe scripts to test my single channel model.

However, i tried changing caffe.io.load_image(image_path, color=False) and set the channel_swap uncommented, but no use, any help?
This is the traceback, that i'm getting:
Traceback (most recent call last): File "adv.py", line 129, in <module> make_n_test_adversarial_example(img, net, transformer, epsilon, label_index=label_index) File "adv.py", line 40, in make_n_test_adversarial_example attack_hwc = transformer.deprocess(data_blob, transformed_img + adversarial_noise[0]) File "/home/raja/Documents/rs/caffe/python/caffe/io.py", line 174, in deprocess decaf_in += mean ValueError: non-broadcastable output operand with shape (224,224) doesn't match the broadcast shape (1,224,224)

34页图2-9

把位移向量／偏置向量直接包含在矩阵乘法之中。这句话下面的等式右边矩阵第一个应该为1＊x+1＊y吧？
后面几个等号也错了
31页：如图2-4左图所示，所以在标准坐标系中的（0，1）也就是横轴上的向量。应该为（1，0）吧？

请问大神知道怎么用caffe做一维非图像数据的分类吗

P125有疑问，是不是描述错误？

对不起，之前提的错误是我理解错了

P36、P38发现的错误

P36第一行，对应的本征向量为1.81和0.69。
应该是：对应的本征值为1.81和0.69。

P39奇异值分解公式2-16应为
M=U∑V*

P109 第四段第三句重复

where is the ../data.pkl?

when i run dlcv_for_beginners/chap7/caffe/gen_hdf5.py,
it hint the error:
IOError: [Errno 2] No such file or directory: '../data.pkl'
thanks

97页图3-24

左图 y对t求导错写成t对b求导
右图倒数第二个节点错写成y，应该是t

已解决

关于用caffe做回归一章节的问题

如果利用Ising model 生成的图片尺寸变大的话那代码里的 stop 和 order 要怎样进行变化呢，有没有公式或者可以参考的资料

第8章caffe代码测试时的错误

WARNING: Logging before InitGoogleLogging() is written to STDERR
W0202 10:45:07.652467 4063 _caffe.cpp:139] DEPRECATION WARNING - deprecated use of Python interface
W0202 10:45:07.652498 4063 _caffe.cpp:140] Use this instead (with the named "weights" parameter):
W0202 10:45:07.652503 4063 _caffe.cpp:142] Net('lenet.prototxt', 1, weights='mnist_lenet_iter_36000.caffemodel')
Traceback (most recent call last):
File "recognize_digit.py", line 14, in
net = caffe.Net('lenet.prototxt', 'mnist_lenet_iter_36000.caffemodel', caffe.TEST)
RuntimeError: Could not open file mnist_lenet_iter_36000.caffemodel

请帮忙看看是什么问题？

运行出错python gan_demo.py --cpu inputs/zig.jpg

$ python gan_demo.py --cpu inputs/zig.jpg
C:\Users\Administrator\Anaconda3\envs\torchcpu\lib\site-packages\torch\nn\functional.py:1386: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigm
oid instead.
warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")
C:\Users\Administrator\Anaconda3\envs\torchcpu\lib\site-packages\torch\nn\modules\loss.py:512: UserWarning: Using a target size (torch.Size([2000])) that is d
ifferent to the input size (torch.Size([2000, 1])) is deprecated. Please ensure they have the same size.
return F.binary_cross_entropy(input, target, weight=self.weight, reduction=self.reduction)
Traceback (most recent call last):
File "gan_demo.py", line 88, in
loss_d_real = d_real_loss.data.cpu().numpy()[0] if cuda else d_real_loss.data.numpy()[0]
IndexError: too many indices for array

P34 35 91 96 128 130 等勘误

errata.pdf版本：Mar 24, 2018

P34 倒数第二段：
“本征向量的意思就是说【对】经过变换后”，
应改为“本征向量的意思就是说：对向量x进行矩阵A所对应的线性变换后，”

P35 第二段最后一句：
这就是本征值的几何含义：变换会将对应本征向量方向上的向量进行缩放，
改为：
这就是本征值的几何含义：矩阵A所对应的线性变换，会将A对应的本征向量进行缩放，

P91
图3-11：这幅插图是错的。

正确的插图在这里：https://github.com/frombeijingwithlove/dlcv_book_pretrained_caffe_models/blob/master/figs_n_plots/chap3.pdf

P96 倒数第2行的前几个字符：
∂d/∂a=2c=2(a+b)
应改为：∂d/∂b=2c=2(a+b)

P128
最后一行：“而每个输出也会【收到】所有输入的影响”
“收到”应改为“受到”。

P130
图4-11 a)下方的解释：“分布式表片...”
应改为“分布式表征...”

P135 Let

这里的label为什么要除以2呢?岂不是label都错了?

https://github.com/frombeijingwithlove/dlcv_for_beginners/blob/196205ac0e9fefae7c5ab689455568d73373d9d3/random_bonus/multiple_models_fusion_caffe/gen_img_list.py#L20

P184，图像的表示下面一段倒数第二句以及图6-1

P184，图像的表示下面一段倒数第二句
第二个维度应该是宽度吧？

以及图6-1中RBG应该是RGB吧？

82页P(Y=y|x,theta)表达式是不是有问题，是不是应该用分号 ; 才对

Page86 图2-56 y关于g2的求导是不是标注错了？

我发现的一些错误

27页 2.1.1 齐性后，给一个向量缩放一个倍数在变换，应该是再变换

29页一处错误

公式（2-9）上面：
则u在v上的投影长度为u的长度都是|u|cos(θ)
是不是应该改成则u在v上的投影长度是|u|cos(θ)

4.1.4 卷积情况下的权值数量

文中说卷积情况下，卷积核大小是2X2=4，一共是4x9=36个权值，但因为卷积操作是参数共享的，所以是否这里的权值参数数量只有2X2=4，而不是36呢？

245页问题

为什么验证集正确率总是高于训练集，这难道没有问题吗？

67页卷积计算问题

中间那行[1,2,]与[2,3,0]计算结果应该是8吧，书上是9.

第八章mxnet版本的mnist识别accuracy一直很低，而且不变

第八章mxnet版本的mnist识别accuracy一直很低，而且不变，我用的nxnet1.0.1版本，这是怎么回事

101，114，184，189，191，216 , 228, 229, 279

版次：2018年1月第一版第二次印刷。
1、p101 第三行，“斜率小于0大于-1”改为“大于0小于1”
2、P114 倒数第三行“实现”改为“实线”；
3、p184倒数第二行，第二个“高度”改为“宽度”
4、p189正文倒数第三行，“右上角”改为“左上角”
5、p191 第四个注释，“半斤”改为“半径”
6、p216第6个注释，“下半部分标签为2”改为“下半部分标签为0”
7、p228代码倒数第十行，“zip(*dataset)”改为“zip(*datasets)”
8、p229第六行，“第一部”改为“第一步”
9、p279倒数第二段第二行，“FPR和FPR”改为“TPR和FPR”
初学者，可能有理解错误处，请见谅

我发现一些有疑问的地方

31页，第一行

应改成
35 页，正定矩阵的定义（公式 2-15）
标准正定矩阵的定义，应该没有 “对称矩阵” 这一条件吧？只是您在本书大多讨论的是“对称”正定矩阵，所以加上了 “对称” 这一条件？
38页，奇异值分解定义（公式 2-16）

是否应该是
43页，表2-1，右下角不跳舞+不下雨的概率值，应该是 0.1 不是 0
51页，倒数第三行的球体体积除以立方体体积的公式

应改成
52页，公式 2-32

应改成
84页，第五行
“并且值越大，分类为Δ的概率也接近1”
是否应改成
“并且值越大，分类为O的概率也接近1” ？
84页，第五行，最后一个字 “远” ，是否应该是 “原” ？

Change layer name after training

your work on model fusion is great , and give me more inspiration. however ,there is a problem ,you change your layer name after training odd/even caffemodel, e.g, your odd caffemodel conv1 layer name is 'conv1', after your change, this layer name is 'odd/conv1' , it will be re-trained because of caffe rules（If we provide the weights argument to the caffe train command, the pretrained weights will be loaded into our model, matching layers by name， see here）. So if you change all the layers' name, your odd/even caffemodels can't have no usage. In fact , your fusion model is just train from scratch.

I don't know is it right , so I hope you can give me some advice.

158页最下方代码

i = 0
while i < 100: #笑100遍
print("ha")

应为：
i = 0
while i < 100: #笑100遍
print("ha")
i = i+1

否则无限循环

关于回归部分hdf5文件制作是否有误

https://github.com/frombeijingwithlove/dlcv_for_beginners/blob/a741afa4e103673c914fd5a09919430000e136c4/chap9/gen_hdf5.py#L25
caffe读取图片格式为CHW，而你保存的文件却是为HWC格式。

144页有公式4-11 错误

最后一个等号后面的i ----> i+n的是目标函数J对x_{i+n} 求偏导数。
包括下面的那句话的i 也要换成i+n

P100 梯度消失一节有疑问

2018年1月第1版第2次印刷的书上，
书第100页 3.3.4梯度消失一节，最后一段第一句，
原文为：因为导致梯度消失的根本原因是小于0的梯度连续做乘法。
可是按照上文的Sigmoid的例子的话，梯度消失应该是小于1的梯度连续做乘法引起的。
是否应改为：因为导致梯度消失的根本原因是小于1的梯度连续做乘法。

googlenet结构图有误（page139）

今天详细看了第四章关于goolgenet的结构图，发现有好几个疑虑，还请解答？

书籍第139页，图4-19 Googlenet结构示意图；
从上往下的看，

经过pool kernel：3 x 3 stride：2 type：max后，应该是64x56x56而不是64x112x112；
经过第7个inception》832x14x14，再经过pool kernel：3x3 stride：2 type：max》832x7x7，第8个inception的内容有错误，应该是采用第9个inception的内容，同时把inception后的结果改为1024x7x7，然后再进行一个pool kernel：7x7 type：average，这里type的类型也错误（原来写的是max）；
（这里的错误比较复杂，还望仔细检查下，不过明显有错误，832x7x7——1024x7x7——832x7x7明显走多了，多用了一个inception）
同时另外两分列也有小错误，第二列，全连接前的卷积，528x4x4经过1x1的卷积核以后结果不应该是128x4x4，本人猜测是64x4x4，因为经过全连接的话128x4x4=2048，而不是1024,；
第三列，全连接前的卷积，51244经过1*1的卷积核以后结果不应该是128x4x4,，原因同上，本人猜测是64x4x4；
5.书籍第140页，对应“值得一提的是，最后一个inception模块输出7x7大小的832通道特征响应图后”，这里应该改成“输出7x7大小的1024通道特征相应图”

本人是新手，也许提出来的问题不太专业，但这两页的内容确实对读者造成了一定的困惑，烦请解答。感谢作者的付出，希望大家一起努力进步。

pytorch在win10下安装好后，运行二维GANs代码scipy报错

有没有小伙伴在win10系统下跑的通二维GANs代码的啊？
我运行代码后报这样的错，
from scipy import ndimage
File "C:\Users\dell\AppData\Roaming\Python\Python36\site-packages\scipy\ndimage_init_.py", line 161, in
from .filters import *
File "C:\Users\dell\AppData\Roaming\Python\Python36\site-packages\scipy\ndimage\filters.py", line 37, in
from scipy.misc import doccer
File "C:\Users\dell\AppData\Roaming\Python\Python36\site-packages\scipy\misc_init_.py", line 67, in
from scipy.interpolate.pade import pade as pade
File "C:\Users\dell\AppData\Roaming\Python\Python36\site-packages\scipy\interpolate_init.py", line 175, in
from .interpolate import *
File "C:\Users\dell\AppData\Roaming\Python\Python36\site-packages\scipy\interpolate\interpolate.py", line 21, in
import scipy.special as spec
File "C:\Users\dell\AppData\Roaming\Python\Python36\site-packages\scipy\special_init.py", line 648, in
from ._ellip_harm import ellip_harm, ellip_harm_2, ellip_normal
File "C:\Users\dell\AppData\Roaming\Python\Python36\site-packages\scipy\special_ellip_harm.py", line 7, in
from ._ellip_harm_2 import _ellipsoid, _ellipsoid_norm
File "_ellip_harm_2.pyx", line 7, in init scipy.special.ellip_harm_2
File "C:\Users\dell\AppData\Roaming\Python\Python36\site-packages\scipy\integrate_init.py", line 93, in
from .bvp import solve_bvp
File "C:\Users\dell\AppData\Roaming\Python\Python36\site-packages\scipy\integrate_bvp.py", line 11, in
from scipy.optimize import OptimizeResult
File "C:\Users\dell\AppData\Roaming\Python\Python36\site-packages\scipy\optimize_init.py", line 241, in
from ._minimize import *
File "C:\Users\dell\AppData\Roaming\Python\Python36\site-packages\scipy\optimize_minimize.py", line 33, in
from .tnc import _minimize_tnc
File "C:\Users\dell\AppData\Roaming\Python\Python36\site-packages\scipy\optimize\tnc.py", line 37, in
from scipy.optimize import moduleTNC, approx_fprime
ImportError: cannot import name 'moduleTNC'

Process finished with exit code 1
已经重装了scipy包三次了，还是不行。。
求大神们帮忙

P135 AlexNet 结构示意图是不是有问题

1.第一层池化层输入为 965555 核长为3 步长为2 池化后竟然还是 485555？！！！这一步错了下面还有价值么？让大家意会？
2.我不知道这图是从哪里来的，网上找了一下都不一样

关于用GAN生成二维样本的小例子的疑问；

从知乎用GAN生成二维样本的小例子过来的，关于作者你说的” 判别器(Discriminator)的输入不是1维样本，而是把整个mini-batch整体作为一个维度是batch size(代码中batch size等于cardinality)那么大的样本。“
这个问题，devneg其实做过正面回复，

第8章caffe版本模型训练好后测试发生 'NoneType' object has no attribute 'astype'

$ python recognize_digit.py ../data/test.txt
WARNING: Logging before InitGoogleLogging() is written to STDERR
W0205 15:06:08.584481 14628 _caffe.cpp:139] DEPRECATION WARNING - deprecated use of Python interface
W0205 15:06:08.584520 14628 _caffe.cpp:140] Use this instead (with the named "weights" parameter):
W0205 15:06:08.584525 14628 _caffe.cpp:142] Net('lenet.prototxt', 1, weights='mnist_lenet_iter_36000.caffemodel')
I0205 15:06:08.586066 14628 net.cpp:51] Initializing net from parameters:

I0205 15:06:08.586148 14628 layer_factory.hpp:77] Creating layer data
I0205 15:06:08.586163 14628 net.cpp:84] Creating Layer data
I0205 15:06:08.586174 14628 net.cpp:380] data -> data
I0205 15:06:08.586195 14628 net.cpp:122] Setting up data
I0205 15:06:08.586206 14628 net.cpp:129] Top shape: 64 1 28 28 (50176)
I0205 15:06:08.586211 14628 net.cpp:137] Memory required for data: 200704
I0205 15:06:08.586216 14628 layer_factory.hpp:77] Creating layer conv1
I0205 15:06:08.586227 14628 net.cpp:84] Creating Layer conv1
I0205 15:06:08.586233 14628 net.cpp:406] conv1 <- data
I0205 15:06:08.586241 14628 net.cpp:380] conv1 -> conv1
I0205 15:06:08.586324 14628 net.cpp:122] Setting up conv1
I0205 15:06:08.586334 14628 net.cpp:129] Top shape: 64 20 24 24 (737280)
I0205 15:06:08.586339 14628 net.cpp:137] Memory required for data: 3149824
I0205 15:06:08.586351 14628 layer_factory.hpp:77] Creating layer pool1
I0205 15:06:08.586361 14628 net.cpp:84] Creating Layer pool1
I0205 15:06:08.586367 14628 net.cpp:406] pool1 <- conv1
I0205 15:06:08.586374 14628 net.cpp:380] pool1 -> pool1
I0205 15:06:08.586387 14628 net.cpp:122] Setting up pool1
I0205 15:06:08.586395 14628 net.cpp:129] Top shape: 64 20 12 12 (184320)
I0205 15:06:08.586400 14628 net.cpp:137] Memory required for data: 3887104
I0205 15:06:08.586405 14628 layer_factory.hpp:77] Creating layer conv2
I0205 15:06:08.586416 14628 net.cpp:84] Creating Layer conv2
I0205 15:06:08.586421 14628 net.cpp:406] conv2 <- pool1
I0205 15:06:08.586428 14628 net.cpp:380] conv2 -> conv2
I0205 15:06:08.586645 14628 net.cpp:122] Setting up conv2
I0205 15:06:08.586658 14628 net.cpp:129] Top shape: 64 50 8 8 (204800)
I0205 15:06:08.586663 14628 net.cpp:137] Memory required for data: 4706304
I0205 15:06:08.586673 14628 layer_factory.hpp:77] Creating layer pool2
I0205 15:06:08.586683 14628 net.cpp:84] Creating Layer pool2
I0205 15:06:08.586688 14628 net.cpp:406] pool2 <- conv2
I0205 15:06:08.586696 14628 net.cpp:380] pool2 -> pool2
I0205 15:06:08.586706 14628 net.cpp:122] Setting up pool2
I0205 15:06:08.586714 14628 net.cpp:129] Top shape: 64 50 4 4 (51200)
I0205 15:06:08.586719 14628 net.cpp:137] Memory required for data: 4911104
I0205 15:06:08.586725 14628 layer_factory.hpp:77] Creating layer ip1
I0205 15:06:08.586735 14628 net.cpp:84] Creating Layer ip1
I0205 15:06:08.586740 14628 net.cpp:406] ip1 <- pool2
I0205 15:06:08.586747 14628 net.cpp:380] ip1 -> ip1
I0205 15:06:08.589907 14628 net.cpp:122] Setting up ip1
I0205 15:06:08.589939 14628 net.cpp:129] Top shape: 64 500 (32000)
I0205 15:06:08.589944 14628 net.cpp:137] Memory required for data: 5039104
I0205 15:06:08.589962 14628 layer_factory.hpp:77] Creating layer relu1
I0205 15:06:08.589977 14628 net.cpp:84] Creating Layer relu1
I0205 15:06:08.589983 14628 net.cpp:406] relu1 <- ip1
I0205 15:06:08.589992 14628 net.cpp:367] relu1 -> ip1 (in-place)
I0205 15:06:08.590001 14628 net.cpp:122] Setting up relu1
I0205 15:06:08.590006 14628 net.cpp:129] Top shape: 64 500 (32000)
I0205 15:06:08.590013 14628 net.cpp:137] Memory required for data: 5167104
I0205 15:06:08.590018 14628 layer_factory.hpp:77] Creating layer ip2
I0205 15:06:08.590028 14628 net.cpp:84] Creating Layer ip2
I0205 15:06:08.590034 14628 net.cpp:406] ip2 <- ip1
I0205 15:06:08.590040 14628 net.cpp:380] ip2 -> ip2
I0205 15:06:08.590100 14628 net.cpp:122] Setting up ip2
I0205 15:06:08.590107 14628 net.cpp:129] Top shape: 64 10 (640)
I0205 15:06:08.590112 14628 net.cpp:137] Memory required for data: 5169664
I0205 15:06:08.590119 14628 layer_factory.hpp:77] Creating layer prob
I0205 15:06:08.590127 14628 net.cpp:84] Creating Layer prob
I0205 15:06:08.590133 14628 net.cpp:406] prob <- ip2
I0205 15:06:08.590142 14628 net.cpp:380] prob -> prob
I0205 15:06:08.590152 14628 net.cpp:122] Setting up prob
I0205 15:06:08.590160 14628 net.cpp:129] Top shape: 64 10 (640)
I0205 15:06:08.590165 14628 net.cpp:137] Memory required for data: 5172224
I0205 15:06:08.590171 14628 net.cpp:200] prob does not need backward computation.
I0205 15:06:08.590178 14628 net.cpp:200] ip2 does not need backward computation.
I0205 15:06:08.590183 14628 net.cpp:200] relu1 does not need backward computation.
I0205 15:06:08.590186 14628 net.cpp:200] ip1 does not need backward computation.
I0205 15:06:08.590191 14628 net.cpp:200] pool2 does not need backward computation.
I0205 15:06:08.590196 14628 net.cpp:200] conv2 does not need backward computation.
I0205 15:06:08.590201 14628 net.cpp:200] pool1 does not need backward computation.
I0205 15:06:08.590206 14628 net.cpp:200] conv1 does not need backward computation.
I0205 15:06:08.590212 14628 net.cpp:200] data does not need backward computation.
I0205 15:06:08.590217 14628 net.cpp:242] This network produces output prob
I0205 15:06:08.590227 14628 net.cpp:255] Network initialization done.
I0205 15:06:08.592061 14628 net.cpp:744] Ignoring source layer mnist
I0205 15:06:08.592398 14628 net.cpp:744] Ignoring source layer loss
Traceback (most recent call last):
File "recognize_digit.py", line 23, in
image = cv2.imread(imgpath, cv2.IMREAD_GRAYSCALE).astype(np.float) - MEAN
AttributeError: 'NoneType' object has no attribute 'astype'

请帮忙看看是怎么回事？

187页代码

倒数第二行的channel—hist 是不是没有定义，在pycharm上无法运行