qlmx / huawei-garbage Goto Github PK

View Code? Open in Web Editor NEW

302.0 302.0 98.0 638 KB

2019 Huawei Cloud garbage Classification Competition online score 2th place

Python 100.00%

huawei-garbage's People

Contributors

Stargazers

Watchers

Forkers

wang3702 qmoji tomheaven r-craft kongxuanzhi dataaiplayer scottkaykay lrq0701 zxzxzxygithub damonzhenghuang jandyu niyingqiu newcoderzhen chenlongsun zhengweisrc sephirot1st zhangxiaobao01 zymale jzcruiser lvywxxx oujieww kaihenguestc time185 lubocsu jiawuzhang beyond-zw szskd wovai supershuang1995 demingwang cassie-cv wolfworld6 mmdestone juingzhou ydduong themachinehf feel19871009 dfs000 tttjjy fanfan-a yananren2137 szqxx swirlingcloud 2960665836 stumpwithbuds baobunuo cauchyguo zzzhangxuuu ppker beanhsiang kryptonrefugee jackysnake troyzhxu superrender lumiaomiao gaiya2050 carty-bao yingmuying guyucowboy yangcongmin weizhiyangq baibaidedongdong langlifei yanggui19891007 lizihao6868 aspnetcs weijace wangxh258 cancangit zfxu wlm2019 yangyin2016 zcf900 iscoyizj hell-to-heaven diaozhuo99 linhduongtuan yx868868 iswyq zoe0712 litou-lyh gracewang961030 aaronsya yongboliang abobooalba michaelzhouy leaf-heart titraveller lanpang1 luochengbin823 zzyper xiaomujiang cneaglearmy junwenzhang start0606 lovaya a-why-not-fork-repositories-good-luck asncja

huawei-garbage's Issues

why index should add one in dataset.py?

i want to know why index should add one ?

def getitem(self, index):
assert index <= len(self), 'index range error'
index += 1
img_path, label = self.env[index].strip().split(',')

    try:
        img = Image.open(img_path)
    except:
        print(img_path)
        print('Corrupted image for %d' % index)
        return self[index + 1]

    if self.transform is not None:
        if img.layers == 1:
            print(img_path)
        img = self.transform(img)

    if self.target_transform is not None:
        label = self.target_transform(label)
    return (img, int(label))

CUDA OOM after one epoch

always CUDA OOM after one epoch... no matter how many batchsize is.. 4 GPU machine.

几个问题

大佬，您好，我是在idea中运行您的代码，但我没有通过requirement直接安装指定版本的包（有些包下载遇到问题），我手动安装了要求的包（版本较高）。
但是在utils.py中第50行中”StackNet“显示无法解析，我查了一下pytorch的开发手册，没有找到StackNet类。请问一下这是在哪个包里？

在radam.py文件中的第11行，”import required“，我去看了一下optimizer中的代码，没发现什么问题。我去掉requied后，文件中也没有报错，请问一下可以去掉这个required吗？

在customize_service.py文件中，要导入华为的包，我看PTServingBaseService没有在其他地方被引用，又在华为网站上发现好像在华为云上运行才需要写这个，我就注释掉了，这没关系吧？

关于准确率的小问题

请问一下大佬，你这个准确率是整个数据集识别正确的除以整个数据集的数量，还是40个小类识别准确率的平均值啊

请问我在运行train.py是报以下错误，是什么原因，麻烦解答一下，谢谢

Traceback (most recent call last):
File "train.py", line 249, in
main()
File "train.py", line 99, in main
train_loss, train_acc, train_5 = train(train_loader, model, criterion, optimizer, epoch, use_cuda)
File "train.py", line 158, in train
inputs, targets = torch.atrutograd.Variable(inputs), torch.autograd.Variable(targets)
AttributeError: module 'torch' has no attribute 'atrutograd'

'PngImageFile' object has no attribute 'layers'

if self.transform is not None: if img.layers == 1: print(img_path) img = self.transform(img)

when i run the code, here is a error which show that 'PngImageFile' object has no attribute 'layers'

args.py

大佬你好，我想问问你那个args.py里面的checkpoints的default的路径是指向哪呢

您好！请问你们训练了多少个epoch达到收敛了？

您好！请问，你们训练了多少个epochs能够收敛，我看代码中epochs参数设置的是200貌似，然后好像也没看到你们使用early stopping。所以，想问下，你们训练了多少个epochs。

训练第一个周期没问题，第二个周期报错CUDA 内存不够

你好，
我在测试这个项目的时候，只用了一个gpu, bs==2来训练，发现第一个周期没有问题，第二个周期报错: cuda runtime error(2): out of memory。
我怀疑是不是在第一个周期运行完之后，第二个周期没有释放上一个周围的内存导致的？

ModuleNotFoundError: No module named 'adabound'

感谢您的分享！
训练时提示：
ModuleNotFoundError: No module named 'adabound'
adabound的实现是您自己写的吗？我可以从github找一个别的实现放在代码里吗？

运行报错

运行train.py后，报错 FileNotFoundError: [Errno 2] No such file or directory: '/data0/search/qlmx/clover/garbage/res_16_288_last1/log.txt' ，请问这个在哪里修改

你好，我是单张显卡的笔记本跑这个模型，我没有找到你的项目中的 arg.py

我该如何修改呢

您好，测试集精度只有88左右，正常吗？

train_loss:0.075142, val_loss:0.576665, train_acc:97.635311, train_5:99.954958, val_acc:87.491548, val_5:98.174442

您好，程序可以正常运行，速度也很快。使用垃圾分类的数据集进行复现时，测试集的top1精度只有88左右。这个结果正常吗？

训练时的batchsize是24，测试的batchsize是2.谢谢~

数据集怎么获取？

问一下大佬，数据集怎么获取？

针对类别少的参数调整

如果我只针对几个类别的垃圾图片进行训练，参数该怎么调整。对于较少类别的（我选的是4类）训练结果一直在36左右，提不上去

运行报错

使用默认的配置没有问题，但是网络换成resnet18或者resnet34以后报错不匹配了，
ret = torch.addmm(bias, input, weight.t()) RuntimeError: size mismatch, m1: [4 x 512], m2: [2048 x 2] at C:/w/1/s/windows/pytorch/aten/src\THC/generic/THCTensorMathBlas.cu:273

关于预训练权重

你已对resnet模型结构进行了更改，当你加载res101权重时，是只会加载相同部分的权重吗？能直接跳过新加的部分

predit.py can't work.

predit.py运行异常，确定可以正常预测吗？

about preprocess_img

请问你的图像预处理是不是错了，没有做NCHW的转换
img = data['input_img']
img = img.unsqueeze(0)
img = img.to(self.device)
with torch.no_grad():
pred_score = self.model(img)

运行‘train.py’报错

运行到某些图片时会中断，报以下错误：

Traceback (most recent call last):
File "/home/ubuntu/Desktop/huawei-garbage/train.py", line 260, in
main()
File "/home/ubuntu/Desktop/huawei-garbage/train.py", line 110, in main
train_loss, train_acc, train_5 = train(train_loader, model, criterion, optimizer, epoch, use_cuda)
File "/home/ubuntu/Desktop/huawei-garbage/train.py", line 163, in train
for batch_idx, (inputs, targets) in enumerate(train_loader):
File "/home/ubuntu/Desktop/huawei-garbage/venv/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 637, in next
return self._process_next_batch(batch)
File "/home/ubuntu/Desktop/huawei-garbage/venv/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 658, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
AttributeError: Traceback (most recent call last):
File "/home/ubuntu/Desktop/huawei-garbage/venv/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/ubuntu/Desktop/huawei-garbage/venv/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 138, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/ubuntu/Desktop/huawei-garbage/dataset.py", line 60, in getitem
if img.layers == 1:
File "/home/ubuntu/Desktop/huawei-garbage/venv/lib/python3.6/site-packages/PIL/Image.py", line 546, in getattr
raise AttributeError(name)
AttributeError: layers

请问是什么原因呢？

关于训练

这个代码训练时使用TTA策略了吗？

No such file or directory: 'data/new_shu_label.txt'

请问 'data/new_shu_label.txt' 和 'data/val1.txt' 这两个文件是什么？我调试脚本的时候提示没有 'data/new_shu_label.txt' 这个文件。

您好！请问可以提供以下数据集下载链接吗

目前在网络上下载到的数据集和您的细分类标准都不太一样
谢谢您！

谢谢博主的分享！但博主的训练步骤不够详细,我补充一下训练的详细流程，

第一步，建立文件夹data，把garbage_classify全部解压缩到data下;
第二步，修改‘garbage_classify_V3.json’为‘garbage_classify_rule.json’,运行preprocess.py，生成训练集和测试集 ;
第三步，单张显卡的话，修改arg.py 85行 parser.add_argument('--gpu-id', default='0, 1, 2, 3' 为'--gpu-id', default='0'，同时修改 '--train-batch'，'--test-batch'为适当的数字;
第四步，运行train.py

**### 数据集下载
**

https://modelarts-competitions.obs.cn-north-1.myhuaweicloud.com/garbage_classify/dataset/garbage_classify.zip

本期包含了日常生活中收集的40种垃圾图像，每对数据包括TXT中的垃圾图像及其标签文件。图像名称及其对应的数字标签中一行的格式，例如“ image_0.jpg，0”（名称，标签）。数据索引不连续，图像总数约为14802。

垃圾种类40类
{
"0": "其他垃圾/一次性快餐盒",
"1": "其他垃圾/污损塑料",
"2": "其他垃圾/烟蒂",
"3": "其他垃圾/牙签",
"4": "其他垃圾/破碎花盆及碟碗",
"5": "其他垃圾/竹筷",
"6": "厨余垃圾/剩饭剩菜",
"7": "厨余垃圾/大骨头",
"8": "厨余垃圾/水果果皮",
"9": "厨余垃圾/水果果肉",
"10": "厨余垃圾/茶叶渣",
"11": "厨余垃圾/菜叶菜根",
"12": "厨余垃圾/蛋壳",
"13": "厨余垃圾/鱼骨",
"14": "可回收物/充电宝",
"15": "可回收物/包",
"16": "可回收物/化妆品瓶",
"17": "可回收物/塑料玩具",
"18": "可回收物/塑料碗盆",
"19": "可回收物/塑料衣架",
"20": "可回收物/快递纸袋",
"21": "可回收物/插头电线",
"22": "可回收物/旧衣服",
"23": "可回收物/易拉罐",
"24": "可回收物/枕头",
"25": "可回收物/毛绒玩具",
"26": "可回收物/洗发水瓶",
"27": "可回收物/玻璃杯",
"28": "可回收物/皮鞋",
"29": "可回收物/砧板",
"30": "可回收物/纸板箱",
"31": "可回收物/调料瓶",
"32": "可回收物/酒瓶",
"33": "可回收物/金属食品罐",
"34": "可回收物/锅",
"35": "可回收物/食用油桶",
"36": "可回收物/饮料瓶",
"37": "有害垃圾/干电池",
"38": "有害垃圾/软膏",
"39": "有害垃圾/过期药物"
}

全部代码结构（含原始数据集）

{repo_root}
├── models //模型文件夹
├── utils //一些函数包
| ├── eval.py // 求精度
│ ├── misc.py // 模型保存，参数初始化，优化函数选择
│ ├── radam.py
│ └── ...
├── args.py //参数配置文件
├── build_net.py //搭建模型
├── dataset.py //数据批量加载文件
├── preprocess.py //数据预处理文件，生成坐标标签
├── train.py //训练运行文件
├── transform.py //数据增强文件
├──data //建立文件夹data，把garbage_classify全部解压缩到data下
| ├──garbage_classify
| | ├──train_data //全部训练数据集
| | ├──garbage_classify_rule.json //标签文件

python版本为3.6，具体的函数包如下：

pytorch>=1.0.1
torchvision>=0.2.2
matplotlib>=3.1.0
numpy>=1.16.4
scikit-image
pandas
sklearn
adabound
requests

py3.7训练的话，要修改下面的代码
if use_cuda: inputs, targets = inputs.cuda(), targets.cuda(async=True) inputs, targets = torch.autograd.Variable(inputs), torch.autograd.Variable(targets)
#python3.7已经移除了async关键字，而用non_blocking代替。(导致apache-airflow也出了问题)
#cuda() 本身也没有async.

就是把 async=True去掉

if use_cuda:
inputs, targets = inputs.cuda(), targets.cuda()
inputs, targets = torch.autograd.Variable(inputs), torch.autograd.Variable(targets)`

还无法解决的话，加QQ群：669412407（注明华为垃圾分类训练）共同学习谢谢！

请问train.py会一直跑下去么？

大概会跑多久，我之前把train_batch调成了2，太高的话老是报GPU内存不够。

win10+py3.7+pytorch1.3 SyntaxError: invalid syntax

File "B:/PyTorch/huawei-garbage/train.py", line 167
inputs, targets = inputs.cuda(), targets.cuda(async=True)
^
SyntaxError: invalid syntax

跑preprocess的时候报错

Traceback (most recent call last):
File "F:/garbage_classification/garbage clssification code/huawei-garbage-master/preprocess.py", line 68, in
for fold_, (trn_idx, val_idx) in enumerate(folds.split(result, labels)):
File "F:\anaconda\lib\site-packages\sklearn\model_selection_split.py", line 724, in split
y = check_array(y, ensure_2d=False, dtype=None)
File "F:\anaconda\lib\site-packages\sklearn\utils\validation.py", line 550, in check_array
context))
ValueError: Found array with 0 sample(s) (shape=(0,)) while a minimum of 1 is required.

qlmx / huawei-garbage Goto Github PK

huawei-garbage's People

Contributors

Stargazers

Watchers

Forkers

huawei-garbage's Issues

还无法解决的话，加QQ群：669412407（注明 华为垃圾分类训练） 共同学习谢谢！

Recommend Projects

Recommend Topics

Recommend Org

还无法解决的话，加QQ群：669412407（注明华为垃圾分类训练）共同学习谢谢！