Giter VIP home page Giter VIP logo

huawei-garbage's People

Contributors

litou-lyh avatar qlmx avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

huawei-garbage's Issues

关于配置

我想问一下,跑这个模型最低配置要求多少

why index should add one in dataset.py?

i want to know why index should add one ?

def getitem(self, index):
assert index <= len(self), 'index range error'
index += 1
img_path, label = self.env[index].strip().split(',')

    try:
        img = Image.open(img_path)
    except:
        print(img_path)
        print('Corrupted image for %d' % index)
        return self[index + 1]

    if self.transform is not None:
        if img.layers == 1:
            print(img_path)
        img = self.transform(img)

    if self.target_transform is not None:
        label = self.target_transform(label)
    return (img, int(label))

几个问题

大佬,您好,我是在idea中运行您的代码,但我没有通过requirement直接安装指定版本的包(有些包下载遇到问题),我手动安装了要求的包(版本较高)。
但是在utils.py中第50行中”StackNet“显示无法解析,我查了一下pytorch的开发手册,没有找到StackNet类。请问一下这是在哪个包里?

在radam.py文件中的第11行,”import required“,我去看了一下optimizer中的代码,没发现什么问题。我去掉requied后,文件中也没有报错,请问一下可以去掉这个required吗?

在customize_service.py文件中,要导入华为的包,我看PTServingBaseService没有在其他地方被引用,又在华为网站上发现好像在华为云上运行才需要写这个,我就注释掉了,这没关系吧?

关于准确率的小问题

请问一下大佬,你这个准确率是整个数据集识别正确的除以整个数据集的数量,还是40个小类识别准确率的平均值啊

请问我在运行train.py是报以下错误,是什么原因,麻烦解答一下,谢谢

Traceback (most recent call last):
File "train.py", line 249, in
main()
File "train.py", line 99, in main
train_loss, train_acc, train_5 = train(train_loader, model, criterion, optimizer, epoch, use_cuda)
File "train.py", line 158, in train
inputs, targets = torch.atrutograd.Variable(inputs), torch.autograd.Variable(targets)
AttributeError: module 'torch' has no attribute 'atrutograd'

'PngImageFile' object has no attribute 'layers'

if self.transform is not None: if img.layers == 1: print(img_path) img = self.transform(img)

when i run the code, here is a error which show that 'PngImageFile' object has no attribute 'layers'

args.py

大佬你好,我想问问你那个args.py里面的checkpoints的default的路径是指向哪呢

训练第一个周期没问题,第二个周期报错CUDA 内存不够

你好,
我在测试这个项目的时候,只用了一个gpu, bs==2来训练,发现第一个周期没有问题,第二个周期报错: cuda runtime error(2): out of memory。
我怀疑是不是在第一个周期运行完之后,第二个周期没有释放上一个周围的内存导致的?

ModuleNotFoundError: No module named 'adabound'

感谢您的分享!
训练时提示:
ModuleNotFoundError: No module named 'adabound'
adabound的实现是您自己写的吗?我可以从github找一个别的实现放在代码里吗?

运行报错

运行train.py后,报 错 FileNotFoundError: [Errno 2] No such file or directory: '/data0/search/qlmx/clover/garbage/res_16_288_last1/log.txt' ,请问这个在哪里修改

您好,测试集精度只有88左右,正常吗?

train_loss:0.075142, val_loss:0.576665, train_acc:97.635311, train_5:99.954958, val_acc:87.491548, val_5:98.174442

您好,程序可以正常运行,速度也很快。使用垃圾分类的数据集进行复现时,测试集的top1精度只有88左右。这个结果正常吗?

训练时的batchsize是24,测试的batchsize是2.谢谢~

针对类别少的参数调整

如果我只针对几个类别的垃圾图片进行训练,参数该怎么调整。对于较少类别的(我选的是4类)训练结果一直在36左右,提不上去

运行报错

使用默认的配置没有问题,但是网络换成resnet18或者resnet34以后报错不匹配了,
ret = torch.addmm(bias, input, weight.t()) RuntimeError: size mismatch, m1: [4 x 512], m2: [2048 x 2] at C:/w/1/s/windows/pytorch/aten/src\THC/generic/THCTensorMathBlas.cu:273

关于预训练权重

你已对resnet模型结构进行了更改,当你加载res101权重时,是只会加载相同部分的权重吗?能直接跳过新加的部分

about preprocess_img

请问你的图像预处理是不是错了,没有做NCHW的转换
img = data['input_img']
img = img.unsqueeze(0)
img = img.to(self.device)
with torch.no_grad():
pred_score = self.model(img)

运行‘train.py’报错

运行到某些图片时会中断,报以下错误:

Traceback (most recent call last):
File "/home/ubuntu/Desktop/huawei-garbage/train.py", line 260, in
main()
File "/home/ubuntu/Desktop/huawei-garbage/train.py", line 110, in main
train_loss, train_acc, train_5 = train(train_loader, model, criterion, optimizer, epoch, use_cuda)
File "/home/ubuntu/Desktop/huawei-garbage/train.py", line 163, in train
for batch_idx, (inputs, targets) in enumerate(train_loader):
File "/home/ubuntu/Desktop/huawei-garbage/venv/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 637, in next
return self._process_next_batch(batch)
File "/home/ubuntu/Desktop/huawei-garbage/venv/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 658, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
AttributeError: Traceback (most recent call last):
File "/home/ubuntu/Desktop/huawei-garbage/venv/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/ubuntu/Desktop/huawei-garbage/venv/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 138, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/ubuntu/Desktop/huawei-garbage/dataset.py", line 60, in getitem
if img.layers == 1:
File "/home/ubuntu/Desktop/huawei-garbage/venv/lib/python3.6/site-packages/PIL/Image.py", line 546, in getattr
raise AttributeError(name)
AttributeError: layers

请问是什么原因呢?

关于训练

这个代码训练时使用TTA策略了吗?

谢谢博主的分享!但博主的训练步骤不够详细,我补充一下训练的详细流程,

谢谢博主的分享!但博主的训练步骤不够详细,我补充一下训练的详细流程,

第一步,建立文件夹data,把garbage_classify全部解压缩到data下;
第二步, 修改‘garbage_classify_V3.json’为‘garbage_classify_rule.json’,运行preprocess.py,生成训练集和测试集 ;
第三步,单张显卡的话,修改arg.py 85行 parser.add_argument('--gpu-id', default='0, 1, 2, 3' 为'--gpu-id', default='0',同时修改 '--train-batch','--test-batch'为适当的数字;
第四步,运行train.py

**### 数据集下载
**

https://modelarts-competitions.obs.cn-north-1.myhuaweicloud.com/garbage_classify/dataset/garbage_classify.zip

本期包含了日常生活中收集的40种垃圾图像,每对数据包括TXT中的垃圾图像及其标签文件。图像名称及其对应的数字标签中一行的格式,例如“ image_0.jpg,0”(名称,标签)。数据索引不连续,图像总数约为14802。

垃圾种类40类
{
"0": "其他垃圾/一次性快餐盒",
"1": "其他垃圾/污损塑料",
"2": "其他垃圾/烟蒂",
"3": "其他垃圾/牙签",
"4": "其他垃圾/破碎花盆及碟碗",
"5": "其他垃圾/竹筷",
"6": "厨余垃圾/剩饭剩菜",
"7": "厨余垃圾/大骨头",
"8": "厨余垃圾/水果果皮",
"9": "厨余垃圾/水果果肉",
"10": "厨余垃圾/茶叶渣",
"11": "厨余垃圾/菜叶菜根",
"12": "厨余垃圾/蛋壳",
"13": "厨余垃圾/鱼骨",
"14": "可回收物/充电宝",
"15": "可回收物/包",
"16": "可回收物/化妆品瓶",
"17": "可回收物/塑料玩具",
"18": "可回收物/塑料碗盆",
"19": "可回收物/塑料衣架",
"20": "可回收物/快递纸袋",
"21": "可回收物/插头电线",
"22": "可回收物/旧衣服",
"23": "可回收物/易拉罐",
"24": "可回收物/枕头",
"25": "可回收物/毛绒玩具",
"26": "可回收物/洗发水瓶",
"27": "可回收物/玻璃杯",
"28": "可回收物/皮鞋",
"29": "可回收物/砧板",
"30": "可回收物/纸板箱",
"31": "可回收物/调料瓶",
"32": "可回收物/酒瓶",
"33": "可回收物/金属食品罐",
"34": "可回收物/锅",
"35": "可回收物/食用油桶",
"36": "可回收物/饮料瓶",
"37": "有害垃圾/干电池",
"38": "有害垃圾/软膏",
"39": "有害垃圾/过期药物"
}

全部代码结构(含原始数据集)

{repo_root}
├── models //模型文件夹
├── utils //一些函数包
| ├── eval.py // 求精度
│ ├── misc.py // 模型保存,参数初始化,优化函数选择
│ ├── radam.py
│ └── ...
├── args.py //参数配置文件
├── build_net.py //搭建模型
├── dataset.py //数据批量加载文件
├── preprocess.py //数据预处理文件,生成坐标标签
├── train.py //训练运行文件
├── transform.py //数据增强文件
├──data //建立文件夹data,把garbage_classify全部解压缩到data下
| ├──garbage_classify
| | ├──train_data //全部训练数据集
| | ├──garbage_classify_rule.json //标签文件

python版本为3.6,具体的函数包如下:

pytorch>=1.0.1
torchvision>=0.2.2
matplotlib>=3.1.0
numpy>=1.16.4
scikit-image
pandas
sklearn
adabound
requests

py3.7训练的话,要修改下面的代码
if use_cuda: inputs, targets = inputs.cuda(), targets.cuda(async=True) inputs, targets = torch.autograd.Variable(inputs), torch.autograd.Variable(targets)
#python3.7已经移除了async关键字,而用non_blocking代替。(导致apache-airflow也出了问题)
#cuda() 本身也没有async.

就是把 async=True去掉

if use_cuda:
inputs, targets = inputs.cuda(), targets.cuda()
inputs, targets = torch.autograd.Variable(inputs), torch.autograd.Variable(targets)`

还无法解决的话,加QQ群:669412407(注明 华为垃圾分类训练) 共同学习谢谢!

跑preprocess的时候报错

Traceback (most recent call last):
File "F:/garbage_classification/garbage clssification code/huawei-garbage-master/preprocess.py", line 68, in
for fold_, (trn_idx, val_idx) in enumerate(folds.split(result, labels)):
File "F:\anaconda\lib\site-packages\sklearn\model_selection_split.py", line 724, in split
y = check_array(y, ensure_2d=False, dtype=None)
File "F:\anaconda\lib\site-packages\sklearn\utils\validation.py", line 550, in check_array
context))
ValueError: Found array with 0 sample(s) (shape=(0,)) while a minimum of 1 is required.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.