Light

wyg1997 / workpiece_location Goto Github PK

View Code? Open in Web Editor NEW

0.0 0.0 0.0 171 KB

Python 100.00%

workpiece_location's People

Contributors

Watchers

workpiece_location's Issues

windows环境下worker为4时有bug

worker设置为4时，有一个线程的dataloader对象的self.CLASSES为空。

标签数不同时batch_size只能设置为1

标签数不同，sampler无法打包，所以batch_size只能设置为1。

网络训练优化

训练角度和半径分支时，不对当前的位置得分算mask，只对ignore label做mask就可以。
有角度和半径分支时，在refine时，不concat这两个特征。
得到结果的卷积层使用其它的初始化方法(使结果在[0, 1]之间，减小训练难度)。

角度检测方案设计

算法一

网络结构

添加角度预测分支，通道数为：

2(包括sinθ和cosθ)。 b3968c1
num_cls*2，即对每个类别都预测一个类别。 bcfc5ed

计算方式

拿到网络输出的方向矩阵，有这一个点的sinθ和cosθ值，可以计算出arctanθ，并根据sinθ的符号来确定真正的θ值。

Ground Truth

选择target中在阈值以上的点，这些点全赋值为标注的角度(sinθ, cosθ)。

loss及角度误差的计算

只对标注点计算loss，使用L2_loss，公式为：

$\begin{align*} loss&=\frac{(cos\Theta-cos\Theta_{0})^{2}+(sin\Theta-sin\Theta_{0})^2}{2}\\ &=\frac{1+sin^{2}\Theta_{0}+cos^{2}\Theta_{0}}{2}-cos(\Theta-\Theta_{0}) \end{align*}$

~~可以推出θ误差的计算公式：~~

$cos(\Delta\Theta)=\frac{1+sin^{2}\Theta_{0}+cos^{2}\Theta_{0}}{2}-loss$

直接计算出loss，并根据sin和cos值计算arctan的值，并根据sin的符号确定角度即可。

可能存在的问题

sinθ和cosθ的值存在约束关系，网络训练是否保持精度。

结果

两种方式结果差别不大，在工件数据上效果均不好。原因是网络需要回归的角度范围是0~360度，难度较大。

还有一种可能的原因是训练位置的分支和训练角度的分支有一定的约束关系，两者同时训练不易拟合。一个比较明显的例子就是如果对两者的loss加权，如果角度的权重增大，位置精度会下降很多。

可视化target**有黑点

target**应该是接近白色的点，即像素值接近255，但实际显示是黑色。

模板匹配参数未使用

模板匹配时模板的缩放范围、旋转角度等信息未使用。

target设置不正确，导致训练背景不能正常分辨

之前的训练思路是这样：网络输出和target的尺寸相同，均为[n, c, h, w]，c为总类别数+1。

target除了背景那一通道均为高斯分布的关键点位置。

问题就出在背景通道的得分一直为0，导致在训练背景的时候不能正确的对背景进行学习，容易把背景为成前景。

TODO list

定位

模型
- 支持更多backbone(~~现在只支持vgg~~)
- ~~支持目标尺度和角度的检测~~(转为下阶段任务)
- ~~支持继承训练~~(因为训练迭代次数少，不实现继承训练)
数据集与数据加载
- 翻转操作
- 光照操作
- 随机缩放操作
- 上采样倍数(stride参数)
- 二维高斯函数的sigma参数
训练
- 多种优化器
- 学习率调整
- step调整
功能性改动
- 结果可视化
- 测试结果指标评估
- Visdom训练过程可视化
- 训练数据可视化(方便debug)
- 结果是否可视化加入config file中

windows环境下opencv绘图操作类型出问题

在tools/visualize.py的104和138行中，传入的图片是img[i]，不是opencv的UMat格式。在windows下会报错。

训练集数据过少，训练出问题

问题

训练集只有2-4张图时，刚开始的几轮训练loss会非常大，最后也无法训练出效果。

可能的解决办法

添加梯度截断，避免出现大的参数更新。
结果：训练效果不好，被截断的梯度更新不正常？
延迟梯度更新，配合数据增强，扩充训练数据。
结果：训练效果不好。

可视化时，以关键点为中心的圆绘制有误

在tools/visualize.py的第129行中，做了取整操作，如果值为-1，半径就变成0。

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.