luoyetx / jda Goto Github PK
View Code? Open in Web Editor NEWC++ implementation of Joint Cascade Face Detection and Alignment.
License: BSD 3-Clause "New" or "Revised" License
C++ implementation of Joint Cascade Face Detection and Alignment.
License: BSD 3-Clause "New" or "Revised" License
Hi,
For non-face images, the resized images are stored in neg.hds. However, during the training, neg.CalcFeatureValues() is trying to access neg.imgs, neg.imgs_half and neg.imgs_quarter which are not initialized.
For me it looks like a bug. Please correct me if I am missing any thing here.
Thanks and Regards,
I have emailed @luoyetx to apply for the training data, but to accelerate the process, can anybody share the data with me please? Many thanks
I want to test your trained model on FDDB and check the performance. So I pulled your code and did make. I put FDDB data into ./data folder as described in README. Now I don't know how to start the test. You said ./jda fddb comment would do that right? Which folder should I be?
For some reason, I decide to share the data I have collected. I wish this can help you and help me. If you want the data, you can send me an email and I will reply with the URL and Password to access these data.
If you have any question about the data, ask me in this issue or send me an email.
Hello,
First of all thank you for open sourcing this code. It is excellent. I'm encountering an error during training where a leaf node can receive a NaN score. After this happens, training freezes. The error has to occur in the following block of code-
if (node_idx >= nodes_n / 2) {
// we are on a leaf node
const int idx = node_idx - nodes_n / 2;
double pos_w, neg_w;
pos_w = neg_w = c.esp;
for (int i = 0; i < pos_n; i++)
pos_w += pos.weights[pos_idx[i]];
for (int i = 0; i < neg_n; i++)
neg_w += neg.weights[neg_idx[i]];
float score = 0.5 * log(pos_w / neg_w);
scores[idx] = isnan(score) ? 0. : score;
return;
}
I added the NaN check above the return myself to work around the issue, but I'm not sure setting the score to 0 is the proper solution. Do you have any insight on better ways to avoid this problem?
在您的博客中有这样一句话
在 3000fps 中,由于随机森林是针对单个关键点的,所有随机树中使用到的特征点不会关联到其他关键点上,只在当前关键点的附近区域随机产生两个特征点,做像素差值来作为 Shape-index 特征。
另外在原文3000fps
的section 3.3中提到
1) ...
2) we only consider the pixel features in the local region of a landmark
上述都表明在文章3000fps
中的做法是在训练一个landmark的cart的时候,只用了它本身附近的shape-index特征
但是在您的代码中,Feature是包含有两个landmark_id的,请问这样做的原因是什么呢?
I read your code in CART part, but I can't find the detail descriptions in the paper. What references does it cite? In another word, which paper make you directly know the detail about the feature generation and training of CART?
大家都是**人 我说中文好了 这样能更好地表达我的意思 避免表达不清造成歧义
JDA/src/jda/data.cpp 中的第57行代码
tr[j] = img.at<uchar>(x1_, y1_) - img.at<uchar>(x2, y2);
是不是应该改成
tr[j] = img.at<uchar>(x1_, y1_) - img.at<uchar>(x2_, y2_);
请问你代码实现的如何了,我试着跑了下python的版本,结果负样本完全不够用,我塞了3000张正样本,20000张负样本(500*500)结果训练到第一个stage的第40个弱分类器就完全不够用了,
实在差的太多了。
是不是哪里有什么问题?
谢谢
Hi, i was wondering if you did any tests to measure how it performs against well known detectors,
such as OpenCV VJ etc. would be great to see some numbers
thanx
There is calculating the max variance of residuals in your code, I think it should be min. So why you did that ? An could you send me your best fddb roc curves to me, [email protected].
hello, @luoyetx
just as the 2 line:https://github.com/luoyetx/JDA/blob/master/src/jda/data.cpp#L886-L887
the first line get threads_n, which is the max num threads we can use, so why set it 3 times larger in the next line? i think it may be no meaning, just equals omp_set_num_threads(threads_n);
( i'm not familiar with openmp),thanks.
in JDA/src/jda/btcart.cpp line 40
cart.Initialize(stage, k);
might be
cart.Initialize(stage, landmark_id);
where is the model file?
Do you have the updated final version release?
i think you can try to use recently introduced Normalised Pixel Difference features instead of existing.
You can read more here: http://www.cbsr.ia.ac.cn/users/scliao/papers/Liao-PAMI15-NPD.pdf
I'm currently doing some tests using NPD for facial landmarks alignment and it clearly outperforms classical pixel difference features.
你好,我是新手。想问一下,怎么生成train.txt里的landmark呢?
按照3000fps的思路,每一颗树在训练的时候特征值提取的两个点都是在同一个lanmark_id的某一个radius内生成的,但是我看你的代码在获取特征值的时候又是随机生成了两个landmark_id,在函数Cart::GenFeaturePool里面。
这里有点疑惑,希望解答下。谢谢
The training algorithms keeps on printing:
[05/10/16 - 17:18:51] Reset current_idx and restart, reset times = 148
[05/10/16 - 17:18:51] Current augment parameters, should flip = 1, rotation angle = 0
[05/10/16 - 17:18:54] Run out of background images
[05/10/16 - 17:18:54] Reset current_idx and restart, reset times = 149
[05/10/16 - 17:18:54] Current augment parameters, should flip = 1, rotation angle = 90
the code goes into this function, but cannot go out of it:
nega_n += hardNegaMining(joincascador, img, imgs, scores, shapes);
in hardNegaMining function,
bool is_face = joincascador.Validate(img(o), img_h(h), img_q(q), score, shape, n);
if (is_face) {
#pragma omp critical
{
imgs.push_back(img(o).clone());
scores.push_back(score);
shapes.push_back(shape);
}
}
it tries to find a face in negative samples. it seems impossible.
Hi luoyetx, thanks for your code.
In your config.json, for each cart,
detection rate = 1 - drop_n / number_of_positive_samples = 1-2/50000 = 0.99996
max false positive rate = 1 - restart.th = 0.999
Therefore, the final detection rate is pow(0.99996, 5405) = 0.8976
the final false positive rate = pow(0.999, 5405) = 0.0671
Is that right? Maybe the DR is a little low and the FPR is too high.
*********~/JDA/build$ ./jda test
terminate called after throwing an instance of 'jsmn::Error'
what(): File not exists!
已放弃 (核心已转储)
jda compline OK . is jsmn problem?
Hello,
Is there GPU implementation for this algorithm? if NOT will be any plan in future to support GPU?
Thanks in advance
in JDA/src/jda/data.cpp line34 to 43
switch (feature.scale) {
case Feature::ORIGIN:
scale = 1.0; break;
case Feature::HALF:
scale = 0.5; break;
case Feature::QUARTER:
scale = 0.25; break;
default:
scale = 1.0; break;
}
for example, if you want to pick up a pix in half scale of original image, you must mutilply a scale of 2(or sqrt(2)?) but not 0.5, so the code should be like this:
switch (feature.scale) {
case Feature::ORIGIN:
scale = 1.0; break;
case Feature::HALF:
scale = 2; break;
case Feature::QUARTER:
scale = 4; break;
default:
scale = 1.0; break;
}
how can I get the recall and precision? anybody knows? thx a lot!
Since the training process of JDA will take much time, we need to save the middle status of training process and can be recovered from such status.
你好,我看到前面的提问中你有提到用标注五个关键点的10000个人脸和20000张非人脸进行训练,大概的训练耗时需要多久,检测人脸效果和五官标注的效果如何
hi, in your code the three scales are: 1, 1/(2^(1/2)), 1/2.
but in the paper, it says:"we generate three scales of images by down sampling the input image to half and one foruth.".
So depending on your choice?
Hi,thanks very much for your open source code. I've learned a lot from it.
However,i have some question with this line features[i][j] = feature.CalcFeatureValue(img, img_half, img_quarter, shape, stp_mc[i]);
in function CalcFeatureValues in data.cpp LINE 168.
As I understand it, features[i][j]
means the i_th feature of the j_th image, stp_mc means transformation from meanShape to currentShape. So the stp_mc used here should be stp_mc[idx[j]] , am i right?
Following is the original code.
`
Mat_ DataSet::CalcFeatureValues(const vector& feature_pool,
const vector& idx) const {
const int n = feature_pool.size();
const int m = idx.size();
if (m == 0) {
return Mat_();
}
Mat_ features(n, m);
#pragma omp parallel for
for (int j = 0; j < m; j++) {
const Mat& img = imgs[idx[j]];
const Mat& img_half = imgs_half[idx[j]];
const Mat& img_quarter = imgs_quarter[idx[j]];
const Mat_& shape = current_shapes[idx[j]];
for (int i = 0; i < n; i++) {
const Feature& feature = feature_pool[i];
features[i][j] = feature.CalcFeatureValue(img, img_half, img_quarter, shape, stp_mc[i]);
}
}
return features;
}
`
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.