Giter VIP home page Giter VIP logo

Comments (4)

wzyforgit avatar wzyforgit commented on May 25, 2024

https://github.com/wzyforgit/deep-learning-debugger/blob/master/src/alg/detect/yolov8npose.cpp
但是我只保留了姿态估计代码,其它部分需要你自己去原来的py文件移植
使用的话就是依次调这几个函数:init -> setImage -> analyze -> result

from ncnn.

CloudsRL avatar CloudsRL commented on May 25, 2024

哥们,我也经历了类似的问题,不过我是手写了前后处理,实现了功能,但是效果不太好,每帧图片大约100ms。
前处理主要是大小缩放,让输入的大小变成网络输入的大小(这里我是直接resize的,但是有人提到这样效果不好,得padding实现),还有NCNN的substract_mean_normalize(这个各种历程里应该都有,不过我不太知道这个参数该怎么做);
后处理一个是大小还原,另一个就是从结果当中提取有用的,因为直接输出的结果是84行8400列的一个超长的表格,每一列是一个结果,包括中心点的XY,宽高,然后就是80个类的置信度,需要自己根据阈值去筛选,筛选完还要进行NMS(非极大值抑制)。网上有一些是考虑什么网格大小进行筛选的,但是我比较菜看不懂,就自己简单的手写了一个。兄弟你先网上搜集一下资料自己试试,如果还不行的话给我留个联系方式(邮箱),我把我这部分代码给你,一起交流学习

from ncnn.

zhenjing avatar zhenjing commented on May 25, 2024

例子1: https://github.com/Qengineering/YoloV8-ncnn-Raspberry-Pi-4
使用中间层结果,类似yolo5

from ncnn.

zhenjing avatar zhenjing commented on May 25, 2024

基于yolov8的最终输出,用opencv dnn库做后处理。
` ncnn::Extractor ex = yolo.create_extractor();

ex.input("images", bgr);

std::vector<Object> proposals;

ncnn::Mat out;
ex.extract("output", out);
//std::cout << "out dims: " << out.dims << " w: " << out.w << " h: " << out.h << " d: " << out.d << " c: " << out.c << std::endl;

// output shape: [1, 84, 8400]; 84 means: [cx, cy, w, h, prob * 80]
cv::Mat output_buffer(out.h, out.w, CV_32F, (float*)out.data);
transpose(output_buffer, output_buffer); // [8400,84]
//std::cout << "rows : " << output_buffer.rows << " cols: " << output_buffer.cols << std::endl;

float score_threshold = prob_threshold;
std::vector<int> class_ids;
std::vector<float> class_scores;
std::vector<Rect> boxes;

// Figure out the bbox, class_id and class_score
for (int i = 0; i < output_buffer.rows; i++) {
    Mat classes_scores = output_buffer.row(i).colRange(4, 84);
    Point class_id;
    double maxClassScore;
    minMaxLoc(classes_scores, 0, &maxClassScore, 0, &class_id);

    if (maxClassScore > score_threshold) {
        class_scores.push_back(maxClassScore);
        class_ids.push_back(class_id.x);
        float cx = output_buffer.at<float>(i, 0);
        float cy = output_buffer.at<float>(i, 1);
        float w = output_buffer.at<float>(i, 2);
        float h = output_buffer.at<float>(i, 3);

        int left = int((cx - 0.5 * w) * scale);
        int top = int((cy - 0.5 * h) * scale);
        int width = int(w * scale);
        int height = int(h * scale);

        // printf("box: {%d, %d, %d, %d}, class_idx: %d, score: %f\n", left, top, left + width, top + height, class_id.x, maxClassScore);
        boxes.push_back(Rect(left, top, width, height));
    }
}
//NMS
std::vector<int> indices;
NMSBoxes(boxes, class_scores, score_threshold, nms_threshold, indices);

for (size_t i = 0; i < indices.size(); i++) {
    Object obj;
    int index = indices[i];
    obj.label = class_ids[index];
    obj.prob = class_scores[index];
    obj.rect.x = boxes[index].x;
    obj.rect.y = boxes[index].y;
    obj.rect.width = boxes[index].width;
    obj.rect.height = boxes[index].height;
    objects.push_back(obj);

    printf("### box: {%f, %f, %f, %f}, class_idx: %d, score: %f\n", obj.rect.x, obj.rect.y, obj.rect.x + obj.rect.width, obj.rect.y + obj.rect.height, obj.label, obj.prob);
}

`

from ncnn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.