Giter VIP home page Giter VIP logo

Comments (7)

5had3z avatar 5had3z commented on June 8, 2024

polygon_masks=True returns valid poly and vert data btw. In the meantime, I will probably use this to render image with a plugin that runs over these with cv::fillConvexPoly. Could probably be interesting see how well that works compared to doing it manually in COCOReader::PixelwiseMasks, and potentially replace it...

from dali.

5had3z avatar 5had3z commented on June 8, 2024

If ratio=True, the poly points are internally transformed so it tries to rasterize the mask with normalized points and fails. I don't think its unreasonable to have ratio=True for normalized bboxes and have a segmentation image.

In regards to the segmentation image itself, it is initialized with zeros, but the poly masks may not cover all pixels, and there might be a zero class, so there will be incorrect loss applied to areas that should be ignore class, specified by the user.But I think one better would be to not create a semantic segmentation, rather just a mask segmentation that corresponds with the labels and boxes. Instances are hence still separated so panoptic segmentation can be performed with this, rather than only semantic (class recovered via label tensor index). For this mask segmentation, -1 can be void area (since we're using int32).

I might still try testing throughput of the current RLE implementation and cv::fillConvexPoly to simplify the codebase and remove pycocotools may be useful. RLE is not the prettiest thing to look at, I'll have to have a deeper understanding to figure out how to change from class to mask id.

from dali.

5had3z avatar 5had3z commented on June 8, 2024

Current plan is operator rasterize_vert_poly that rasterizes from vert/poly list, will basically follow the meta algorithm that works in normal python below.

mask = np.zeros(image.shape[:-1], dtype=np.int32)
vert_cpu *= mask.shape
vert_cpu = vert_cpu.astype(np.int32)
for i, poly_ in enumerate(poly_cpu, 1):
    points = vert_cpu[poly_[1] : poly_[2]]
    mask = cv2.fillConvexPoly(mask, points, i)

from dali.

5had3z avatar 5had3z commented on June 8, 2024

Without actually benchmarking any performance difference, I have a plugin that works as expected.

from dali.

JanuszL avatar JanuszL commented on June 8, 2024

Hi @5had3z,

Thank you for reporting the bug. Yes, indeed it is about the mishandling of the ratio=True case.
#5407 should fix that.
If you feel that you can improve the COCO reader behavior we would be more than happy to review and merge a PR (the functionality itself is an external contribution as well #2248).

from dali.

5had3z avatar 5had3z commented on June 8, 2024

I had my blinders on and didn't stop to think I need panoptic, not instance segmentation, so I need to make a COCO panoptic reader plugin anyway. I think there is some overlap with my existing cityscapes plugins, if there is interest reading coco panoptic format I can upstream this, but there will probably have to be some lenghty design discussions.

FYI the post-hoc rasterize Verticies + Poly is relatively simple to implement, however I haven't benchmarked check if its eggregiously slower than the existing impl.

using ConstTensor = dali::ConstSampleView<dali::CPUBackend>;
using Tensor = dali::SampleView<dali::CPUBackend>;

void rasterizeVertPoly(ConstTensor polyTensor, ConstTensor vertTensor, Tensor maskTensor, bool normCoords)
{
    struct Poly
    {
        int maskIdx;
        int startIdx;
        int endIdx;
    };

    const auto outShape = maskTensor.shape();
    cv::Mat maskMat(outShape[0], outShape[1], CV_32S, maskTensor.raw_mutable_data());
    maskMat.setTo(-1); // FIXME user defined void area

    std::span polyData(static_cast<const Poly*>(polyTensor.raw_data()), polyTensor.shape()[0]);
    std::span vertData(static_cast<const cv::Point2f*>(vertTensor.raw_data()), vertTensor.shape()[0]);

    std::vector<cv::Point> vertices(vertData.size()); // Transform  poly points from float to int

    if (normCoords)
    {
        std::transform(std::execution::unseq, vertData.begin(), vertData.end(), vertices.begin(),
            [h = outShape[0], w = outShape[1]](cv::Point2f p) { return cv::Point(p.x * w, p.y * h); });
    }
    else
    {
        std::transform(std::execution::unseq, vertData.begin(), vertData.end(), vertices.begin(),
            [](cv::Point2f p) { return static_cast<cv::Point>(p); });
    }

    for (const auto& polyPoint : polyData)
    {
        cv::Point* start = vertices.data() + polyPoint.startIdx;
        const int nPoints = polyPoint.endIdx - polyPoint.startIdx;
        cv::fillConvexPoly(maskMat, start, nPoints, polyPoint.maskIdx);
    }
}

template <>
void RasterizeVertPoly<dali::CPUBackend>::RunImpl(dali::Workspace& ws)
{
    const auto& polyIn = ws.Input<dali::CPUBackend>(0);
    const auto& vertIn = ws.Input<dali::CPUBackend>(1);
    auto& maskOut = ws.Output<dali::CPUBackend>(0);

    auto& tPool = ws.GetThreadPool();

    for (int sampleId = 0; sampleId < ws.GetRequestedBatchSize(0); ++sampleId)
    {
        tPool.AddWork([&, sampleId](int)
            { rasterizeVertPoly(polyIn[sampleId], vertIn[sampleId], maskOut[sampleId], mNormalizedCoords); });
    }

    tPool.RunAll();
}

from dali.

JanuszL avatar JanuszL commented on June 8, 2024

Hi @5had3z,

Thank you for sharing your implementation. The current one is an external contribution and was not benchmarked thoroughly as well. If you have some spare time feel free to compare both your implementations to see which one yields better perf results.

if there is interest reading coco panoptic format I can upstream this
We haven't hard such a request yet but feel free to contribute and extend the coco reader.

from dali.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.