Giter VIP home page Giter VIP logo

Comments (9)

glenn-jocher avatar glenn-jocher commented on June 29, 2024 1

Hello,

Thank you for referencing the issue and sharing your pseudocode. It looks like you're on the right track! For specific modifications to align with the suggestions in the linked issue, it would be helpful to know more about the exact challenges or errors you're encountering.

Since you're also exploring a TensorFlow Lite implementation, that's a great alternative approach. If you encounter any hurdles there or need further clarification on the pseudocode adjustments, feel free to reach out. We're here to help!

Good luck with your iOS project! 🚀

from ultralytics.

RonaldoBandeira avatar RonaldoBandeira commented on June 29, 2024 1

@fosteman You can check this work here. It runs fine. I made my tests here and its code is great, with nice optimizations as running the NMS at the GPU.

@glenn-jocher IDK much about your roadmap and licensing between AGPLv3 and GPLv3 but if you may, you can contact the original developer

from ultralytics.

glenn-jocher avatar glenn-jocher commented on June 29, 2024 1

Hello @RonaldoBandeira,

Thank you for sharing the link to the iOS implementation of YOLOv8. It's great to hear about successful optimizations like running NMS on the GPU!

Regarding the licensing query, all usage of Ultralytics models, architectures, or code at any stage in R&D, development, and deployment requires an Ultralytics Enterprise License unless the entire project is open-sourced under AGPL-3.0. For more specific details or to discuss licensing further, please reach out directly to our licensing team.

from ultralytics.

github-actions avatar github-actions commented on June 29, 2024

👋 Hello @fosteman, thank you for your interest in Ultralytics YOLOv8 🚀! We recommend a visit to the Docs for new users where you can find many Python and CLI usage examples and where many of the most common questions may already be answered.

If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Join the vibrant Ultralytics Discord 🎧 community for real-time conversations and collaborations. This platform offers a perfect space to inquire, showcase your work, and connect with fellow Ultralytics users.

Install

Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.

pip install ultralytics

Environments

YOLOv8 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

Ultralytics CI

If this badge is green, all Ultralytics CI tests are currently passing. CI tests verify correct operation of all YOLOv8 Modes and Tasks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

from ultralytics.

fosteman avatar fosteman commented on June 29, 2024

duplicate of #2879

from ultralytics.

fosteman avatar fosteman commented on June 29, 2024

Here's a working SegmentationMode.swift

import Foundation
import UIKit
import CoreML
import Vision

class SegmentationModel {
    private var visionModel: VNCoreMLModel?
    private var request: VNCoreMLRequest?
    var onResults: (([BoundingBox], [SegmentationMask]) -> Void)?
    
    init() {
        loadModel()
        print("Model loaded")
    }
    
    private func loadModel() {
        do {
            let config = MLModelConfiguration()
            config.computeUnits = .all // Force CPU execution
            let model = try segment_amaranth_grill_2560(configuration: config)
            self.visionModel = try VNCoreMLModel(for: model.model)
            self.request = VNCoreMLRequest(model: visionModel!, completionHandler: { [weak self] request, error in
                self?.processClassifications(for: request, error: error)
            })
        } catch {
            print("Failed to load Vision ML model: \(error)")
        }
    }
    
    func performSegmentation(on buffer: CMSampleBuffer) {
        guard let request = self.request else { return }
        let handler = VNImageRequestHandler(cmSampleBuffer: buffer, options: [:])
        do {
            try handler.perform([request])
        } catch {
            print("Failed to perform segmentation: \(error)")
        }
    }
    
    private func processClassifications(for request: VNRequest, error: Error?) {
        guard let results = request.results as? [VNCoreMLFeatureValueObservation] else {
            print("Unexpected results: \(String(describing: request.results))")
            return
        }

        guard let var1052 = results.last?.featureValue.multiArrayValue,
              let p = results.first?.featureValue.multiArrayValue else {
            print("Missing expected MultiArray outputs", results.first, results.last)
            return
        }

        // Parse the results
        let boundingBoxes = parseBoundingBoxes(from: var1052)
        let segmentationMasks = parseSegmentationMasks(from: p, using: boundingBoxes)

        // Notify the coordinator with results
        onResults?(boundingBoxes, segmentationMasks)
    }

    
    private func parseBoundingBoxes(from multiArray: MLMultiArray) -> [BoundingBox] {
        var boundingBoxes: [BoundingBox] = []
        let threshold: Float = 0.5
        let numberOfClasses = 4
        let segmentValuesCount = 32
        let elementSize = 4 + numberOfClasses + segmentValuesCount
        let totalElements = multiArray.count

        let featureVectors = multiArray.dataPointer.bindMemory(to: Float.self, capacity: multiArray.count)

        let startBoundingBoxLoop = CFAbsoluteTimeGetCurrent()
        for i in 0..<86016 {
            if i % 1000 == 0 {
                let currentTime = CFAbsoluteTimeGetCurrent()
                print("Processing bounding box \(i) / 86016. Time elapsed: \((currentTime - startBoundingBoxLoop) * 1000) ms")
            }

            let baseOffset = i * elementSize

            if baseOffset + elementSize <= totalElements {
                let x = featureVectors[baseOffset]
                let y = featureVectors[baseOffset + 1]
                let width = featureVectors[baseOffset + 2]
                let height = featureVectors[baseOffset + 3]

                var classScores = [Float]()
                for j in 0..<numberOfClasses {
                    let classIndex = baseOffset + 4 + j
                    if classIndex < totalElements {
                        classScores.append(featureVectors[classIndex])
                    } else {
                        print("Warning: Out-of-bounds class score index \(classIndex)")
                        classScores.append(0)
                    }
                }

                let maxClassScore = classScores.max() ?? 0
                let maxClassIndex = classScores.firstIndex(of: maxClassScore) ?? -1

                if maxClassScore > threshold {
                    var segmentationValues = [Float]()
                    for k in 0..<segmentValuesCount {
                        let segmentIndex = baseOffset + 4 + numberOfClasses + k
                        if segmentIndex < totalElements {
                            segmentationValues.append(featureVectors[segmentIndex])
                        } else {
                            print("Warning: Out-of-bounds segmentation value index \(segmentIndex)")
                            segmentationValues.append(0)
                        }
                    }
                    boundingBoxes.append(BoundingBox(x: x, y: y, width: width, height: height, classIndex: maxClassIndex, classScore: maxClassScore, segmentationValues: segmentationValues))
                }
            } else {
                print("Warning: Out-of-bounds baseOffset \(baseOffset)")
            }
        }

        print("Parsed \(boundingBoxes.count) bounding boxes")
        return boundingBoxes
    }

    private func parseSegmentationMasks(from multiArray: MLMultiArray, using boundingBoxes: [BoundingBox]) -> [SegmentationMask] {
        var segmentationMasks: [SegmentationMask] = []
        let multiArrayShape = multiArray.shape
        let channels = multiArrayShape[1].intValue
        let height = multiArrayShape[2].intValue
        let width = multiArrayShape[3].intValue
        let totalElements = multiArray.count

        let featureVectors = multiArray.dataPointer.bindMemory(to: Float.self, capacity: multiArray.count)

        let startSegmentationLoop = CFAbsoluteTimeGetCurrent()
        for (boxIndex, boundingBox) in boundingBoxes.enumerated() {
            if boxIndex % 100 == 0 {
                let currentTime = CFAbsoluteTimeGetCurrent()
                print("Processing segmentation mask \(boxIndex) / \(boundingBoxes.count). Time elapsed: \((currentTime - startSegmentationLoop) * 1000) ms")
            }

            var mask = [Float](repeating: 0, count: height * width)
            for i in 0..<channels {
                let segmentationValue = boundingBox.segmentationValues[i]
                let channelStartIndex = i * height * width

                if channelStartIndex + height * width <= totalElements {
                    for j in 0..<(height * width) {
                        mask[j] += segmentationValue * featureVectors[channelStartIndex + j]
                    }
                } else {
                    print("Warning: Out-of-bounds index in multiArray")
                }
            }

            let thresholdedMask = mask.map { $0 > 0.5 ? 1.0 : 0.0 }
            segmentationMasks.append(SegmentationMask(mask: thresholdedMask, width: width, height: height))
        }

        print("Parsed \(segmentationMasks.count) segmentation masks")
        return segmentationMasks
    }
}

struct BoundingBox {
    let x: Float
    let y: Float
    let width: Float
    let height: Float
    let classIndex: Int
    let classScore: Float
    let segmentationValues: [Float]
}

struct SegmentationMask {
    let mask: [Double]
    let width: Int
    let height: Int
}

The output in the console (a physical iPhone 11 Pro Max executing these calculations, not a simulator)

No image named 'logo' found in asset catalog for /private/var/containers/Bundle/Application/DA859767-B18E-4655-A620-15B7A9B4240B/chicken-gut-analysis-ar-hud.app
Model loaded
Processing bounding box 0 / 86016. Time elapsed: 0.05805492401123047 ms
Processing bounding box 1000 / 86016. Time elapsed: 25.381088256835938 ms
Processing bounding box 2000 / 86016. Time elapsed: 50.60398578643799 ms
Processing bounding box 3000 / 86016. Time elapsed: 86.36307716369629 ms
Processing bounding box 4000 / 86016. Time elapsed: 122.08402156829834 ms
Processing bounding box 5000 / 86016. Time elapsed: 157.79507160186768 ms
Processing bounding box 6000 / 86016. Time elapsed: 193.59803199768066 ms
Processing bounding box 7000 / 86016. Time elapsed: 229.53903675079346 ms
Processing bounding box 8000 / 86016. Time elapsed: 265.3390169143677 ms
Processing bounding box 9000 / 86016. Time elapsed: 289.6000146865845 ms
Processing bounding box 10000 / 86016. Time elapsed: 296.531081199646 ms
Processing bounding box 11000 / 86016. Time elapsed: 303.47204208374023 ms
Processing bounding box 12000 / 86016. Time elapsed: 310.371994972229 ms
Processing bounding box 13000 / 86016. Time elapsed: 317.2900676727295 ms
Processing bounding box 14000 / 86016. Time elapsed: 324.241042137146 ms
Processing bounding box 15000 / 86016. Time elapsed: 331.19308948516846 ms
Processing bounding box 16000 / 86016. Time elapsed: 338.14001083374023 ms
Processing bounding box 17000 / 86016. Time elapsed: 345.04103660583496 ms
Processing bounding box 18000 / 86016. Time elapsed: 352.0190715789795 ms
Processing bounding box 19000 / 86016. Time elapsed: 358.98005962371826 ms
Processing bounding box 20000 / 86016. Time elapsed: 365.9580945968628 ms
Processing bounding box 21000 / 86016. Time elapsed: 372.9090690612793 ms
Processing bounding box 22000 / 86016. Time elapsed: 379.90105152130127 ms
Processing bounding box 23000 / 86016. Time elapsed: 386.9110345840454 ms
Processing bounding box 24000 / 86016. Time elapsed: 393.8760757446289 ms
Processing bounding box 25000 / 86016. Time elapsed: 400.8220434188843 ms
Processing bounding box 26000 / 86016. Time elapsed: 407.77504444122314 ms
Processing bounding box 27000 / 86016. Time elapsed: 414.789080619812 ms
Processing bounding box 28000 / 86016. Time elapsed: 422.08099365234375 ms
Processing bounding box 29000 / 86016. Time elapsed: 430.0030469894409 ms
Processing bounding box 30000 / 86016. Time elapsed: 437.50107288360596 ms
Processing bounding box 31000 / 86016. Time elapsed: 444.4620609283447 ms
Processing bounding box 32000 / 86016. Time elapsed: 451.4220952987671 ms
Processing bounding box 33000 / 86016. Time elapsed: 472.675085067749 ms
Processing bounding box 34000 / 86016. Time elapsed: 493.1679964065552 ms
Processing bounding box 35000 / 86016. Time elapsed: 500.1339912414551 ms
Processing bounding box 36000 / 86016. Time elapsed: 507.0990324020386 ms
Processing bounding box 37000 / 86016. Time elapsed: 514.0920877456665 ms
Processing bounding box 38000 / 86016. Time elapsed: 521.0440158843994 ms
Processing bounding box 39000 / 86016. Time elapsed: 527.9980897903442 ms
Processing bounding box 40000 / 86016. Time elapsed: 534.9550247192383 ms
Processing bounding box 41000 / 86016. Time elapsed: 546.6680526733398 ms
Processing bounding box 42000 / 86016. Time elapsed: 573.078989982605 ms
Processing bounding box 43000 / 86016. Time elapsed: 583.0579996109009 ms
Processing bounding box 44000 / 86016. Time elapsed: 590.0090932846069 ms
Processing bounding box 45000 / 86016. Time elapsed: 596.966028213501 ms
Processing bounding box 46000 / 86016. Time elapsed: 628.2210350036621 ms
Processing bounding box 47000 / 86016. Time elapsed: 664.0499830245972 ms
Processing bounding box 48000 / 86016. Time elapsed: 677.8280735015869 ms
Processing bounding box 49000 / 86016. Time elapsed: 684.7680807113647 ms
Processing bounding box 50000 / 86016. Time elapsed: 691.7120218276978 ms
Processing bounding box 51000 / 86016. Time elapsed: 698.6880302429199 ms
Processing bounding box 52000 / 86016. Time elapsed: 705.6289911270142 ms
Processing bounding box 53000 / 86016. Time elapsed: 712.643027305603 ms
Processing bounding box 54000 / 86016. Time elapsed: 719.6170091629028 ms
Processing bounding box 55000 / 86016. Time elapsed: 726.5729904174805 ms
Processing bounding box 56000 / 86016. Time elapsed: 733.5770130157471 ms
Processing bounding box 57000 / 86016. Time elapsed: 740.5160665512085 ms
Processing bounding box 58000 / 86016. Time elapsed: 747.9240894317627 ms
Processing bounding box 59000 / 86016. Time elapsed: 755.9940814971924 ms
Processing bounding box 60000 / 86016. Time elapsed: 762.971043586731 ms
Processing bounding box 61000 / 86016. Time elapsed: 769.9110507965088 ms
Processing bounding box 62000 / 86016. Time elapsed: 776.960015296936 ms
Processing bounding box 63000 / 86016. Time elapsed: 784.0110063552856 ms
Processing bounding box 64000 / 86016. Time elapsed: 790.9690141677856 ms
Processing bounding box 65000 / 86016. Time elapsed: 799.8290061950684 ms
Processing bounding box 66000 / 86016. Time elapsed: 811.5969896316528 ms
Processing bounding box 67000 / 86016. Time elapsed: 821.4939832687378 ms
Processing bounding box 68000 / 86016. Time elapsed: 828.4980058670044 ms
Processing bounding box 69000 / 86016. Time elapsed: 835.5940580368042 ms
Processing bounding box 70000 / 86016. Time elapsed: 843.2900905609131 ms
Processing bounding box 71000 / 86016. Time elapsed: 851.6700267791748 ms
Processing bounding box 72000 / 86016. Time elapsed: 887.7869844436646 ms
Processing bounding box 73000 / 86016. Time elapsed: 926.6870021820068 ms
Processing bounding box 74000 / 86016. Time elapsed: 938.0390644073486 ms
Processing bounding box 75000 / 86016. Time elapsed: 948.0060338973999 ms
Processing bounding box 76000 / 86016. Time elapsed: 957.9260349273682 ms
Processing bounding box 77000 / 86016. Time elapsed: 967.8289890289307 ms
Processing bounding box 78000 / 86016. Time elapsed: 977.7380228042603 ms
Processing bounding box 79000 / 86016. Time elapsed: 987.6210689544678 ms
Processing bounding box 80000 / 86016. Time elapsed: 997.5130558013916 ms
Processing bounding box 81000 / 86016. Time elapsed: 1007.436990737915 ms
Processing bounding box 82000 / 86016. Time elapsed: 1017.3079967498779 ms
Processing bounding box 83000 / 86016. Time elapsed: 1027.2250175476074 ms
Processing bounding box 84000 / 86016. Time elapsed: 1037.1500253677368 ms
Processing bounding box 85000 / 86016. Time elapsed: 1047.0080375671387 ms
Processing bounding box 86000 / 86016. Time elapsed: 1056.8870306015015 ms
Parsed 15137 bounding boxes
Processing segmentation mask 0 / 15137. Time elapsed: 0.07700920104980469 ms

Each segmnetation mask loops into eternity.

What' wrong with this code ?

I really need to understand if i am doing this right

from ultralytics.

fosteman avatar fosteman commented on June 29, 2024

pseudocode

class SegmentationModel:
    define init():
        call loadModel()
        print "Model loaded"

    define loadModel():
        try:
            create model configuration
            load the model using the configuration
            create a Vision model from the CoreML model
            create a Vision request with the Vision model and a completion handler
        catch error:
            print "Failed to load Vision ML model: ", error

    define performSegmentation(buffer):
        if no Vision request exists:
            return
        create an image request handler with the buffer
        try:
            perform the request using the handler
        catch error:
            print "Failed to perform segmentation: ", error

    define processClassifications(request, error):
        if request results are not of the expected type:
            print "Unexpected results: ", request results
            return

        get the last result as var1052
        get the first result as p

        if var1052 or p is missing:
            print "Missing expected MultiArray outputs"
            return

        parse bounding boxes from var1052
        parse segmentation masks from p using the bounding boxes
        notify with the results

    define parseBoundingBoxes(multiArray):
        initialize empty list for bounding boxes
        set threshold value
        define number of classes
        define segment values count
        calculate element size
        get total elements in multiArray

        bind multiArray data to feature vectors

        for each index in range of elements:
            if index is a multiple of 1000:
                print progress

            calculate base offset

            if base offset plus element size is within total elements:
                extract x, y, width, and height from feature vectors
                extract class scores from feature vectors
                find max class score and its index

                if max class score is greater than threshold:
                    extract segmentation values from feature vectors
                    add new bounding box to the list
            else:
                print "Warning: Out-of-bounds baseOffset ", base offset

        print "Parsed ", number of bounding boxes, " bounding boxes"
        return bounding boxes

    define parseSegmentationMasks(multiArray, boundingBoxes):
        initialize empty list for segmentation masks
        get shape of multiArray
        extract channels, height, and width from shape
        get total elements in multiArray

        bind multiArray data to feature vectors

        for each bounding box index and bounding box in bounding boxes:
            if bounding box index is a multiple of 100:
                print progress

            initialize mask with zeros of size height * width

            for each channel:
                get segmentation value from bounding box
                calculate channel start index

                if channel start index plus height * width is within total elements:
                    for each pixel index in height * width:
                        add segmentation value multiplied by feature vector to mask
                else:
                    print "Warning: Out-of-bounds index in multiArray"

            threshold the mask values to binary
            add new segmentation mask to the list

        print "Parsed ", number of segmentation masks, " segmentation masks"
        return segmentation masks

struct BoundingBox:
    float x
    float y
    float width
    float height
    int classIndex
    float classScore
    list of float segmentationValues

struct SegmentationMask:
    list of double mask
    int width
    int height

from ultralytics.

glenn-jocher avatar glenn-jocher commented on June 29, 2024

Hello,

Thank you for sharing your pseudocode. It provides a good structure for a segmentation model using CoreML and Vision frameworks. However, if you're experiencing infinite loops or performance issues during segmentation mask processing, it might be related to how the bounding boxes and segmentation masks are being parsed and processed in real-time.

A few things to consider:

  1. Optimization: Ensure that the operations within your loops, especially those involving large datasets, are optimized. Avoid unnecessary computations inside critical loops.
  2. Debugging: Add more detailed logging to understand where exactly the code might be getting stuck or taking longer than expected.
  3. Validation: Check if all indices and calculations especially within parseBoundingBoxes and parseSegmentationMasks are correctly aligned with the data structures you're working with.

If you continue to face issues, providing more specific details or error messages could help in diagnosing the problem more effectively.

from ultralytics.

fosteman avatar fosteman commented on June 29, 2024

@glenn-jocher , here's your idea: #2879 (comment)

I do not understand what i'm missing with my code.

Could you take the pseudo and perhaps edit it to do the thing you mentioned ^^ up there it should ?

Thanks!

IN the meantime, I'm heads down to implement segmentation through Tensorflow Lite framework on my ios project instead!

from ultralytics.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.