Giter VIP home page Giter VIP logo

Comments (15)

FSet89 avatar FSet89 commented on July 19, 2024

Did you find an answer for this problem?

from tflite-support.

rajeev-tbrew avatar rajeev-tbrew commented on July 19, 2024

@FSet89 No, I have not been able to find a solution. I saw that you have raised similar issue where difference is between CPU and NNPI. At least the issue I reported is some what validated :).

from tflite-support.

FSet89 avatar FSet89 commented on July 19, 2024

Yes, in my case the problem might be related with the quantization, as the inference device (NPU) may have some precision issues depending on how the quantization is performed (symm(asymm, per channel /per axis, int8/uint8...). Are you using a quantized model?

from tflite-support.

rajeev-tbrew avatar rajeev-tbrew commented on July 19, 2024

Yes, it is a quantized model but I feel implementation should be consistent OR at least we should know how to make it beave similarly. I think this library is currently evolving and we should have more info coming soon.

from tflite-support.

wangtz avatar wangtz commented on July 19, 2024

@lu-wang-g

from tflite-support.

lu-wang-g avatar lu-wang-g commented on July 19, 2024

@rajeev-tbrew, can you please provide the code snippet of creating tensorbuffer and passing it to Model? Thanks!

@xunkai55 to investigate this further.

from tflite-support.

xunkai55 avatar xunkai55 commented on July 19, 2024

Thanks for your nice words!

I'd like to know if there's any possibility to have a model and a piece of data for debugging purpose, so that we can reproduce the problem on our side. In theory, GPU / CPU inference should be consistent.

from tflite-support.

rajeev-tbrew avatar rajeev-tbrew commented on July 19, 2024

@xunkai55 sorry for the delayed response. Here is how we can reproduce.

The model is Mediapipe pose detector available on this URL.

To use GPU, I used kotlin to set options using the following code:

    val compatibilityList = CompatibilityList();
     val options = if(compatibilityList.isDelegateSupportedOnThisDevice){
        Log.d("Output", "This Device is GPU compatible")
        Model.Options.Builder().setDevice(Model.Device.GPU).build();

    } else {
        Log.d("Output", "This Device is not GPU compatible")
        Model.Options.Builder().setNumThreads(4).build(); 
    val poseDetection: PoseDetectionMl by lazy {
            PoseDetectionMl.newInstance(context, options);
        }
   
    fun StartPoseDetector():PoseDetectionMl{
        return poseDetection;
    }

In Java, the following code works with the image and prepares it for model input

ImageProcessor imageProcessor = new ImageProcessor.Builder()
               .add(new ResizeOp(new_height, new_width, ResizeOp.ResizeMethod.BILINEAR ))
               .add(new ResizeWithCropOrPadOp(224, 224))
               .add(new NormalizeOp(127.5f, 127.5f))
               .build(); 

 TensorImage tImage = TensorImage.fromBitmap(img);
 tImage = imageProcessor.process(tImage);

Following code then run the model prediction and gets boxes and scores tensors.

PoseDetectionMl.Outputs outputs = model.process(tImage.getTensorBuffer());
        TensorBuffer boxes = outputs.getOutputFeature0AsTensorBuffer(); //2254 ROIs
        TensorBuffer scores = outputs.getOutputFeature1AsTensorBuffer();

I am sharing the input image that I used for both GPU and CPU run along with box and scores outputs. Please do let me know if you need any additional information. Thanks a lot for your help.

output_cpu_boxes.txt
output_cpu_scores.txt
output_gpu_boxes.txt
output_gpu_scores.txt
yoga-6c

from tflite-support.

xunkai55 avatar xunkai55 commented on July 19, 2024

Thanks for reporting. I can reproduce an inconsistency on my machine and pixel 4. Will investigate deeper.

from tflite-support.

xunkai55 avatar xunkai55 commented on July 19, 2024

Hi there,

TFLite by default turns on a switch for GPU inference, which allows performance loss but gains faster speed.

TFLite Support (org.tensorflow.lite.support.model.Model) adopts default settings so that switch is on. In a very simple demo, I turned off that switch (by using TFLite interpreter / delegate API) and the result looks identical then.

Please take a look.

In the future, we need to revisit our API to explore the ways to expose that option.

GpuDiffers.zip

from tflite-support.

xunkai55 avatar xunkai55 commented on July 19, 2024

@rajeev-tbrew Here's how to replace the Model with Interpreter (copy-pasted from the zipped project above):

// Initialization
      Interpreter.Options gpuOptions = new Interpreter.Options();
      GpuDelegate.Options gpuDelegateOptions = new GpuDelegate.Options();
      gpuDelegateOptions.setPrecisionLossAllowed(false);
      GpuDelegate gpuDelegate = new GpuDelegate(gpuDelegateOptions);
      gpuOptions.addDelegate(gpuDelegate);
      Interpreter gpuInterpreter = new Interpreter(FileUtil.loadMappedFile(getApplicationContext(), MODEL), gpuOptions);
// Inference
    Map<Integer, Object> outputs = new HashMap<Integer, Object>();
    outputs.put(0, detection);
    outputs.put(1, score);
    interpreter.runForMultipleInputsOutputs(new Object[]{input.getBuffer()}, outputs);

from tflite-support.

rajeev-tbrew avatar rajeev-tbrew commented on July 19, 2024

Thank you @xunkai55. It will be great to have this option exposed in the support library as it makes usage of tflite models quite simple. Please feel free to close this issue if you want to track this feature request in a separate thread or you can keep it open till this option is made available in tflite-support . Thanks once again for your help.

from tflite-support.

mikaraento avatar mikaraento commented on July 19, 2024

We are looking into the differences here, however I wanted to help set expectations: GPU inference will give different answers from CPU. Whether the difference is too large compared to the performance benefit depends on the usecase and we recommend you testing your specific model and usecase before production.

As an example, internal Pose detection models have significantly different results on GPU vs CPU even though they are floating point models. In our testing the difference however is acceptable for our specific usecases.

If you need a systematic way to monitor for differences between accelerators for your model, please take a look at the mini-benchmark in https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/experimental/acceleration/mini_benchmark

from tflite-support.

rajeev-tbrew avatar rajeev-tbrew commented on July 19, 2024

@mikaraento Thanks for your inputs. I have slightly different views (which could be entirely wrong as well).

The model output in this case is not dependent on where we run it i.e. CPU or GPU. The reason it is different is because GPU is being run in a setup where precision loss is taken to speed up the inference. But if we remove that setup then both GPU and CPU give same exact outputs. That's what @xunkai55 meant in his post by using other API (Interpreter) which allows us to run model the model w/o precision loss on GPU and get the same results as running on CPU.

from tflite-support.

lu-wang-g avatar lu-wang-g commented on July 19, 2024

Closing the issue for now. Feel free to reopen if you have further questions.

from tflite-support.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.