Giter VIP home page Giter VIP logo

Comments (19)

lu-wang-g avatar lu-wang-g commented on July 19, 2024

I suspect this is a memory fragmentation issue as well. ImageProcessor and TensorImage did a lot of allocation and freeing. Maybe it helps to declare a new TensorImage in each loop instead of the make it long-lived?

from tflite-support.

DavidRobb avatar DavidRobb commented on July 19, 2024

I've just tried shifting the creation of various objects in and out of the loop but can't find any work around for the OOM error apart from restarting the Activity every 20 minutes (obviously not ideal)

However, by putting everything in the main loop:

        for (i in 1..5000) {
            GlobalScope.run {
                // Process the image in Tensorflow

                val tflite = Interpreter(
                    FileUtil.loadMappedFile(applicationContext, MODEL_PATH),
                    Interpreter.Options().addDelegate(NnApiDelegate())
                )
                val detector =
                    ObjectDetectionHelper(
                        tflite,
                        FileUtil.loadLabels(applicationContext, LABELS_PATH)
                    )


                val inputIndex = 0
                val inputShape = tflite.getInputTensor(inputIndex).shape()
                val tfInputSize =
                    Size(inputShape[2], inputShape[1]) // Order of axis is: {1, height, width, 3}

                val cropSize = minOf(bitmap.width, bitmap.height)
                val tfip = ImageProcessor.Builder()
                    .add(ResizeWithCropOrPadOp(cropSize, cropSize))
                    .add(
                        ResizeOp(
                            tfInputSize.height,
                            tfInputSize.width,
                            ResizeOp.ResizeMethod.NEAREST_NEIGHBOR
                        )
                    )
                    .add(Rot90Op(-0 / 90))
                    .add(NormalizeOp(0f, 1f))
                    .build()

                val tfImageBuffer = TensorImage(DataType.UINT8)
                tfImageBuffer.load(bitmap)
                val tfImage = tfip.process(tfImageBuffer)
                // Perform the object detection for the current frame
                val predictions = detector.predict(tfImage)

                // Report only the top prediction
                val prediction = predictions.maxByOrNull { it.score }
                Log.i(TAG, " $i ${"%.2f".format(prediction?.score)} ${prediction?.label}")
            }
        }

I can observe a memory leak in Others:
image

Any ideas please?

from tflite-support.

fergushenderson avatar fergushenderson commented on July 19, 2024

Does it help if you insert a call to System.gc() in the loop?

If so, the issue may be that mapped memory (allocated by loadMappedFile, which calls FileChannel.map, which ends up calling the underlying system call mmap()) isn't being reclaimed until garbage collection happens, but garbage collection may not be happening because you're not running out of heap space.

from tflite-support.

DavidRobb avatar DavidRobb commented on July 19, 2024

I've just tried adding a System.gc() in the loop for both the cases: where just the processImage is in the loop and where all objects are recreated in the loop:

No noticeable difference from the observations without the System.gc() call.

Still crashes with an OOM

from tflite-support.

xunkai55 avatar xunkai55 commented on July 19, 2024

@DavidRobb In your snippet, you never close the interpreter. Could you please add close calls and retry?

Ref: https://www.tensorflow.org/lite/api_docs/java/org/tensorflow/lite/Interpreter

from tflite-support.

DavidRobb avatar DavidRobb commented on July 19, 2024

Sorry for the delay in replying. I've just tried adding the tflite.close to the main loop in the OOM example and it makes no difference.

However, I have discovered that it works fine (with or without the close) for the emulated Pixel XL API 28 and shows no sign of a memory leak.

but with the emulated Pixel 2 API 27 or my hardware Moto G 5 the leak exists and it crashes after a few thousand iterations

from tflite-support.

xunkai55 avatar xunkai55 commented on July 19, 2024

@DavidRobb Thanks for reporting, it could be really important.

Could you please try moving the creation and closing of interpreter out of the loop? That can tell us if the memory leak happens in inference calls, or create/close of the interpreter. Thanks!

from tflite-support.

DavidRobb avatar DavidRobb commented on July 19, 2024

@xunkai55 Just tried this. Moving the tflite creation and close outside the loop provides results similar to my original code example:
No obvious memory leak shown in the profiler but after 1726 iterations:

2021-06-17 10:48:53.682 1098-1098/com.android.example.camerax.tflite I/CameraActivity: 1725 nu null 2021-06-17 10:48:53.994 1098-1098/com.android.example.camerax.tflite I/CameraActivity: 1726 nu null 2021-06-17 10:48:54.012 1098-10834/com.android.example.camerax.tflite W/libc: pthread_create failed: couldn't allocate 1036288-bytes mapped space: Out of memory 2021-06-17 10:48:54.014 1098-10834/com.android.example.camerax.tflite E/libc++abi: terminating with uncaught exception of type std::__1::system_error: thread constructor failed: Try again 2021-06-17 10:48:54.014 1098-10834/com.android.example.camerax.tflite A/libc: Fatal signal 6 (SIGABRT), code -6 in tid 10834 (.camerax.tflite), pid 1098 (.camerax.tflite) 2021-06-17 10:48:54.466 1098-1113/com.android.example.camerax.tflite W/libc: pthread_create failed: couldn't allocate 1036288-bytes mapped space: Out of memory
I suspect a memory fragmentation problem.

from tflite-support.

xunkai55 avatar xunkai55 commented on July 19, 2024

@DavidRobb well received. Let me try this on my machine.

from tflite-support.

DavidRobb avatar DavidRobb commented on July 19, 2024

@xunkai55 Any progress? Have you been able to reproduce the problem?

from tflite-support.

xunkai55 avatar xunkai55 commented on July 19, 2024

Sorry I was busy on the other things. Will get back to you within 24 hours.

from tflite-support.

xunkai55 avatar xunkai55 commented on July 19, 2024

I can reproduce that the "Others" section continuously grows. However haven't got any clue to solve it.

from tflite-support.

xunkai55 avatar xunkai55 commented on July 19, 2024

With command adb shell dumpsys meminfo $APP_PACKAGE

I can see the major memory is occupied by "Ashmem".

Applications Memory Usage (in Kilobytes):
Uptime: 2137881 Realtime: 2137881

** MEMINFO in pid 3147 [com.android.example.camerax.tflite] **
                   Pss  Private  Private  SwapPss     Heap     Heap     Heap
                 Total    Dirty    Clean    Dirty     Size    Alloc     Free
                ------   ------   ------   ------   ------   ------   ------
  Native Heap    19532    19416        0        0    20992    18508     2483
  Dalvik Heap     4122     4024        0        0     3394     1697     1697
 Dalvik Other      604      604        0        0                           
        Stack      204      204        0        0                           
       Ashmem   120020   119604        0        0                           
    Other dev       14        0       12        0                           
     .so mmap     9090      116     4404        0                           
    .apk mmap     5409        0      320        0                           
    .dex mmap     9514        4     6008        0                           
    .oat mmap      631        0        0        0                           
    .art mmap     4203     3768        0        0                           
   Other mmap       63        4        0        0                           
      Unknown      917      900        0        0                           
        TOTAL   174323   148644    10744        0    24386    20205     4180
 
 App Summary
                       Pss(KB)
                        ------
           Java Heap:     7792
         Native Heap:    19416
                Code:    10852
               Stack:      204
            Graphics:        0
       Private Other:   121124
              System:    14935
 
               TOTAL:   174323       TOTAL SWAP PSS:        0
 
 Objects
               Views:       11         ViewRootImpl:        0
         AppContexts:        3           Activities:        1
              Assets:        2        AssetManagers:        4
       Local Binders:        3        Proxy Binders:       10
       Parcel memory:        2         Parcel count:       10
    Death Recipients:        0      OpenSSL Sockets:        0
            WebViews:        0
 
 SQL
         MEMORY_USED:        0
  PAGECACHE_OVERFLOW:        0          MALLOC_SIZE:        0

from tflite-support.

xunkai55 avatar xunkai55 commented on July 19, 2024

Well I noticed another thing: when the "Others" reached around 1GB, the profiler just halts, but the process / app still works. I haven't notice any app crash yet (with Pixel 2 API 27 emulator on Linux).

from tflite-support.

xunkai55 avatar xunkai55 commented on July 19, 2024

@DavidRobb As far as I know, ashmem may be rooted in frequent mmaps. It explains the growing "Others" when you put everything including creating interpreter in the loop, but probably not relevant with the out of memory thing you noticed in the main thread.

@fergushenderson could you please help correct me if I'm not understanding correctly.

from tflite-support.

srjoglekar246 avatar srjoglekar246 commented on July 19, 2024

@DavidRobb Can you try not using Nnapi? Looks like NNAPI delegate might be using ashmem in its implementation, and initializing the delegate in a loop could cause problems.

from tflite-support.

DavidRobb avatar DavidRobb commented on July 19, 2024

@srjoglekar246 Brilliant. Thank-you. That has fixed it for me. It's been running for 2hours 20 now whereas would crash after about 20 mins before.

As suggested by the tfLite documentation https://www.tensorflow.org/lite/performance/nnapi I've done:

        val options = Interpreter.Options()
// Initialize interpreter with NNAPI delegate for Android Pie or above
        if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.P) {
            val nnApiDelegate = NnApiDelegate()
            options.addDelegate(nnApiDelegate)
        }
        val tflite = Interpreter(FileUtil.loadMappedFile(context, MODEL_PATH), options)

Thanks all for the help

from tflite-support.

fergushenderson avatar fergushenderson commented on July 19, 2024

It looks like the real problem here was failing to call nnApiDelegate.close().
(Doc for that method says: "User is expected to call this method explicitly.")
Failing to close the NnApiDelegate will result in a memory leak.

That issue is still present in the camera-samples:

https://github.com/android/camera-samples/blob/42f76c4aab7c7c2d41110c3c127c20c08882e079/CameraXTfLite/app/src/main/java/com/example/android/camerax/tflite/CameraActivity.kt#L89

from tflite-support.

fergushenderson avatar fergushenderson commented on July 19, 2024

Just to close the loop here, the original issue in the Camera sample app that lead to this issue,
where the sample app code wasn't calling close() on the NnApiDelegate,
was fixed (by ggfan) back on Sept 28. For details, see
android/camera-samples#417.

from tflite-support.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.