Giter VIP home page Giter VIP logo

Comments (18)

hobu avatar hobu commented on July 20, 2024

I assume you're just running out of memory, and there's some kind of reference counting-related bug in laspy that's causing stuff to hang on and not be deleted.

from laspy.

grantbrown avatar grantbrown commented on July 20, 2024

Points are provided using numpy frombuffer, so closing the file (and therefore the underlying map) and then trying to access the points is going to cause issues. If you're certain that you want to keep point data around after closing a file, you could do something like:

import glob, numpy as np, laspy

#assumes you are in a directory with a collection of las files
files = glob.glob("*.las")

f0 = files[0]
las = las.file.File(f0, mode="r")
points = las.points.copy()
las.close()
#list objects in main namespace
dir()
#points is still present in the namespace
#calling points now should not result in a crash
points

An even better solution would be to use a function to explicitly define what data you want out of each file before closing them.

import glob, numpy as np, laspy

#assumes you are in a directory with a collection of las files
files = glob.glob("*.las")

def process(fname):
    las = las.file.File(fname, mode="r")
    points = las.points.copy()
    las.close()
    return(points)

for i, f in enumerate(files):
    print i, f
    pts = process(f)
    print points

I'd be interested to know if you continue to have problems with an approach like that.

from laspy.

jeffreywolf avatar jeffreywolf commented on July 20, 2024

Thanks (@hobu/@grantbrown) for the quick reply.

I tried the last code block (made a few minor changes: module name in process changed to 'laspy', 'points' in for loop changed to pts")...

import glob, numpy as np, laspy

#assumes you are in a directory with a collection of las files
files = glob.glob("*.las")

def process(fname):
    las = laspy.file.File(fname, mode="r")
    points = las.points.copy()
    las.close()
    return(points)

for i, f in enumerate(files):
    print i, f
    pts = process(f)
    print pts

The problem persists. Memory is still not deallocated. Here is a screenshot after passing through 141 files (a subset of all the files I am working with). I ran this on a 64GB RAM server. There are >250 las files in the directory.

laspy

from laspy.

grantbrown avatar grantbrown commented on July 20, 2024

I'm looking into the issue.

from laspy.

jeffreywolf avatar jeffreywolf commented on July 20, 2024

Thank you! Glad to help in any way, but I'm not familiar with the code-base.

from laspy.

grantbrown avatar grantbrown commented on July 20, 2024

Following the instructions here, I've managed to reduce the size of the memory leak in this commit on a feature branch. There are still quite a few remaining objects, but I wasn't immediately able to figure out why or where they came from.

I'll have to return to this later, but if either of you run across the culprit in the meantime definitely let me know.

from laspy.

grantbrown avatar grantbrown commented on July 20, 2024

Leak check script:
from pympler import tracker
import laspy

def g():                                                                        
    f = laspy.file.File("simple.las")                                           
    pts = f.points.copy()                                                       
    del(pts)                                                                    
    f.close()                                                                   
    del(f)                                                                      

memory_tracker = tracker.SummaryTracker()                                       
memory_tracker.print_diff()                                                     
print("########## START   ##########")                                           

for i in xrange(1000):                                                          
    g()                                                                         

memory_tracker.print_diff()      

from laspy.

KAMI911 avatar KAMI911 commented on July 20, 2024

Hi,
I tested the the memory_tracker code, and I got OOM while testing on a 259653143 bytes long las file. I had to set xrange to 2 because 10 caused long-long wait and huge memory usage. 100..1000 caused OOM.

The output of the script:

python2.7 ./mem.py 
                  types |   # objects |   total size
======================= | =========== | ============
                   list |        2623 |    266.38 KB
                    str |        2625 |    150.42 KB
                    int |         272 |      6.38 KB
                   dict |           2 |      1.30 KB
     wrapper_descriptor |           7 |    560     B
                weakref |           3 |    264     B
      member_descriptor |           2 |    144     B
                   code |           1 |    128     B
  function (store_info) |           1 |    120     B
                   cell |           2 |    112     B
      getset_descriptor |           1 |     72     B
      method_descriptor |           1 |     72     B
                  tuple |           0 |      8     B
         instancemethod |          -1 |    -80     B
########## START   ##########
                               types |   # objects |   total size
==================================== | =========== | ============
                                 int |    18546627 |    424.50 MB
                                list |          23 |    155.51 MB
                                dict |          38 |     47.89 KB
                            instance |         188 |     13.22 KB
                                 str |         121 |      6.42 KB
                              Struct |          10 |      5.75 KB
                lxml.etree._Document |          10 |    880     B
                 lxml.etree._Element |          10 |    720     B
                               float |          24 |    576     B
                                 set |           2 |    464     B
                                file |           2 |    288     B
                         numpy.dtype |           2 |    176     B
  <class 'laspy.header.HeaderManager |           2 |    128     B
         <class 'laspy.header.Header |           2 |    128     B

I am using multiprocessing module to handle my transforming task here:
https://github.com/KAMI911/lactransformer/blob/master/lactransformer.py
Using https://github.com/KAMI911/lactransformer/blob/master/lib/TransformerWorkflow.py and https://github.com/KAMI911/lactransformer/blob/master/lib/LasPyConverter.py modules. Probably using multiprocessing for laspy process can avoid the leak hence can be a workaround until the developers found the problem. Also I will try memory_tracker with my code. 👍

from laspy.

grantbrown avatar grantbrown commented on July 20, 2024

Thanks for the help - I'll try to find some time this weekend to dig back into this issue.

from laspy.

grantbrown avatar grantbrown commented on July 20, 2024

Just to keep folks posted, I haven't forgotten about this issue, but am pretty strapped for time right now. I'll try to get back to this by next weekend. In the meantime, I'd be curious to know if anyone learns anything about the cause of the problem.

from laspy.

jeffreywolf avatar jeffreywolf commented on July 20, 2024

To keep folks posted from my end (e.g., laspy user) my workaround was to write a driver program that uses multiprocessing and the subprocess library to run each las tile in parallel through subprocess (command line) calls to an outside program. The outside program called by subprocess opens (using laspy) a single 1 km x 1 km las tile and rasterizes that point cloud (using numpy and GDAL). The memory is freed when each subprocess finishes, so I was able to process a collection of over >250 1 km x 1 km las tiles this way. This may be helpful information for others interested in using laspy for processing multiple las files while the issue is being resolved.

from laspy.

WilliamWalker avatar WilliamWalker commented on July 20, 2024

Yes, great library. I too have recently encountered the following error after I open/close several LAS files consecutively: Error mapping file: [Error 8] Not enough storage is available to process this command

While monitoring my system resources, it appears that memory is allocated when a file is opened (e.g. inFile = laspy.file.File(las_path, mode='r')), but is not released with inFile.close(). If i restart my python shell, the memory is released. Any update on this would greatly appreciated.

from laspy.

grantbrown avatar grantbrown commented on July 20, 2024

This issue definitely needs to be fixed, however I haven't yet had a chance to revisit it. In the meantime, using the multiprocessing trick mentioned by KAMI911 and jeffrywolf should hopefully help (as sub-optimal as this is).

From my initial research, the leak may arise from circular references between several of the worker classes in laspy (Header, HeaderManager, Reader etc.)

from laspy.

grantbrown avatar grantbrown commented on July 20, 2024

I think I've contained the worst of the problem with bb41e5. It looks like there may still be a few bytes around after file close, but it doesn't seem to scale with the number of files processed. I'd be grateful if those of you who've had issues could test out the latest changes to see if you still run into problems (there could certainly be other problematic code paths).

Sorry it's taken me so long to return to the issue.

from laspy.

jonathan-voss avatar jonathan-voss commented on July 20, 2024

Does that actually work? You know del is only ever called when the garbage collector is reclaiming that object, right? As far as I can tell, deleting references in del should be no different from the normal behavior of a Python object.

I've been working on Python 3 compatibility for this library for a few days and my branch doesn't seem to have this issue. I only learned about this issue when I noticed that there's been a commit since I started my work.

I'm getting close to being confident that I haven't broken everything so maybe I'll send you a pull request soon? You probably won't like it, I changed a bunch of stuff.

from laspy.

grantbrown avatar grantbrown commented on July 20, 2024

You're right, I didn't continue testing to find the minimal subset of changes sufficient to deal with the leak. Looking again, it turns out that the problem was that I'd defined a del method already, which simply closed the file without doing the rest of the work to match the default object behavior. Refactored here, thanks.

from laspy.

jonathan-voss avatar jonathan-voss commented on July 20, 2024

Ah, I deleted those in my branch, as context managers are a better way to make sure files get closed at the right time.

from laspy.

grantbrown avatar grantbrown commented on July 20, 2024

Marking issue resolved, version bumped to 1.4.1, pypi updated

from laspy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.