Comments (23)
Can you give a reproducible example? By default, laspy shouldn't load entire files into memory until necessary. LAS files are memory mapped, however, so virtual memory will be allocated.
from laspy.
Aha, indeed regular LAS files are not directly loaded in memory. But this is not currently true for LAZ files?
Just try to open this file for instance (3.5GB LAZ file with 700M points; it is from the Dutch National LiDAR dataset that is distributed in such tiles). I tried to open it with laspy.file.File()
and memory usage rapidly climbed to over 20GB (and then I terminated it).
Apart from that it would be useful to me if just a subset of a dataset can be read from disk, e.g. get_points()
with some parameters to only load points that are within a certain bounding box, thinned or just 1st returns, etc.
One might of course argue that a dataset should be split up in tiles that easily fit main memory... that would be a perfectly reasonable assumption to me (despite the fact it is not true for our National dataset).
So feel free to deny this feature request. I also see there are already some issue on improving LAZ support, so perhaps this issue can be closed.
PS. thanks for providing this library, it has already been very useful for me
from laspy.
I think laspy definitely should strive to handle laszip files in the same way that las files are handled, and Howard has suggested a feasible way to do so. Unfortunately, I don't have a lot of free development time right now, so unless we get a pull request implementing the functionality it may be some time before it happens.
As far as getting points within a bounding box, this should be feasible and relatively efficient (for las files). If you're willing to store all X, Y, and Z values, you can extract those and quickly determine which indices to keep. Alternatively, file objects are iterable and sliceable, so that could be easily used to fetch points in a bounding box without loading a bunch of stuff into memory.
from laspy.
Is it possible to write VLR information to LAS headers in a 32 bit python environment without mapping to virtual memory? When looping through LAS files, i receive the following error when the loop attempts to read a file roughly > 400MB:
File "C:\Python27\lib\site-packages\laspy-1.4.0-py2.7.egg\laspy\base.py", line 191, in map
raise laspy.util.LaspyException("Error mapping file: " + str(e))
LaspyException: Error mapping file: [Error 8] Not enough storage is available to process this command
I have 64 bit Windows 7, 32 bit python, and laspy 1.4.0.
Also, this may be due to a memory leak mentioned in another issue. I have tried multiprocessing and subprocess.popen with the same results. However, when I can run my code on a single file at a time, regardless of size, and it completes successfully.
from laspy.
Hmm, the memory leak issue should be taken care of in version 1.4.0. Can you post the complete code you're using?
I don't have access to any machines running 32 bit operating systems at the moment.
from laspy.
import os, struct, sys, glob
import laspy
def build_geotiff():
### Define GeoTiff ASCII parameters ###
GTCitGeoKey = 'PCS Name = NAD_1983_2011_StatePlane_Florida_East_FIPS_0901_Ft_US' + '|'
VertCitGeoKey = 'NAVD88 - Geoid12B (Feet)' + '|'
# Build GeoAsciiParamsTag Record
vlr_body2 = GTCitGeoKey + VertCitGeoKey + '\x00'
# Build GeoTiff Projection Keys VLRs
sGeoKey1 = struct.pack('4H', 1, 1, 0, 6)
### sKeyEntry ###
# sKeyEntry = [wKeyID, wTIFFTagLocation, wCount, wValue_Offset]
# GTModelTypeGeoKey: ModelTypeProjected
sKeyEntry1 = struct.pack('4H', 1024, 0, 1, 1)
# GTCitationGeoKey: Custom
sKeyEntry2 = struct.pack('4H', 1026, 34737, len(GTCitGeoKey), 0)
# ProjLinearUnitsGeoKey: Linear_Foot_US_Survey
sKeyEntry3 = struct.pack('4H', 3076, 0, 1, 9003)
# VerticalCSTypeGeoKey: VertCS_North_American_Vertical_Datum_1988
sKeyEntry4 = struct.pack('4H', 4096, 0, 1, 5103)
# VerticalCitationGeoKey: Custom
sKeyEntry5 = struct.pack('4H', 4097, 34737, len(VertCitGeoKey), len(GTCitGeoKey))
# VerticalUnitsGeoKey: Linear_Foot_US_Survey
sKeyEntry6 = struct.pack('4H', 4099, 0, 1, 9003)
# Combine strings
vlr_body1 = sGeoKey1 + sKeyEntry1 + sKeyEntry2 + sKeyEntry3 + sKeyEntry4 + sKeyEntry5 + sKeyEntry6
new_vlr1 = laspy.header.VLR(user_id = 'LASF_Projection',
record_id = 34735,
VLR_body = vlr_body1,
description = "GeoTiff Projection Keys")
new_vlr2 = laspy.header.VLR(user_id = 'LASF_Projection',
record_id = 34737,
VLR_body = vlr_body2,
description = "GeoTiff ASCII parameters")
new_vlrs = [new_vlr1, new_vlr2]
return new_vlrs
if __name__ == "__main__":
# Read LAS files
las_files = glob.glob(sys.argv[1])
# Sort ascending order
las_files.sort()
# Loop
for las_file in las_files:
print("Attempting %s..." % os.path.basename(las_file))
f = laspy.file.File(las_file, mode='rw')
inVLRs = build_geotiff()
f.header.vlrs = inVLRs
f.close()
from laspy.
Also, I'm running the following:
- Windows 7 64-bit, 24 GB RAM
- Python version 2.7.8 MSC v.1500 32 bit
- laspy version 1.4.0
from laspy.
I'll give this a test when I can, but off the bat I've got a couple questions:
- What is "def worker_bee(inLAS):" doing? Shouldn't this incomplete definition produce an error?
- Why is f opening the variable "inLAS" instead of "las_file"?
from laspy.
No signs of a memory leak on my machine with this modified script.
So you're running 32 bit python on a 64 bit machine? Is there any particular reason for that, or is it something you can upgrade?
from laspy.
Sorry, I copied and pasted parts of the code without cleaning it up first. I've updated it now.
from laspy.
32 bit Python is just what I have, mainly because I work with ArcMap which comes with the 32 bit install. I haven't upgraded, mainly because I don't know the consequences of an upgrade.
from laspy.
@WilliamWalker: it may be worth reinstalling the laspy from source. I think the previous version with the memory leak was also version 1.4.0.
@grantbrown: was the version number updated after fixing the memory leak?
from laspy.
My apologies, @jeffreywolf is correct. I'll bump the version to 1.4.1 and make sure the latest release is on pypi.
I think there's another problem with this script for files which contain waveform data; it may nuke the waveform packet descriptor VLR. @WilliamWalker: you probably want to carefully merge/extend the VLR list in your files rather than just completely overwriting it.
from laspy.
@grantbrown Thanks for the warning. I've been struggling for weeks now trying to find some software that will write certain information into the headers. Even several major packages at my disposal (GlobalMapper, TerraSolid, GeoCue, LP360) could not write a particular tag. I've muddled through the LAS structure, trying to figure out the best way to store the information. Luckily, I typically deal with LAS v1.2 data coming out of TerraSolid, so no waveform packets, or any VLRs for that matter, are in the headers.
@jeffreywolf Thanks! I'll give 1.4.1 a try.
from laspy.
(GlobalMapper, TerraSolid, GeoCue, LP360) could not write a particular tag.
Which tag(s) do you need to write? You can specify vertical datum systems that are exactly like your example using PDAL (it uses GDAL and libgeotiff to handle tag duties).
from laspy.
Yep, you should be OK with version 1.2. I may see if we can add a warning if users try to nuke required VLRs/EVLRs - it seems like an easy way to shoot yourself in the foot.
from laspy.
@hobu Thanks for the suggestion (and others you've given me on GIS SE). I've looked at PDAL in the past, but could never understand how to get PDAL, GDAL, and libgeotiff to play together on my machine. What's the best approach to install it on my machine (I've never had much luck compiling)?
from laspy.
Use Docker. I've already built everything for you. Follow this http://www.pdal.io/tutorial/docker.html
docker run -v //c/Users/Howard:/data pdal/master pdal translate /data/myfile.las /data/myvertical.las -f reprojection --filters.reprojection.out_srs=/data/myvertical.wkt
Have myvertical.wkt
be the WKT that defines the COMP_CS with the units and vertical datum you want.
from laspy.
Thank you! Does this install the python API as well?
from laspy.
you can do a few things with PDAL and Python, but there isn't really a "Python API" other than a mechanism to allow you to read a PDAL pipeline and then consume it with numpy.
from laspy.
Just an update. Unfortunately, laspy 1.4.1 did not solve the issue, so I'm certain it is a limitation of the 32-bit mmap. I've also tried creating .bat files to run the files consecutively, and utilized subprocess.Popen with similar results. However, the files created by the .bat file appear to have updated headers, even though I receive a "python.exe has stopped working" error window.
from laspy.
Just installed a 64 bit version of python and laspy. Everything ran without a hitch. Thanks again guys!
from laspy.
Please reopen if this still an issue going foward
from laspy.
Related Issues (20)
- migrate to urllib3 2.0 HOT 2
- Classification in laspy doesn't match with lasinfo HOT 6
- Reclassification of a COPC file HOT 5
- Allow for writing a custom creation date HOT 1
- append_points() function has size limit at approximately 15.1GB? HOT 3
- Ability to read COPC from fsspec sources HOT 2
- LAS/LAZ Header from Remote Source HOT 2
- laszip installation in ubuntu HOT 2
- Append data points to existing LAS file with header that contains extra dimensions. HOT 5
- List of 0 issue HOT 3
- Tests fail: ModuleNotFoundError: No module named 'laspy.cli.core' HOT 11
- laspy installs the 'tests' module globally that would conflict with other packages that accidentally do the same
- The 'rich' dependency is not listed in setup.py HOT 1
- Header update API? HOT 1
- AttributeError: module 'laspy' has no attribute 'read' HOT 7
- Provide pre-built wheels HOT 4
- Conda Forge test failures with Numpy 1.22 + Python 3.9 HOT 6
- Multiple point clouds structure in one LAS file HOT 2
- LAS / LAZ Shuffle in custom attributes in QGIS HOT 7
- Empty string input parse_crs() should result in a None return, but instead a CRSError is thrown in pyproj HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from laspy.