Comments (8)
OK, I think I've solved my problem.
I needed to adjust the "start_first_evlr" property to be the full length of the LAS file as shown below. Not sure if that is the best way, but it seems to work. I needed to read the mmap size for the input LAS file (before the EVLR was appended) to get the index where the EVLR should be placed.
# update the "start_first_evlr" property if this is the first EVLR. Otherwise the laspy code will start with an offset of 0...
if (len(inFile_v14.header.evlrs) == 0):
theReader = laspy.base.Reader(inFileName,"r")
theDataProvider = laspy.base.DataProvider(inFileName, theReader)
theDataProvider.open("r")
theDataProvider.map()
endofthefile = theDataProvider._mmap.size()
outFile_v14.header.start_first_evlr = endofthefile
theDataProvider.close()
theReader.close()
I don't think I should have had to do this, though... The laspy code in base.py seems to want to reset the mmap buffer to the size of the first EVLR (disregarding the point data content when the new EVLR is added). This takes place around line 904 in base.py in the "set_evlrs" method:
self.data_provider.fileref.seek(0, 0)
dat_part_1 = self.data_provider.fileref.read(old_offset)
# Manually Close:
self.data_provider.close(flush=False)
self.data_provider.open("w+b")
self.data_provider.fileref.write(dat_part_1)
total_evlrs = sum([len(x) for x in value])
self.data_provider.fileref.write("\x00"*total_evlrs)
self.data_provider.fileref.close()
self.data_provider.open("r+b")
self.data_provider.map()
self.seek(old_offset, rel = False)
for evlr in value:
self.data_provider._mmap.write(evlr.to_byte_string())
if self.has_point_records:
self.data_provider.point_map()
self.populate_evlrs()
The full attached code seems to work:
# Set input / output filenames
evlrContentFile = "./LAS_DATA/evlrContentFile.txt"
inFileName = './LAS_DATA/infile.las'
outFileName = './LAS_DATA/outfile.las'
print('Update started: {:%Y-%m-%d %H:%M:%S}'.format(datetime.datetime.now()))
# Get EVLR string content
with open(evlrContentFile, "r") as evlrFile:
evlrString=evlrFile.read();
# open input LAS file
inFile_v14 = laspy.file.File(inFileName, mode = "r")
# make a copy of the input file
shutil.copy(inFileName,outFileName);
# modify 'rw' the copied file
outFile_v14 = laspy.file.File(outFileName, mode = "rw")
# create new EVLR record
new_evlr = laspy.header.EVLR(user_id = EVLR_userid, record_id = EVLR_recordid,
VLR_body = evlrString)
# update the "start_first_evlr" property if this is the first EVLR. Otherwise the laspy code will start with an offset of 0...
if (len(inFile_v14.header.evlrs) == 0):
theReader = laspy.base.Reader(inFileName,"r")
theDataProvider = laspy.base.DataProvider(inFileName, theReader)
theDataProvider.open("r")
theDataProvider.map()
endofthefile = theDataProvider._mmap.size()
outFile_v14.header.start_first_evlr = endofthefile
theDataProvider.close()
theReader.close()
# outFile_14 has the same, empty EVLR as inFile
old_evlrs = inFile_v14.header.evlrs
old_evlrs.append(new_evlr)
# update the EVLR property
outFile_v14.header.evlrs = old_evlrs
# close and exit
outFile_v14.close()
inFile_v14.close()
print('Update Completed: {:%Y-%m-%d %H:%M:%S}'.format(datetime.datetime.now()))
from laspy.
I ran into another problem when applying my workaround to large LAS files.
I also posted at: http://gis.stackexchange.com/questions/207193/update-evlr-header-info-for-very-large-las-files-using-laspy
My method shown above works for smaller LAS files (tested on files up to 800MB), but fails on larger files (fails on a file 6.2GB in size).
The issue seems to be the file byte offset integer value (where the EVLR content is to be placed) is too large for laspy (using version 1.2.5 with python version 2.7.12 64-bit) to convert to a C long value.
My error stack trace looks like this:
Traceback (most recent call last):
File "C:\Projects\Tampa\LASPY_EVLR\writeEVLR.py", line 112, in
main(sys.argv[1:])
File "C:\Projects\Tampa\LASPY_EVLR\writeEVLR.py", line 103, in main
outFile_v14.header.evlrs = old_evlrs
File "C:\Users\amurdoch\AppData\Local\Continuum\Anaconda2\envs\lidartools\lib\site-packages\laspy\header.py", line 1146, in set_evlrs
self.reader.set_evlrs(value)
File "C:\Users\amurdoch\AppData\Local\Continuum\Anaconda2\envs\lidartools\lib\site-packages\laspy\base.py", line 848, in set_evlrs
dat_part_1 = self.data_provider.fileref.read(old_offset)
OverflowError: Python int too large to convert to C long
Digging deeper, I found that the offset value for the beginning of the EVLR content is set to a value of 6,440,746,607 or about 6.44e9. I know this WILL fail to be converted to a double, but I'm not sure about C long types.
If this is a 32-bit unsigned long integer, then the conversion won't work... Is it possible to make this a 64-bit unsigned long integer? It looks like the max 32-bit unsigned long integer value is: 4,294,967,295.
This is a very large LAS file, though we have tons more of this size. This LAS file has 214,691,510 points.
I'm setting the "start_first_evlr" property to the end of the input file (when there are no EVLRs present) in the code from my previous post:
endofthefile = theDataProvider._mmap.size()
outFile_v14.header.start_first_evlr = endofthefile
Then when I set the updated EVLR content, the Overflow Error shows up:
outFile_v14.header.evlrs = old_evlrs
Do you have any workaround suggestions for me?
from laspy.
Hmm, start_first_evlr should be a ulonglong (64 bit unsigned), so I'm not sure where it would be getting convered to a double. From those error messages, it almost looks like the problem is in the read method of the file object; most of the file operations are done with mmap, but when resizing stuff for evlrs etc. we're just moving dumb bytes around. If that's the issue, we'll need to do something like create a temporary mmap in those methods.
from laspy.
Grant,
I don't think the value is ever being converted to a double but to a C long type.
After reading the linked post, I checked the size of the ctypes within my script.
These lines tell me that the c_long type is 4 bytes and the c_ulonglong type is 8 bytes:
print('ctypes long: ' + str(ctypes.sizeof(ctypes.c_long)))
print('ctypes ulonglong: ' + str(ctypes.sizeof(ctypes.c_ulonglong)))
That may be the problem. Based on the linked post, I think that I should be seeing a length of 8 bytes for a c_long type? Is that correct?
My version of python is from anaconda... (info determined with sys.version):
2.7.12 |Continuum Analytics, Inc.| (default, Jun 29 2016, 11:07:13) [MSC v.1500 64 bit (AMD64)]
I am running a 64-bit Win 7 machine.
Hopefully this is something simple that can be fixed on my end...
Do you have any advice for me to get the c_long type to use an 8 byte length?
Thanks for helping me out!
from laspy.
I actually think 4 bytes is fine for a c_long. Can you test whether the problem is that the "read" function ends up using a c_long? That would be an odd choice, but it would explain the problem.
from laspy.
On my installation, the file read function is using a 4 byte c_long (signed long).
I tested the base python file read function and I can't pass in an offset of more than the max signed long integer value (2,147,483,647). Not sure if this is due to a problem with my python installation, or if this is truly the max limit for reading from a file in python...
My test code is below:
import sys
inFileName = r'C:\Projects\Tampa\LASPY_EVLR\LAS_DATA\input\Large_LAS\20505.las'
bit32_offset_signedlong = 2147483647
print("python version" + sys.version)
print("------------------------------")
print("opening file")
fileref = open(inFileName, "r")
print("------------------------------")
print("starting 32bit max read")
datpart_32bitmax = fileref.read(bit32_offset_signedlong)
print("success with 32bit max read")
print("------------------------------")
print("starting 32bit max plus one read")
datpart_32bitmaxplus1 = fileref.read(bit32_offset_signedlong + 1)
print("success with 32bit max plus one read")
and the output looks like this:
python version2.7.12 |Continuum Analytics, Inc.| (default, Jun 29 2016, 11:07:13) [MSC v.1500 64 bit (AMD64)]
------------------------------
opening file
------------------------------
starting 32bit max read
success with 32bit max read
------------------------------
starting 32bit max plus one read
Traceback (most recent call last):
File "C:\Projects\Tampa\LASPY_EVLR\check_clong.py", line 19, in <module>
datpart_32bitmaxplus1 = fileref.read(bit32_offset_signedlong + 1)
OverflowError: Python int too large to convert to C long
Press any key to continue . . .
Also cross-posted most of this question to:
http://stackoverflow.com/questions/39083644/python-file-read-int-too-large-to-convert-to-c-long
Thanks for taking a look at this
from laspy.
It looks like the file read offset parameter is limited to a c long signed integer which limits the size of a LAS file that can be edited (at least when adding EVLR content info to the end of the file). I accepted the answer given here:
http://stackoverflow.com/questions/39083644/python-file-read-int-too-large-to-convert-to-c-long
So, it appears that the EVLR editing portion of laspy doesn't support files greater than 2GB (2,147,483,647 bytes). Does that sound accurate for now?
from laspy.
Closing, as its and old unactive issue
from laspy.
Related Issues (20)
- migrate to urllib3 2.0 HOT 2
- Classification in laspy doesn't match with lasinfo HOT 6
- Reclassification of a COPC file HOT 5
- Allow for writing a custom creation date HOT 1
- append_points() function has size limit at approximately 15.1GB? HOT 3
- Ability to read COPC from fsspec sources HOT 2
- LAS/LAZ Header from Remote Source HOT 2
- laszip installation in ubuntu HOT 2
- Append data points to existing LAS file with header that contains extra dimensions. HOT 5
- List of 0 issue HOT 3
- Tests fail: ModuleNotFoundError: No module named 'laspy.cli.core' HOT 11
- laspy installs the 'tests' module globally that would conflict with other packages that accidentally do the same
- The 'rich' dependency is not listed in setup.py HOT 1
- Header update API? HOT 1
- AttributeError: module 'laspy' has no attribute 'read' HOT 7
- Provide pre-built wheels HOT 4
- Conda Forge test failures with Numpy 1.22 + Python 3.9 HOT 6
- Multiple point clouds structure in one LAS file HOT 2
- LAS / LAZ Shuffle in custom attributes in QGIS HOT 7
- Empty string input parse_crs() should result in a None return, but instead a CRSError is thrown in pyproj HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from laspy.