Giter VIP home page Giter VIP logo

Comments (8)

ahm126 avatar ahm126 commented on July 2, 2024

OK, I think I've solved my problem.

I needed to adjust the "start_first_evlr" property to be the full length of the LAS file as shown below. Not sure if that is the best way, but it seems to work. I needed to read the mmap size for the input LAS file (before the EVLR was appended) to get the index where the EVLR should be placed.

    # update the "start_first_evlr" property if this is the first EVLR.  Otherwise the laspy code will start with an offset of 0...
    if (len(inFile_v14.header.evlrs) == 0):
        theReader = laspy.base.Reader(inFileName,"r")
        theDataProvider = laspy.base.DataProvider(inFileName, theReader)
        theDataProvider.open("r")
        theDataProvider.map()
        endofthefile = theDataProvider._mmap.size()
        outFile_v14.header.start_first_evlr = endofthefile
        theDataProvider.close()
        theReader.close()

I don't think I should have had to do this, though... The laspy code in base.py seems to want to reset the mmap buffer to the size of the first EVLR (disregarding the point data content when the new EVLR is added). This takes place around line 904 in base.py in the "set_evlrs" method:

            self.data_provider.fileref.seek(0, 0)
            dat_part_1 = self.data_provider.fileref.read(old_offset)
            # Manually Close:
            self.data_provider.close(flush=False)
            self.data_provider.open("w+b")
            self.data_provider.fileref.write(dat_part_1)
            total_evlrs = sum([len(x) for x in value])
            self.data_provider.fileref.write("\x00"*total_evlrs)
            self.data_provider.fileref.close()
            self.data_provider.open("r+b")
            self.data_provider.map()
            self.seek(old_offset, rel = False)

            for evlr in value:
                self.data_provider._mmap.write(evlr.to_byte_string())

            if self.has_point_records:
                self.data_provider.point_map()
            self.populate_evlrs()

The full attached code seems to work:

     # Set input / output filenames
    evlrContentFile = "./LAS_DATA/evlrContentFile.txt"
    inFileName = './LAS_DATA/infile.las'
    outFileName = './LAS_DATA/outfile.las'

    print('Update started: {:%Y-%m-%d %H:%M:%S}'.format(datetime.datetime.now()))

    # Get EVLR string content
    with open(evlrContentFile, "r") as evlrFile:
        evlrString=evlrFile.read();

    # open input LAS file
    inFile_v14 = laspy.file.File(inFileName, mode = "r")

    # make a copy of the input file
    shutil.copy(inFileName,outFileName);

    # modify 'rw' the copied file
    outFile_v14 = laspy.file.File(outFileName, mode = "rw")

    # create new EVLR record
    new_evlr = laspy.header.EVLR(user_id = EVLR_userid, record_id = EVLR_recordid,
                    VLR_body = evlrString)

    # update the "start_first_evlr" property if this is the first EVLR.  Otherwise the laspy code will start with an offset of 0...
    if (len(inFile_v14.header.evlrs) == 0):
        theReader = laspy.base.Reader(inFileName,"r")
        theDataProvider = laspy.base.DataProvider(inFileName, theReader)
        theDataProvider.open("r")
        theDataProvider.map()
        endofthefile = theDataProvider._mmap.size()
        outFile_v14.header.start_first_evlr = endofthefile
        theDataProvider.close()
        theReader.close()

    # outFile_14 has the same, empty EVLR as inFile
    old_evlrs = inFile_v14.header.evlrs
    old_evlrs.append(new_evlr)

    # update the EVLR property
    outFile_v14.header.evlrs = old_evlrs

    # close and exit
    outFile_v14.close()
    inFile_v14.close()
    print('Update Completed: {:%Y-%m-%d %H:%M:%S}'.format(datetime.datetime.now()))

from laspy.

ahm126 avatar ahm126 commented on July 2, 2024

I ran into another problem when applying my workaround to large LAS files.
I also posted at: http://gis.stackexchange.com/questions/207193/update-evlr-header-info-for-very-large-las-files-using-laspy

My method shown above works for smaller LAS files (tested on files up to 800MB), but fails on larger files (fails on a file 6.2GB in size).

The issue seems to be the file byte offset integer value (where the EVLR content is to be placed) is too large for laspy (using version 1.2.5 with python version 2.7.12 64-bit) to convert to a C long value.

My error stack trace looks like this:

Traceback (most recent call last):
File "C:\Projects\Tampa\LASPY_EVLR\writeEVLR.py", line 112, in
main(sys.argv[1:])
File "C:\Projects\Tampa\LASPY_EVLR\writeEVLR.py", line 103, in main
outFile_v14.header.evlrs = old_evlrs
File "C:\Users\amurdoch\AppData\Local\Continuum\Anaconda2\envs\lidartools\lib\site-packages\laspy\header.py", line 1146, in set_evlrs
self.reader.set_evlrs(value)
File "C:\Users\amurdoch\AppData\Local\Continuum\Anaconda2\envs\lidartools\lib\site-packages\laspy\base.py", line 848, in set_evlrs
dat_part_1 = self.data_provider.fileref.read(old_offset)
OverflowError: Python int too large to convert to C long

Digging deeper, I found that the offset value for the beginning of the EVLR content is set to a value of 6,440,746,607 or about 6.44e9. I know this WILL fail to be converted to a double, but I'm not sure about C long types.

If this is a 32-bit unsigned long integer, then the conversion won't work... Is it possible to make this a 64-bit unsigned long integer? It looks like the max 32-bit unsigned long integer value is: 4,294,967,295.

This is a very large LAS file, though we have tons more of this size. This LAS file has 214,691,510 points.

I'm setting the "start_first_evlr" property to the end of the input file (when there are no EVLRs present) in the code from my previous post:

endofthefile = theDataProvider._mmap.size()
outFile_v14.header.start_first_evlr = endofthefile

Then when I set the updated EVLR content, the Overflow Error shows up:

outFile_v14.header.evlrs = old_evlrs

Do you have any workaround suggestions for me?

from laspy.

grantbrown avatar grantbrown commented on July 2, 2024

Hmm, start_first_evlr should be a ulonglong (64 bit unsigned), so I'm not sure where it would be getting convered to a double. From those error messages, it almost looks like the problem is in the read method of the file object; most of the file operations are done with mmap, but when resizing stuff for evlrs etc. we're just moving dumb bytes around. If that's the issue, we'll need to do something like create a temporary mmap in those methods.

Perhaps related?

from laspy.

ahm126 avatar ahm126 commented on July 2, 2024

Grant,

I don't think the value is ever being converted to a double but to a C long type.

After reading the linked post, I checked the size of the ctypes within my script.
These lines tell me that the c_long type is 4 bytes and the c_ulonglong type is 8 bytes:

    print('ctypes long: ' + str(ctypes.sizeof(ctypes.c_long)))
    print('ctypes ulonglong: ' + str(ctypes.sizeof(ctypes.c_ulonglong)))

That may be the problem. Based on the linked post, I think that I should be seeing a length of 8 bytes for a c_long type? Is that correct?

My version of python is from anaconda... (info determined with sys.version):
2.7.12 |Continuum Analytics, Inc.| (default, Jun 29 2016, 11:07:13) [MSC v.1500 64 bit (AMD64)]

I am running a 64-bit Win 7 machine.

Hopefully this is something simple that can be fixed on my end...
Do you have any advice for me to get the c_long type to use an 8 byte length?

Thanks for helping me out!

from laspy.

grantbrown avatar grantbrown commented on July 2, 2024

I actually think 4 bytes is fine for a c_long. Can you test whether the problem is that the "read" function ends up using a c_long? That would be an odd choice, but it would explain the problem.

from laspy.

ahm126 avatar ahm126 commented on July 2, 2024

On my installation, the file read function is using a 4 byte c_long (signed long).
I tested the base python file read function and I can't pass in an offset of more than the max signed long integer value (2,147,483,647). Not sure if this is due to a problem with my python installation, or if this is truly the max limit for reading from a file in python...

My test code is below:

import sys

inFileName = r'C:\Projects\Tampa\LASPY_EVLR\LAS_DATA\input\Large_LAS\20505.las'
bit32_offset_signedlong = 2147483647

print("python version" + sys.version)
print("------------------------------")

print("opening file")
fileref = open(inFileName, "r")
print("------------------------------")

print("starting 32bit max read")
datpart_32bitmax = fileref.read(bit32_offset_signedlong)
print("success with 32bit max read")
print("------------------------------")

print("starting 32bit max plus one read")
datpart_32bitmaxplus1 = fileref.read(bit32_offset_signedlong + 1)
print("success with 32bit max plus one read")

and the output looks like this:

python version2.7.12 |Continuum Analytics, Inc.| (default, Jun 29 2016, 11:07:13) [MSC v.1500 64 bit (AMD64)]
------------------------------
opening file
------------------------------
starting 32bit max read
success with 32bit max read
------------------------------
starting 32bit max plus one read
Traceback (most recent call last):
  File "C:\Projects\Tampa\LASPY_EVLR\check_clong.py", line 19, in <module>
    datpart_32bitmaxplus1 = fileref.read(bit32_offset_signedlong + 1)
OverflowError: Python int too large to convert to C long
Press any key to continue . . .

Also cross-posted most of this question to:
http://stackoverflow.com/questions/39083644/python-file-read-int-too-large-to-convert-to-c-long

Thanks for taking a look at this

from laspy.

ahm126 avatar ahm126 commented on July 2, 2024

It looks like the file read offset parameter is limited to a c long signed integer which limits the size of a LAS file that can be edited (at least when adding EVLR content info to the end of the file). I accepted the answer given here:
http://stackoverflow.com/questions/39083644/python-file-read-int-too-large-to-convert-to-c-long

So, it appears that the EVLR editing portion of laspy doesn't support files greater than 2GB (2,147,483,647 bytes). Does that sound accurate for now?

from laspy.

tmontaigu avatar tmontaigu commented on July 2, 2024

Closing, as its and old unactive issue

from laspy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.