Comments (23)
Perhaps the FFI::Platypus closures are not thread safe? It could also be GDAL that is not thread safe.
It's probably only safe if the Perl is threaded and the os calling thread also has a Perl interpreter on it.
There is a test here:
https://github.com/PerlFFI/FFI-Platypus/blob/81d8c6910287aa61170f51f5cc261657f3ed1179/t/threads.t
from geo-gdal-ffi.
It might help to use a state
variable to declare $instance
inside the new
sub, rather than a package lexical as currently used.
Possibly related: #5
from geo-gdal-ffi.
Although using a state var would require a reasonable amount of re-engineering now that I look.
from geo-gdal-ffi.
It is not evident for me why a singleton in this case needs to be used, but for efficiency (maybe is it related to FFI implementaton?). The problem here is the use of new() in the BEGIN section that has side effects, not the singleton per se, anyway.
Eventually, I would consider the IO::Socket::SSL implementation case which has the same type of problems in initialization (needs to be done once in the main thread and never in other ones). In my code I experienced thread issues with HTTP::Tiny, but I solved by simply adopting a closure with a new() per thread even with SSL connections, for instance. Unfortunately that implementation needs to consider explicitly threads case if evailable at Config level.
from geo-gdal-ffi.
I guess the singleton is just for convenience, so that the user does not need to declare and initialize it.
from geo-gdal-ffi.
Just for reference, I added the following patch to the Debian package for Geo::GDAL::FFI, currently available in unstable:
from geo-gdal-ffi.
Sigh, the patch seems to mitigate the problem, but it does not solve the issue entirely. I just saw the same issue appearing again randomly in my code. It Is definitively an heisenbug. It disappears by not initializing the $instance
in the FFI.pm
module, but I'm not sure again if it is a true solution. I will try to dive into the problem better.
from geo-gdal-ffi.
Are you able to provide test code to reproduce?
from geo-gdal-ffi.
Are you able to provide test code to reproduce?
The code is public on the net in our official GitLab instance, but it is too complex for a test case. I could try to prepare something that forces the failure and possibly segfaults (it happens, too). Maybe reinstating the singleton in the BEGIN section could suffice.
from geo-gdal-ffi.
Can you provide a link to the code?
Also, is Geo::GDAL::FFI loaded before or after Thread::Queue in your code?
from geo-gdal-ffi.
I managed to write a simple script that fails with Perl 5.36 here:
$ /tmp/snippet.pl
thread in1: pushed 36.0724123035656
thread out1: popped 36.0724123035656
thread in2: pushed 56.6034915697859
thread out3: popped 56.6034915697859
thread in3: pushed 21.4827805831195
thread out5: popped 21.4827805831195
thread in4: pushed 58.0341478650556
thread out1: popped 58.0341478650556
thread in5: pushed 94.7460995362238
thread out4: popped 94.7460995362238
Scalars leaked: -3
Attempt to free unreferenced scalar: SV 0x55eee135c618, Perl interpreter: 0x55eee3f44de0 during global destruction.
Attempt to free unreferenced scalar: SV 0x55eee135c5d0, Perl interpreter: 0x55eee3f44de0 during global destruction.
Attempt to free unreferenced scalar: SV 0x55eee135c510, Perl interpreter: 0x55eee3f44de0 during global destruction.
Annullato
This is with the initialization of $instance
in the BEGIN section, which is the code currently in the master
branch. Changing the sleeping time or even repeating the run more times in a row, one can get only the Scalars leaked or even the other three messages.
Of course, order of the use clauses is not important. A constant is the number (3) of leaks. Typically the script crashes at the first join() in the main thread.
from geo-gdal-ffi.
Windows gives a "free to wrong pool" error. gdb backtrace is below.
I wonder if this is something in FFI::Platypus?
(gdb) bt
#0 0x00007ffed33847ab in perl538!Perl_drand48_r () from C:\perls\5.38.0.1_PDL\perl\bin\perl538.dll
#1 0x00007fff141e2ccb in ?? () from C:\perls\5.38.0.1_PDL\perl\vendor\lib\auto\FFI\Platypus\Platypus.xs.dll
#2 0x00007ffed3322fec in perl538!Perl_clear_defarray () from C:\perls\5.38.0.1_PDL\perl\bin\perl538.dll
#3 0x00007ffed3377932 in perl538!Perl_runops_standard () from C:\perls\5.38.0.1_PDL\perl\bin\perl538.dll
#4 0x00007ffed332d2ad in perl538!Perl_call_sv () from C:\perls\5.38.0.1_PDL\perl\bin\perl538.dll
#5 0x00007ffed32da70b in perl538!Perl_newRV () from C:\perls\5.38.0.1_PDL\perl\bin\perl538.dll
#6 0x00007ffed32da91d in perl538!Perl_sv_clear () from C:\perls\5.38.0.1_PDL\perl\bin\perl538.dll
#7 0x00007ffed32db0a9 in perl538!Perl_sv_free2 () from C:\perls\5.38.0.1_PDL\perl\bin\perl538.dll
#8 0x00007ffed32d889e in perl538!Perl_rcpv_copy () from C:\perls\5.38.0.1_PDL\perl\bin\perl538.dll
#9 0x00007ffed32db341 in perl538!Perl_sv_free2 () from C:\perls\5.38.0.1_PDL\perl\bin\perl538.dll
#10 0x00007ffed332f926 in perl_destruct () from C:\perls\5.38.0.1_PDL\perl\bin\perl538.dll
#11 0x00007fff1d2d1693 in ?? () from C:\perls\5.38.0.1_PDL\perl\lib\auto\threads\threads.xs.dll
#12 0x00007fff1d2d2c20 in ?? () from C:\perls\5.38.0.1_PDL\perl\lib\auto\threads\threads.xs.dll
#13 0x00007ffed3322fec in perl538!Perl_clear_defarray () from C:\perls\5.38.0.1_PDL\perl\bin\perl538.dll
#14 0x00007ffed3377932 in perl538!Perl_runops_standard () from C:\perls\5.38.0.1_PDL\perl\bin\perl538.dll
#15 0x00007ffed333363a in perl_run () from C:\perls\5.38.0.1_PDL\perl\bin\perl538.dll
#16 0x00007ffed3386e23 in perl538!RunPerl () from C:\perls\5.38.0.1_PDL\perl\bin\perl538.dll
#17 0x00007ff79d751340 in ?? ()
#18 0x00007ff79d751146 in ?? ()
#19 0x00007fff3d5f257d in KERNEL32!BaseThreadInitThunk () from C:\WINDOWS\System32\kernel32.dll
#20 0x00007fff3ebaaa78 in ntdll!RtlUserThreadStart () from C:\WINDOWS\SYSTEM32\ntdll.dll
#21 0x0000000000000000 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
And FWIW, adding this line to the create sub gets things to work.
eval 'require Geo::GDAL::FFI';
from geo-gdal-ffi.
eval 'require Geo::GDAL::FFI';
Ah, that was exactly the magic illusion that moving the initialization out of the BEGIN block gave to me before. Unfortunately, it could reappear later on, when it hurts more :-) What is evident is that it does not relate directly to some initialization done for every Perl thread because the magic number (3) and the number of threads do not coincide. To be safe I would try to move the init phase only into the main thread.
from geo-gdal-ffi.
Unfortunately, it could reappear later on, when it hurts more :-)
A pity.
from geo-gdal-ffi.
I found that apparently after moving off the BEGIN block the new() sub, no errors appear after commenting out
#$ffi->load_custom_type('::StringPointer' => 'string_pointer');
and all related attaches. I guess the whole problem is due to some oddities in FFI::Platypus
that are still obscure for me.
PS:
NOTE: As of version 0.61, this custom type is now deprecated since pointers to strings are supported in the [FFI::Platypus](https://metacpan.org/pod/FFI::Platypus) directly without custom types.
This module provides a [FFI::Platypus](https://metacpan.org/pod/FFI::Platypus) custom type for pointers to strings.
This is in FFI::Platypus::Type::StringPointer
from geo-gdal-ffi.
Wow! Apparently I got it. It is enough to use string* instead of string_pointer (and remove the load_custom_type) to have the master working with the snippet code, even with the original BEGIN block. \o/
from geo-gdal-ffi.
That's good news.
Would you be able to submit a new PR?
from geo-gdal-ffi.
@plicease - FYI.
It's a longish thread but #58 (comment) and #58 (comment) have the key diagnoses.
from geo-gdal-ffi.
Although I still get the free to wrong pool error using Strawberry Perl 5.38 and 5.32 after I replace string_pointer
with string*
and comment out the load_custom_type
line.
from geo-gdal-ffi.
Although I still get the free to wrong pool error using Strawberry Perl 5.38 and 5.32 after I replace
string_pointer
withstring*
and comment out theload_custom_type
line.
Yep, the last step is understanding what's wrong with the SetErrorHandling() call. For instance, if you comment out that step and move it after the thread creation, the whole script works. The string_pointer use instead seems the cause of the reference leaking. I will do some other trials because thread issues are always challenging with timing.
from geo-gdal-ffi.
Commenting out the CPLPushErrorHandler
avoids the crash.
Geo-GDAL-FFI/lib/Geo/GDAL/FFI.pm
Line 88 in ba5426e
Perhaps the FFI::Platypus closures are not thread safe? It could also be GDAL that is not thread safe.
from geo-gdal-ffi.
GDAL is mostly thread-safe. I already tried with CPLPushErrorHandler
(only main thread) and CPLSetErrorHandler
(all threads) and sticking of the closure (as strongly suggested in the doc): no results. Definitively I guess it can be called only and strictly after threads creation. See for instance, my fresh t/threads.t
for a PR ;-)
here
Of course, that needs to be documented. That said, I'm still confused about why and how the old binding deals with that (probably @ajolma can enlight us on that). That's because I'm currently using the same code in production with the old binding in Debian 11 and it works like a charm.
And talking about that, this is the piece of code where I found problems with multi-threading.
from geo-gdal-ffi.
Ok, if it is acceptable, I'm going to close PR #69 and create two new PRs: one for removing the string_pointer use and the other for thread-safety that introduces the init() and modifies the new() into a per-thread method. All with appropriate changes to documentation, too.
EDIT: in the meantime, I verified that the string_pointer use is the cause of the SV leaks in multi-thread context (and possibly other cases), so that is also a prereq to restructure the code for thread safety.
from geo-gdal-ffi.
Related Issues (20)
- CreateLayer fails if SpatialReference arg exists but is invalid (undef or blank) HOT 3
- Test failure with GDAL 3.01 and Proj 6.2 HOT 1
- FFI::Platypus::Declare is now a separate distribution, needs to be added to Makefile.PL HOT 1
- Can't locate FFI/Platypus/Declare.pm in @INC HOT 2
- Drive by comment: consider ignore_not_found HOT 1
- Add recent OGRGeometry functions HOT 6
- ubuntu 20.04 unable to install using cpan or cpanm HOT 10
- How to do coordinate transformations in Geo::GDAL::FFI HOT 2
- Is the data dir check needed for GDAL >= 2.3.1? HOT 1
- Should Geo::GDAL::FFI::Band inherit from Geo::GDAL::FFI::Object? HOT 1
- Tests fail on FreeBSD as NITF is not found HOT 17
- warning received during loading $FFI::Platypus 'used only once' HOT 1
- Type of the function argument(s) incorrect for GDALInvGeoTransform and GDALApplyGeoTransform HOT 3
- Typo in Geometry.pm
- Add Geo::GDAL::FFI::Layer::GetFeatureCount method
- Tests: use Test::TempDir::Tiny
- test failures in t/00.t
- Fails to build with GDAL 3.8.0 HOT 13
- Release version 0.11 to CPAN HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from geo-gdal-ffi.