Comments (5)
@dazza-codes sqlite is only used in the main process in a single thread. The child processes only do processing of imagery.
Can you be more specific about the deadlock? I've never encountered one. Is it the one described here https://bugs.python.org/issue35267 or a different one?
from rio-mbtiles.
Dunno for sure. While processing 1000s of GeoTIFFs on AWS-batch, using RGBA inputs that are fairly big (esp. uncompressed) and zoom levels of about 4..14, the conversions get hung up sometimes. So far, I've forked and sacrificed multiprocessing speed for reliability. While I have a bit of time, revisiting this speed tradeoff to learn more about the project and mess it up 😁 Maybe my mental model of the process is not entirely clear; I probably need to learn to generate a sequence diagram for this or something. I don't know whether the process pool/worker is still active when sqlite is accessed to write out the result of each call to process_tile. There is no explicit termination of the process pool and no context manager at work. I considered just setting the worker-tasks to None
so that all the workers live as long as the pool because maybe something is wacky in sharing the global src
when the process cleans up after 100 tasks, I dunno. I started to wonder what happens with a map_async
callback pattern that might try to save results as they return, but that would mess with sqlite thread safety if that's in the callback handler. I'm also considering whether each process_tile
can just throw out a tmp-file (give reproject
a tmp-file instead of a MemoryFile) and always return None
, then gather up all the tmp-file tile outputs into sqlite after they're all done. Also curious about using dask.delayed in case it has better memory management for large inputs and 1000s of tiles. I see some new work with futures, will take a look at that 👓
from rio-mbtiles.
Explored various options and trade-offs in
- https://github.com/dazza-codes/rio-mbtiles/branches
- need to test several options with larger inputs and compare against current
master
from rio-mbtiles.
There is an interesting comment in https://docs.python.org/3/library/concurrent.futures.html
Changed in version 3.3: When one of the worker processes terminates abruptly,
a BrokenProcessPool error is now raised. Previously, behaviour was undefined
but operations on the executor or its futures would often freeze or deadlock.
This behavior might also apply to Pool
when the worker init fails (don't know for sure).
from rio-mbtiles.
I'm closing for now. The sqlite3 module is only used from a single thread. Python's multiprocessing is more likely to be the cause of deadlocks.
from rio-mbtiles.
Related Issues (20)
- rio mbtiles MemoryError HOT 1
- rasterio._err.CPLE_NotSupportedError: Cannot find coordinate operations HOT 1
- Is it possible to mantain native resolution at high zoom levels? HOT 2
- Require Python 3.7+ HOT 1
- Allow append to existing mbtiles file created by rio-mbtiles
- 1.5.0 release HOT 5
- Support cutlines
- Raster outside cutline is black (0), not transparent HOT 1
- Add --covers option to define output extent
- How to deal with large data sets and how to avoid auto projection transformation of Rio mbtiles? HOT 2
- Feature request: add support for PNG8 HOT 4
- Support for WebP tiles
- Add support for GDAL creation options for output tiles
- 1.5.1 release HOT 1
- 4 ... out of range for dataset HOT 8
- 1.6.0 release HOT 4
- missing tile table HOT 2
- MBtiles command not found HOT 1
- Please bump shapely version to >=1.8.0 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rio-mbtiles.