Comments (8)
I've made some changes to reduce memory fragmentation which helps a little. The
main user of memory is the lookup table to convert node ids to lat/lon when
reading the ways.
Currently all the nodes in the osm file are read to memory (a std::vector of
{int id; long x,y;}). This takes 12 bytes per node. France has 122 million
nodes and this works.out at c.1.5 GB). It is reallocating this array where
preprocessing first runs out of memory. I can improve things a little by
reserve() ing the space for these nodes, and France then compiles on my 2GB
machine.
Other space optimisations could be:
Read the file in two passes, work out which nodes are needed in the first pass, and then load these nodes only in the 2nd. The current drawing code uses about 80% of the nodes though so savings would be modest.
As above, but allow processing in batches. eg. Pass 1 read first 1M ways. Pass 2 load nodes for these ways and write ways to temp file. Repeat until all ways read. Then load all ways from temp file, sort, index, and write final db.
I think the second of these is the only way to read an arbitrarily large osm
file (eg europe or planet), but it would be slow traversing all nodes multiple
times.
Any thoughts?
Original comment by [email protected]
on 14 Feb 2011 at 3:49
- Changed state: Started
from monav.
This seems to help, the preprocessing gets further than before, but does not
finish , yet.
[11:05:43][0] 15800000 ways with 133634762 nodes
is as far as it goes.
Besides implementing a multi-pass approach you could try:
- Store only IDs of nodes for each way instead of coordinates and do the coordinate lookup on-thy-fly. When splitting ways you have to insert new nodes.
- Use a vector that uses realloc for PODS instead of allocate + copy ( e.g. QVector if I am not mistaken ). STL vectors have some bad design choices ( resize not using realloc, resize doubling the vector size, no shrink, clear not freeing memory etc ). Manual resizing might also squeeze out some more memory ( my dynamic graph in the Contraction Hierarchies plugin does this; it checks whether enough space is reserved and if not only enlarges by a small factor, e.g. 1.2, instead of 2 )
The multi-pass approach might be nice have have in the future, however, since
you have to load all ways into memory at some point your memory savings are
limited,
Original comment by [email protected]
on 15 Feb 2011 at 11:05
from monav.
What do you mean by doing "coordinate lookup on-the-fly". The only efficient
way I could see of going from node-id --> coordinate is to look it up in an
in-memory std::map or equivalent. It could be looked up from disk, but that
would be really slow (worst case scenario polling the osm.bz2 file, or a bit
faster from a custom generated file). Am I missing something obvious?
As far as doing a multi-pass approach goes it wouldn't ever be necessary to
load all ways into memory. They can be processed one by one, and then
bucket-sorted to one of a number of temporary files each of which will fit into
memory for sorting and indexing. It would be slow though.
Original comment by [email protected]
on 15 Feb 2011 at 2:24
from monav.
With on-the-fly lookup I mean the following:
Right now you store for each way its list of coordinates in the global node
array for all ways. Instead you could just store the nodes' ID and later when
you want to access a node's coordinates you perform the node-lookup with the
binary search.
Of course, this slows down the whole process if you have to access the way's
coordinates more than once.
By the way, the binary search is way more efficient then a std::map, as the
std::map is internally a fully fledged self-balancing binary tree. This usually
means an overhead of at least 8 bytes per entry.
Original comment by [email protected]
on 15 Feb 2011 at 2:31
from monav.
This issue was closed by revision f4c62ed6fc75.
Original comment by [email protected]
on 21 May 2011 at 2:12
- Changed state: Fixed
from monav.
It turns out none of the more minor changes you suggested allowed a large
enough increase in file size, so I've implemented a multi-pass preprocessor. It
scales poorly with increasing file size, but can handle France easeily, and
even the whole of Europe in under 2h with 2GB of RAM which should be OK.
N.B. Processing the whole of Europe isn't actually useful at the moment as the
resulting database file size (>4GB) breaks some assumptions in the client code,
and would need a new DB format to fix.
Original comment by [email protected]
on 21 May 2011 at 2:17
from monav.
Ah, I just hit this issue while processing Europe. The last comment shows that
it's a known issue. Maybe make a new bug out of it?
Original comment by [email protected]
on 18 Jul 2011 at 12:51
from monav.
Yeah, just go ahead and file a new bug report for that, I'll assign it to James.
Original comment by [email protected]
on 18 Jul 2011 at 1:01
from monav.
Related Issues (20)
- Building Monav against Qt 5 HOT 1
- Guidance on how to integrate Monav in a Windows 8 App Project
- monav-preprocessor: bad_alloc HOT 3
- MoNav Preprocessor HOT 2
- Misleading turn instructions when route is almost straight at a crossing HOT 4
- Routing prefers highway all the time HOT 1
- Plays only the first audio file in Qt Simulator HOT 1
- Way modificators greater than zero are ignored HOT 2
- build failure: QtMultimedia: No such file or directory HOT 7
- Assertion `edges->size() != ( int ) edgesBegin' failed. (for one specific route) HOT 1
- Some turns are not announced HOT 1
- libxml2 include path HOT 1
- Preprocessor linker error with libprotobuf
- monav 0.3.release-3+hg5585bb4a8d77 crashes on startup
- No map with Mapnik renderer and metaTileSize > 63
- Fix memory handling bugs HOT 1
- Fix preprocessor crash on small data sets without places HOT 2
- Fix QTileRenderer crash on small data sets HOT 2
- Preprocessor: Fix unpredicatable result of BinaryHeap::WasInserted() HOT 4
- Bicycle routing on footways / steps
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from monav.