omniscale / imposm-parser Goto Github PK
View Code? Open in Web Editor NEWDeprecated: Python parser for OpenStreetMap data
Home Page: http://imposm.org/docs/imposm.parser/latest/
License: Apache License 2.0
Deprecated: Python parser for OpenStreetMap data
Home Page: http://imposm.org/docs/imposm.parser/latest/
License: Apache License 2.0
imposm.parser/setup.py", line 15 except OSError, ex: ^ SyntaxError: invalid syntax
Hi,
I installed on OSX like this:
brew install protobuf
pip install imposm.parser
Everything went fine, but when I tested library, e.g.:
python -c"from imposm.parser.pbf import OSMPBF"
it reported error:
Traceback (most recent call last):
File "<string>", line 1, in <module>
ImportError: dlopen(/Users/rlujo/env/invh/busni/pybiz/lib/python2.7/site-packages/imposm/parser/pbf/OSMPBF.so,
2): Symbol not found: __ZN6google8protobuf11MessageLite15ParseFromStringERKSs
Referenced from: /Users/rlujo/env/invh/busni/pybiz/lib/python2.7/site-packages/imposm/parser/pbf/OSMPBF.so
Expected in: flat namespace
in /Users/rlujo/env/invh/busni/pybiz/lib/python2.7/site-packages/imposm/parser/pbf/OSMPBF.so
I have spent some time on investigation, it seems that there is some problem in linking osmpbf.so and libprotobuf.dylib, the signature of the method ZN6google8protobuf11MessageLite15ParseFromStringERK... is mismatched.
Solution in my case was upgrading XCode from 4.8 to 6.2 and reinstalling protobuf and imposm.parser:
pip uninstall imposm.parser
brew uninstall protobuf
brew install protobuf
pip install imposm.parser
Hope this will help someone.
When I try to parse the file contained in http://took.paulnorman.ca/imports/final.zip I get the following error
Process Process-1:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/home/pnorman/osm/imposmenv/local/lib/python2.7/site-packages/imposm/parser/simple.py", line 113, in parse_it
parser.parse(input)
File "/home/pnorman/osm/imposmenv/local/lib/python2.7/site-packages/imposm/parser/xml/multiproc.py", line 110, in parse
chunker.read(self.mmap_queue, coords_callback=self.coords_callback)
File "/home/pnorman/osm/imposmenv/local/lib/python2.7/site-packages/imposm/parser/xml/multiproc.py", line 225, in read
xml_nodes.write(line)
ValueError: data out of range
When I parse the same file but passed through xmllint --format final.osm > final-format.osm
it works fine.
The heads of the two files are shown below:
<?xml version="1.0"?>
<osm version="0.6" upload="false" generator="uvmogr2osm">
<node lat="49.182733878" visible="true" lon="-122.885183474" id="-1"><tag k="addr:housenumber" v="12181"/><tag k="surrey:permit_date" v="19850801"/><tag k="addr:city" v="Surrey"/><tag k="addr:street" v="99 Avenue"/></node>
<node lat="49.164329338" visible="true" lon="-122.784702089" id="-2"><tag k="addr:housenumber" v="8885"/><tag k="surrey:permit_date" v="19881228"/><tag k="addr:city" v="Surrey"/><tag k="addr:street" v="158 Street"/></node>
<node lat="49.167453196" visible="true" lon="-122.831808057" id="-3"><tag k="addr:housenumber" v="9055"/><tag k="surrey:permit_date" v="20030130"/><tag k="addr:city" v="Surrey"/><tag k="addr:street" v="141A Street"/></node>
<?xml version="1.0"?>
<osm version="0.6" upload="false" generator="uvmogr2osm">
<node lat="49.182733878" visible="true" lon="-122.885183474" id="-1">
<tag k="addr:housenumber" v="12181"/>
<tag k="surrey:permit_date" v="19850801"/>
<tag k="addr:city" v="Surrey"/>
<tag k="addr:street" v="99 Avenue"/>
</node>
<node lat="49.164329338" visible="true" lon="-122.784702089" id="-2">
<tag k="addr:housenumber" v="8885"/>
"""
nodes and relations with empty tags will not be returned, but ways will be, since they might be needed for building relations.
"""
WHY!?
a Way can contain nodes
http://wiki.openstreetmap.org/wiki/Way
a Relation can contain nodes
http://wiki.openstreetmap.org/wiki/Relation
a Relation can contain other relations
http://wiki.openstreetmap.org/wiki/Relation:boundary
bloody hell
I take OSM dump for my country: BY.osm.pbf 86 MB and BY.osm.bz2 138 MB. And trying get nodes, ways and relations, but not all will found. Next example print first node, first way and first relation in osm file if found and also print all nodes, ways and relations count:
from __future__ import absolute_import, print_function
from imposm.parser import OSMParser
file_name = 'BY.osm.pbf'
node_count = 0L
way_count = 0L
rel_count = 0L
def nodes_callback(nodes):
global node_count
node_count += len(nodes)
for node_id, node_tags, node_coord in nodes:
if type(node_id) not in (int, long):
print('node', node_id, 'NO INT', type(node_id))
if node_id == 356241L:
print('node', node_id)
def ways_callback(ways):
global way_count
way_count += len(ways)
for way_id, way_tags, way_nodes in ways:
if type(way_id) not in (int, long):
print('way', way_id, 'NO INT', type(way_id))
if way_id == 4418866L:
print('way', way_id)
def relations_callback(relations):
global rel_count
rel_count += len(relations)
for rel_id, rel_tags, rel_rels in relations:
if type(rel_id) not in (int, long):
print('rel', rel_id, 'NO INT', type(rel_id))
if rel_id == 4034L:
print('rel', rel_id)
def main():
OSMParser(nodes_callback=nodes_callback, ways_callback=ways_callback,
relations_callback=relations_callback).parse(file_name)
print(node_count, way_count, rel_count)
if __name__ == '__main__':
main()
Output for BY.osm.pbf
with concurrency=None
and concurrency=1
:
way 4418866
rel 4034
185818 1188354 12874
Output for BY.osm.bz2
with concurrency=None
and concurrency=1
has a little different result:
way 4418866
rel 4034
185818 1188354 12864
But with my own implementation I have completely different result:
node 356241
way 4418866
rel 4034
9070000 1180000 10000
Code of my implementation:
from __future__ import absolute_import, print_function
from bz2 import BZ2File
from lxml import etree as ET
def parse(file, nodes_callback=None, ways_callback=None,
relations_callback=None, batch=10000):
node_batch, way_batch, relation_batch = [], [], []
tags, nodes, members = {}, [], []
for event, elem in ET.iterparse(file):
tag = elem.tag
attrib = elem.attrib
if tag == 'tag':
tags[attrib['k']] = attrib['v']
elif tag == 'nd' and ways_callback:
nodes.append(int(attrib['ref']))
elif tag == 'member' and relations_callback:
members.append((int(attrib['ref']), attrib['type'], attrib['role']))
elif tag == 'node':
if nodes_callback:
node_batch.append((int(attrib['id']), tags,
(float(attrib['lon']), float(attrib['lat']))))
if len(node_batch) == batch:
nodes_callback(node_batch)
node_batch = []
tags, nodes, members = {}, [], []
elem.clear()
elif tag == 'way':
if ways_callback:
way_batch.append((int(attrib['id']), tags, nodes))
if len(way_batch) == batch:
ways_callback(way_batch)
way_batch = []
tags, nodes, members = {}, [], []
elem.clear()
elif tag == 'relation':
if relations_callback:
relation_batch.append((int(attrib['id']), tags, members))
if len(relation_batch) == batch:
relations_callback(relation_batch)
relation_batch = []
tags, nodes, members = {}, [], []
elem.clear()
class OSMParser(object):
def __init__(self, nodes_callback=None, ways_callback=None,
relations_callback=None):
self.nodes_callback = nodes_callback
self.ways_callback = ways_callback
self.relations_callback = relations_callback
def parse(self, file_name):
with BZ2File(file_name, 'rb') as file:
parse(file, self.nodes_callback, self.ways_callback,
self.relations_callback)
It would be really nice to parse also version and author data.
<node id="338982517" lat="48.5212819" lon="9.0573507" version="7"
timestamp="2012-12-15T16:59:53Z" changeset="14283022" uid="290680"
user="wheelmap_visitor">
I see that you have updated a new version 1.0.5 on your documentation website [http://imposm.org/docs/imposm/latest/], and I can find the source code for this on pypi. Do you intend to also update this repository ?
Thanks !
Sorry, it's not really an issue, but I couldn't find a mailing-list/group to ask a question on. Hope you don't mind. Here goes:
I want to identify all of the farm regions in a PBF file (I've actually downloaded Wales.pbf from http://download.geofabrik.de/openstreetmap/europe/great_britain/ as it's quite small).
Firstly, to get hold of all of the appropriate ways and associated coordinates, do I need to keep a list of all coordinates which are read? i.e. do I need to do something like:
from imposm.parser import OSMParser
ways = []
def ways_callback(nodes):
for node in nodes:
ways.append(node)
coords = {}
def coords_callback(coords_list):
for coord in coords_list:
id = coord[0]
coords[id] = coord
r = OSMParser(ways_callback=ways_callback,
coords_callback=coords_callback)
r.parse('wales.osm.pbf')
or is there a smarter approach? (Obviously I could do my filtering of the ways in the ways_callback
, but I haven't done that here for maximum flexibility).
From here, to get the coordinates of all ways
which have a landuse
containing farm
I'm doing:
def way_coords(way):
w_coords = []
for coord_id in way[2]:
w_coords.append(coords[coord_id][1:])
return np.array(w_coords)
for way in ways:
if 'landuse' in way[1]:
if 'farm' in way[1]['landuse']:
c = way_coords(way)
plt.plot(c[:, 0], c[:, 1])
Am I missing some part of the interface which simplifies the retrieval of coordinates for a given way?
Thanks!
Hey!
I've tried to make Python2 and Python3 version (with one code) and I failed - I had some problems with osm.cpp and conditional ifdefs...
I managed to run test suite (through tox) on python3.5 using this code:
https://github.com/lechup/imposm-parser/tree/python3
Which is mainly based on this work:
sidewalklabs@64150b1
Cheers
Well, you know.
Multitasking parts are failing when i use this in Celery. I have modified the source code myself for this. It would be great if the original code would also work in billiard.
import sys
from imposm.parser import OSMParser
def callback(data):
raise Exception("I'm broken")
parser = OSMParser(relations_callback=callback, concurrency=1)
try:
parser.parse_pbf_file("var/RU.osm.pbf")
except Exception as e:
print(e)
print("now you have to press ctrl-c")
print("or kill all three processes with signal")
sys.exit(100500) # does not help
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/home/qmax/Work/regions/.env/local/lib/python2.7/site-packages/imposm/parser/simple.py", line 113, in parse_it
parser.parse(input)
File "/home/qmax/Work/regions/.env/local/lib/python2.7/site-packages/imposm/parser/pbf/multiproc.py", line 70, in parse
pos_queue.put(pos)
File "/usr/lib/python2.7/multiprocessing/queues.py", line 311, in put
if not self._sem.acquire(block, timeout):
KeyboardInterrupt
I had to make the following change to setup.py in order to build/install imposm.parser on OSX.
ext_modules=[
Extension("imposm.parser.pbf.OSMPBF",
["imposm/parser/pbf/osm.cc", "imposm/parser/pbf/osm.pb.cc"], libraries=['protobuf'],
include_dirs=['/usr/local/include'] # Add include_dirs
),
],
Is there a way to explicitly tell imposm.parser where to look for the protobuf headers?
I first tried running python setup.py config -I/usr/local/include
but that didn't seem to work.
My knowledge of setuptools is minimal, so I'm sorry if this is just a lack of understanding on my part.
hello,
first, thanks for you great software!
but with the actual release i have problems. sometimes (about 1 of 10 tries), the parsing process hangs. you can reproduce the hang with the following script:
class OSM(object):
def check_relation(self, relation):
sys.stdout.write(".")
sys.stdout.flush()
osm = OSM()
print "get routes"
p = OSMParser(concurrency=4, relations_callback=osm.check_relation)
p.parse(infile)
print "done"
it feels like sometimes a thread does not finish processing. i tried it with:
thanks for your help!
max.
Hello,
I have found an issue with the parser recently. When the input file does not have <node>
tags inside of it, the parse result is incorrect.
Sample code:
#!/usr/bin/env python
from imposm.parser import OSMParser
class ParseCallback(object):
def nodes(self, nodes):
print ' nodes: %d' % len(nodes)
def ways(self, ways):
print ' ways: %d' % len(ways)
def relations(self, relations):
print ' relations: %d' % len(relations)
def coords(self, coords):
print ' coords: %d' % len(coords)
handler = ParseCallback()
parser = OSMParser(concurrency=1,
ways_callback=handler.ways,
relations_callback=handler.relations,
nodes_callback=handler.nodes,
coords_callback=handler.coords)
for file in ['Malacca.theme-parks.osm', 'Malacca.theme-parks.2.osm']:
print file
parser.parse(file)
Sample file 1 (Malacca.theme-parks.osm):
<?xml version="1.0" encoding="UTF-8"?>
<osm version="0.6" generator="Overpass API">
<note>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</note>
<meta osm_base="2016-08-24T13:26:03Z" areas="2016-08-23T01:33:02Z"/>
<way id="170830704">
<nd ref="1819670323"/>
<nd ref="1819670381"/>
<nd ref="1819670388"/>
<nd ref="1819670409"/>
<nd ref="1819670318"/>
<nd ref="1819670351"/>
<nd ref="1819670347"/>
<nd ref="1819670376"/>
<nd ref="1819670323"/>
<tag k="tourism" v="theme_park"/>
</way>
<node id="1819670323" lat="2.2009914" lon="102.2490477"/>
<node id="1819670376" lat="2.2008949" lon="102.2491738"/>
<node id="1819670318" lat="2.2009512" lon="102.2495493"/>
<node id="1819670347" lat="2.2008600" lon="102.2492811"/>
<node id="1819670351" lat="2.2008846" lon="102.2494478"/>
<node id="1819670381" lat="2.2012250" lon="102.2495819"/>
<node id="1819670388" lat="2.2011070" lon="102.2496409"/>
<node id="1819670409" lat="2.2010235" lon="102.2496217"/>
</osm>
Sample file 2 (Malacca.theme-parks.2.osm, as you can see the dummy node is added here):
<?xml version="1.0" encoding="UTF-8"?>
<osm version="0.6" generator="Overpass API">
<note>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</note>
<meta osm_base="2016-08-24T13:26:03Z" areas="2016-08-23T01:33:02Z"/>
<node id="0" lat="0" lon="0"/>
<way id="170830704">
<nd ref="1819670323"/>
<nd ref="1819670381"/>
<nd ref="1819670388"/>
<nd ref="1819670409"/>
<nd ref="1819670318"/>
<nd ref="1819670351"/>
<nd ref="1819670347"/>
<nd ref="1819670376"/>
<nd ref="1819670323"/>
<tag k="tourism" v="theme_park"/>
</way>
<node id="1819670323" lat="2.2009914" lon="102.2490477"/>
<node id="1819670376" lat="2.2008949" lon="102.2491738"/>
<node id="1819670318" lat="2.2009512" lon="102.2495493"/>
<node id="1819670347" lat="2.2008600" lon="102.2492811"/>
<node id="1819670351" lat="2.2008846" lon="102.2494478"/>
<node id="1819670381" lat="2.2012250" lon="102.2495819"/>
<node id="1819670388" lat="2.2011070" lon="102.2496409"/>
<node id="1819670409" lat="2.2010235" lon="102.2496217"/>
</osm>
Results:
Malacca.theme-parks.osm
coords: 7
coords: 1
ways: 0
nodes: 0
relations: 0
Malacca.theme-parks.2.osm
coords: 8
coords: 1
ways: 1
nodes: 0
relations: 0
As you can see way from the first file is not parsed.
It will be great add python 3 support. I think enough add 3.3+ support.
I am using python 2.7 and I am trying to pip install imposm.parser
.
The first thing I did was to install protobuf with pip install protobuf
. That installed version 2.6.1 in C:\Python27\Lib\site-packages\google
. Then I went here to download protoc-2.6.1-build2-windows-x86_32.exe
which I renamed to protoc.exe
and placed in C:\Python27\Lib\site-packages\google
folder.
I then installed Microsoft Visual C++ 9.0 from here
Finally, I executed pip install imposm.parser
and this is the error I get:
running build_ext building 'imposm.parser.pbf.OSMPBF' extension creating build\temp.win-amd64-2.7 creating build\temp.win-amd64-2.7\Release creating build\temp.win-amd64-2.7\Release\imposm creating build\temp.win-amd64-2.7\Release\imposm\parser creating build\temp.win-amd64-2.7\Release\imposm\parser\pbf C:\Users\Shiro\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0\VC\Bin\amd64\cl.exe /c /nologo /Ox /MD /W3 /GS- /DNDEBUG -C:\Python27\include -IC:\Python27\PC /Tpimposm/parser/pbf/osm.cc Fobuild\temp.win-amd64-2.7\Release\imposm/parser/pbf/osm.obj osm.cc C:\Users\Shiro\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0\VC\Include\xlocale(342) : warning C4530: C++ exception handler used, but unwind semantics are not enabled. Specify /EHsc c:\users\shiro\appdata\local\temp\pip-build-jel4jl\imposm.parser\imposm\parser\pbf\osm.pb.h(9) : fatal error C1083: Cannot open include file: 'google/protobuf/stubs/common.h': No such file or directory error: command '"C:\Users\Shiro\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0\VC\Bin\amd64\cl.exe"' failed with exit status 2
Edit: The header header file saying that is missing is from protobuf
which I just installed with pip install protobuf
. The file missing is right here in the github google/protobuf/stubs/common.h shouldn't it exist on my system since I installed with pip
? I don't understand...
Any plans of porting to python 3?
With a simple script, I can't collect all the nodes present in a .osm file returned from the standard osm API (0.6).
# -*- coding: utf-8 *-*
from imposm.parser import OSMParser
# simple class that handles the parsed OSM data.
class HighwayCounter(object):
highways = 0
def ways(self, ways):
# callback method for ways
for osmid, tags, refs in ways:
self.highways += 1
class NodesCounter(object):
n = 0
def nodes(self, nodes):
# callback method for ways
for osmid, tags, coords in nodes:
self.n += 1
# instantiate counter and parser and start parsing
counter = HighwayCounter()
ncounter = NodesCounter()
p = OSMParser(concurrency=4, ways_callback=counter.ways,
nodes_callback=ncounter.nodes)
p.parse('map.osm')
# done
print counter.highways
print ncounter.n
I have tested with various files, but the result is the same...
Get this error
"Setup script exited with error: can't copy 'imposm\parser/pbf\osm.pb.cc': doesn't exist or not a regular file"
when install imposm.parser with easy_install
I have protobuf installed with pip, win32 binaries of protobuf, mingw build from pythonxy and correct PATH to use compiler and protobuf binary
Wonder if someone can help. I believe there are some incompatibilities with latest version of gcc to compile this library but didn't find much information ...
Using gcc-c++ version 4.8.3
When doing the pip install a chain of errors fills the screen but the very beginning seems to be
writing manifest file 'imposm.parser.egg-info/SOURCES.txt'
warning: manifest_maker: standard file '-c' not found
...
gcc -pthread -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/opt/anaconda/include/python3.5m -c imposm/parser/pbf/osm.pb.cc -o build/temp.linux-x86_64-3.5/imposm/parser/pbf/osm.pb.o
cc1plus: warning: command line option '-Wstrict-prototypes' is valid for C/ObjC but not for C++ [enabled by default]
...
imposm/parser/pbf/osm.pb.cc:586:24: error: expected '<' before '<:' token
if (static_cast<::google::protobuf::uint8>(tag) ==
^
imposm/parser/pbf/osm.pb.cc:586:24: error: expected type-specifier before '<:' token
imposm/parser/pbf/osm.pb.cc:586:24: error: expected '>' before '<:' token
:
any ideas where to look for the problem ? thanks
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.