Giter VIP home page Giter VIP logo

osm_extract_polygon's Introduction

OSM Extract Polygon

codecov License: MIT

Changelog

Contributors: andgem, davetha, morandd

v. 0.5.0:

  • added administrative level information to the geojson output

Description

This small and simple tool processes OSM pbf files to generate boundary polygons.

The main question it answers is: How do I extract the polygon of an administrative boundary?

In particular it looks for administrative boundaries (e.g., city boundaries, country boundaries, ...) and creates an output file per boundary that is in the Osmosis Polygon format.

Since version 0.3.0 it also supports the GeoJson output format.

Download

Just head over to the Releases and grab the version for your operating system (macOS, Linux, and Windows supported).

Usage

Extracts administrative boundaries of OSM pbf files and produces polygon files compatible with Osmosis.

USAGE:
    osm_extract_polygon [FLAGS] [OPTIONS] --file <filename>

FLAGS:
    -g, --geojson      set this flag to generate geojson output
    -o, --overwrite    set this flag to overwrite files without asking; if neither this nor --skip is set the user is
                       being prompted should a file be overwritten.
    -s, --skip         set this flag to skip overwriting files; if neither this nor --overwrite is set the user is being
                       prompted should a file be overwritten.
    -h, --help         Prints help information
    -V, --version      Prints version information

OPTIONS:
    -f, --file <filename>          input file
    -x, --max <max_admin_level>    max administrative level (can take value from 1-11) [default: 8]
    -m, --min <min_admin_level>    minimum administrative level (can take value from 1-11) [default: 8]
    -p, --path <path>              path to which the output will be saved to [default: '<input_filename>_polygons/']

Example 1 - Simple use case

osm_extract_polygon -f karlsruhe-regbez-latest.osm.pbf

The program will create a folder <INPUT_PBF_FILE>_polygons/ in the same folder where the input file is. This folder contains for each administrative boundary it found and extract a .poly file. The name of the file is the name of the administrative boundary relation, potentially prefixed by a prefix defined in the relation under the tag name:prefix.

Should more than one administrative boundary result in the same name, then, to avoid overwriting files, the filenames will have postfixes that corresponds to the relation id the administrative boundary is based on. For example, it the result for data of Spain can result in the following three files: Vimianzo_12532173.poly, Vimianzo_348941.poly, Vimianzo_9482766.poly. For these 12532173, 348941, and 9482766 are the relation ids mentioned above.

For more information about the meaning of the minimum and maximum administrative level take a look into the OSM Wiki.

Example 2 - GeoJson Output

In the next example we will create, additionally to the *.poly output, also matching GeoJSON files. We do this by passing the command line parameter --geojson (or alternatively, the short form -g) to the program.

./osm_extract_polygon -f berlin-latest.osm.pbf --geojson -o

This should create additional GeoJson files in the subfolder berlin-latest.osm.pbf_polygons/. Note, that we have also passed the parameter -o which instructs the program to overwrite already existing files in this folder without asking.

Example GeoJson file in the output:

{
  "geometry": {
    "coordinates": [
      [
        [
          [
            13.441906929016113,
            52.3632698059082
          ],
          [
            13.440044403076172,
            52.363494873046875
          ],
          [
            13.437420845031738,
            52.36367416381836
          ],
          [
            13.437135696411133,
            52.36361312866211
          ],
          [
            13.436691284179688,
            52.36356735229492
          ],
          ...
        ]
      ]
    ],
    "type": "Polygon"
  },
  "properties": {
    "name": "Blankenfelde-Mahlow"
  },
  "type": "Feature"
}

Use Case: Extracting a smaller OSM file of a city

Assume you want to have a small OSM file of a single city. The problem you might face is, that the smallest file you can get is still very large. The tool Osmosis can extract parts of an osm file when supplied with a Osmosis polygon file, but you don't have such a file (and manually creating one is burdensome).

In this example I will explain how to solve this problem for the city of Karlsruhe, Germany.

Preparation

  1. Get the newest release of osm_extract_polygon from the release page.
  2. Install Osmosis
  3. Obtain a OSM pbf file that contains Karlsruhe: Go to geofabrik and download Karlsruhe Regierungsbezirk.

Execution

  1. Run osm_extract_polygon:
./osm_extract_polygon -f karlsruhe-regbez-latest.osm.pbf
  1. Verify that the program ran, a few hundred small *.poly files are in the folder karlsruhe-regbez-latest.osm.pbf_polygons/. The file you are interested in is Stadt_Karlsruhe.poly.
  2. Run Osmosis:
osmosis --read-pbf file="karlsruhe-regbez-latest.osm.pbf" --bounding-polygon file="karlsruhe-regbez-latest.osm.pbf_polygons/Stadt_Karlsruhe.poly" --write-xml file="karlsruhe.osm"

The output osm file you are interested in is karlsruhe.osm.

osm_extract_polygon's People

Contributors

andgem avatar davetha avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

osm_extract_polygon's Issues

Feature request: export to both .geojson and .poly

While it is possible to use other tools (like python2geojson.py) to convert .poly files to .geojson, if it is easy to add that could be a nice feature to add to this tool. Both output formats are useful: .poly for masking with osmium and .geojson for all other uses.

problem extracting admin level 3 - 4 within a country from country-specific pbf

I would like to extract admin levels within a specific country (e.g. USA) give its pbf file. The problem is the output of osm_extract_polygon includes admin levels from neighboring countries as well (e.g Canada). Is this a bug or I am not able to use this properly?
example:

osm_extract_polygon -m 3 -x 6 us-latest.osm.pbf
includes "quebec" but does not include Alaska.

Not working

╰─$ ./osm_extract_polygon -f kazakhstan-latest.osm.pbf
error: cannot set both -o (--overwrite) and -s (--skip)!

Admin Levels

Hey!
I stumbled upon your project a week or so ago. I had a use case where I needed admin level and place in the geo properties. I noticed while I was adding the functionality in that you had a "TODO: add admin level for boundaries".

Do you want it to be an optional flag to add admin level and or place in, or are you fine with it being always on? My code with it always on is a fork located here: https://github.com/davetha/osm_extract_polygon I can alter it if needed and do a pull request whenever I hear from you :)

Another feature I added in that should likely be a cli flag is tacking on the relationship id in the file name. When you're dealing with larger OSM files and many admin levels, you'll often run into conflicting names... As I was typing this, I saw #33 so maybe no longer an issue :)

Tutorial

Hello! I am working for a World Bank project. I think I would greatly benefit from using this package, but honestly I wasn't able to fully follow the instructions. Please if you have a chance to provide an example I could walk through I would greatly appreciate it. Thanks! [email protected]

Two regions at same admin_level with same name

The "name" property of the admin_level relation is, unfortunately, not always unique. In Spain for example we find that at admin_level 8 the following two towns have the same 'name' property. (Parenthetically, I note some other data quality issues in the Relations, like the property "population" is defined twice, once as a number and then again as an object with subproperties.)

  1. Google Maps Name: 28939 Arroyomolinos, Madrid, Spain
    OSM name: "Arroyomolinos"
    OSM Relation ID: "relation/341636"
  2. Google Maps Name: Arroyomolinos, Cáceres, Cáceres, Spain
    OSM name: "Arroyomolinos"
    OSM Relation ID: "relation/1864624"

So when I run this tool for Spain at admin_level=8, the first Arroyomolinos.poly gets overwritten by the second and is thus lost.

A solution could be to suffix either "_1", "_2", etc, or the @id property (like "relation/1864624") to nonunique output .poly filenames. Personally I'd opt for the former suffixing style.

Thanks for this great tool!

"_1", "_2", etc suffixes are nondeterministic

It seems that the "_1", "_2", etc suffixing added to regions with identical names is nondeterministic. That is, if you run this on a region like Peru the regions Amazonas_1, Amazonas_2, and Amazonas_3 can refer to different regions on repeated runs. Likewise, if two different people run this program they may get different Amazonas_1 regions. I presume this is an artifact resulting from parallelism.

To fix this, perhaps instead of using the _1, _2 etc suffix, use the Relation ID or Way ID instead. That will be both unique and deterministic.

configurable output directory

Small feature request: could you make the output directory configurable via the command line?

If this were configurable, this utility would be thread-safe. To illustrate, I am using this to extract admin boundaries from national OSM files by running it several times with min=7 max=7, min=6 max=6, etc. But since the output directory is always _polygons/ these commands must be run in sequence, when logically they could be run in parallel using GNU parallel for example.

Feature request: clip to bounding .poly

As raised in this issue [https://github.com//issues/9] there is often the case that .pbf file for a given country includes admin_level boundaries from neighboring countries.

This leads to a case where, for example, the norway.pbf file contains the complete polygon for several Swedish kommunes. If using this tool for multi-country applications we then have a situation where the Swedish kommune is then defined twice: once in Sweden and a second time in Norway. This is an issue when trying to visualize the results, since the two polygons overlap.

A nice feature would be if osm_extract_polygon could also accept a a .poly boundary, and clip all output regions so they fit inside that boundary.

I have no idea how hard/easy this is to implement. But I thought it's worth at least clearly stating the issue.

AndGem, thank you for the fast fix on my other Issue. You can reply to or close these two issues as you see fit. I just wanted to mention these two feature ideas. I think your tool is basically the only tool for extracting .poly files from OSM so it carries a lot of weight :)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.