osmlab / atlas-generator Goto Github PK
View Code? Open in Web Editor NEWDistributed generation of Atlas shards
License: BSD 3-Clause "New" or "Revised" License
Distributed generation of Atlas shards
License: BSD 3-Clause "New" or "Revised" License
In Readme.md there is following information:
Downloads the country boundaries and the sharding tree files from the respective sub-folders available here ( https://apple.ent.box.com/s/3k3wcc0lq1fhqgozxr4mdi0llf95byo3 ).
When I tried to reach that location I received following error:
"This shared file or folder link has been removed or is unavailable to you."
Are boundary files and sharding tree available in different location?
Currently when using boundary.txt.gz to generate Atlas files the boundaries are only supported if they are POLYGONS. Would be awesome if there could be support for MULTIPOLYGONS as most boundaries contain multiple polygons.
Right now, downloadBoundaries
gets "https://dl.dropboxusercontent.com/s/5a5q6tro5lx4m07/world_boundaries_osm_20171013.txt.gz". This is (a) out of date and (b) in the wrong format.
When using ./gradlew run
, the task errors out with the following:
org.openstreetmap.atlas.exception.CoreException: Job Atlas Generator failed.
at org.openstreetmap.atlas.generator.tools.spark.SparkJob.onRun(SparkJob.java:204)
at org.openstreetmap.atlas.utilities.runtime.Command.execute(Command.java:338)
at org.openstreetmap.atlas.utilities.runtime.Command.run(Command.java:282)
at org.openstreetmap.atlas.generator.AtlasGenerator.main(AtlasGenerator.java:63)
Caused by: org.openstreetmap.atlas.exception.CoreException: Invalid country boundary text file format.
at org.openstreetmap.atlas.geography.boundary.CountryBoundaryMap.readFromPlainText(CountryBoundaryMap.java:730)
at org.openstreetmap.atlas.geography.boundary.CountryBoundaryMap.fromPlainText(CountryBoundaryMap.java:189)
at org.openstreetmap.atlas.geography.boundary.CountryBoundaryMapArchiver.read(CountryBoundaryMapArchiver.java:69)
at org.openstreetmap.atlas.generator.AtlasGenerator.boundaries(AtlasGenerator.java:393)
at org.openstreetmap.atlas.generator.AtlasGenerator.start(AtlasGenerator.java:157)
at org.openstreetmap.atlas.generator.tools.spark.SparkJob.onRun(SparkJob.java:197)
... 3 more
Caused by: java.lang.ClassCastException: class org.locationtech.jts.geom.MultiPolygon cannot be cast to class org.locationtech.jts.geom.Polygon (org.locationtech.jts.geom.MultiPolygon and org.locationtech.jts.geom.Polygon are in unnamed module of loader 'app')
at org.openstreetmap.atlas.geography.boundary.CountryBoundaryMap.readFromPlainText(CountryBoundaryMap.java:723)
... 8 more
I cloned the repo and ran "./gradle clean build" - all good, then I ran "./gradlew clean run" and it fails
2022-10-17 14:51:37 ERROR [main] Command:286 - Command execution failed.
org.openstreetmap.atlas.exception.CoreException: Job Atlas Generator failed.
at org.openstreetmap.atlas.generator.tools.spark.SparkJob.onRun(SparkJob.java:204)
at org.openstreetmap.atlas.utilities.runtime.Command.execute(Command.java:338)
at org.openstreetmap.atlas.utilities.runtime.Command.run(Command.java:282)
at org.openstreetmap.atlas.generator.AtlasGenerator.main(AtlasGenerator.java:63)
Caused by: org.openstreetmap.atlas.exception.CoreException: Invalid country boundary text file format.
at org.openstreetmap.atlas.geography.boundary.CountryBoundaryMap.readFromPlainText(CountryBoundaryMap.java:730)
at org.openstreetmap.atlas.geography.boundary.CountryBoundaryMap.fromPlainText(CountryBoundaryMap.java:189)
at org.openstreetmap.atlas.geography.boundary.CountryBoundaryMapArchiver.read(CountryBoundaryMapArchiver.java:69)
at org.openstreetmap.atlas.generator.AtlasGenerator.boundaries(AtlasGenerator.java:393)
at org.openstreetmap.atlas.generator.AtlasGenerator.start(AtlasGenerator.java:157)
at org.openstreetmap.atlas.generator.tools.spark.SparkJob.onRun(SparkJob.java:197)
... 3 more
Caused by: java.lang.ClassCastException: class org.locationtech.jts.geom.MultiPolygon cannot be cast to class org.locationtech.jts.geom.Polygon (org.locationtech.jts.geom.MultiPolygon and org.locationtech.jts.geom.Polygon are in unnamed module of loader 'app')
at org.openstreetmap.atlas.geography.boundary.CountryBoundaryMap.readFromPlainText(CountryBoundaryMap.java:723)
... 8 more
java version info:
java --version
openjdk 11.0.11 2021-04-20
OpenJDK Runtime Environment (build 11.0.11+9-Ubuntu-0ubuntu2.20.04)
OpenJDK 64-Bit Server VM (build 11.0.11+9-Ubuntu-0ubuntu2.20.04, mixed mode, sharing)
Currently, we're passing the entire CountryBoundaryMap from the driver to the slaves, requiring the entire thing to be serialized. There might be a better way to do this, by passing only the boundaries being sliced or only the boundaries requested in the job parameter.
In the AtlasGenerator job, calculating statistics is a separate (optional) stage.
The idea is to replace this with a Spark Double Accumulator earlier in the flow. The Accumulator can still support the custom AtlasStatistics class and be optional.
This might improve the overall runtime of the statistics portion, as it would be done inline with Atlas creation. However, I don't have data to back up this assumption. Writing a task in case there is interest in streamlining this portion of the job.
I get a generation failure in RUS locally on my system with the following sharded pbf files.
21/08/12 16:06:08 ERROR Executor: Exception in task 682.3 in stage 4.0 (TID 8766)
org.openstreetmap.atlas.exception.CoreException: Error during task Way Sectioned Atlas Creation for shard RUS_10-615-346 :
at org.openstreetmap.atlas.generator.AtlasGeneratorHelper.lambda$sectionAtlas$2f38d685$1(AtlasGeneratorHelper.java:431)
at org.apache.spark.api.java.JavaPairRDD$.$anonfun$pairFunToScalaFun$1(JavaPairRDD.scala:1044)
at scala.collection.Iterator$$anon$10.next(Iterator.scala:459)
at org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:222)
at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:299)
at org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1371)
at org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1298)
at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1362)
at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1186)
at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:360)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:311)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:127)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:446)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:449)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: org.openstreetmap.atlas.exception.CoreException: Couldn't find node at POINT (36.9266202 50.5129055) while sectioning Line [Line: id=362729841000000, polyLine=LINESTRING (36.9266202 50.5129055, 36.9259448 50.5128815, 36.9234523 50.512998, 36.9055813 50.5135254, 36.9048755 50.5135274), [Tags: [last_edit_user_name => Vorrutyer_bak], [last_edit_changeset => 32926830], [last_edit_time => 1438076138000], [last_edit_user_id => 622109], [iso_country_code => RUS], [highway => track], [last_edit_version => 1]]] for Atlas 10-615-346
at org.openstreetmap.atlas.geography.atlas.raw.sectioning.AtlasSectionProcessor.createNode(AtlasSectionProcessor.java:397)
at org.openstreetmap.atlas.geography.atlas.raw.sectioning.AtlasSectionProcessor.createEdge(AtlasSectionProcessor.java:329)
at org.openstreetmap.atlas.geography.atlas.raw.sectioning.AtlasSectionProcessor.createSections(AtlasSectionProcessor.java:454)
at org.openstreetmap.atlas.geography.atlas.raw.sectioning.AtlasSectionProcessor.section(AtlasSectionProcessor.java:663)
at java.base/java.lang.Iterable.forEach(Iterable.java:75)
at org.openstreetmap.atlas.geography.atlas.raw.sectioning.AtlasSectionProcessor.run(AtlasSectionProcessor.java:186)
at org.openstreetmap.atlas.generator.AtlasGeneratorHelper.lambda$sectionAtlas$2f38d685$1(AtlasGeneratorHelper.java:426)
... 18 more
The pbf files I used in generating atlas files for this are:
And this sharding file:
sharding.txt
After copying the sharding txt file and pbf files to the correct folder, I just changed the countries in the gradle.properties file to RUS
generator.local.countries=RUS
and then ran
./gradlew run
I wasn't able to reproduce it if I only including the shard that was showing the issue. By including the shards around that shard I was able to reproduce it.
As a note:
It looks like way 362729841 crosses over the corner of 10-616-345 but has no nodes in that shard. Shard 10-616-345 is kitty-corner to the shard that sees the issue (10-615-346). This way is not part of any relation and is not close to a boundary so I'm not sure that the PR that is outstanding will address this issue. My guess is that there is a boundary case somewhere when scanning for ways in a neighboring shard. Maybe if a way crosses a surrounding shard but has no nodes in any surrounding shard then things break?
Create an integration test that can run a small contained Spark job to test AtlasGenerator.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.