Giter VIP home page Giter VIP logo

Comments (10)

karussell avatar karussell commented on May 22, 2024

Cool!

As a side effect, the PBF reader will be able to use multiple threads.

Why that? Via queue or something else?

from graphhopper.

NopMap avatar NopMap commented on May 22, 2024

Yes it comes with two queues and multiple possible worker threads.

But first I have a problem. The PBF classes need two additional JAR files. I would like to add them as JARs not source to make sure they are unmodified due to their licensing.

How do you add these .jar dependencies to graphhopper maven?
Where would we park the required license files?

from graphhopper.

karussell avatar karussell commented on May 22, 2024

Hmmh, can we establish a plugin mechanism somehow? E.g. I would also like to add apache-compress (see tools project).

But I don't like the fact that we blow up the graphhopper size and dependencies just for one single usecase as import which not all need (e.g. on Android). I want that grapphhopper has only two external dependencies (trove4j+later on probably lucene)

If dependencies are in maven central then you just need an additional dependency section in the pom.xml. If not this gets a bit more complicated, where you need to add a repository or different. E.g. for protobuf this looks like:

<dependency>
       <groupId>com.google.protobuf</groupId>
       <artifactId>protobuf-java</artifactId>
       <version>2.5.0</version>
</dependency>

Probably we can create an import class + jar in the tools project somehow?

from graphhopper.

karussell avatar karussell commented on May 22, 2024

Ok, if we add <scope>provided</scope> (or <optional>true</optional>?) to the dependency we could implement that in the core without requiring all users to have that bundled. I'll think about this.

I've read this and this

scope=provided means that the library is needed for compilation and
runtime, however it is provided by some sort of container. Typical
example: servlet-api
optional=true means that a library is needed for compilation, but it
is not necessary at runtime. Very often this is a symptom of poorly
made modules: it is best to isolate optional code into a different
module where the dependency is not an option. For example, in Velocity
Tools we had an optional dependency on an XML library, for a specific
XML tool. Isolating this code into a new module made this dependency
mandatory, but you have to include one more module in the using
project. 

from graphhopper.

NopMap avatar NopMap commented on May 22, 2024

This is looking good. I have a working, standalone reader at the moment. Reading bavaria as PBF takes only 17% of the time compared to XML. Yes, no mistake 83% faster. :-)

Now I have to check a few things and then move it into graphhopper. But I need the dependencies for that. Their size is unproblematic, 200k and 450k. Nothing next to trove. :-)

from graphhopper.

karussell avatar karussell commented on May 22, 2024

Reading bavaria as PBF takes only 17% of the time compared to XML

Woot!

Nothing next to trove. :-)

I know, but we'll need a lot more dependencies in the future, so we need to think about that and keep the core small and/or move the import section into another subproject. Not sure yet. But for now probably do that optional thing.

from graphhopper.

NopMap avatar NopMap commented on May 22, 2024

I have integrated the PBF reader into graphhopper. The times for bavaria on my machine are now 165s with XML and 44s with PBF. So 1/4 of the time overall. It is working, I checked the routing in the web demo, seems to work fine.

We still have two parameters to tweak for performance. The number of worker threads used for parsing and the maximum length of the queue if parsing is faster than processing in graphhopper. I set it to 2 workers and 50000 queue size because this gave the best performance on my machine with bavaria.

I added the dependencies to the pom.xml. This works, if we want to do it differently we can always change it. Be careful to use these versions and not configure it to the latest version. When I tried a mix of versions it crashed.

This time I did not break any tests, but how would you create a test for the new data format?

from graphhopper.

karussell avatar karussell commented on May 22, 2024

Cool, thanks! We could set the threads to half of the available CPUs of the machine or even make this configurable. I'll do this.

When I tried a mix of versions it crashed.

thanks for this info!

This time I did not break any tests, but how would you create a test for the new data format?

ok, I'll add a new andorra file and we'll see if the results are the same

from graphhopper.

NopMap avatar NopMap commented on May 22, 2024

I have 8 cores (with hyperthreading). 2 threads was faster than one, more than that did not improve the speed. On the other hand I already had to put in a limiter to keep the consumption queue from overrunning.

Play with the values, but I don't think half the CPU cores is a good idea.

from graphhopper.

karussell avatar karussell commented on May 22, 2024

Ok, made 2 the default but made it configurable.

This issue will be fixed via #64

from graphhopper.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.