Giter VIP home page Giter VIP logo

vtquery's People

Contributors

apendleton avatar artemp avatar cyanrook avatar ericcarlschwartz avatar flippmoke avatar manaswinidas avatar mapsam avatar ozqu avatar springmeyer avatar stevage avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

vtquery's Issues

LeakSanitizer: detected memory leaks

Some memory leaks detected in the asan bulids. cc @artemp - think this is worth investigating?

https://travis-ci.org/mapbox/vtquery/jobs/501593581

=================================================================
==20064==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x7fdb287be1a2 in operator new(unsigned long) /home/travis/build/mapbox/mason/mason_packages/.build/llvm-5.0.0/build/../projects/compiler-rt/lib/asan/asan_new_delete.cc:92:3
    #1 0x8e4a00 in node::Start(int, char**) (/home/travis/.nvm/versions/node/v10.15.2/bin/node+0x8e4a00)
    #2 0x7fdb274c7f44 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21f44)
    #3 0x89fd84 in _start (/home/travis/.nvm/versions/node/v10.15.2/bin/node+0x89fd84)
-----------------------------------------------------
Suppressions used:
  count      bytes template
    371     242469 v8::internal
-----------------------------------------------------
SUMMARY: AddressSanitizer: 16 byte(s) leaked in 1 allocation(s).
Aborted (core dumped)
npm ERR! Test failed.  See above for more details.
=================================================================
==20052==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x7f13221af1a2 in operator new(unsigned long) /home/travis/build/mapbox/mason/mason_packages/.build/llvm-5.0.0/build/../projects/compiler-rt/lib/asan/asan_new_delete.cc:92:3
    #1 0x8e4a00 in node::Start(int, char**) (/home/travis/.nvm/versions/node/v10.15.2/bin/node+0x8e4a00)
    #2 0x7f1320eb8f44 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21f44)
    #3 0x89fd84 in _start (/home/travis/.nvm/versions/node/v10.15.2/bin/node+0x89fd84)
-----------------------------------------------------
Suppressions used:
  count      bytes template
    550     229030 v8::internal
-----------------------------------------------------
SUMMARY: AddressSanitizer: 16 byte(s) leaked in 1 allocation(s).
/home/travis/.travis/functions: line 104: 20052 Aborted                 (core dumped) npm test
The command "npm test" failed and exited with 134 during .
Your build has been stopped.
/home/travis/.travis/functions: line 472: 20102 Aborted                 (core dumped) pkill -9 -P "${$}" &> /dev/null

Efficiency recommendations

As @artemp mentioned today over chat, it is important to ensure code is correct before optimizing it. Because, of course, Faster code that gets the wrong answer is not superior to correct code. per https://howardhinnant.github.io/coding_guidelines.html.

#28 indicates a potentially critical bug involving the closest point results, so clearly we should be wary to optimize around the closest point code path until #28 is resolved.

However there are other code paths that:

  • Look and behave correctly
  • Exhibit potential for optimization

Below are some recommendations to consider for optimization.

Use more appropriate container types

Currently the code is using std::map<std::string, variant_type> to hold feature properties. The most efficient container is the one that is fastest for your purpose. In our case creation and iteration speed are what matters (e.g. we don't search so that does not matter). In my experience std::vector<std::pair<std::string, variant_type>> can be created faster than std::map and therefore is a better container when fast search is not needed. So, I would recommend changing to std::vector<std::pair<std::string, variant_type>>.

Don't risk blocking the main loop

Only the code inside Execute() runs in the threadpool. The code before entering the threadpool and after (HandleOKCallback()) all run on the main event loop and therefore is synchronous. If that synchronous code is slow (does CPU intensive work or triggers many allocations) then that is going to block the main loop and potentially reduce the ability to dispatch work to the threadpool. This problem is also going to be more severe when the code is running in a larger application where other JS is running on the main event loop.

So, I would recommend reverting a175212#diff-d293222b917c4d9edf75c1288538554fL281 and moving the results_to_json_string(result_string_, results_); out of the HandleOKCallback() function and back into the Execute().

@@ -283,6 +283,7 @@ struct Worker : Nan::AsyncWorker {
             std::sort(results_.begin(), results_.end(), [](const ResultObject& a, const ResultObject& b) { return a.distance < b.distance; });
 
             // TODO(sam) create new results vector (from results_) of length specific to num_results option
+            results_to_json_string(result_string_, results_);
 
         } catch (const std::exception& e) {
             SetErrorMessage(e.what());
@@ -299,7 +300,6 @@ struct Worker : Nan::AsyncWorker {
     void HandleOKCallback() override {
         Nan::HandleScope scope;
 
-        results_to_json_string(result_string_, results_);
 
         auto const argc = 2u;
         v8::Local<v8::Value> argv[argc] = {

Reducing unnecessary copies/reallocation

This is especially important when custom objects are being copied because:

  • Custom objects may be large and poorly packed (and therefore larger than they would be if packed in a more optimized way).
  • Contain other object members that are large and require allocation and deallocation to be copied themselves (like std::vector or std::map).

Accidental copies are easy to have happen and a common cause of slow code. They are one of the reasons, in my experience, why it is easy to write C++ that is slower than Javascript (v8 goes to extreme lengths to avoid allocations for you). To write C++ faster than Javascript we need to aim for zero copy code. To achieve that we need to either pass objects by reference or leverage C++11 move semantics.

Okay, enough context, let's dive in...

We need to avoid copying the results passed to the json generator. Passing by reference here makes sense:

- void results_to_json_string(std::string & s, std::vector<ResultObject> results) {
+ void results_to_json_string(std::string & s, std::vector<ResultObject> const& results) {

Also the properties arg being passed to the ResultObject constructor is currently being copied. And the ResultObject itself is being allocated, then copied, when it could be constructed in place. To avoid these extra allocations do:

@@ -32,7 +32,7 @@ struct ResultObject {
     ResultObject(
         mapbox::geometry::point<double> p,
         double distance0,
-        std::map<std::string, ResultObject::variant_type> props_map,
+        std::map<std::string, ResultObject::variant_type> && props_map,
         std::string name,
         GeomType geom_type)
         : coordinates(p),
@@ -272,8 +272,7 @@ struct Worker : Nan::AsyncWorker {
                                 properties_map.insert(std::pair<std::string, variant_type>(key, value));
                             }
 
-                            ResultObject r(feature_lnglat, meters, properties_map, layer_name, original_geometry_type);
-                            results_.emplace_back(r);
+                            results_.emplace_back(feature_lnglat, meters, std::move(properties_map), layer_name, original_geometry_type);
                         }
                     } // end tile.layer.feature loop
                 }     // end tile.layer loop

Also, the individual properties are suffering from an extra copy currently due to the use of insert. Instead we want to use C++11 emplace:

@@ -267,9 +267,9 @@ struct Worker : Nan::AsyncWorker {
                             using variant_type = ResultObject::variant_type;
                             ResultObject::properties_type properties_map;
                             while (auto prop = feature.next_property()) {
-                                std::string key = std::string{prop.key()};
-                                variant_type value = vtzero::convert_property_value<variant_type>(prop.value());
-                                properties_map.insert(std::pair<std::string, variant_type>(key, value));
+                                properties_map.emplace(
+                                    std::string{prop.key()},
+                                    vtzero::convert_property_value<variant_type>(prop.value()));
                             }
 
                             ResultObject r(feature_lnglat, meters, properties_map, layer_name, original_geometry_type);

Also, all this is probably of minor cost compared to the re-allocation cost that is more hidden:

  • Reallocation of the ResultObjects inside the std::vector<ResultObject> as it grows
  • Reallocation of the ResultObjects inside the std::vector<ResultObject> when it is sorted

We want to avoid both:

  • the amount of reallocation
  • the cost of reallocation.

The best way to avoid the amount of allocation is to call vector.reserve(<max possible size>). However in our case it is tricky to know <max possible size> because we don't know how many ResultObject we will end up with. This needs some experimentation and may or may not result in faster code. This post has some details on this.

The real win possible in vtquery currently is in reducing the cost of reallocation. The cost is large currently because the ResultObject does not ban copying or implement a move assignment or move constructor. If we were to dig deep in https://www.slideshare.net/ripplelabs/howard-hinnant-accu2014 we'd learn that if we delete the copy constructor and copy assignment operator and define the move assignment or move constructor, then the reallocation happening during std::sort will magically start using move semantics and will avoid the reallocation from copying data around during sort.

This can be done by adding these operators to the ResultObject class:

+    // non-copyable
+    ResultObject(ResultObject const&) = delete;
+    ResultObject& operator=(ResultObject const&) = delete;
+
+    // movable
+    ResultObject(ResultObject&&) = default;
+    ResultObject& operator=(ResultObject&&) = default;

Be more lazy

Currently the feature properties are:

  • Being pushed into a std::map
  • Being decoded into a variant

These structures require allocations.

Given that:

  • All we do with the feature properties is ultimately turn them into JSON objects, and
  • When limiting results is implemented (I noticed num_results is currently not respected), we will only return a subset of features and therefore the creation of properties will be wasted.

I think:

  • We should be more lazy and not decode into a variant or std::map during the layer.next_feature() loop.
  • Instead we should take a reference to the data_view representing the vtzero::feature
  • After sorting, we can decode only the top N features directly from their data_view into a JSON object without std::map or variant intermediate objects.

Note: Unless a better solution were devised, this would probably require bringing back the std::deque<vtzero::layer> layers; that was removed in 84606a3 (to be able to keep the vtzero::layer referenced by the vtzero::feature alive). It should be possible to store a data_view representing the property values to avoid needing to keep the layer around (the layer is only needed to get the right property key/value, but once gotten it should be safe to use them without the layer alive?).

/cc @mapbox/core-tech

when to use std::int32_t and std::int64_t

Description

Looks like there have been some conversations around this and thought we should discuss all the places where we need to be explicit about the bit size of the int and when we shouldn't. I'll start us off with refs to previous conversations and lists of our current int32_t and int64_t usage. It would be great to get some follow-up comments about the reasoning behind this.

It looks like the idea is that we want to use int32_t for x, y, z, and extent until we cast them to doubles or int64_ts when we store them in a mapbox::geometry::point. Looks like we use int64_t for id and property values, referred to as v in the code.

References

int32_t/ uint32_t usage

int64_t/ uint64_t usage

v0.3.0 release

TODO:

  • Decide whether to land #73 at the same time
  • Decide whether to land to land #92
  • Update changelog
  • Publish binaries

Some code comments

I am just looking at the code with focus on the use of vtzero. Here are some things:

deduplication does not work when comparing tags across tiles/layers

Right now we're comparing tag integers between features to determine if they share the same properties. Unfortunately tag integers are not shared on a global space (tileset) and are only shared within a tile. For example 0, 0 might map to type, road in one tile, but it maps to color, purple in another tile.

For now, let's start by decoding properties and comparing them directly. This will likely result in a pretty significant slowdown, but will actually perform deduplication in the way we want.

cc @flippmoke

Revise Readme

This project is moving forward and close to being usable. It's time to revise the readme to orient to an external audience. So I think the Things to remember and Loop architecture proposals could be moved to tickets, where appropriate, and the develop section could be touched up to change node-cpp-skel -> vtquery.

/cc @mapsam

Why zxy values extracted as int64_t and then casted to uint32_t ?

vtquery/src/vtquery.cpp

Lines 505 to 514 in 80d6515

std::int64_t y = y_val->IntegerValue();
if (y < 0) {
return utils::CallbackError("'y' value must not be less than zero", callback);
}
// in-place construction
std::unique_ptr<TileObject> tile{new TileObject{static_cast<std::uint32_t>(z),
static_cast<std::uint32_t>(x),
static_cast<std::uint32_t>(y),
buffer}};

Looks untidy ^. Why not using int32Value() and store z,x,y as signed int ?
/cc @mapsam

Should we enforce a single "z"?

Once we go into the worker, we have the following information:

  • lng/lat
  • buffers with z/x/y

If we are looping through each buffer, we'll have to calculate the tile coordinates of the query lng/lat based on each buffer's z level. If we made the assumption that every buffer was at the same zoom level (defined outside of the tiles array) we could calculate the query tile coordinates once and essentially update their origin point according to each buffer's x/y values in the loop.

So the API would change to:

const tiles = [ { buffer: new Buffer(), x: 1, y: 2} ];
vtquery(tiles, lnglat, zoom, options, callback);

cc @springmeyer

Dangerous abort possible in HandleOKCallback

It is possible for the code inside

vtquery/src/vtquery.cpp

Lines 362 to 418 in c3bad9b

void HandleOKCallback() override {
Nan::HandleScope scope;
v8::Local<v8::Object> results_object = Nan::New<v8::Object>();
v8::Local<v8::Array> features_array = Nan::New<v8::Array>();
results_object->Set(Nan::New("type").ToLocalChecked(), Nan::New<v8::String>("FeatureCollection").ToLocalChecked());
// for each result object
while (!results_queue_.empty()) {
auto const& feature = results_queue_.back(); // get reference to top item in results queue
if (feature.distance < std::numeric_limits<double>::max()) {
// if this is a default value, don't use it
v8::Local<v8::Object> feature_obj = Nan::New<v8::Object>();
feature_obj->Set(Nan::New("type").ToLocalChecked(), Nan::New<v8::String>("Feature").ToLocalChecked());
// create geometry object
v8::Local<v8::Object> geometry_obj = Nan::New<v8::Object>();
geometry_obj->Set(Nan::New("type").ToLocalChecked(), Nan::New<v8::String>("Point").ToLocalChecked());
v8::Local<v8::Array> coordinates_array = Nan::New<v8::Array>(2);
coordinates_array->Set(0, Nan::New<v8::Number>(feature.coordinates.x)); // latitude
coordinates_array->Set(1, Nan::New<v8::Number>(feature.coordinates.y)); // longitude
geometry_obj->Set(Nan::New("coordinates").ToLocalChecked(), coordinates_array);
feature_obj->Set(Nan::New("geometry").ToLocalChecked(), geometry_obj);
// create properties object
v8::Local<v8::Object> properties_obj = Nan::New<v8::Object>();
for (auto const& prop : feature.properties_vector) {
set_property(prop, properties_obj);
}
// set properties.tilquery
v8::Local<v8::Object> tilequery_properties_obj = Nan::New<v8::Object>();
tilequery_properties_obj->Set(Nan::New("distance").ToLocalChecked(), Nan::New<v8::Number>(feature.distance));
std::string og_geom = getGeomTypeString(feature.original_geometry_type);
tilequery_properties_obj->Set(Nan::New("geometry").ToLocalChecked(), Nan::New<v8::String>(og_geom).ToLocalChecked());
tilequery_properties_obj->Set(Nan::New("layer").ToLocalChecked(), Nan::New<v8::String>(feature.layer_name).ToLocalChecked());
properties_obj->Set(Nan::New("tilequery").ToLocalChecked(), tilequery_properties_obj);
// add properties to feature
feature_obj->Set(Nan::New("properties").ToLocalChecked(), properties_obj);
// add feature to features array
features_array->Set(static_cast<uint32_t>(results_queue_.size() - 1), feature_obj);
}
results_queue_.pop_back();
}
results_object->Set(Nan::New("features").ToLocalChecked(), features_array);
auto const argc = 2u;
v8::Local<v8::Value> argv[argc] = {
Nan::Null(), results_object};
callback->Call(argc, static_cast<v8::Local<v8::Value>*>(argv));
}
};
to suffer from a C++ exception. Because there is no try/catch this exception will lead to an abort which would bring down the entire node process. So, this needs a try/catch to protect from DDOS potential on the API side.

The particular case where I noticed this was this backtrace:

Core file '/cores/core.46375' (x86_64) was loaded.
(lldb) thread backtrace all
* thread #1, stop reason = signal SIGSTOP
  * frame #0: 0x00007fff50f6ae3e libsystem_kernel.dylib`__pthread_kill + 10
    frame #1: 0x00007fff510a9150 libsystem_pthread.dylib`pthread_kill + 333
    frame #2: 0x00007fff50ec7312 libsystem_c.dylib`abort + 127
    frame #3: 0x00007fff4eea2f8f libc++abi.dylib`abort_message + 245
    frame #4: 0x00007fff4eea3113 libc++abi.dylib`default_terminate_handler() + 241
    frame #5: 0x00007fff5022deab libobjc.A.dylib`_objc_terminate() + 105
    frame #6: 0x00007fff4eebe7c9 libc++abi.dylib`std::__terminate(void (*)()) + 8
    frame #7: 0x00007fff4eebe26d libc++abi.dylib`__cxa_throw + 121
    frame #8: 0x0000000102bf479c vtquery.node`protozero::pbf_reader::next() + 204
    frame #9: 0x0000000102c0046b vtquery.node`vtzero::property_value::type() const + 43
    frame #10: 0x0000000102c002d5 vtquery.node`_ZN6vtzero13apply_visitorINS_6detail15convert_visitorIN6mapbox7feature5valueENS3_11vector_tile6detail22property_value_mappingEEEEEDTclclsr3stdE7declvalIT_EEtlN9protozero9data_viewEEEEOSA_NS_14property_valueE + 37
    frame #11: 0x0000000102bf0cfd vtquery.node`VectorTileQuery::set_property(vtzero::property const&, v8::Local<v8::Object>&) + 61
    frame #12: 0x0000000102bf609b vtquery.node`VectorTileQuery::Worker::HandleOKCallback() + 731
    frame #13: 0x0000000102bf4f4a vtquery.node`Nan::AsyncWorker::WorkComplete() + 58
    frame #14: 0x0000000102c00292 vtquery.node`Nan::AsyncExecuteComplete(uv_work_s*) + 18
    frame #15: 0x0000000100a5ff65 node`uv__work_done + 178
    frame #16: 0x0000000100a61e0e node`uv__async_io + 317
    frame #17: 0x0000000100a71e74 node`uv__io_poll + 1934
    frame #18: 0x0000000100a6228f node`uv_run + 321
    frame #19: 0x00000001008ce97d node`node::Start(int, char**) + 744
    frame #20: 0x0000000100001234 node`start + 52
  thread #2, stop reason = signal SIGSTOP
    frame #0: 0x00007fff50f617fe libsystem_kernel.dylib`semaphore_wait_trap + 10
    frame #1: 0x0000000100a6d378 node`uv_sem_wait + 16
    frame #2: 0x00000001008cf6b4 node`node::DebugSignalThreadMain(void*) + 49
    frame #3: 0x00007fff510a66c1 libsystem_pthread.dylib`_pthread_body + 340
    frame #4: 0x00007fff510a656d libsystem_pthread.dylib`_pthread_start + 377
    frame #5: 0x00007fff510a5c5d libsystem_pthread.dylib`thread_start + 13
  thread #3, stop reason = signal SIGSTOP
    frame #0: 0x00007fff50f617fe libsystem_kernel.dylib`semaphore_wait_trap + 10
    frame #1: 0x0000000100a75647 node`v8::base::Semaphore::Wait() + 23
    frame #2: 0x00000001009346a9 node`v8::platform::TaskQueue::GetNext() + 57
    frame #3: 0x00000001009348ab node`v8::platform::WorkerThread::Run() + 43
    frame #4: 0x0000000100a77357 node`v8::base::ThreadEntry(void*) + 87
    frame #5: 0x00007fff510a66c1 libsystem_pthread.dylib`_pthread_body + 340
    frame #6: 0x00007fff510a656d libsystem_pthread.dylib`_pthread_start + 377
    frame #7: 0x00007fff510a5c5d libsystem_pthread.dylib`thread_start + 13
  thread #4, stop reason = signal SIGSTOP
    frame #0: 0x00007fff50f617fe libsystem_kernel.dylib`semaphore_wait_trap + 10
    frame #1: 0x0000000100a75647 node`v8::base::Semaphore::Wait() + 23
    frame #2: 0x00000001009346a9 node`v8::platform::TaskQueue::GetNext() + 57
    frame #3: 0x00000001009348ab node`v8::platform::WorkerThread::Run() + 43
    frame #4: 0x0000000100a77357 node`v8::base::ThreadEntry(void*) + 87
    frame #5: 0x00007fff510a66c1 libsystem_pthread.dylib`_pthread_body + 340
    frame #6: 0x00007fff510a656d libsystem_pthread.dylib`_pthread_start + 377
    frame #7: 0x00007fff510a5c5d libsystem_pthread.dylib`thread_start + 13
  thread #5, stop reason = signal SIGSTOP
    frame #0: 0x00007fff50f617fe libsystem_kernel.dylib`semaphore_wait_trap + 10
    frame #1: 0x0000000100a75647 node`v8::base::Semaphore::Wait() + 23
    frame #2: 0x00000001009346a9 node`v8::platform::TaskQueue::GetNext() + 57
    frame #3: 0x00000001009348ab node`v8::platform::WorkerThread::Run() + 43
    frame #4: 0x0000000100a77357 node`v8::base::ThreadEntry(void*) + 87
    frame #5: 0x00007fff510a66c1 libsystem_pthread.dylib`_pthread_body + 340
    frame #6: 0x00007fff510a656d libsystem_pthread.dylib`_pthread_start + 377
    frame #7: 0x00007fff510a5c5d libsystem_pthread.dylib`thread_start + 13
  thread #6, stop reason = signal SIGSTOP
    frame #0: 0x00007fff50f617fe libsystem_kernel.dylib`semaphore_wait_trap + 10
    frame #1: 0x0000000100a75647 node`v8::base::Semaphore::Wait() + 23
    frame #2: 0x00000001009346a9 node`v8::platform::TaskQueue::GetNext() + 57
    frame #3: 0x00000001009348ab node`v8::platform::WorkerThread::Run() + 43
    frame #4: 0x0000000100a77357 node`v8::base::ThreadEntry(void*) + 87
    frame #5: 0x00007fff510a66c1 libsystem_pthread.dylib`_pthread_body + 340
    frame #6: 0x00007fff510a656d libsystem_pthread.dylib`_pthread_start + 377
    frame #7: 0x00007fff510a5c5d libsystem_pthread.dylib`thread_start + 13
  thread #7, stop reason = signal SIGSTOP
    frame #0: 0x00007fff50f6acee libsystem_kernel.dylib`__psynch_cvwait + 10
    frame #1: 0x00007fff510a7662 libsystem_pthread.dylib`_pthread_cond_wait + 732
    frame #2: 0x0000000100a6d49a node`uv_cond_wait + 9
    frame #3: 0x0000000100a602c9 node`worker + 227
    frame #4: 0x00007fff510a66c1 libsystem_pthread.dylib`_pthread_body + 340
    frame #5: 0x00007fff510a656d libsystem_pthread.dylib`_pthread_start + 377
    frame #6: 0x00007fff510a5c5d libsystem_pthread.dylib`thread_start + 13
  thread #8, stop reason = signal SIGSTOP
    frame #0: 0x00007fff50f6b432 libsystem_kernel.dylib`__ulock_wait + 10
    frame #1: 0x00007fff5109f6ba libsystem_platform.dylib`_os_unfair_lock_lock_slow + 140
    frame #2: 0x00007fff50fc3332 libsystem_malloc.dylib`szone_malloc_should_clear + 213
    frame #3: 0x00007fff50fc3201 libsystem_malloc.dylib`malloc_zone_malloc + 103
    frame #4: 0x00007fff50fc250b libsystem_malloc.dylib`malloc + 24
    frame #5: 0x00007fff4eea2628 libc++abi.dylib`operator new(unsigned long) + 40
    frame #6: 0x0000000102bfb383 vtquery.node`std::__1::vector<mapbox::geometry::point<long long>, std::__1::allocator<mapbox::geometry::point<long long> > >::reserve(unsigned long) + 67
    frame #7: 0x0000000102bfbcfc vtquery.node`vtzero::detail::get_result<mapbox::vector_tile::detail::polygon_geometry_handler<long long>, void>::type vtzero::detail::geometry_decoder<protozero::const_varint_iterator<unsigned int> >::decode_polygon<mapbox::vector_tile::detail::polygon_geometry_handler<long long> >(mapbox::vector_tile::detail::polygon_geometry_handler<long long>&&) + 364
    frame #8: 0x0000000102bfa6c9 vtquery.node`mapbox::geometry::geometry<long long, std::__1::vector> mapbox::vector_tile::detail::extract_geometry_polygon<long long>(vtzero::feature const&) + 105
    frame #9: 0x0000000102bf5779 vtquery.node`VectorTileQuery::Worker::Execute() + 1993
    frame #10: 0x0000000100a60240 node`worker + 90
    frame #11: 0x00007fff510a66c1 libsystem_pthread.dylib`_pthread_body + 340
    frame #12: 0x00007fff510a656d libsystem_pthread.dylib`_pthread_start + 377
    frame #13: 0x00007fff510a5c5d libsystem_pthread.dylib`thread_start + 13
  thread #9, stop reason = signal SIGSTOP
    frame #0: 0x00007fff50fa1221 libsystem_m.dylib`___lldb_unnamed_symbol122$$libsystem_m.dylib + 72
    frame #1: 0x0000000102bf58f1 vtquery.node`VectorTileQuery::Worker::Execute() + 2369
    frame #2: 0x0000000100a60240 node`worker + 90
    frame #3: 0x00007fff510a66c1 libsystem_pthread.dylib`_pthread_body + 340
    frame #4: 0x00007fff510a656d libsystem_pthread.dylib`_pthread_start + 377
    frame #5: 0x00007fff510a5c5d libsystem_pthread.dylib`thread_start + 13
  thread #10, stop reason = signal SIGSTOP
    frame #0: 0x00007fff50f6acee libsystem_kernel.dylib`__psynch_cvwait + 10
    frame #1: 0x00007fff510a7662 libsystem_pthread.dylib`_pthread_cond_wait + 732
    frame #2: 0x0000000100a6d49a node`uv_cond_wait + 9
    frame #3: 0x0000000100a602c9 node`worker + 227
    frame #4: 0x00007fff510a66c1 libsystem_pthread.dylib`_pthread_body + 340
    frame #5: 0x00007fff510a656d libsystem_pthread.dylib`_pthread_start + 377
    frame #6: 0x00007fff510a5c5d libsystem_pthread.dylib`thread_start + 13
  thread #11, stop reason = signal SIGSTOP
    frame #0: 0x0000000102bfbc4a vtquery.node`vtzero::detail::get_result<mapbox::vector_tile::detail::polygon_geometry_handler<long long>, void>::type vtzero::detail::geometry_decoder<protozero::const_varint_iterator<unsigned int> >::decode_polygon<mapbox::vector_tile::detail::polygon_geometry_handler<long long> >(mapbox::vector_tile::detail::polygon_geometry_handler<long long>&&) + 186
    frame #1: 0x0000000102bfa6c9 vtquery.node`mapbox::geometry::geometry<long long, std::__1::vector> mapbox::vector_tile::detail::extract_geometry_polygon<long long>(vtzero::feature const&) + 105
    frame #2: 0x0000000102bf5779 vtquery.node`VectorTileQuery::Worker::Execute() + 1993
    frame #3: 0x0000000100a60240 node`worker + 90
    frame #4: 0x00007fff510a66c1 libsystem_pthread.dylib`_pthread_body + 340
    frame #5: 0x00007fff510a656d libsystem_pthread.dylib`_pthread_start + 377
    frame #6: 0x00007fff510a5c5d libsystem_pthread.dylib`thread_start + 13
  thread #12, stop reason = signal SIGSTOP
    frame #0: 0x00007fff50c0c8af libz.1.dylib`___lldb_unnamed_symbol42$$libz.1.dylib + 175
    frame #1: 0x00007fff50c096b7 libz.1.dylib`inflate + 5432
    frame #2: 0x0000000102bf6e28 vtquery.node`void gzip::Decompressor::decompress<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, char const*, unsigned long) const + 232
    frame #3: 0x0000000102bf5180 vtquery.node`VectorTileQuery::Worker::Execute() + 464
    frame #4: 0x0000000100a60240 node`worker + 90
    frame #5: 0x00007fff510a66c1 libsystem_pthread.dylib`_pthread_body + 340
    frame #6: 0x00007fff510a656d libsystem_pthread.dylib`_pthread_start + 377
    frame #7: 0x00007fff510a5c5d libsystem_pthread.dylib`thread_start + 13
  thread #13, stop reason = signal SIGSTOP
    frame #0: 0x00007fff50fc4f44 libsystem_malloc.dylib`tiny_free_list_add_ptr + 227
    frame #1: 0x00007fff50fda758 libsystem_malloc.dylib`tiny_free_no_lock + 570
    frame #2: 0x00007fff50fdb254 libsystem_malloc.dylib`free_tiny + 628
    frame #3: 0x0000000102bf582d vtquery.node`VectorTileQuery::Worker::Execute() + 2173
    frame #4: 0x0000000100a60240 node`worker + 90
    frame #5: 0x00007fff510a66c1 libsystem_pthread.dylib`_pthread_body + 340
    frame #6: 0x00007fff510a656d libsystem_pthread.dylib`_pthread_start + 377
    frame #7: 0x00007fff510a5c5d libsystem_pthread.dylib`thread_start + 13
  thread #14, stop reason = signal SIGSTOP
    frame #0: 0x0000000102bfba76 vtquery.node`std::__1::vector<mapbox::geometry::polygon<long long, std::__1::vector>, std::__1::allocator<mapbox::geometry::polygon<long long, std::__1::vector> > >::reserve(unsigned long) + 214
    frame #1: 0x0000000102bfa6f9 vtquery.node`mapbox::geometry::geometry<long long, std::__1::vector> mapbox::vector_tile::detail::extract_geometry_polygon<long long>(vtzero::feature const&) + 153
    frame #2: 0x0000000102bf5779 vtquery.node`VectorTileQuery::Worker::Execute() + 1993
    frame #3: 0x0000000100a60240 node`worker + 90
    frame #4: 0x00007fff510a66c1 libsystem_pthread.dylib`_pthread_body + 340
    frame #5: 0x00007fff510a656d libsystem_pthread.dylib`_pthread_start + 377
    frame #6: 0x00007fff510a5c5d libsystem_pthread.dylib`thread_start + 13
  thread #15, stop reason = signal SIGSTOP
    frame #0: 0x00007fff50f6acee libsystem_kernel.dylib`__psynch_cvwait + 10
    frame #1: 0x00007fff510a7662 libsystem_pthread.dylib`_pthread_cond_wait + 732
    frame #2: 0x0000000100a6d49a node`uv_cond_wait + 9
    frame #3: 0x0000000100a602c9 node`worker + 227
    frame #4: 0x00007fff510a66c1 libsystem_pthread.dylib`_pthread_body + 340
    frame #5: 0x00007fff510a656d libsystem_pthread.dylib`_pthread_start + 377
    frame #6: 0x00007fff510a5c5d libsystem_pthread.dylib`thread_start + 13
  thread #16, stop reason = signal SIGSTOP
    frame #0: 0x00007fff50fc56a1 libsystem_malloc.dylib`tiny_free_list_remove_ptr + 96
    frame #1: 0x00007fff50fdaac8 libsystem_malloc.dylib`tiny_free_no_lock + 1450
    frame #2: 0x00007fff50fdb254 libsystem_malloc.dylib`free_tiny + 628
    frame #3: 0x0000000102bfc584 vtquery.node`mapbox::util::detail::variant_helper<mapbox::geometry::polygon<long long, std::__1::vector>, mapbox::geometry::multi_point<long long, std::__1::vector>, mapbox::geometry::multi_line_string<long long, std::__1::vector>, mapbox::geometry::multi_polygon<long long, std::__1::vector>, mapbox::geometry::geometry_collection<long long, std::__1::vector> >::destroy(unsigned long, void*) + 100
    frame #4: 0x0000000102bf582d vtquery.node`VectorTileQuery::Worker::Execute() + 2173
    frame #5: 0x0000000100a60240 node`worker + 90
    frame #6: 0x00007fff510a66c1 libsystem_pthread.dylib`_pthread_body + 340
    frame #7: 0x00007fff510a656d libsystem_pthread.dylib`_pthread_start + 377
    frame #8: 0x00007fff510a5c5d libsystem_pthread.dylib`thread_start + 13
(lldb) quit

vtzero::convert_property_value inside worker

The set_property method does some variant decoding in the main thread while constructing the final JSON object to return to node. This, in theory, could happen within the worker threads before being handled in the callback, and increase performance.

vtquery/src/vtquery.cpp

Lines 154 to 160 in 95ab347

/// used to create the final v8 (JSON) object to return to the user
void set_property(vtzero::property const& property,
v8::Local<v8::Object>& properties_obj) {
auto val = vtzero::convert_property_value<mapbox::feature::value, mapbox::vector_tile::detail::property_value_mapping>(property.value());
mapbox::util::apply_visitor(property_value_visitor{properties_obj, std::string(property.key())}, val);
}

cc @springmeyer @flippmoke

Protecting against trouble on invalid limit values

I've done some poking at what happens when invalid limit values are passed. I've found that:

limit=0 segfaults the process

Replicate with:

diff --git a/bench/rules.js b/bench/rules.js
index 4587f25..080f471 100644
--- a/bench/rules.js
+++ b/bench/rules.js
@@ -9,7 +9,7 @@ module.exports = [
   {
     description: 'pip: many building polygons',
     queryPoint: [120.9667, 14.6028],
-    options: { radius: 0 },
+    options: { radius: 0 , limit: 0},
     tiles: [
       { z: 16, x: 54789, y: 30080, buffer: fs.readFileSync('./test/fixtures/manila-buildings-16-54789-30080.mvt')}
     ]

And node bench/vtquery.bench.js --iterations 50 --concurrency 1

limit=1000000000 hangs the process

Likely due to a large allocation.

Replicate with:

diff --git a/bench/rules.js b/bench/rules.js
index 4587f25..080f471 100644
--- a/bench/rules.js
+++ b/bench/rules.js
@@ -9,7 +9,7 @@ module.exports = [
   {
     description: 'pip: many building polygons',
     queryPoint: [120.9667, 14.6028],
-    options: { radius: 0 },
+    options: { radius: 0 , limit: 1000000000},
     tiles: [
       { z: 16, x: 54789, y: 30080, buffer: fs.readFileSync('./test/fixtures/manila-buildings-16-54789-30080.mvt')}
     ]

And node bench/vtquery.bench.js --iterations 50 --concurrency 1

limit=1000000000000 overflows

The error is Error: 'limit' must be a positive number, which is better than a crash but indicates that the number has overflowed to negative which should be protected. Instead we should have some reasonable limit to the max value and throw if it is over that value.

Replicate with:

diff --git a/bench/rules.js b/bench/rules.js
index 4587f25..dfa5866 100644
--- a/bench/rules.js
+++ b/bench/rules.js
@@ -9,7 +9,7 @@ module.exports = [
   {
     description: 'pip: many building polygons',
     queryPoint: [120.9667, 14.6028],
-    options: { radius: 0 },
+    options: { radius: 0 , limit: 1000000000000},
     tiles: [
       { z: 16, x: 54789, y: 30080, buffer: fs.readFileSync('./test/fixtures/manila-buildings-16-54789-30080.mvt')}
     ]

And node bench/vtquery.bench.js --iterations 50 --concurrency 1

regular results when distance values are the exact same

The following test asserts that result order remains the same when distances are the same value

test('results with same exact distance return in expected order', assert => {
  // this query returns four results, three of which are the exact same distance and different types of features
  const buffer = fs.readFileSync(path.resolve(__dirname+'/../node_modules/@mapbox/mvt-fixtures/real-world/chicago/13-2098-3045.mvt'));
  const ll = [-87.7964, 41.8675];
  vtquery([{buffer: buffer, z: 13, x: 2098, y: 3045}], ll, { radius: 10, layers: ['road'] }, function(err, result) {
    assert.equal(result.features[1].properties.type, 'turning_circle', 'is expected type');
    assert.ok(checkClose(result.features[1].properties.tilequery.distance, 9.436356889343624, 1e-6), 'is the proper distance');
    assert.equal(result.features[2].properties.type, 'service:driveway', 'is expected type');
    assert.ok(checkClose(result.features[2].properties.tilequery.distance, 9.436356889343624, 1e-6), 'is the proper distance');
    assert.equal(result.features[3].properties.type, 'pedestrian', 'is expected type');
    assert.ok(checkClose(result.features[3].properties.tilequery.distance, 9.436356889343624, 1e-6), 'is the proper distance');
    assert.end();
  });
});

This works on OSX, but fails on linux with a different order of values. Let's figure out how priority_queues are handled differently between the two OS's and then this test can be added back in. Plenty of background on this ticket in #42, including a Dockerfile to recreate the failing test.

Note: I've tried updating the CompareDistance operator to use <= with no luck.

cc @flippmoke @springmeyer

dedupe results

When querying multiple tiles, duplicate results are expected since vtquery doesn't know that two polygons split by a tile boundary are actually the same source feature. There are a few options for deduping:

  1. union geometries before querying
  2. based on feature id
  3. based on a hash of the properties

Unioning feels like an excellent long-term goal, but for now 2, and 3 feel most do-able. It's unlikely features will always have ID's, so going to start working on a hash of the properties objects. This isn't going to be a light operation, unfortunately, so it seems like a good idea to make deduping an option plus only use it when completely necessary. That said:

  • only dedupe when multiple tiles are being queried
  • add a dedupe option to the API
  • dedupe only when necessary since it will require decoding the properties

When to dedupe?

One concern of deduping as late as possible: if we don't dedupe while we add to the queue we run the chance of deduping results and missing out on other possible values that would make it into the queue were there only one result in the first place. Consider the following diagram:

screen shot 2018-01-12 at 9 53 58 am

If we had a limit of 5, we'd fill the queue with results A, B, C, E, and point at the original query point (distance 0). Deduping once we're past filling the queue would remove points A, B, and C in favor of the actual query point inside the polygon (instead of A, B, and C, which are just tile edge results). That leaves us with a queue of two results, where should have three (D) but we can't add it any longer.

I think if we start deduping, we may need to do it on demand, immediately when something is added to the queue. Flow would be:

  1. check if result should be added to the priority queue (this already exists)
  2. if so, generate a hash of the properties when adding to the queue (this will require us to store the hash in the ResultObject struct.)
  3. if it matches any other results in the queue, compare distances and remove the further result
  4. on to next possible feature to add

Performance hit hypothesis:

I think this will decrease performance for multi-tile queries heavily. That said, I think a decrease in performance is worth more valid results. Plus the user can opt out.

cc @mapbox/core-tech

More default compiler warnings

Enabling more compiler warnings for an existing project can be painful. It is better to do it from the start so you can think about how to achieve the cleanest and most warning free code as you design.

In this light I noticed we were missing some common compiler warning flags in this project, because I missed them in node-cpp-skel. So, we should pull over mapbox/node-cpp-skel#78

refs mapbox/cpp#37

More efficient response format

In profiling vtquery under load with @mapsam I recall seeing that the JSON response creation runs on the main event loop and takes a non-trivial amount of time to return as a JS object.

Given this fact there are several things we can consider that would speed up the response:

Async json generation in threadpool

We could generate the JSON response async inside the thread pool (so inside Execute() rather than on the main thread like is currently done (

void HandleOKCallback() override {
).

This would require serializing the JSON to a string and then passing the data back from the threadpool in the same way that we pass vector tiles back from the threadpool to the main thread in vtcomposite and vtshaver (as a node.Buffer).

Compressing the JSON response

Now that we're working with a string, we could also gzip compress the JSON string in the threadpool before sending back. This could help speed up the transfer of the data to clients.

benchmarks

Let's set up a benchmark script based on the node-cpp-skel async benchmarks so we can work through different looping ideas and sorting styles.

Other than total time, it'd be great to benchmark individual pieces of this process, such as:

  • sorting
  • vtzero reading
  • sending JSON back to node
  • more...

cc @springmeyer @GretaCB

Filter by geometries by bbox

I ran the slowest/most intense benchmark (query: all things - dense nine tiles) on OS X using node bench/vtquery.bench.js --iterations 5000 --concurrency 1 and a patch to disable all other tests besides query: all things - dense nine tiles. Then I profiled in Activity Monitor during the run.

What I see is:

My interpretation is that:

  • Parsing with vtzero is extremely cheap
  • Running closest_point is also pretty cheap
  • Decoding into mapbox::geometry objects is expensive due to memory allocation and deallocation

And therefore our overwhelming bottleneck (where > 50% of the time is taken) is mapbox::vector_tile::extract_geometry (https://github.com/mapbox/vector-tile/blob/97d8b89fe635f117ce7de25790028a65d9ce5172/include/mapbox/vector_tile.hpp#L15)

So, to reduce the latency of scenarios like this (large radius and multiple tiles) we'll need to speed up mapbox::vector_tile::extract_geometry.

Profiling output: https://callgraph.herokuapp.com/76849341d35452543e35c504964dcb94#thread-6

/cc @mapsam @flippmoke

Code consisitency/correctness around pointers

In regards to

vtquery/src/vtquery.cpp

Lines 26 to 29 in 5ffe8d7

static const char* GeomTypeStrings[] = {"point", "linestring", "polygon", "unknown"};
const char* getGeomTypeString(int enumVal) {
return GeomTypeStrings[enumVal];
}

@alliecrevier ask:

[nitpick] Since T const& instead of const T& is the style used throughout this code file, it might make sense to use this style for pointers as well. So instead of const char* and static const char* you would use char const* and static char const* respectively. It looks weird to me but it's consistent!

exclude layers option

For usability, it'd be nice to have an option named excludeLayers that is the same thing as layers but purely for saying "do not query these layers". This is helpful when you specifically don't want a particular layer in tilesets with many layers, like Mapbox Streets, but don't want to list all of the layers out.

Failing to compile with g++-6

All of a sudden in the #99 branch the g++ job started failing with:

  g++-6 '-DNODE_GYP_MODULE_NAME=vtquery' '-DUSING_UV_SHARED=1' '-DUSING_V8_SHARED=1' '-DV8_DEPRECATION_WARNINGS=1' '-D_LARGEFILE_SOURCE' '-D_FILE_OFFSET_BITS=64' '-DBUILDING_NODE_EXTENSION' '-DDEBUG' '-D_DEBUG' '-DV8_ENABLE_CHECKS' -I/home/travis/.node-gyp/10.5.0/include/node -I/home/travis/.node-gyp/10.5.0/src -I/home/travis/.node-gyp/10.5.0/deps/uv/include -I/home/travis/.node-gyp/10.5.0/deps/v8/include  -fPIC -pthread -Wall -Wextra -Wno-unused-parameter -m64 -isystem /home/travis/build/mapbox/vtquery/node_modules/nan -isystem /home/travis/build/mapbox/vtquery/mason_packages/.link/include/ -Wall -Wextra -Wconversion -pedantic-errors -Wconversion -Wshadow -Wfloat-equal -Wuninitialized -Wunreachable-code -Wold-style-cast -Wno-error=unused-variable -g -O0 -std=c++14 -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14 -D_GLIBCXX_USE_CXX11_ABI=0 -std=gnu++1y -Werror -MMD -MF ./Debug/.deps/Debug/obj.target/vtquery/src/vtquery.o.d.raw  -Weffc++ -c -o Debug/obj.target/vtquery/src/vtquery.o ../src/vtquery.cpp
In file included from /home/travis/build/mapbox/vtquery/mason_packages/.link/include/boost/config.hpp:61:0,
                 from /home/travis/build/mapbox/vtquery/mason_packages/.link/include/boost/mpl/aux_/config/msvc.hpp:19,
                 from /home/travis/build/mapbox/vtquery/mason_packages/.link/include/boost/mpl/aux_/config/adl.hpp:17,
                 from /home/travis/build/mapbox/vtquery/mason_packages/.link/include/boost/mpl/aux_/adl_barrier.hpp:17,
                 from /home/travis/build/mapbox/vtquery/mason_packages/.link/include/boost/mpl/bool_fwd.hpp:17,
                 from /home/travis/build/mapbox/vtquery/mason_packages/.link/include/boost/mpl/bool.hpp:17,
                 from /home/travis/build/mapbox/vtquery/mason_packages/.link/include/boost/mpl/not.hpp:17,
                 from /home/travis/build/mapbox/vtquery/mason_packages/.link/include/boost/mpl/assert.hpp:17,
                 from /home/travis/build/mapbox/vtquery/mason_packages/.link/include/boost/geometry/core/closure.hpp:22,
                 from /home/travis/build/mapbox/vtquery/mason_packages/.link/include/boost/geometry/geometry.hpp:25,
                 from /home/travis/build/mapbox/vtquery/mason_packages/.link/include/mapbox/geometry/algorithms/detail/boost_adapters.hpp:7,
                 from /home/travis/build/mapbox/vtquery/mason_packages/.link/include/mapbox/geometry/algorithms/closest_point_impl.hpp:2,
                 from ../src/vtquery.cpp:11:
/home/travis/build/mapbox/vtquery/mason_packages/.link/include/boost/math/constants/constants.hpp: In static member function ‘static constexpr T boost::math::constants::detail::constant_half<T>::get(const mpl_::int_<5>&)’:
/home/travis/build/mapbox/vtquery/mason_packages/.link/include/boost/math/constants/constants.hpp:265:3: error: unable to find numeric literal operator ‘operator""Q’
   BOOST_DEFINE_MATH_CONSTANT(half, 5.000000000000000000000000000000000000e-01, "5.00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000e-01")
   ^
[ .....  snip ......]
       ^
/home/travis/build/mapbox/vtquery/mason_packages/.link/include/boost/math/special_functions/log1p.hpp:207:7: note: use -std=gnu++11 or -fext-numeric-literals to enable more built-in suffixes
/home/travis/build/mapbox/vtquery/mason_packages/.link/include/boost/math/special_functions/log1p.hpp:208:7: error: unable to find numeric literal operator ‘operator""Q’
       BOOST_MATH_BIG_CONSTANT(T, 64, -0.560026216133415663808e-6)
       ^
/home/travis/build/mapbox/vtquery/mason_packages/.link/include/boost/math/special_functions/log1p.hpp:208:7: note: use -std=gnu++11 or -fext-numeric-literals to enable more built-in suffixes

I think this is probably happening since I switched to node v10 with enabled the -std=gnu++1y flag. So we now have both -std=c++14 and -std=gnu++1y flags getting into the compile flags.

build binaries for node v8.x

Node.js v8.x is the most current LTS version of Node.js - let's make sure we're building binaries for it.

cc @mapbox/core-tech

Disable sort option

Sorting by distance will be on by default, but let's allow users to remove it for their own needs. This will also result in a performance gain.

cc @mapbox/core-tech

invalid ELF header when installing remote binary

Getting the following error when installing vtquery from a remote binary:

Error: /home/travis/...../node_modules/@mapbox/vtquery/lib/binding/vtquery.node: invalid ELF header

Quick searching suggests this error comes from using a binary built with on operating system architecture and trying to use it on another. The binary was built with Node 4 on Trusty, and I'm using it in the same.

cc @mapbox/core-tech

[Feature] Basic Filtering Options

Premise

VTQuery provides a way to search inside a Vector Tile but doesn't provide a way to filter out results. This feature request is to implement a basic filtering mechanism to return only results that match the filters.

Desired Capability

As part of the options parameter sent to VTQuery, an additional feature can be included labelled filters. filters will contain an array of filter objects which consist of a parameter, condition, and value. Each result returned by VTQuery will satisfy all filters.

  • parameter - This will be the key inside parameters which the filter will be applied to
  • condition - Basic filter options <, <=, >, >=, =, !=
  • value - A value for which the features parameter will be compared.

Example

filters: [ {'parameter': 'size', 'condition': '>', 'value': 50}] will cause VTQuery to return only features with a size greater than 50.

To Be Determined

  • Should a condition contains be implemented for strings?
    • This would enable "City StoreName" to be returned on a query of contains StoreName

optimizations (now and future)

This is a running list of optimizations to discuss/implement

  • dedupe based on ID in the feature if it exists
  • save a vector of data_views from vtzero once
  • use vtzero to loop through and run geometry against closest_point - add data_view to vector for usage later, and keep moving
  • convert "radius" to tile coordinates
  • work within integers instead of doubles for geometry.hpp
  • determine where to load tile buffers - probably in javascript and pass an array of buffers in?
  • get feature properties from data_views once loop is finished (currently saved as variants until part of the results object)
  • not returning specific geometry types (i.e. "only give me points")
  • how to handle "radius" across tiles? keeping the origin relative to the current tile so if you have a tile that is outside of the bounds of the origin point, the origin value increases (or goes negative) this will give distances as real numbers, rather than interpolating based on a relative origin
  • "sort during" instead of "sort after" architecture
  • bypass closest point when radius=0 and only use boost::geometry::within AND only work with polygons (open question - how to handle exact intersections of query point along lines/boundaries?)
  • non-copyable ResultsObject

reduce package size by removing mason deps

The .npmignore in this repository is missing some major pieces, including .mason, .toolchain and mason_packages. We should be using the .gitignore by default, but wanted to avoid include the cloudformation template in the npm package. This results in a ~54MB package tarball, which is crazy huge.

cc @mapbox/core-tech

Closest point returning unexpected results for polygons

The visual tests show that the results from closest point aren't completely expected. Some polygons' results are along the expected edge, but others are on the opposite side of the polygon. This appears related to where the query point is so perhaps this is a result of winding order or a negative value origin point?

Here's a gif showing the results from vtquery flipping as the point gets closer to the polygons on the lower portion of the image (this is not on a tile border, FYI):

tilequery-flipping-results

My expectations:

screenshot-localhost-5000 2017-10-25 08-38-13-111

These are Mapbox Streets tiles and I can recreate the issue at z15 with any buildings.

cc @flippmoke

dedupe is too strong

Deduplication of features is too strong right now. Consider two buildings (polygons) as two unique features but they have few properties (or no properties) and no ID. For all intensive purposes these should be two unique features to avoid removing important data.

Perhaps it's best to only dedupe based on IDs for now, while we think about other ways to best dedupe with properties.

cc @flippmoke

update mason packages to released versions

Before 0.1.x is released, let's make sure vtquery is using released versions of libraries. Here's where we are at:

✅ geometry.hpp 96d3505

https://github.com/mapbox/mason/tree/master/scripts/geometry

1e4d116#diff-a213025e5b87b712fdc0007bd3a0e87b most recent update was part of vtzero/vector-tile updates and moved away from the latest 0.9.2 official release. Working off this commit mapbox/geometry.hpp@96d3505 from geometry.hpp. Will need some 👀 from @flippmoke @artemp on this one I think.

Update: we'll plan on releasing on a gitsha!

vtzero 533b811

https://github.com/mapbox/mason/tree/master/scripts/vtzero

1e4d116#diff-a213025e5b87b712fdc0007bd3a0e87b commit with most recent update was related to vector-tile 2.x development branch - hoping current master should work just fine.

✅ spatial-algorithms cdda174

https://github.com/mapbox/mason/tree/master/scripts/spatial-algorithms

105f1d6#diff-a213025e5b87b712fdc0007bd3a0e87b commit with most recent update to the version shows no need to specify number type in closest point info

✅ vector-tile f4728da

https://github.com/mapbox/mason/tree/master/scripts/vector-tile

Pending 2.x release.

cc @mapbox/core-tech

When radius=0 (default) do we validate the tile and query point intersect?

When the radius is 0 (which is the default) should we just choose the first buffer in the tiles array? Or do we loop through each buffer still and confirm the zxy intersects with the query lng/lat? Presumably downstream implementors of this library, if using radius=0, will only pass the single buffer, but there's the chance that buffer does not intersect with the query point.

node v10 deprecations breaking travis

Node v10 jobs on travis are currently broken due to:

./src/vtquery.cpp:487:52: error: 'ToObject' is deprecated [-Werror,-Wdeprecated-declarations]
        v8::Local<v8::Object> tile_obj = tile_val->ToObject();
                                                   ^
/home/travis/.node-gyp/10.13.0/include/node/v8.h:2455:10: note: 'ToObject' has been explicitly marked deprecated here
  inline V8_DEPRECATED("Use maybe version", Local<Object> ToObject() const);
         ^
/home/travis/.node-gyp/10.13.0/include/node/v8config.h:327:29: note: expanded from macro 'V8_DEPRECATED'
  declarator __attribute__((deprecated))
                            ^
../src/vtquery.cpp:497:49: error: 'ToObject' is deprecated [-Werror,-Wdeprecated-declarations]
        v8::Local<v8::Object> buffer = buf_val->ToObject();

This could be solved by turning deprecations back into warnings like:

diff --git a/binding.gyp b/binding.gyp
index 6540cb7..cf2f13d 100644
--- a/binding.gyp
+++ b/binding.gyp
@@ -21,7 +21,8 @@
         '-Wuninitialized',
         '-Wunreachable-code',
         '-Wold-style-cast',
-        '-Wno-error=unused-variable'
+        '-Wno-error=unused-variable',
+        '-Wno-error=deprecated-declarations'
       ]
   },
   # `targets` is a list of targets for gyp to run.

or by fixing the deprecations, similar to mapbox/vtcomposite@9282bee

or just by porting to N-API

/cc @mapsam @flippmoke @mapbox/maps-api

union geometries

In order to more effectively solve deduping and cross-tile querying, it'd be great to union intersecting geometries (such as those split by a tile boundary).

This will require some work with boost and spatial-algorithms in order to move forward.

cc @artemp @flippmoke

radius=0 leads to zero results with PIP

When radius is 0, we expect point in polygon style results (ie 1 result if we’re within a single polygon). Right now we get zero results since the query point is a double (lng lat) but by the time we’re comparing the distance from closest_point we’re in integers, which then get converted to a lng lat and can be different than the original query point since it’s an interpolation. This means the distance check looks something like if (0.00011241241 <= 0) { ... } which evaluates to false.

Three possible solutions:

  1. We’ll want to use something other than closest_point for PIP (radius=0) operations
  2. Do all calculations in doubles instead of ints
  3. Instead of using zero (if radius is zero) we use some sort of very small number like if (0.0000234 < 0.5) to catch these instances

cc @flippmoke

API structure

Current options:

🍎 Single buffer query

const vtquery = require('vtquery');
vtquery(vector_tile_buffer, lng, lat, function(err, results) {
  // results
});

🍌 Multiple buffers in query

const vtquery = require('vtquery');
const bufferArray = [buffer1, buffer2, buffer3];
vtquery(bufferArray, lng, lat, function(err, results) {
  // results
});

🍊 options

const vtquery = require('vtquery');
const bufferArray = [buffer1, buffer2, buffer3];
const options = {
  longitude: 20.2,
  latitude: 47.3,
  radius: 10 // which unit? 
};
vtquery(bufferArray, options, function(err, results) {
  // results
});

🍇 options with bufferArray included

const vtquery = require('vtquery');
const options = {
  longitude: 20.2,
  latitude: 47.3,
  radius: 10, // which unit? 
  tiles: [
    { buffer: new Buffer(1), z: 1, x: 1, y: 1 },
    { buffer: new Buffer(2), z: 1, x: 0, y: 1 }
  ]
};
vtquery(options, function(err, results) {
  // results
});

Since we'll be required to turn tile coordinates into lng/lat, we'll need to know zxy values for each buffer, which means we are forced into option 🍇 unless we go with option 🍌 where we only query a single buffer and we'll know the zxy in javascript and can use spherical mercator.

Leaning towards 🍇.

cc @GretaCB @millzpaugh @springmeyer

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.