Measure perf for leveldown after error handling implementation. about abi-stable-node HOT 24 CLOSED

nodejs commented on May 18, 2024

Measure perf for leveldown after error handling implementation.

from abi-stable-node.

Comments (24)

boingoing commented on May 18, 2024 1

Thanks @ianwjhalliday for pointing to the leveldown benchmark. This looks like a pretty representative test of ordinary database operations. It writes about a million entries representing a database of around 110MB. Ran the benchmark on the same machine and binaries listed in the above comment and found NAPI has about a 5% perf impact. Here are the results for a few runs:

Leveldown Nan on V8-Node 8.x	Leveldown NAPI on V8-Node-Napi 8.x
Elapsed: 45.867s	Elapsed: 47.619s
Elapsed: 44.805s	Elapsed: 47.535s
Elapsed: 45.134s	Elapsed: 47.506s
Elapsed: 45.054s	Elapsed: 46.482s
Elapsed: 44.739s	Elapsed: 47.694s
avg(elapsed) 45.1198s	avg(elapsed) 47.3672s (+4.98%)

This workload is not very chatty in terms of calls out to napi API so the measured perf impact may be representative generally. Reducing the overhead for napi callback functions which need to fetch the argument count, argument values, this value, etc by consolidating all those into a single call may prove to minimize the perf impact here, as well as in other modules. This needs some more investigation.

from abi-stable-node.

aruneshchandra commented on May 18, 2024

@boingoing - any update on this ?

from abi-stable-node.

boingoing commented on May 18, 2024

Leveldown Nan on V8-Node 8.x	Leveldown NAPI on V8-Node-Napi 8.x
Node https://github.com/nodejs/abi-stable-node api-prototype-8.x commit `1b30df1`	Node https://github.com/nodejs/abi-stable-node api-prototype-8.x commit `6e70e77`
Leveldown https://github.com/boingoing/leveldown master commit boingoing/leveldown@`ef05005`	Leveldown https://github.com/boingoing/leveldown api-opaque-prototype commit boingoing/leveldown@`48c8b83`
Elapsed: 45.951 Kernel elapsed: 1.234 User elapsed: 2.359	Elapsed: 45.169 Kernel elapsed: 1.234 User elapsed: 2.296
Elapsed: 43.893 Kernel elapsed: 1.265 User elapsed: 2.109	Elapsed: 45.044 Kernel elapsed: 0.968 User elapsed: 2.171
Elapsed: 46.150 Kernel elapsed: 0.968 User elapsed: 2.328	Elapsed: 44.609 Kernel elapsed: 1.109 User elapsed: 1.812
Elapsed: 45.691 Kernel elapsed: 1.078 User elapsed: 2.500	Elapsed: 45.070 Kernel elapsed: 1.156 User elapsed: 2.171
Elapsed: 42.064 Kernel elapsed: 0.859 User elapsed: 2.031	Elapsed: 44.671 Kernel elapsed: 1.062 User elapsed: 1.953
avg(elapsed) 44.7498	avg(elapsed) 44.9126 (+0.3%)

Above executes the leveldown unit test suite with plain or instrumented Node/Leveldown binaries built from respective repositories. Elapsed time for the suite is measured as a performance metric.

Tests run on a Windows machine with this hardware:
Intel Xeon E5-1620 v3 @ 3.50GHz
16GB DDR4 @ 2133MHz
Samsung XP941 SSD

from abi-stable-node.

mhdawson commented on May 18, 2024

Is there a perf test in leveldown or is this just running the unit tests ?

If there is no perf test I think we might want to write at least as simple one that does something typical and measure that as well.

from abi-stable-node.

aruneshchandra commented on May 18, 2024

Next steps:
Try it with a custom perf test like inserting and reading a large data set into and from the database.
Running it on other platforms like mac / Linux etc.

from abi-stable-node.

ianwjhalliday commented on May 18, 2024

leveldown does have its own perf benchmark suite. @boingoing is that included in your data?

from abi-stable-node.

sampsongao commented on May 18, 2024

@boingoing what is the unit of elapsed time?

from abi-stable-node.

boingoing commented on May 18, 2024

@sampsongao all units are in seconds.

from abi-stable-node.

jorangreef commented on May 18, 2024

Reducing the overhead for napi callback functions which need to fetch the argument count, argument values, this value, etc by consolidating all those into a single call may prove to minimize the perf impact here, as well as in other modules.

I am working on a cuckoo hash table. Just calling into JS to fetch the various arguments is costing half the run time. Being able to fetch all arguments in a single call would be terrific.

from abi-stable-node.

gabrielschulhof commented on May 18, 2024

@jorangreef I do not see a lot of opportunities to improve napi_get_cb_info().

We have already implemented the change you quoted (whereby you retrieve all function arguments in one call).

@kenny-y do you see a way we can further reduce the overhead of that API call?

from abi-stable-node.

jorangreef commented on May 18, 2024

Thanks @gabrielschulhof

Sorry, I was not clear. What I meant was that I have a napi method which just does this:

size_t argc = 5;
napi_value argv[5];
napi_get_cb_info(env, info, &argc, argv, NULL, NULL);
napi_get_value_external(env, argv[0], (void**) &context));
napi_get_buffer_info(env, argv[1], (void**) &key, &keyLength)
napi_get_value_int64(env, argv[2], &keyOffset);
napi_get_buffer_info(env, argv[3], (void**) &value, &keyLength)
napi_get_value_int64(env, argv[4], &valueOffset);

This alone takes a total time of 200ms for 1,000,000 calls. Each napi call here is adding overhead, 100ms is in napi_get_cb_info and the rest is split across the others.

Is that right?

from abi-stable-node.

gabrielschulhof commented on May 18, 2024

@jorangreef I experimentally added a benchmark to https://github.com/nodejs/node/tree/master/benchmark/napi/function_args with your combination of parameters in order to examine how much faster V8 is than N-API:

The V8 benchmark:

void CallWithCombo1(const FunctionCallbackInfo<Value>& args) {
  v8::Local<v8::Context> context = args.GetIsolate()->GetCurrentContext();
  void* data = args[0].As<External>()->Value();
  (void) data;
  size_t buf1_length = node::Buffer::Length(args[1]);
  (void) buf1_length;
  char* buf1_data = node::Buffer::Data(args[1]);
  (void) buf1_data;
  int64_t int1 = args[2]->IntegerValue(context).FromJust();
  (void) int1;
  size_t buf2_length = node::Buffer::Length(args[3]);
  (void) buf2_length;
  char* buf2_data = node::Buffer::Data(args[3]);
  (void) buf2_data;
  int64_t int2 = args[4]->IntegerValue(context).FromJust();
  (void) int2;
}

The N-API benchmark:

static napi_value CallWithCombo1(napi_env env, napi_callback_info info) {
  size_t argc = 5;
  napi_value argv[5];
  napi_get_cb_info(env, info, &argc, argv, NULL, NULL);

  void* data;
  napi_get_value_external(env, argv[0], &data);

  size_t buf1_length;
  void* buf1_data;
  napi_get_buffer_info(env, argv[1], &buf1_data, &buf1_length);

  int64_t int1;
  napi_get_value_int64(env, argv[2], &int1);

  size_t buf2_length;
  void* buf2_data;
  napi_get_buffer_info(env, argv[3], &buf2_data, &buf2_length);

  int64_t int2;
  napi_get_value_int64(env, argv[4], &int2);
  return NULL;
}

The result:

There's definitely a performance degradation there, but, of course, that is to be expected. The question is: does the benefit of ABI stability outweigh the performance cost, and can we improve the performance?

from abi-stable-node.

gabrielschulhof commented on May 18, 2024

One potential improvement I've found is to have a version of napi_get_value_int64() which is faster than what we currently have, at the expense of not dealing consistently with NaN, and positive and negative infinity:

napi_status napi_get_value_int64_fast(napi_env env,
                                      napi_value value,
                                      int64_t* result) {
  // Omit NAPI_PREAMBLE and GET_RETURN_STATUS because V8 calls here cannot throw
  // JS exceptions.
  CHECK_ENV(env);
  CHECK_ARG(env, value);
  CHECK_ARG(env, result);

  v8::Local<v8::Value> val = v8impl::V8LocalValueFromJsValue(value);

  // Empty context: https://github.com/nodejs/node/issues/14379
  v8::Local<v8::Context> context;
  *result = val->IntegerValue(context).FromJust();

  return napi_clear_last_error(env);
}

This doesn't seem to buy us that much though 😕:

... so I'm not sure if it's worth implementing napi_get_value_int64_fast().

from abi-stable-node.

kenny-y commented on May 18, 2024

@gabrielschulhof Not yet, I'm still trying to figure out whether there is anything that I can squeeze out...

from abi-stable-node.

jorangreef commented on May 18, 2024

Thanks for the benchmark @gabrielschulhof, that's awesome! 😃

at the expense of not dealing consistently with NaN, and positive and negative infinity

Just speaking for myself, I personally would not mind if methods like napi_get_value_int64 had these optimizations by default and we had slower _validated alternative methods for those who really need the checks done on the C side. These _validated methods could then do more checks and afford to be more paranoid. For example, with something like napi_get_value_uint32 I think I have seen that overflow (without any exception) can occur if the user passes something greater than a uint32. If these checks are important to the module author, then they can always do the validations on the JS side of their module and this should be faster. If they really need the validations done on the C side of their module, then they can fall back to the _validated alternatives?

Are there any optimizations that can be done for napi_get_buffer_info? For example, a version that only gets a pointer to the buffer, not the length? This might help in cases where the module author is already doing validation and throwing exceptions on the JS side.

Regarding napi_get_cb_info is there a way to have JS call a N-API module method AND collect the arguments while still on the JS side, i.e. without the C method having to call back to JS?

If I understand things correctly, at present, it's like this:

JS calls C method, C method calls back to JS asking for arguments.

Is there theoretically any way to do this:

N-API JS collects all arguments and calls N-API C method with argv.

At present, the N-API version of my hash table is slower than the JS version. The JS version can set an element in the hash table faster than the N-API version can just fetch the arguments (let alone set an element). I understand a hash table is by nature more chatty than typical N-API use cases, but it would be great if argument passing was not such a bottleneck.

from abi-stable-node.

gabrielschulhof commented on May 18, 2024

We don't really call back into JS when retrieving arguments. That is, no JS code gets executed as part of argument retrieval. We do have to copy the arguments out of C++ though, because, although they appear to be an array, with C++' operator overloading we cannot assume that they map to a plain C array – in fact, they most likely don't.

from abi-stable-node.

gabrielschulhof commented on May 18, 2024

@jorangreef one possible solution I can think of would be to use an ArrayBuffer instance and several TypedArray instances to construct a native structure and then pass only a single argument to the binding. The binding can then cast the ArrayBuffer data to a native structure.

from abi-stable-node.

gabrielschulhof commented on May 18, 2024

@jorangreef I tested the uint32_t truncation, and AFAICT it's working as advertised. Please have a look at the PR and let me know if I've missed any scenarios!

from abi-stable-node.

gabrielschulhof commented on May 18, 2024

@jorangreef OK, the external still has to go in a separate argument.

from abi-stable-node.

gabrielschulhof commented on May 18, 2024

I just realized that using a large ArrayBuffer to hold the arguments implies that the memory for the two Buffers is contiguous, otherwise it requires copying. Is this the case for your addon and, if not, can you rearrange the memory to make this the case?

from abi-stable-node.

gabrielschulhof commented on May 18, 2024

Argh ... no Int64 TypedArray!

from abi-stable-node.

mhdawson commented on May 18, 2024

Would it make sense to have this in a new open issue?

from abi-stable-node.

gabrielschulhof commented on May 18, 2024

@jorangreef I'm curious - are you using N-API directly, or via the node-addon-api package? If via the package, are you building on a version of Node.js that has an implementation of N-API? We've recently improved the performance of N-API function invocation.

This improvement has not yet landed in node-addon-api's version of N-API, but I have a PR in the pipe which should bring the changes to node-addon-api.

The performance gains are charted in the PR.

from abi-stable-node.

gabrielschulhof commented on May 18, 2024

@jorangreef napi_get_buffer_info() accepts nullptr for the information you're not interested in. So, you could pass nullptr for the length to only receive the data.

from abi-stable-node.

Measure perf for leveldown after error handling implementation. about abi-stable-node HOT 24 CLOSED

Comments (24)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent