Comments (24)
Thanks @ianwjhalliday for pointing to the leveldown benchmark. This looks like a pretty representative test of ordinary database operations. It writes about a million entries representing a database of around 110MB. Ran the benchmark on the same machine and binaries listed in the above comment and found NAPI has about a 5% perf impact. Here are the results for a few runs:
Leveldown Nan on V8-Node 8.x | Leveldown NAPI on V8-Node-Napi 8.x |
---|---|
Elapsed: 45.867s | Elapsed: 47.619s |
Elapsed: 44.805s | Elapsed: 47.535s |
Elapsed: 45.134s | Elapsed: 47.506s |
Elapsed: 45.054s | Elapsed: 46.482s |
Elapsed: 44.739s | Elapsed: 47.694s |
avg(elapsed) 45.1198s | avg(elapsed) 47.3672s (+4.98%) |
This workload is not very chatty in terms of calls out to napi API so the measured perf impact may be representative generally. Reducing the overhead for napi callback functions which need to fetch the argument count, argument values, this value, etc by consolidating all those into a single call may prove to minimize the perf impact here, as well as in other modules. This needs some more investigation.
from abi-stable-node.
@boingoing - any update on this ?
from abi-stable-node.
Leveldown Nan on V8-Node 8.x | Leveldown NAPI on V8-Node-Napi 8.x |
---|---|
Node https://github.com/nodejs/abi-stable-node api-prototype-8.x commit 1b30df1 |
Node https://github.com/nodejs/abi-stable-node api-prototype-8.x commit 6e70e77 |
Leveldown https://github.com/boingoing/leveldown master commit boingoing/leveldown@ef05005 |
Leveldown https://github.com/boingoing/leveldown api-opaque-prototype commit boingoing/leveldown@48c8b83 |
Elapsed: 45.951 Kernel elapsed: 1.234 User elapsed: 2.359 |
Elapsed: 45.169 Kernel elapsed: 1.234 User elapsed: 2.296 |
Elapsed: 43.893 Kernel elapsed: 1.265 User elapsed: 2.109 |
Elapsed: 45.044 Kernel elapsed: 0.968 User elapsed: 2.171 |
Elapsed: 46.150 Kernel elapsed: 0.968 User elapsed: 2.328 |
Elapsed: 44.609 Kernel elapsed: 1.109 User elapsed: 1.812 |
Elapsed: 45.691 Kernel elapsed: 1.078 User elapsed: 2.500 |
Elapsed: 45.070 Kernel elapsed: 1.156 User elapsed: 2.171 |
Elapsed: 42.064 Kernel elapsed: 0.859 User elapsed: 2.031 |
Elapsed: 44.671 Kernel elapsed: 1.062 User elapsed: 1.953 |
avg(elapsed) 44.7498 | avg(elapsed) 44.9126 (+0.3%) |
Above executes the leveldown unit test suite with plain or instrumented Node/Leveldown binaries built from respective repositories. Elapsed time for the suite is measured as a performance metric.
Tests run on a Windows machine with this hardware:
Intel Xeon E5-1620 v3 @ 3.50GHz
16GB DDR4 @ 2133MHz
Samsung XP941 SSD
from abi-stable-node.
Is there a perf test in leveldown or is this just running the unit tests ?
If there is no perf test I think we might want to write at least as simple one that does something typical and measure that as well.
from abi-stable-node.
Next steps:
Try it with a custom perf test like inserting and reading a large data set into and from the database.
Running it on other platforms like mac / Linux etc.
from abi-stable-node.
leveldown does have its own perf benchmark suite. @boingoing is that included in your data?
from abi-stable-node.
@boingoing what is the unit of elapsed time?
from abi-stable-node.
@sampsongao all units are in seconds.
from abi-stable-node.
Reducing the overhead for napi callback functions which need to fetch the argument count, argument values, this value, etc by consolidating all those into a single call may prove to minimize the perf impact here, as well as in other modules.
I am working on a cuckoo hash table. Just calling into JS to fetch the various arguments is costing half the run time. Being able to fetch all arguments in a single call would be terrific.
from abi-stable-node.
@jorangreef I do not see a lot of opportunities to improve napi_get_cb_info()
.
We have already implemented the change you quoted (whereby you retrieve all function arguments in one call).
@kenny-y do you see a way we can further reduce the overhead of that API call?
from abi-stable-node.
Thanks @gabrielschulhof
Sorry, I was not clear. What I meant was that I have a napi method which just does this:
size_t argc = 5;
napi_value argv[5];
napi_get_cb_info(env, info, &argc, argv, NULL, NULL);
napi_get_value_external(env, argv[0], (void**) &context));
napi_get_buffer_info(env, argv[1], (void**) &key, &keyLength)
napi_get_value_int64(env, argv[2], &keyOffset);
napi_get_buffer_info(env, argv[3], (void**) &value, &keyLength)
napi_get_value_int64(env, argv[4], &valueOffset);
This alone takes a total time of 200ms for 1,000,000 calls. Each napi call here is adding overhead, 100ms is in napi_get_cb_info and the rest is split across the others.
Is that right?
from abi-stable-node.
@jorangreef I experimentally added a benchmark to https://github.com/nodejs/node/tree/master/benchmark/napi/function_args with your combination of parameters in order to examine how much faster V8 is than N-API:
The V8 benchmark:
void CallWithCombo1(const FunctionCallbackInfo<Value>& args) {
v8::Local<v8::Context> context = args.GetIsolate()->GetCurrentContext();
void* data = args[0].As<External>()->Value();
(void) data;
size_t buf1_length = node::Buffer::Length(args[1]);
(void) buf1_length;
char* buf1_data = node::Buffer::Data(args[1]);
(void) buf1_data;
int64_t int1 = args[2]->IntegerValue(context).FromJust();
(void) int1;
size_t buf2_length = node::Buffer::Length(args[3]);
(void) buf2_length;
char* buf2_data = node::Buffer::Data(args[3]);
(void) buf2_data;
int64_t int2 = args[4]->IntegerValue(context).FromJust();
(void) int2;
}
The N-API benchmark:
static napi_value CallWithCombo1(napi_env env, napi_callback_info info) {
size_t argc = 5;
napi_value argv[5];
napi_get_cb_info(env, info, &argc, argv, NULL, NULL);
void* data;
napi_get_value_external(env, argv[0], &data);
size_t buf1_length;
void* buf1_data;
napi_get_buffer_info(env, argv[1], &buf1_data, &buf1_length);
int64_t int1;
napi_get_value_int64(env, argv[2], &int1);
size_t buf2_length;
void* buf2_data;
napi_get_buffer_info(env, argv[3], &buf2_data, &buf2_length);
int64_t int2;
napi_get_value_int64(env, argv[4], &int2);
return NULL;
}
The result:
There's definitely a performance degradation there, but, of course, that is to be expected. The question is: does the benefit of ABI stability outweigh the performance cost, and can we improve the performance?
from abi-stable-node.
One potential improvement I've found is to have a version of napi_get_value_int64()
which is faster than what we currently have, at the expense of not dealing consistently with NaN
, and positive and negative infinity:
napi_status napi_get_value_int64_fast(napi_env env,
napi_value value,
int64_t* result) {
// Omit NAPI_PREAMBLE and GET_RETURN_STATUS because V8 calls here cannot throw
// JS exceptions.
CHECK_ENV(env);
CHECK_ARG(env, value);
CHECK_ARG(env, result);
v8::Local<v8::Value> val = v8impl::V8LocalValueFromJsValue(value);
// Empty context: https://github.com/nodejs/node/issues/14379
v8::Local<v8::Context> context;
*result = val->IntegerValue(context).FromJust();
return napi_clear_last_error(env);
}
This doesn't seem to buy us that much though 😕:
... so I'm not sure if it's worth implementing napi_get_value_int64_fast()
.
from abi-stable-node.
@gabrielschulhof Not yet, I'm still trying to figure out whether there is anything that I can squeeze out...
from abi-stable-node.
Thanks for the benchmark @gabrielschulhof, that's awesome! 😃
at the expense of not dealing consistently with NaN, and positive and negative infinity
Just speaking for myself, I personally would not mind if methods like napi_get_value_int64
had these optimizations by default and we had slower _validated
alternative methods for those who really need the checks done on the C side. These _validated
methods could then do more checks and afford to be more paranoid. For example, with something like napi_get_value_uint32
I think I have seen that overflow (without any exception) can occur if the user passes something greater than a uint32
. If these checks are important to the module author, then they can always do the validations on the JS side of their module and this should be faster. If they really need the validations done on the C side of their module, then they can fall back to the _validated
alternatives?
Are there any optimizations that can be done for napi_get_buffer_info
? For example, a version that only gets a pointer to the buffer, not the length? This might help in cases where the module author is already doing validation and throwing exceptions on the JS side.
Regarding napi_get_cb_info
is there a way to have JS call a N-API module method AND collect the arguments while still on the JS side, i.e. without the C method having to call back to JS?
If I understand things correctly, at present, it's like this:
JS calls C method, C method calls back to JS asking for arguments.
Is there theoretically any way to do this:
N-API JS collects all arguments and calls N-API C method with argv.
At present, the N-API version of my hash table is slower than the JS version. The JS version can set an element in the hash table faster than the N-API version can just fetch the arguments (let alone set an element). I understand a hash table is by nature more chatty than typical N-API use cases, but it would be great if argument passing was not such a bottleneck.
from abi-stable-node.
We don't really call back into JS when retrieving arguments. That is, no JS code gets executed as part of argument retrieval. We do have to copy the arguments out of C++ though, because, although they appear to be an array, with C++' operator overloading we cannot assume that they map to a plain C array – in fact, they most likely don't.
from abi-stable-node.
@jorangreef one possible solution I can think of would be to use an ArrayBuffer
instance and several TypedArray
instances to construct a native structure and then pass only a single argument to the binding. The binding can then cast the ArrayBuffer
data to a native structure.
from abi-stable-node.
@jorangreef I tested the uint32_t
truncation, and AFAICT it's working as advertised. Please have a look at the PR and let me know if I've missed any scenarios!
from abi-stable-node.
@jorangreef OK, the external still has to go in a separate argument.
from abi-stable-node.
I just realized that using a large ArrayBuffer
to hold the arguments implies that the memory for the two Buffer
s is contiguous, otherwise it requires copying. Is this the case for your addon and, if not, can you rearrange the memory to make this the case?
from abi-stable-node.
Argh ... no Int64 TypedArray
!
from abi-stable-node.
Would it make sense to have this in a new open issue?
from abi-stable-node.
@jorangreef I'm curious - are you using N-API directly, or via the node-addon-api package? If via the package, are you building on a version of Node.js that has an implementation of N-API? We've recently improved the performance of N-API function invocation.
This improvement has not yet landed in node-addon-api's version of N-API, but I have a PR in the pipe which should bring the changes to node-addon-api.
The performance gains are charted in the PR.
from abi-stable-node.
@jorangreef napi_get_buffer_info()
accepts nullptr
for the information you're not interested in. So, you could pass nullptr
for the length to only receive the data.
from abi-stable-node.
Related Issues (20)
- Discuss https://github.com/nodejs/node-addon-api/pull/902 HOT 1
- Thread safety of napi_threadsafe_function
- Ask for review of node-serialport Node-API port - https://github.com/serialport/node-serialport/pull/2305/files HOT 2
- Drive towards full coverage HOT 5
- Discuss Node-API addition https://github.com/nodejs/node/pull/41329#issuecomment-1015667899 HOT 1
- Regular hang in Running test 'typed_threadsafe_function/typed_threadsafe_function_sum' HOT 1
- Node-API v9 HOT 7
- Better coverage of node-api in Youtube HOT 13
- Node-API add-on build HOT 4
- Add list of non C language bindings to Node.js doc HOT 10
- Document how to check for memory leaks in addon/default leak. HOT 4
- Electron 21 breaks api compatability with some modules HOT 9
- Get contacts for each of the runtimes supporting Node-API HOT 8
- What's new in the Node-API and node-addon-api world? - 2023 version HOT 20
- Can references be shared between different modules/envs? HOT 3
- Improve score on OpenSSF scorecard HOT 8
- Consider changing the license for examples to MIT HOT 2
- Tracking issues for V8 changes needed HOT 1
- Help with node-sqlite3 performance HOT 3
- What's new in the Node-API and node-addon-api world? - 2024 version HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from abi-stable-node.