Comments (11)
To reproduce you can modify the sample pub-sub to run the following code once a connection has been acquired (using 1.3.4
):
var d=[];
const SIZE=120000
for (var j = 0; j<SIZE ; j++) {
d.push('a');
}
setInterval(() => {connection.publish('test', JSON.stringify({'msg': d}), 0);}, 20)
setInterval(() => {
const used = process.memoryUsage().heapUsed / 1024 / 1024;
console.log(`The script uses approximately ${Math.round(used * 100) / 100} MB`);
}, 1000);
Connecting to Greengrass this logs:
The script uses approximately 52.79 MB
The script uses approximately 98.75 MB
The script uses approximately 144.13 MB
The script uses approximately 190.45 MB
The script uses approximately 233.85 MB
The script uses approximately 280.58 MB
The script uses approximately 327.43 MB
The script uses approximately 374.23 MB
The script uses approximately 417.31 MB
The script uses approximately 466.4 MB
<--- Last few GCs --->
[12871:0x57ff080] 21723 ms: Mark-sweep (reduce) 492.8 (502.2) -> 492.8 (502.2) MB, 8.7 / 0.0 ms (average mu = 0.248, current mu = 0.002) last resort GC in old space requested
[12871:0x57ff080] 21731 ms: Mark-sweep (reduce) 492.8 (502.2) -> 492.8 (502.2) MB, 7.8 / 0.0 ms (average mu = 0.146, current mu = 0.002) last resort GC in old space requested
<--- JS stacktrace --->
FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory
1: 0xa63060 node::Abort() [node]
2: 0x995c57 node::FatalError(char const*, char const*) [node]
3: 0xc3bc1e v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
4: 0xc3bf97 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
5: 0xe04a05 [node]
6: 0xe16cd1 v8::internal::Heap::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [node]
7: 0xdda89a v8::internal::Factory::AllocateRaw(int, v8::internal::AllocationType, v8::internal::AllocationAlignment) [node]
8: 0xdd3da4 v8::internal::FactoryBase<v8::internal::Factory>::AllocateRawWithImmortalMap(int, v8::internal::AllocationType, v8::internal::Map, v8::internal::AllocationAlignment) [node]
9: 0xdd5ea0 v8::internal::FactoryBase<v8::internal::Factory>::NewRawOneByteString(int, v8::internal::AllocationType) [node]
10: 0x103bf3a v8::internal::String::SlowFlatten(v8::internal::Isolate*, v8::internal::Handle<v8::internal::ConsString>, v8::internal::AllocationType) [node]
11: 0xc4eef9 v8::String::Utf8Length(v8::Isolate*) const [node]
12: 0xa4091a [node]
13: 0xca876b [node]
14: 0xca9d1c [node]
15: 0xcaa396 v8::internal::Builtin_HandleApiCall(int, unsigned long*, v8::internal::Isolate*) [node]
16: 0x14c8439 [node]
Signal received: -2102827800, errno: 32767
################################################################################
Resolved stacktrace:
################################################################################
0x00007fd2c0b8895b: ?? ??:0
0x00000000000a2213: s_print_stack_trace at module.c:?
0x000000000000f5e0: __restore_rt at sigaction.c:?
0x00007fd2c9cb5277: ?? ??:0
0x00007fd2c9cb6968: ?? ??:0
node() [0xa63071]
node(_ZN4node10FatalErrorEPKcS1_+0) [0x995c57]
node(_ZN2v85Utils16ReportOOMFailureEPNS_8internal7IsolateEPKcb+0x4e) [0xc3bc1e]
node(_ZN2v88internal2V823FatalProcessOutOfMemoryEPNS0_7IsolateEPKcb+0x347) [0xc3bf97]
node() [0xe04a05]
node(_ZN2v88internal4Heap34AllocateRawWithRetryOrFailSlowPathEiNS0_14AllocationTypeENS0_16AllocationOriginENS0_19AllocationAlignmentE+0xf1) [0xe16cd1]
node(_ZN2v88internal7Factory11AllocateRawEiNS0_14AllocationTypeENS0_19AllocationAlignmentE+0x9a) [0xdda89a]
node(_ZN2v88internal11FactoryBaseINS0_7FactoryEE26AllocateRawWithImmortalMapEiNS0_14AllocationTypeENS0_3MapENS0_19AllocationAlignmentE+0x14) [0xdd3da4]
node(_ZN2v88internal11FactoryBaseINS0_7FactoryEE19NewRawOneByteStringEiNS0_14AllocationTypeE+0x50) [0xdd5ea0]
node(_ZN2v88internal6String11SlowFlattenEPNS0_7IsolateENS0_6HandleINS0_10ConsStringEEENS0_14AllocationTypeE+0x18a) [0x103bf3a]
node(_ZNK2v86String10Utf8LengthEPNS_7IsolateE+0x19) [0xc4eef9]
node() [0xa4091a]
node() [0xca876b]
node() [0xca9d1c]
node(_ZN2v88internal21Builtin_HandleApiCallEiPmPNS0_7IsolateE+0x16) [0xcaa396]
node() [0x14c8439]
################################################################################
Raw stacktrace:
################################################################################
/home/ec2-user/environment/device/node_modules/aws-crt/dist/bin/linux-x64/aws-crt-nodejs.node(aws_backtrace_print+0x4b) [0x7fd2c0b8895b]
/home/ec2-user/environment/device/node_modules/aws-crt/dist/bin/linux-x64/aws-crt-nodejs.node(+0xa2213) [0x7fd2c0a88213]
/lib64/libpthread.so.0(+0xf5e0) [0x7fd2ca05b5e0]
/lib64/libc.so.6(gsignal+0x37) [0x7fd2c9cb5277]
/lib64/libc.so.6(abort+0x148) [0x7fd2c9cb6968]
node() [0xa63071]
node(_ZN4node10FatalErrorEPKcS1_+0) [0x995c57]
node(_ZN2v85Utils16ReportOOMFailureEPNS_8internal7IsolateEPKcb+0x4e) [0xc3bc1e]
node(_ZN2v88internal2V823FatalProcessOutOfMemoryEPNS0_7IsolateEPKcb+0x347) [0xc3bf97]
node() [0xe04a05]
node(_ZN2v88internal4Heap34AllocateRawWithRetryOrFailSlowPathEiNS0_14AllocationTypeENS0_16AllocationOriginENS0_19AllocationAlignmentE+0xf1) [0xe16cd1]
node(_ZN2v88internal7Factory11AllocateRawEiNS0_14AllocationTypeENS0_19AllocationAlignmentE+0x9a) [0xdda89a]
node(_ZN2v88internal11FactoryBaseINS0_7FactoryEE26AllocateRawWithImmortalMapEiNS0_14AllocationTypeENS0_3MapENS0_19AllocationAlignmentE+0x14) [0xdd3da4]
node(_ZN2v88internal11FactoryBaseINS0_7FactoryEE19NewRawOneByteStringEiNS0_14AllocationTypeE+0x50) [0xdd5ea0]
node(_ZN2v88internal6String11SlowFlattenEPNS0_7IsolateENS0_6HandleINS0_10ConsStringEEENS0_14AllocationTypeE+0x18a) [0x103bf3a]
node(_ZNK2v86String10Utf8LengthEPNS_7IsolateE+0x19) [0xc4eef9]
node() [0xa4091a]
node() [0xca876b]
node() [0xca9d1c]
node(_ZN2v88internal21Builtin_HandleApiCallEiPmPNS0_7IsolateE+0x16) [0xcaa396]
node() [0x14c8439]
Connecting to AWS IoT Core does not work with SIZE=120000. Must reduce to SIZE to 20000.
Output is similar, but the MQTT connection times out
The script uses approximately 11.83 MB
The script uses approximately 19.09 MB
The script uses approximately 27.92 MB
The script uses approximately 35.35 MB
The script uses approximately 42.44 MB
The script uses approximately 49.53 MB
The script uses approximately 58.26 MB
The script uses approximately 65.55 MB
The script uses approximately 72.75 MB
The script uses approximately 81.33 MB
The script uses approximately 88.62 MB
The script uses approximately 95.7 MB
The script uses approximately 104.43 MB
The script uses approximately 111.72 MB
The script uses approximately 119 MB
The script uses approximately 127.58 MB
The script uses approximately 134.34 MB
The script uses approximately 142.03 MB
The script uses approximately 150.62 MB
The script uses approximately 157.91 MB
The script uses approximately 165.19 MB
The script uses approximately 173.76 MB
AWS IoT MQTT interrupt
CrtError: libaws-c-mqtt: AWS_ERROR_MQTT_TIMEOUT, Time limit between request and response has been exceeded.
at MqttClientConnection._on_connection_interrupted (/home/ec2-user/environment/device/node_modules/aws-crt/dist/native/mqtt.js:336:32)
at /home/ec2-user/environment/device/node_modules/aws-crt/dist/native/mqtt.js:114:113 {
error: 5129,
error_code: 5129,
error_name: 'AWS_ERROR_MQTT_TIMEOUT'
}
N-API call failed: napi_call_function(env, this_ptr, function, argc, argv, NULL)
@ /codebuild/output/src902820455/src/aws-crt-nodejs/source/module.c:368: napi_pending_exception
Calling (error_code) => { this._on_connection_interrupted(error_code); }
Error: libaws-c-mqtt: AWS_ERROR_MQTT_TIMEOUT, Time limit between request and response has been exceeded.
Stack:
Error: libaws-c-mqtt: AWS_ERROR_MQTT_TIMEOUT, Time limit between request and response has been exceeded.
at MqttClientConnection._on_connection_interrupted (/home/ec2-user/environment/device/node_modules/aws-crt/dist/native/mqtt.js:336:32)
at /home/ec2-user/environment/device/node_modules/aws-crt/dist/native/mqtt.js:114:113
N-API call failed: aws_napi_dispatch_threadsafe_function( env, binding->on_connection_interrupted, NULL, on_interrupted, num_params, params)
@ /codebuild/output/src902820455/src/aws-crt-nodejs/source/mqtt_client_connection.c:96: napi_pending_exception
Fatal error condition occurred in /codebuild/output/src902820455/src/aws-crt-nodejs/source/mqtt_client_connection.c:96: aws_napi_dispatch_threadsafe_function( env, binding->on_connection_interrupted, NULL, on_interrupted, num_params, params)
Exiting Application
################################################################################
Resolved stacktrace:
################################################################################
0x00007f688c34d95b: ?? ??:0
0x00007f688c343423: ?? ??:0
0x00000000000a4111: s_on_connection_interrupted_call at mqtt_client_connection.c:?
node() [0xa32868]
node() [0x144c979]
node(uv_run+0x2f0) [0x14452f0]
node(_ZN4node13SpinEventLoopEPNS_11EnvironmentE+0x135) [0x9b5ed5]
node(_ZN4node16NodeMainInstance3RunEPKNS_16EnvSerializeInfoE+0x170) [0xaa44d0]
node(_ZN4node5StartEiPPc+0x10a) [0xa2fe0a]
0x00007f68907e9445: ?? ??:0
node() [0x9b2ecc]
################################################################################
Raw stacktrace:
################################################################################
/home/ec2-user/environment/device/node_modules/aws-crt/dist/bin/linux-x64/aws-crt-nodejs.node(aws_backtrace_print+0x4b) [0x7f688c34d95b]
/home/ec2-user/environment/device/node_modules/aws-crt/dist/bin/linux-x64/aws-crt-nodejs.node(aws_fatal_assert+0x43) [0x7f688c343423]
/home/ec2-user/environment/device/node_modules/aws-crt/dist/bin/linux-x64/aws-crt-nodejs.node(+0xa4111) [0x7f688c24f111]
node() [0xa32868]
node() [0x144c979]
node(uv_run+0x2f0) [0x14452f0]
node(_ZN4node13SpinEventLoopEPNS_11EnvironmentE+0x135) [0x9b5ed5]
node(_ZN4node16NodeMainInstance3RunEPKNS_16EnvSerializeInfoE+0x170) [0xaa44d0]
node(_ZN4node5StartEiPPc+0x10a) [0xa2fe0a]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f68907e9445]
node() [0x9b2ecc]
Signal received: -1870884220, errno: 32616
################################################################################
Resolved stacktrace:
################################################################################
0x00007f688c34d95b: ?? ??:0
0x00000000000a2213: s_print_stack_trace at module.c:?
0x000000000000f5e0: __restore_rt at sigaction.c:?
0x00007f68907fd277: ?? ??:0
0x00007f68907fe968: ?? ??:0
0x0000000000198428: aws_fatal_assert at ??:?
0x00000000000a4111: s_on_connection_interrupted_call at mqtt_client_connection.c:?
node() [0xa32868]
node() [0x144c979]
node(uv_run+0x2f0) [0x14452f0]
node(_ZN4node13SpinEventLoopEPNS_11EnvironmentE+0x135) [0x9b5ed5]
node(_ZN4node16NodeMainInstance3RunEPKNS_16EnvSerializeInfoE+0x170) [0xaa44d0]
node(_ZN4node5StartEiPPc+0x10a) [0xa2fe0a]
0x00007f68907e9445: ?? ??:0
node() [0x9b2ecc]
################################################################################
Raw stacktrace:
################################################################################
/home/ec2-user/environment/device/node_modules/aws-crt/dist/bin/linux-x64/aws-crt-nodejs.node(aws_backtrace_print+0x4b) [0x7f688c34d95b]
/home/ec2-user/environment/device/node_modules/aws-crt/dist/bin/linux-x64/aws-crt-nodejs.node(+0xa2213) [0x7f688c24d213]
/lib64/libpthread.so.0(+0xf5e0) [0x7f6890ba35e0]
/lib64/libc.so.6(gsignal+0x37) [0x7f68907fd277]
/lib64/libc.so.6(abort+0x148) [0x7f68907fe968]
/home/ec2-user/environment/device/node_modules/aws-crt/dist/bin/linux-x64/aws-crt-nodejs.node(+0x198428) [0x7f688c343428]
/home/ec2-user/environment/device/node_modules/aws-crt/dist/bin/linux-x64/aws-crt-nodejs.node(+0xa4111) [0x7f688c24f111]
node() [0xa32868]
node() [0x144c979]
node(uv_run+0x2f0) [0x14452f0]
node(_ZN4node13SpinEventLoopEPNS_11EnvironmentE+0x135) [0x9b5ed5]
node(_ZN4node16NodeMainInstance3RunEPKNS_16EnvSerializeInfoE+0x170) [0xaa44d0]
node(_ZN4node5StartEiPPc+0x10a) [0xa2fe0a]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f68907e9445]
node() [0x9b2ecc]
If connection.publish
is commented out memory stays constant
from aws-crt-nodejs.
I run this on the Docker container node:12.18.0-slim.
from aws-crt-nodejs.
The issue is in the fact that the publish
method creates a callback closure (
aws-crt-nodejs/lib/native/mqtt.ts
Line 310 in 72267f5
payload
.from aws-crt-nodejs.
Closing this in favor of related PR: https://www.github.com/awslabs/aws-crt-nodejs/pull/155 to run the build w/tests up front. I've rebased the work starting with these changes. Thanks for your help!
from aws-crt-nodejs.
Intended on closing PR not this issue before
from aws-crt-nodejs.
Hey @DavidOgunsAWS , the memory leak is still present. I've taken over for @accnops where I thought this memory leak had been fixed in a new release. We're using version 1.3.8 of your library, as we thought the memory leak was fixed from version 1.3.6 onwards.
I should elaborate a bit on my use case, where currently there are quite a lot of publishes happening per second. We want to send at least 500 packets per second in a next deliverable. In Arthur's post, we were still only at 100. As stated, each packet is sized 700-900 kB. This is a naive implementation that's just a stringified JSON, so we can still gain bandwidth on packing the data that's being sent.
I've tried a few approaches in handling this MQTT connection over the past few days, and unfortunately none have worked stable for this load.
Aggregating the data
When I saw that the data was being sent at effectively 500 Hz, and there was no aggregation going on, the CPU was running >100%. The first thing I did was add aggregation to the stream with configurable parameters for either a timeout or a maximum amount of aggregated packets before the packet would be published.
This reduced the CPU usage a lot, to about 50%, when I chose to aggregate 5 packets. However, the MQTT still crashes after a few minutes due to sudden disconnect errors (these are the same errors that occur when sending a packet that is too large). Additionally, after restarting after crashing, the process systematically restarts after ~45 seconds, and doesn't seem to get out of it. It only survives for a few minutes if I turn off the data stream for the first minute, or reduce the packet rate in my environment.
After messing about with the number of packets to try to get a crash-free aggregation configuration, I worked with a hypothetical smaller packet size of about 70 kB, working from the hypothesis that we should eventually use heavily optimized packed data to achieve a large packet rate. This worked more stable, but due to this stability we were able to verify that for normal CPU (~50%), we still see a memory leak. This memory leak leads to increased CPU consumption, and eventually, the CPU usage becomes too high and we experience a crash. It is my hope that this is the main underlying issue.
Summary of issues
- The CPU seems to suffer heavily when publishing, I got my Node process to go above 100% CPU when performing 500 publishes per second. So there's a bottleneck, apart from the network itself, on the amount of publishes going on.
- The MQTT connection seems to crash on a large packet. When the packet sizes are large, thus a large aggregated packet can't be sent over the MQTT.
Thus there is a fundamental issue here where we're capped on both the times we send data, and the size of the data. Additionally, playing with the parameters I haven't been able to get a stable connection going for the 500 packets, do you think there is a max throughput in the MQTT connection?
-
Crash triggered immediately when sending a lot of data after connecting.
-
Memory leak when continuously publishing data, leading to an ever increasing CPU usage, ultimately crashing the process. Is this actually fixed? It's easy to reproduce.
Conclusion
Our next deliverable entails sending 500 packets per second. The next milestone would entail 1300 packets per second, with the final product needing about 4500 packets per second to be sent. Even after optimizations to the data size, I'm not sure if we can achieve those numbers with this library. What is your advice on this issue, and advised usage of your API? What material could I provide you to help fix these breaking issues?
from aws-crt-nodejs.
@laurentva Hi Laurent, if you are using AWS IoT Core as you MQTT broker, you should be aware that a single connection can support a maximum of 100 msg/sec at 5kB message, or 512 kB/sec throughput. If you have sustained traffic at 100 msg/sec 512kB/sec or more, I would suggest looking into using alternative services, such as Amazon Kinesis Data Stream.
from aws-crt-nodejs.
Hey! After a conversation with Massimiliano, he told me I should give you more information about my usecase... The AWS IoT MQTT connection is between an on-premise server and a Greengrass edge server. This means that the limitations of 512 kBps bandwidth per connection should not apply.
from aws-crt-nodejs.
Any progress on the resolution for this issue?
from aws-crt-nodejs.
We replaced aws-crt-nodejs with the generic mqtt package and have moved on 😄 thanks again for the recommendation.
from aws-crt-nodejs.
We believe this, and other similar issues, were fixed in v1.9.1
from aws-crt-nodejs.
Related Issues (20)
- Lambda@Edge SigV4Asymmetric signing issue in Typescript for S3 Multi Region Access point MRAP HOT 4
- MqttConnection should refresh STS credentials for WSS connections on reconnect
- aws-crt mqtt library not working with Webpack 5 HOT 1
- Node and Browser versions of the `iot.AwsIotMqttConnectionConfigBuilder` do not provide the same API HOT 8
- Buffer constructor is deprecated (node) HOT 1
- Migrate browser samples to use aws-sdk v3
- Webpack build warning HOT 4
- Segmentation Fault immediately on require inside Worker threads on Linux HOT 6
- Improve error messages that are returned when an exception is thrown while attempting to subscribe in JS HOT 1
- upgrade axios to 1.x HOT 6
- Add Esbuild support HOT 15
- 'AWS Signer SigV4A Headers' unit test failing HOT 9
- yarn install for aws-crt-nodejs: CMake Error: CMAKE_C_COMPILER not set HOT 11
- Node binaries not loading from binding.js when hoisted to root of node_modules HOT 6
- IoT unsigned custom authentication builder broken HOT 10
- Support `aws-c-s3` HOT 1
- Default provider chain does not support SSO profile-defined credentials HOT 2
- upgrade axios to ^1.6.4 HOT 1
- IoT MQTT5 MqttConnectCustomAuthConfig has invalid typing for custom authorizer + browser HOT 14
- linux-arm64-musl pre built missing HOT 7
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from aws-crt-nodejs.