I have tow vernemqs be clustered, use nginx tcp proxy ,if one vernemq is down, other

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Oops, I missed this. Thank you <a class="user-mention notranslate" data-hovercard-type

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

can't register client due to reason not_ready about vernemq HOT 14 CLOSED

vernemq commented on May 18, 2024

can't register client due to reason not_ready

from vernemq.

Comments (14)

nmarcetic commented on May 18, 2024 1

@ioolkos Ah I see, still running nodes will now accept requests, got it now ;)
Ok testing now.

I scaled replicaset like this
kubectl scale statefulset mfx-adapter-mqtt --replicas=5 --namespace mfx

from vernemq.

dergraf commented on May 18, 2024

Yeah, this is because of the enabled consistency features. From VerneMQs perspective it looks like either the Node is down, or a network problem separates your two nodes. By default VerneMQ trades availability for consistency. Which means that VerneMQ will stop serving requests (CONNECT, PUBLISH, SUBSCRIBE) and therefore favors consistency, e.g. uniqueness of client-ids among all cluster nodes.

But, you can CHANGE this behaviour, by trading consistency for availability!

What are you giving up if you trade consistency for availability:

Define trade_consistency = on in vernemq.conf this will allow that your currently registered clients can still publish/consume messages (however they can't subscribe to new topics), but new clients CAN'T register because VerneMQ 'tries' to ensure uniqueness of the client-ids in the cluster.

Define allow_multiple_sessions = on in vernemq.conf this will allow multiple clients connecting to the cluster using the same client-id. Therefore they share the same subscriptions.

Depending on your use case, combined with proper authentication/authorization, trading consistency might be a good thing. It is definitely useful for a high-availability cluster setup.

Please check https://vernemq.com/docs/clustering.html, to make sure you understand the consequences.

from vernemq.

bayuemu commented on May 18, 2024

thinks ，I try it

from vernemq.

bayuemu commented on May 18, 2024

run ok ! thinks

from vernemq.

vedavidhbudimuri commented on May 18, 2024

Hey @dergraf

Im also facing same issue while hosting two vernemq nodes as two docker containers in an ec2.
each node is responding to other's ping request, so I guess they are connected.
I have added the logs in the image below.

PFA

from vernemq.

larshesel commented on May 18, 2024

@vedavidhbudimuri this happens when your cluster is partitioned. Check the cluster state using vmq-admin cluster show.

from vernemq.

vedavidhbudimuri commented on May 18, 2024

@larshesel vmq-admin cluster show

shows that two of the nodes true and rest of other nodes to be false, is the issue because of that?
Initially, I tried with 6 nodes and later scaled down to two nodes.

from vernemq.

larshesel commented on May 18, 2024

Yes, that's the problem. You have to explicitly tell VerneMQ that the nodes are gone, otherwise VerneMQ has no idea that this isn't a netsplit situation. See http://vernemq.com/docs/clustering/ and http://vernemq.com/docs/clustering/netsplits.html.

from vernemq.

vedavidhbudimuri commented on May 18, 2024

Oops, I missed this. Thank you @larshesel

from vernemq.

nmarcetic commented on May 18, 2024

@larshesel @dergraf Is this mean that I can't' get k8s pod auto-scaling without consequences? Or even manual scale down without manually removing nodes from the cluster?

I just faced with the exact same problem, scale down from 5 to 3 instances and get
can't register client due to reason not_ready
I see 3 nodes running and 2 nodes running false with vmq-admin cluster show
Exactly the same case as described.

Manual scaling and add/remove nodes from a cluster is not realistic in production or with the high load in most use cases (especially when throughput is not consistent). Just curious do you guys have any advice here?

Is there a way to set allow_multiple_sessions and trade_consistency via env var in docker image ?

Thanks!

from vernemq.

ioolkos commented on May 18, 2024

@nmarcetic thanks...

If you want to be able to allow registers in a netsplit you have to allow for it: https://docs.vernemq.com/clustering/netsplits Set the allow_register_during_netsplit = on in VerneMQ config, and also see about the other allow values there.
The config values you mention (allow_multiple_sessions and trade_consistency) are obsolete, don't use them.
The idea is not that you manually make the nodes leave. The pod termination should automatically make the node leave, I think. Don't know what you did and why it didn't work. Maybe related: vernemq/docker-vernemq#193
A VerneMQ cluster is stateful and membership is part of state. Verne doesn't just forget a node, because it was killed. That's why there's the leave command to tell the cluster. See point 2.

from vernemq.

nmarcetic commented on May 18, 2024

@ioolkos Thanks for a prompt response.

I see now, I will try right now with DOCKER_VERNEMQ_ALLOW_REGISTER_DURING_NETSPLIT=on.

from vernemq.

nmarcetic commented on May 18, 2024

@ioolkos Looks same 😞

Running replica=5

+-----------------------------------------------------------------+-------+                                                                                                                                        
|                              Node                               |Running|                                                                                                                                        
+-----------------------------------------------------------------+-------+                                                                                                                                        
|VerneMQ@mfx-adapter-mqtt-0.mfx-adapter-mqtt.mfx.svc.cluster.local| true  |                                                                                                                                        
|VerneMQ@mfx-adapter-mqtt-1.mfx-adapter-mqtt.mfx.svc.cluster.local| true  |                                                                                                                                        
|VerneMQ@mfx-adapter-mqtt-2.mfx-adapter-mqtt.mfx.svc.cluster.local| true  |                                                                                                                                        
|VerneMQ@mfx-adapter-mqtt-3.mfx-adapter-mqtt.mfx.svc.cluster.local| true  |                                                                                                                                        
|VerneMQ@mfx-adapter-mqtt-4.mfx-adapter-mqtt.mfx.svc.cluster.local| true  |                                                                                                                                        
+-----------------------------------------------------------------+-------+

Scaling down to replica=3

+-----------------------------------------------------------------+-------+
|                              Node                               |Running|
+-----------------------------------------------------------------+-------+
|VerneMQ@mfx-adapter-mqtt-0.mfx-adapter-mqtt.mfx.svc.cluster.local| true  |
|VerneMQ@mfx-adapter-mqtt-1.mfx-adapter-mqtt.mfx.svc.cluster.local| true  |
|VerneMQ@mfx-adapter-mqtt-2.mfx-adapter-mqtt.mfx.svc.cluster.local| true  |
|VerneMQ@mfx-adapter-mqtt-3.mfx-adapter-mqtt.mfx.svc.cluster.local| false |
|VerneMQ@mfx-adapter-mqtt-4.mfx-adapter-mqtt.mfx.svc.cluster.local| false |
+-----------------------------------------------------------------+-------+

All pods have DOCKER_VERNEMQ_ALLOW_REGISTER_DURING_NETSPLIT: on set as an env variable, as described.
It works only when I scale down to 0 then scale up to N , then I see a "clean" cluster with all running true nodes.
What I am missing?

from vernemq.

ioolkos commented on May 18, 2024

Looks like you're mixing up two things: you now have a netsplitted cluster, but other than before you still can register clients! this is what you configured for with ALLOW_REGISTER. Normally, you could restart those 2 missing nodes, and then the netsplit will heal.

Your other completely unrelated issue is that you want to make those 2 nodes leave in Kubernetes. I do not know why this doesn't work as expected in your case. How do you actually scale down?

from vernemq.

can't register client due to reason not_ready about vernemq HOT 14 CLOSED

Comments (14)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent