Giter VIP home page Giter VIP logo

Comments (14)

nmarcetic avatar nmarcetic commented on May 18, 2024 1

@ioolkos Ah I see, still running nodes will now accept requests, got it now ;)
Ok testing now.

I scaled replicaset like this
kubectl scale statefulset mfx-adapter-mqtt --replicas=5 --namespace mfx

from vernemq.

dergraf avatar dergraf commented on May 18, 2024

Yeah, this is because of the enabled consistency features. From VerneMQs perspective it looks like either the Node is down, or a network problem separates your two nodes. By default VerneMQ trades availability for consistency. Which means that VerneMQ will stop serving requests (CONNECT, PUBLISH, SUBSCRIBE) and therefore favors consistency, e.g. uniqueness of client-ids among all cluster nodes.

But, you can CHANGE this behaviour, by trading consistency for availability!

What are you giving up if you trade consistency for availability:

Define trade_consistency = on in vernemq.conf this will allow that your currently registered clients can still publish/consume messages (however they can't subscribe to new topics), but new clients CAN'T register because VerneMQ 'tries' to ensure uniqueness of the client-ids in the cluster.

Define allow_multiple_sessions = on in vernemq.conf this will allow multiple clients connecting to the cluster using the same client-id. Therefore they share the same subscriptions.

Depending on your use case, combined with proper authentication/authorization, trading consistency might be a good thing. It is definitely useful for a high-availability cluster setup.

Please check https://vernemq.com/docs/clustering.html, to make sure you understand the consequences.

from vernemq.

bayuemu avatar bayuemu commented on May 18, 2024

thinks ,I try it

from vernemq.

bayuemu avatar bayuemu commented on May 18, 2024

run ok ! thinks

from vernemq.

vedavidhbudimuri avatar vedavidhbudimuri commented on May 18, 2024

Hey @dergraf

Im also facing same issue while hosting two vernemq nodes as two docker containers in an ec2.
each node is responding to other's ping request, so I guess they are connected.
I have added the logs in the image below.

PFA
screen shot 2017-08-21 at 3 25 38 pm

from vernemq.

larshesel avatar larshesel commented on May 18, 2024

@vedavidhbudimuri this happens when your cluster is partitioned. Check the cluster state using vmq-admin cluster show.

from vernemq.

vedavidhbudimuri avatar vedavidhbudimuri commented on May 18, 2024

@larshesel vmq-admin cluster show

shows that two of the nodes true and rest of other nodes to be false, is the issue because of that?
Initially, I tried with 6 nodes and later scaled down to two nodes.

from vernemq.

larshesel avatar larshesel commented on May 18, 2024

Yes, that's the problem. You have to explicitly tell VerneMQ that the nodes are gone, otherwise VerneMQ has no idea that this isn't a netsplit situation. See http://vernemq.com/docs/clustering/ and http://vernemq.com/docs/clustering/netsplits.html.

from vernemq.

vedavidhbudimuri avatar vedavidhbudimuri commented on May 18, 2024

Oops, I missed this. Thank you @larshesel

from vernemq.

nmarcetic avatar nmarcetic commented on May 18, 2024

@larshesel @dergraf Is this mean that I can't' get k8s pod auto-scaling without consequences? Or even manual scale down without manually removing nodes from the cluster?

I just faced with the exact same problem, scale down from 5 to 3 instances and get
can't register client due to reason not_ready
I see 3 nodes running and 2 nodes running false with vmq-admin cluster show
Exactly the same case as described.

Manual scaling and add/remove nodes from a cluster is not realistic in production or with the high load in most use cases (especially when throughput is not consistent). Just curious do you guys have any advice here?

Is there a way to set allow_multiple_sessions and trade_consistency via env var in docker image ?

Thanks!

from vernemq.

ioolkos avatar ioolkos commented on May 18, 2024

@nmarcetic thanks...

  1. If you want to be able to allow registers in a netsplit you have to allow for it: https://docs.vernemq.com/clustering/netsplits Set the allow_register_during_netsplit = on in VerneMQ config, and also see about the other allow values there.
    The config values you mention (allow_multiple_sessions and trade_consistency) are obsolete, don't use them.

  2. The idea is not that you manually make the nodes leave. The pod termination should automatically make the node leave, I think. Don't know what you did and why it didn't work. Maybe related: vernemq/docker-vernemq#193

  3. A VerneMQ cluster is stateful and membership is part of state. Verne doesn't just forget a node, because it was killed. That's why there's the leave command to tell the cluster. See point 2.

from vernemq.

nmarcetic avatar nmarcetic commented on May 18, 2024

@ioolkos Thanks for a prompt response.

I see now, I will try right now with DOCKER_VERNEMQ_ALLOW_REGISTER_DURING_NETSPLIT=on.

from vernemq.

nmarcetic avatar nmarcetic commented on May 18, 2024

@ioolkos Looks same 😞

Running replica=5

+-----------------------------------------------------------------+-------+                                                                                                                                        
|                              Node                               |Running|                                                                                                                                        
+-----------------------------------------------------------------+-------+                                                                                                                                        
|VerneMQ@mfx-adapter-mqtt-0.mfx-adapter-mqtt.mfx.svc.cluster.local| true  |                                                                                                                                        
|VerneMQ@mfx-adapter-mqtt-1.mfx-adapter-mqtt.mfx.svc.cluster.local| true  |                                                                                                                                        
|VerneMQ@mfx-adapter-mqtt-2.mfx-adapter-mqtt.mfx.svc.cluster.local| true  |                                                                                                                                        
|VerneMQ@mfx-adapter-mqtt-3.mfx-adapter-mqtt.mfx.svc.cluster.local| true  |                                                                                                                                        
|VerneMQ@mfx-adapter-mqtt-4.mfx-adapter-mqtt.mfx.svc.cluster.local| true  |                                                                                                                                        
+-----------------------------------------------------------------+-------+                    

Scaling down to replica=3

+-----------------------------------------------------------------+-------+
|                              Node                               |Running|
+-----------------------------------------------------------------+-------+
|VerneMQ@mfx-adapter-mqtt-0.mfx-adapter-mqtt.mfx.svc.cluster.local| true  |
|VerneMQ@mfx-adapter-mqtt-1.mfx-adapter-mqtt.mfx.svc.cluster.local| true  |
|VerneMQ@mfx-adapter-mqtt-2.mfx-adapter-mqtt.mfx.svc.cluster.local| true  |
|VerneMQ@mfx-adapter-mqtt-3.mfx-adapter-mqtt.mfx.svc.cluster.local| false |
|VerneMQ@mfx-adapter-mqtt-4.mfx-adapter-mqtt.mfx.svc.cluster.local| false |
+-----------------------------------------------------------------+-------+

All pods have DOCKER_VERNEMQ_ALLOW_REGISTER_DURING_NETSPLIT: on set as an env variable, as described.
It works only when I scale down to 0 then scale up to N , then I see a "clean" cluster with all running true nodes.
What I am missing?

from vernemq.

ioolkos avatar ioolkos commented on May 18, 2024

Looks like you're mixing up two things: you now have a netsplitted cluster, but other than before you still can register clients! this is what you configured for with ALLOW_REGISTER. Normally, you could restart those 2 missing nodes, and then the netsplit will heal.

Your other completely unrelated issue is that you want to make those 2 nodes leave in Kubernetes. I do not know why this doesn't work as expected in your case. How do you actually scale down?

from vernemq.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.