Hello, i am unable to start ELK cluster. On first EC2 instance, I have this config:</p

Can't run ELK cluster on different nodes about elk-docker HOT 7 CLOSED

spujadas commented on May 27, 2024

Can't run ELK cluster on different nodes

from elk-docker.

Comments (7)

dusansusic commented on May 27, 2024

With this config:

network.host: 0.0.0.0
discovery.zen.ping.unicast.hosts: [172.31.3.111, 172.31.3.112]

ES will start on both machines BUT then in logs I am getting this:

[2016-07-21 21:36:01,035][INFO ][env ] [Mortimer Toynbee] heap size [1007.3mb], compressed ordinary object pointers [true]
[2016-07-21 21:36:01,035][WARN ][env ] [Mortimer Toynbee] max file descriptors [65535] for elasticsearch process likely too low, consider increasing to at least [65536]
[2016-07-21 21:36:02,473][INFO ][node ] [Mortimer Toynbee] initialized
[2016-07-21 21:36:02,474][INFO ][node ] [Mortimer Toynbee] starting ...
[2016-07-21 21:36:02,578][INFO ][transport ] [Mortimer Toynbee] publish_address {172.17.0.2:9300}, bound_addresses {[fe80::42:acff:fe11:2]:9300}, {172.17.0.2:9300}
...
...

[2016-07-21 21:39:53,997][INFO ][discovery.zen ] [Ms. MODOK] failed to send join request to master [{Mortimer Toynbee}{6bOsLq4KR0iU0lvIDazsUw}{172.17.0.2}{172.17.0.2:9300}{master=true}], reason [RemoteTransportException[[Ms. MODOK][172.17.0.2:9300][internal:discovery/zen/join]]; nested: IllegalStateException[Node [{Ms. MODOK}{2MvhIxFiSiO0EnpGEM25VA}{172.17.0.2}{172.17.0.2:9300}{master=true}] not master for join request]; ]

So, it bounds to the docker0 NIC. Obviously, if it's running on docker NIC it doesn't know to route traffic to eth0, or what?

from elk-docker.

dusansusic commented on May 27, 2024

tried setup you provided and i am getting on the slave machine this kind of error:

[2016-07-21 22:39:32,792][INFO ][discovery.zen ] [Torpedo] failed to send join request to master [{Merlin}{ZV3sxHacRYypcUCrvS8ZzQ}{172.17.0.2}{172.17.0.2:9300}], reason [RemoteTransportException[[Torpedo][172.17.0.2:9300][internal:discovery/zen/join]]; nested: IllegalStateException[Node [{Torpedo}{vWsTy7nZRSC20HqEKBgKrg}{172.17.0.2}{172.17.0.2:9300}] not master for join request]; ]

from elk-docker.

dusansusic commented on May 27, 2024

is there possibility that I have to set net.ipv4.ip_forward = 1 on system level?

EDIT: nevermind, it's already turned on.

from elk-docker.

spujadas commented on May 27, 2024

You might be running into the 'usual' bind/publish issues (which aren't always easy to fix, mind you 😕).

If you're using IP addresses I believe you should:

Set network.bind_host to 0.0.0.0, to make sure that Elasticsearch is reachable on all of the container's addresses/NICs (including the container's local Docker NIC).
Set network.publish_host to your EC2 internal IP address.

This setting needs to be set explicitly as otherwise Elasticsearch can't know how it can be reached from outside the container.

This assumes that all your internal IP addresses are reachable from the other nodes in your cluster: you can docker exec into each containerised node and make sure the other nodes are ping-able.

A few additional notes:

The Elastic blog has a post on Docker networking with Elasticsearch that may be helpful.
I haven't tested this, but as you're running on AWS EC2 you may want to have a look at the EC2 Discovery plugin (although to be honest I don't know how nicely it plays with a containerised Elasticsearch).
Indeed, as you mentioned, you want net.ipv4.ip_forward to be set to 1 (pretty sure that things would get messed up if set to 0).

Let me know how it goes.

from elk-docker.

dusansusic commented on May 27, 2024

Hello, something is weird with you container, don't know what and why it doesn't discover other nodes in VPC. I pulled official ES image, mounted config dir, elasticsearch.yml had this config:

Node1:

cluster.name: tech
node.master: true
node.data: true
network.host: 0.0.0.0
network.publish_host: 172.31.3.111
discovery.zen.ping.unicast.hosts: [172.31.3.111, 172.31.3.112]

Node2:

cluster.name: tech
node.master: true
node.data: true
network.host: 0.0.0.0
network.publish_host: 172.31.3.112
discovery.zen.ping.unicast.hosts: [172.31.3.111, 172.31.3.112]

and node2 successfully joined to cluster, immediately. If i can help somehow, shot, you've done great job with this container.

from elk-docker.

spujadas commented on May 27, 2024

Thanks for the update, and glad to hear that you managed to have things working (albeit not with the sebp/elk image).

I did some tests. As I haven't got AWS handy, I used two DigitalOcean nodes instead, with private networking enabled, so that we have equivalent set-ups, and it's working perfectly (with both the elasticsearch:2 and the sebp/elk images)… provided that port 9300 is published (which I forgot to mention in the documentation, sorry about that).

Here's what I did.

On host 1 (private IP address 10.135.13.134), my config file /root/elk-docker/elasticsearch.yml.cluster58 reads:

cluster.name: my-application
node.name: node-1
network.host: 0.0.0.0
network.publish_host: 10.135.13.134
discovery.zen.ping.unicast.hosts: [10.135.5.29, 10.135.13.134]
discovery.zen.minimum_master_nodes: 2

On host 2 (private IP address 10.135.5.29), my config file /root/elk-docker/elasticsearch.yml.cluster58 reads:

cluster.name: my-application
node.name: node-2
network.host: 0.0.0.0
network.publish_host: 10.135.5.29
discovery.zen.ping.unicast.hosts: [10.135.5.29, 10.135.13.134]
discovery.zen.minimum_master_nodes: 2

I started the ELK container on each of the two hosts with:

# docker run -p 5601:5601 -p 9200:9200 -p 9300:9300 -p 5044:5044 -p 5000:5000 -v /root/elk-docker/elasticsearch.yml.cluster58:/etc/elasticsearch/elasticsearch.yml -it --name elk-cluster58 sebp/elk

Container logs on host 1 show:

 * Starting Elasticsearch Server                                                                      sysctl: setting key "vm.max_map_count": Read-only file system
                                                                                               [ OK ]
waiting for Elasticsearch to be up (1/30)
waiting for Elasticsearch to be up (2/30)
waiting for Elasticsearch to be up (3/30)
waiting for Elasticsearch to be up (4/30)
waiting for Elasticsearch to be up (5/30)
waiting for Elasticsearch to be up (6/30)
waiting for Elasticsearch to be up (7/30)
waiting for Elasticsearch to be up (8/30)
logstash started.
 * Starting Kibana4                                                                            [ OK ]
==> /var/log/elasticsearch/my-application.log <==
[2016-07-24 11:45:25,265][INFO ][env                      ] [node-1] heap size [1007.3mb], compressed ordinary object pointers [true]
[2016-07-24 11:45:25,268][WARN ][env                      ] [node-1] max file descriptors [65535] for elasticsearch process likely too low, consider increasing to at least [65536]
[2016-07-24 11:45:28,212][INFO ][node                     ] [node-1] initialized
[2016-07-24 11:45:28,212][INFO ][node                     ] [node-1] starting ...
[2016-07-24 11:45:28,375][INFO ][transport                ] [node-1] publish_address {10.135.13.134:9300}, bound_addresses {[::]:9300}
[2016-07-24 11:45:28,395][INFO ][discovery                ] [node-1] my-application/LnPwxhdIT7iH-0VTaEjQ5A
[2016-07-24 11:45:31,469][INFO ][cluster.service          ] [node-1] new_master {node-1}{LnPwxhdIT7iH-0VTaEjQ5A}{10.135.13.134}{10.135.13.134:9300}, added {{node-2}{zjwDkQuyRKmvpLhvNieEWA}{10.135.5.29}{10.135.5.29:9300},}, reason: zen-disco-join(elected_as_master, [1] joins received)
[2016-07-24 11:45:31,618][INFO ][http                     ] [node-1] publish_address {10.135.13.134:9200}, bound_addresses {[::]:9200}
[2016-07-24 11:45:31,619][INFO ][node                     ] [node-1] started
[2016-07-24 11:45:31,687][INFO ][gateway                  ] [node-1] recovered [0] indices into cluster_state
...
[2016-07-24 11:45:39,852][INFO ][cluster.metadata         ] [node-1] [.kibana] creating index, cause [api], templates [], shards [1]/[1], mappings [config]
[2016-07-24 11:45:40,426][INFO ][cluster.routing.allocation] [node-1] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[.kibana][0]] ...]).
...
[2016-07-24 11:45:41,143][INFO ][cluster.routing.allocation] [node-1] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[.kibana][0]] ...]).
...
[2016-07-24 11:45:59,228][INFO ][cluster.metadata         ] [node-1] [.kibana] create_mapping [index-pattern]

Container logs on host 2 show:

 * Starting Elasticsearch Server                                                                       sysctl: setting key "vm.max_map_count": Read-only file system
                                                                                                [ OK ]
waiting for Elasticsearch to be up (1/30)
waiting for Elasticsearch to be up (2/30)
waiting for Elasticsearch to be up (3/30)
waiting for Elasticsearch to be up (4/30)
waiting for Elasticsearch to be up (5/30)
waiting for Elasticsearch to be up (6/30)
waiting for Elasticsearch to be up (7/30)
logstash started.
 * Starting Kibana4                                                                             [ OK ]
==> /var/log/elasticsearch/my-application.log <==
[2016-07-24 11:45:24,833][INFO ][env                      ] [node-2] using [1] data paths, mounts [[/var/lib/elasticsearch (/dev/disk/by-label/DOROOT)]], net usable_space [32gb], net total_space [39.2gb], spins? [possibly], types [ext4]
[2016-07-24 11:45:24,833][INFO ][env                      ] [node-2] heap size [1007.3mb], compressed ordinary object pointers [true]
[2016-07-24 11:45:24,833][WARN ][env                      ] [node-2] max file descriptors [65535] for elasticsearch process likely too low, consider increasing to at least [65536]
[2016-07-24 11:45:26,724][INFO ][node                     ] [node-2] initialized
[2016-07-24 11:45:26,724][INFO ][node                     ] [node-2] starting ...
[2016-07-24 11:45:26,813][INFO ][transport                ] [node-2] publish_address {10.135.5.29:9300}, bound_addresses {[::]:9300}
[2016-07-24 11:45:26,817][INFO ][discovery                ] [node-2] my-application/zjwDkQuyRKmvpLhvNieEWA
[2016-07-24 11:45:31,497][INFO ][cluster.service          ] [node-2] detected_master {node-1}{LnPwxhdIT7iH-0VTaEjQ5A}{10.135.13.134}{10.135.13.134:9300}, added {{node-1}{LnPwxhdIT7iH-0VTaEjQ5A}{10.135.13.134}{10.135.13.134:9300},}, reason: zen-disco-receive(from master [{node-1}{LnPwxhdIT7iH-0VTaEjQ5A}{10.135.13.134}{10.135.13.134:9300}])
[2016-07-24 11:45:31,517][INFO ][http                     ] [node-2] publish_address {10.135.5.29:9200}, bound_addresses {[::]:9200}
[2016-07-24 11:45:31,517][INFO ][node                     ] [node-2] started

Browsing to (or curling) http://46.101.172.205:9200/_cluster/health?pretty=true and http://46.101.156.50:9200/_cluster/health?pretty=true (where 46.101.172.205 and 46.101.156.50 are the public IP addresses of the hosts hosting the ES nodes) shows:

{
  "cluster_name" : "my-application",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 2,
  "number_of_data_nodes" : 2,
  "active_primary_shards" : 1,
  "active_shards" : 2,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

And I observe the same behaviour (without logs from Logstash and Kibana, of course) when using the official elasticsearch image and the config files provided above, and starting the container on each host with:

# docker run -p 9200:9200 -p 9300:9300 -v /root/elk-docker/elasticsearch.yml.cluster58:/usr/share/elasticsearch/config/elasticsearch.yml -it --name elastic-cluster58 elasticsearch:2

So (again provided that port 9300 is also published in addition to all the other ports) I can't see why the cluster won't work with the sebp/elk image in your set-up.

(Before I realised that I had forgotten about port 9300, both containers were running smoothly… but not as a cluster: the nodes were up and running, there were no actual error messages in the logs, but there was a warning about discovery timing out.)

Hope that helps.

from elk-docker.

dusansusic commented on May 27, 2024

hm, I saw you missed that but in my case - without success. Just looked in my history from several days before and found this:

105  sudo docker run -d -p 5601:5601 -p 9200:9200 -p 9300:9300 -p 5044:5044 -p 5000:5000 -it --name elk1-tech xxxxxxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/elk-technical
  106  curl http://172.31.3.111:9200/_cluster/health?pretty

I will try it out later today once more. Please update your docs and thanks! :)

from elk-docker.

Can't run ELK cluster on different nodes about elk-docker HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent