kyleaa / libcluster_ec2 Goto Github PK
View Code? Open in Web Editor NEWLicense: MIT License
License: MIT License
Hi @kyleaa , we're seeing this error in our production environment. Any idea what might be going on?
Thanks!
While it is technically obvious, that the ex_aws
dependency needs to be configured first, I did completely miss it on my first attempt of using this strategy, since I did note use ex_aws
in the application itself before.
Due to the way ex_aws
behaves, it does not resort to using the instance role when nothing is configured at all, which causes this strategy to just fail silently.
Thus, I would suggesst adding a hint to the access key configuration of the ex_aws
dependency to this repository's README.
Most of the time it works very well, thank you for your maintenance.
Rarely fails to get Instance Metadata, and as a result, the VM shuts down.
I have two patterns.
The first pattern is
22:58:38.274 [error] GenServer #PID<0.2279.0> terminating
** (RuntimeError) Instance Meta Error: {:error, %{reason: :econnrefused}}
You tried to access the AWS EC2 instance meta, but it could not be reached.
This happens most often when trying to access it from your local computer,
which happens when environment variables are not set correctly prompting
ExAws to fallback to the Instance Meta.
Please check your key config and make sure they're configured correctly:
For Example:
ExAws.Config.new(:s3)
ExAws.Config.new(:dynamodb)
(ex_aws 2.5.1) lib/ex_aws/instance_meta.ex:27: ExAws.InstanceMeta.request/3
(libcluster_ec2 0.8.1) lib/strategy/tags.ex:124: ClusterEC2.Strategy.Tags.get_nodes/1
(libcluster_ec2 0.8.1) lib/strategy/tags.ex:85: ClusterEC2.Strategy.Tags.load/1
(libcluster_ec2 0.8.1) lib/strategy/tags.ex:77: ClusterEC2.Strategy.Tags.handle_info/2
(stdlib 4.3) gen_server.erl:1123: :gen_server.try_dispatch/4
(stdlib 4.3) gen_server.erl:1200: :gen_server.handle_msg/6
(stdlib 4.3) proc_lib.erl:240: :proc_lib.init_p_do_apply/3
Last message: :load
And second pattern is
05:44:04.090 [error] GenServer #PID<0.2280.0> terminating
** (RuntimeError) Instance Meta Error: HTTP response status code 503
Please check AWS EC2 IAM role.
(ex_aws 2.5.3) lib/ex_aws/instance_meta.ex:51: ExAws.InstanceMeta.retry_or_raise/4
(libcluster_ec2 0.8.1) lib/strategy/tags.ex:124: ClusterEC2.Strategy.Tags.get_nodes/1
(libcluster_ec2 0.8.1) lib/strategy/tags.ex:85: ClusterEC2.Strategy.Tags.load/1
(libcluster_ec2 0.8.1) lib/strategy/tags.ex:77: ClusterEC2.Strategy.Tags.handle_info/2
(stdlib 4.3) gen_server.erl:1123: :gen_server.try_dispatch/4
(stdlib 4.3) gen_server.erl:1200: :gen_server.handle_msg/6
(stdlib 4.3) proc_lib.erl:240: :proc_lib.init_p_do_apply/3
Last message: :load
Both of them failed to get instance_id
from Instance Metadata Service:
libcluster_ec2/lib/strategy/tags.ex
Line 124 in c50d97b
Probably, my configuration is correct because usually works normally.
config :ex_aws,
access_key_id: [{:system, "AWS_ACCESS_KEY_ID"}, :instance_role],
secret_access_key: [{:system, "AWS_SECRET_ACCESS_KEY"}, :instance_role],
debug_requests: true,
region: {:system, "AWS_REGION"}
config :libcluster,
topologies: [
example: [
strategy: ClusterEC2.Strategy.Tags,
config: [
ec2_tagname: "exaws/cluster_name",
ec2_tagvalue: System.fetch_env!("CLUSTER_NAME"),
app_prefix: "my_app",
ip_type: :private,
show_debug: false
]
]
]
ex_aws retry requests to normal AWS APIs, but it doesn't retry requests to IMDS. I think temporary failures should be kept at the warning level and should not cause a RuntimeError, but what do you think?
Hey Guys. I was playing with it and I couldn't make it work. I think is due to my EC2 instance memory that is too low (t2micro). My application is being killed during compilation and I can't even test it.
iex --sname n1 --cookie aws -S mix
==> poison
Compiling 4 files (.ex)
Killed
I would like to know if you guys know if it's possible to connect two or more nodes on EC2 by using Epmd or Gossip Strategy? I tried many times with those two strategies without any success either; Do you have some article that shows how to do it with Epmd or Gossip for EC2 instances? I couldn't find something specifically for that.
I also noticed someone said there's a video on ElixirConf that shows how to use uselibcluster_ec2
Do you guys have the link for this specific video?
Thank you;
I am running into conflicts with ex_aws trying to use the latest ex_aws_dynamo.
Failed to use "ex_aws" (versions 2.0.0 to 2.0.2) because
apps/sms_routing/mix.exs requires ~> 2.0
libcluster_ec2 (version 0.2.1) requires ~> 1.1
** (Mix) Hex dependency resolution failed, change the version requirements of your dependencies or unlock them (by using mix deps.update or mix deps.unlock). If you are unable to resolve the conflicts you can try overriding with {:dependency, "> 1.0", override: true}> 2.0" },
{:ex_aws, "
{:ex_aws_dynamo, "~> 2.0" },
Like the title says
Specifically this line https://github.com/kyleaa/libcluster_ec2/blob/0.7.0/lib/libcluster_ec2.ex#L25 from this code String.slice(body, 0..-2)
produces the warning:
warning: negative steps are not supported in String.slice/2, pass 0..-2//1 instead
The fix is to modify the code according to the warning message.
Was just scanning the code and noticed a hard-coded IP address here. Should this not be a domain?
libcluster_ec2/lib/libcluster_ec2.ex
Line 22 in c50d97b
I can't get my environment on aws elastic beanstalk to create a cluster. I am hoping somebody can help me here?
I have set up my project using a dockerfile, and deployed it to Elastic beanstalk. It's deploying fine, but it's not able to connect to other nodes. Receiving the following message:
20:47:23.103 [warn] [libcluster:example] unable to connect to :"[email protected]"
20:47:23.104 [warn] [libcluster:example] unable to connect to :"[email protected]"
There are two ec2 instances at the moment.
I added the following rules to the security groups inbound rules:
Custom TCP | TCP | 4369 | 0.0.0.0/0
Custom TCP | TCP | 9100 - 9155 | 0.0.0.0/0
Here is some other information I configured:
env.sh.eex
export RELEASE_DISTRIBUTION=name
export RELEASE_NODE=<%= @release.name %>@$(curl -s http://169.254.169.254/latest/meta-data/local-ipv4)
Dockerfile
FROM elixir:1.10.3-slim as builder
RUN mkdir /app
WORKDIR /app
ADD mix.* ./
RUN MIX_ENV=prod mix local.rebar
RUN MIX_ENV=prod mix local.hex --force
RUN MIX_ENV=prod mix deps.get --only prod
ADD . .
RUN MIX_ENV=prod mix release --overwrite
FROM elixir:1.10.3-slim
RUN mkdir /app
WORKDIR /app
RUN apt-get update && apt-get install -y curl
COPY --from=builder /app/_build/prod/rel/my-api .
EXPOSE 4000
CMD bin/my-api start
Dockerrun.aws.json
{
"AWSEBDockerrunVersion": "1",
"Image": {
"Name": "xxxxxx",
"Update": "true"
},
"Ports": [
{
"ContainerPort": "4000"
}
]
}
config/prod.exs
config :libcluster,
topologies: [
example: [
strategy: ClusterEC2.Strategy.Tags,
config: [
app_prefix: "my-api",
ec2_tagname: "elasticbeanstalk:environment-name"
],
]
]
I also hardcoded the RELEASE_COOKIE
.
Related to bitwalker/libcluster#74
libcluster
is currently 3.0.2
though it is not released on hex.pm yet. There should be a new release soon for compatibility with Erlang OTP 21.0.
Also tesla
1.0.0
is released.
https://github.com/teamon/tesla/wiki/0.x-to-1.0-Migration-Guide
Can we have the mix.exs
updated to reflect that?
E.g.
{:tesla, "~> 1.0.0"},
{:libcluster, "~> 2.0 or ~> 3.0"},
Thank you!
Can we add a license file please?
Trying to setup a small cluster based on ec2 tags. I've setup the libcluster config as such:
topologies = [
my_app: [
strategy: ClusterEC2.Strategy.Tags,
config: [
ec2_tagname: "ex_cluster_id",
ec2_tagvalue: ec2_tagvalue,
app_prefix: "my_app",
ip_type: :private
]
]
]
children = [
{Cluster.Supervisor, [topologies, [name: MyAppWeb.ClusterSupervisor]]},
MyAppWeb.Endpoint
]
opts = [strategy: :one_for_one, name: MyAppWeb.Supervisor]
Supervisor.start_link(children, opts)
Unfortunately after building the release and running it on each node I'm getting the following error:
19:00:20.139 [info] Application my_app_web exited: exited in: MyAppWeb.Application.start(:normal, [])
** (EXIT) an exception was raised:
** (ArgumentError) The module Cluster.Supervisor was given as a child to a supervisor but it does not exist.
(elixir) lib/supervisor.ex:629: Supervisor.init_child/1
(elixir) lib/enum.ex:1327: Enum."-map/2-lists^map/1-0-"/2
(elixir) lib/supervisor.ex:615: Supervisor.init/2
(elixir) lib/supervisor.ex:564: Supervisor.start_link/2
(kernel) application_master.erl:277: :application_master.start_it_old/4
{"Kernel pid terminated",application_controller,"{application_start_failure,my_app_web,{bad_return,{{'Elixir.MyAppWeb.Application',start,[normal,[]]},{'EXIT',{#{'__exception__' => true,'__struct__' => 'Elixir.ArgumentError',message => <<\"The module Cluster.Supervisor was given as a child to a supervisor but it does not exist.\">>},[{'Elixir.Supervisor',init_child,1,[{file,\"lib/supervisor.ex\"},{line,629}]},{'Elixir.Enum','-map/2-lists^map/1-0-',2,[{file,\"lib/enum.ex\"},{line,1327}]},{'Elixir.Supervisor',init,2,[{file,\"lib/supervisor.ex\"},{line,615}]},{'Elixir.Supervisor',start_link,2,[{file,\"lib/supervisor.ex\"},{line,564}]},{application_master,start_it_old,4,[{file,\"application_master.erl\"},{line,277}]}]}}}}}"}
Kernel pid terminated (application_controller) ({application_start_failure,my_app_web,{bad_return,{{'Elixir.MyAppWeb.Application',start,[normal,[]]},{'EXIT',{#{'__exception__' => true,'__struct__' =>
I'm assuming this has something to do with how I've setup the EC2 tags strategy because when I use the gossip strategy locally it seems to work fine. Any insight you might have into the issue or possible misconfiguration is appreciated!
As all the nodes boot up in the cluster I am seeing the following show up in my logs:
01:58:26.866 [error] Supervisor 'Elixir.MyApplication.ClusterSupervisor' had child 'Elixir.ClusterEC2.Strategy.Tags' started with 'Elixir.ClusterEC2.Strategy.Tags':start_link([#{'__struct__' => 'Elixir.Cluster.Strategy.State',config => [{ec2_tagname,<<"MyApplication">>},...],...}]) at <0.2182.0> exit with reason {fatal,{unexpected_end,{file,file_name_unknown},{line,2136},{col,48}}} in context child_terminated
My nodes are compiling erlang from source once the EC2 instance is online, so there is a good 5 minutes or so which the machine is up but there's nothing to connect to.
Shouldn't libcluster_ec2
just ignore errors like this?
Any Idea what could be wrong here? I'm just upgrading erlang/elixir version from a app that works with prev version
- ERLANG_VERSION="1:22.3.4-1"
- ELIXIR_VERSION="1.10.4-1"
+ ERLANG_VERSION="1:23.2.3-1"
+ ELIXIR_VERSION="1.11.2-1"
14:36:25.581 [info] Application xxx exited: XXX.Application.start(:normal, []) returned an error: shutdown: failed to start child: Cluster.Supervisor
** (EXIT) shutdown: failed to start child: :xxx ** (EXIT) an exception was raised: ** (ArgumentError) expected a keyword list, but an entry in the list is not a two-element tuple with an atom as its first element, got: {"content-type", "application/x-www-form-urlencoded"} (elixir 1.11.2) lib/keyword.ex:475: Keyword.keys/1 (ex_aws 2.0.2) lib/ex_aws/auth.ex:110: ExAws.Auth.build_canonical_request/5 (ex_aws 2.0.2) lib/ex_aws/auth.ex:96: ExAws.Auth.signature/8
(ex_aws 2.0.2) lib/ex_aws/auth.ex:87: ExAws.Auth.auth_header/7
(ex_aws 2.0.2) lib/ex_aws/auth.ex:39: ExAws.Auth.headers/6
(ex_aws 2.0.2) lib/ex_aws/request.ex:27: ExAws.Request.request_and_retry/7
(ex_aws 2.0.2) lib/ex_aws/operation/query.ex:33: ExAws.Operation.ExAws.Operation.Query.perform/2
(libcluster_ec2 0.6.0) lib/strategy/tags.ex:141: ClusterEC2.Strategy.Tags.get_nodes/1[os_mon] memory supervisor port (memsup): Erlang has closed[os_mon] cpu supervisor port (cpu_sup): Erlang has closed
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.