canonical / eks-snap Goto Github PK
View Code? Open in Web Editor NEWSingle-package EKS Distro
License: Apache License 2.0
Single-package EKS Distro
License: Apache License 2.0
Hi, team
Not sure if this is the right place to report, I have a systemd service in eks node, which is used to drain node on EC2 shutdown, here's the systemd unit:
# cat /etc/systemd/system/aws-shutdown.service
[Unit]
Description=AWS Shutdown Service
After=multi-user.target
Before=shutdown.target reboot.target halt.target
Requires=network-online.target network.target
[Service]
KillMode=none
ExecStart=/bin/true
ExecStop=/usr/local/bin/drain-node
RemainAfterExit=yes
Type=oneshot
TimeoutStopSec=300
[Install]
WantedBy=multi-user.target
Content in /usr/local/bin/drain-node
:
# cat /usr/local/bin/drain-node
#!/usr/bin/env bash
# Script to drain a node when node is shutting down
set -o errexit
set -o pipefail
/snap/bin/kubectl --kubeconfig /var/lib/kubelet/kubeconfig \
drain $(/usr/local/share/eks/imds /latest/meta-data/hostname) \
--ignore-daemonsets \
--delete-emptydir-data
systemctl enable aws-shutdown.service
systemctl start aws-shutdown.service
# systemctl status aws-shutdown.service
● aws-shutdown.service - AWS Shutdown Service
Loaded: loaded (/etc/systemd/system/aws-shutdown.service; enabled; vendor preset: enabled)
Active: active (exited) since Thu 2024-01-04 07:22:16 UTC; 5min ago
Process: 38676 ExecStart=/bin/true (code=exited, status=0/SUCCESS)
Main PID: 38676 (code=exited, status=0/SUCCESS)
Jan 04 07:22:16 ip-10-120-18-17 systemd[1]: Starting AWS Shutdown Service...
Jan 04 07:22:16 ip-10-120-18-17 systemd[1]: Finished AWS Shutdown Service.
I got error when shutting down EC2:
internal error, please report: running "kubectl-eks.kubectl" failed: cannot create transient scope: DBus error "org.freedesktop.systemd1.TransactionIsDestructive": [Transaction for snap.kubectl-eks.kubectl-323cec7a-aa56-4e5f-8764-36638bbcb776.scope/start is destructive (systemd-poweroff.service has 'start' job queued, but 'stop' is included in transaction).]
Please help, thanks.
eks-distro has released a new version this month supporting Kubernetes 1.21: https://github.com/aws/eks-distro/releases
It would be great to see this snap get upgraded. It's incredibly convenient, however Kubernetes 1.18 is too old for our use case.
Thanks for the great work!
Related: #23
After setting up a cluster and applying user permisons as instructed, some commands still require sudo.
sudo usermod -a -G eks liam
sudo chown -f -R liam ~/.kube
# restart session
$ eks status
eks is running
Traceback (most recent call last):
File "/snap/eks/5/scripts/wrappers/status.py", line 205, in <module>
print_pretty(isReady, enabled, disabled)
File "/snap/eks/5/scripts/wrappers/status.py", line 44, in print_pretty
info = get_dqlite_info()
File "/snap/eks/5/scripts/wrappers/common/utils.py", line 72, in get_dqlite_info
with open("{}/info.yaml".format(cluster_dir), mode='r') as f:
PermissionError: [Errno 13] Permission denied: '/var/snap/eks/5/var/kubernetes/backend/info.yaml'
$ sudo eks status
eks is running
high-availability: yes
datastore master nodes: 192.168.1.201:19001 192.168.1.202:19001 192.168.1.203:19001
datastore standby nodes: none
Not sure if this is intended or a requirement, but might be worthwhile adding a note to the instructions if that's the case. Not a big deal though.
Hello,
I am trying to install the EKS-D snap on Ubuntu 20.04.2 LTS, and I am getting this error message:
error: cannot perform the following tasks:
I first installed snaps for both Docker and MicroK8s in the OS, and then ran this command:
sudo snap install eks --classic --edge
I tried this in a Ubuntu 20.04.2 LT instance in Google Cloud Platform, as well as in a VM on Oracle VirtualBox, and get the same results.
Here is the output of 'microk8s inspect'.
inspection-report-20210213_175213.tar.gz
Is there anything else I need to do before installing the EKS-D snap?
Thanks,
Tony Iams
Please run microk8s inspect
and attach the generated tarball to this issue.
inspection-report-worker.tar.gz
inspection-report-control.tar.gz
I used the instructions here https://snapcraft.io/eks
I additionally added the user permissions as in this link. There was no problem with joining two nodes. But the third node does not join (as seen by running sudo eks kubectl get nodes
(attached terminal screenshot). There is no error. I tried multiple times but the result is the same (even with restarting the node).
Dear team,
While installing eks
sudo snap install eks --classic --edge
with snap version:
snap 2.48+20.04
snapd 2.48+20.04
series 16
ubuntu 20.04
kernel 5.4.0-56-generic
I get:
error: cannot perform the following tasks:
- Run configure hook of "eks" snap if present (run hook "configure":
-----
++ OPENSSL_CONF=/snap/eks/current/etc/ssl/openssl.cnf
++ for key in serviceaccount.key ca.key server.key front-proxy-ca.key front-proxy-client.key
++ '[' -f /var/snap/eks/2/certs/serviceaccount.key ']'
++ for key in serviceaccount.key ca.key server.key front-proxy-ca.key front-proxy-client.key
++ '[' -f /var/snap/eks/2/certs/ca.key ']'
++ for key in serviceaccount.key ca.key server.key front-proxy-ca.key front-proxy-client.key
++ '[' -f /var/snap/eks/2/certs/server.key ']'
++ for key in serviceaccount.key ca.key server.key front-proxy-ca.key front-proxy-client.key
++ '[' -f /var/snap/eks/2/certs/front-proxy-ca.key ']'
++ for key in serviceaccount.key ca.key server.key front-proxy-ca.key front-proxy-client.key
++ '[' -f /var/snap/eks/2/certs/front-proxy-client.key ']'
++ '[' -f /var/snap/eks/2/certs/ca.crt ']'
++ '[' -f /var/snap/eks/2/certs/front-proxy-ca.crt ']'
++ render_csr_conf
+++ get_ips
++++ /snap/eks/2/bin/hostname -I
+++ local 'IP_ADDR=192.168.64.2 fdde:54b0:b6d0:805b:985a:41ff:fe43:f15b '
+++ [[ -z 192.168.64.2 fdde:54b0:b6d0:805b:985a:41ff:fe43:f15b ]]
+++ /snap/eks/2/sbin/ifconfig cni0
+++ echo '192.168.64.2 fdde:54b0:b6d0:805b:985a:41ff:fe43:f15b '
++ local 'IP_ADDRESSES=192.168.64.2 fdde:54b0:b6d0:805b:985a:41ff:fe43:f15b '
++ cp /var/snap/eks/2/certs/csr.conf.template /var/snap/eks/2/certs/csr.conf.rendered
++ '[' '192.168.64.2 fdde:54b0:b6d0:805b:985a:41ff:fe43:f15b ' == 127.0.0.1 ']'
++ '[' '192.168.64.2 fdde:54b0:b6d0:805b:985a:41ff:fe43:f15b ' == none ']'
++ local ips= sep=
++ local -i i=3
+++ echo '192.168.64.2 fdde:54b0:b6d0:805b:985a:41ff:fe43:f15b '
++ for IP_ADDR in $(echo "$IP_ADDRESSES")
++ ips+='IP.3 = 192.168.64.2'
++ sep='\n'
++ for IP_ADDR in $(echo "$IP_ADDRESSES")
++ ips+='\nIP.4 = fdde:54b0:b6d0:805b:985a:41ff:fe43:f15b'
++ sep='\n'
++ /snap/eks/2/bin/sed -i 's/#MOREIPS/IP.3 = 192.168.64.2\nIP.4 = fdde:54b0:b6d0:805b:985a:41ff:fe43:f15b/g' /var/snap/eks/2/certs/csr.conf.rendered
++ '[' -f /var/snap/eks/2/certs/csr.conf ']'
++ local force
++ /snap/eks/2/usr/bin/cmp -s /var/snap/eks/2/certs/csr.conf.rendered /var/snap/eks/2/certs/csr.conf
++ force=false
++ false
++ '[' '!' -f /var/snap/eks/2/certs/front-proxy-client.crt ']'
+++ /snap/eks/2/usr/bin/openssl x509 -noout -issuer
++ '[' 'issuer= /CN=front-proxy-ca' == 'issuer=CN = 127.0.0.1' ']'
++ echo 0
+ '[' 0 == 1 ']'
+ '[' -e /var/snap/eks/2/args/containerd-template.toml ']'
+ grep -e 'stream_server_address = ""' /var/snap/eks/2/args/containerd-template.toml
+ grep -e '\-\-allow-privileged' /var/snap/eks/2/args/kubelet
+ '[' -f /root/snap/eks/common/istio-auth.lock ']'
+ '[' -f /root/snap/eks/common/istio-auth.lock ']'
+ need_api_restart=false
+ '[' -f /var/snap/eks/2/credentials/kubelet.config ']'
+ '[' -f /var/snap/eks/2/credentials/proxy.config ']'
+ '[' -f /var/snap/eks/2/credentials/scheduler.config ']'
+ '[' -f /var/snap/eks/2/credentials/controller.config ']'
+ for dir in "${SNAP_DATA}/credentials/ ${SNAP_DATA}/certs/ ${SNAP_DATA}/args/ ${SNAP_DATA}/var/lock"
+ chmod -R ug+rwX /var/snap/eks/2/credentials/ /var/snap/eks/2/certs/ /var/snap/eks/2/args/ /var/snap/eks/2/var/lock
+ chmod -R o-rwX /var/snap/eks/2/credentials/ /var/snap/eks/2/certs/ /var/snap/eks/2/args/ /var/snap/eks/2/var/lock
+ getent group eks
+ getent group eks
+ chgrp eks -R /var/snap/eks/2/credentials/ /var/snap/eks/2/certs/ /var/snap/eks/2/args/ /var/snap/eks/2/var/lock/ /var/snap/eks/2/var/kubernetes/backend/
+ false
+ '[' '!' -f /var/snap/eks/2/args/cluster-agent ']'
+ grep -e '\-\-timeout' /var/snap/eks/2/args/cluster-agent
--timeout 240
+ grep -e '\-\-cluster-cidr=10.152.183.0/24' /var/snap/eks/2/args/kube-proxy
+ '[' -e /var/snap/eks/2/args/cni-network/cni.yaml ']'
+ '[' -e /var/snap/eks/2/var/lock/ha-cluster ']'
+ echo 'Setting up the CNI'
Setting up the CNI
++ date +%s
+ start_timer=1607167542
+ timeout=120
+ KUBECTL='/snap/eks/2/kubectl --kubeconfig=/var/snap/eks/2/credentials/client.config'
+ sleep 5
++ date +%s
+ now=1607167554
+ [[ 1607167554 > 1607167662 ]]
+ sleep 5
++ date +%s
+ now=1607167560
+ [[ 1607167560 > 1607167662 ]]
+ sleep 5
++ date +%s
+ now=1607167583
+ [[ 1607167583 > 1607167662 ]]
++ date +%s
+ now=1607167590
+ [[ 1607167590 < 1607167662 ]]
+ /snap/eks/2/kubectl --kubeconfig=/var/snap/eks/2/credentials/client.config apply -f /var/snap/eks/2/args/cni-network/cni.yaml
configmap/calico-config unchanged
customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/blockaffinities.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created
<exceeded maximum runtime of 5m0s>
-----)
What could be wrong? thanks!
It’s not an issue itself, but I’d like to know if would be possible to setup an EKS cluster with Kubernetes version 1.19 or 1.20.
Actually EKS-D installed on Ubuntu 20.04.
When installing this snap on a machine with a microk8s install, snap eks will install properly but won't work. The issue isn't trivial to spot. When doing eks inspect
we can see that some services can't start as they conflict with microk8s
services.
I'm not sure what is the expected behavior, having both microk8s and eks on one machine can be an edge case, not worth it. However, a good user experience would be to prevent install / alert user of the conflict.
On EC2 with a t2.micro instance running an Ubuntu 20.04, I can't get the cluster to init. Here's what I tried:
sudo snap install eks --classic --edge
eks.start
sudo snap eks status
# not running
Hi
I have a t4g instance in AWS (Graviton2 arm64) running ubuntu-focal-20.04-arm64-server
root@ip-172-31-27-168:~# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.2 LTS
Release: 20.04
Codename: focal
And I've done:
snap install snapcraft --classic
git clone https://github.com/canonical/eks-snap.git
cd eks-snap
snapcraft --use-lxd
runs for while - but errors out with:
CC src/server.lo
CC src/stmt.lo
In file included from src/server.c:1:0:
src/server.h:2:21: fatal error: raft/uv.h: No such file or directory
compilation terminated.
Makefile:1163: recipe for target 'src/server.lo' failed
make: *** [src/server.lo] Error 1
make: *** Waiting for unfinished jobs....
Failed to run 'make -j2' for 'dqlite': Exited with code 2.
Verify that the part is using the correct parameters and try again.
Run the same command again with --debug to shell into the environment if you wish to introspect this failure.
Any suggestions ?
snap install eks
fails in China and It keeps failing to get the pause
image.
Is it possible to override this sandbox_image
argument with another custom image to get it around?
# eks inspect
Inspecting Certificates
Inspecting services
Service snap.eks.daemon-cluster-agent is running
Service snap.eks.daemon-containerd is running
Service snap.eks.daemon-apiserver is running
Service snap.eks.daemon-apiserver-kicker is running
Service snap.eks.daemon-control-plane-kicker is running
Service snap.eks.daemon-proxy is running
Service snap.eks.daemon-kubelet is running
Service snap.eks.daemon-scheduler is running
Service snap.eks.daemon-controller-manager is running
Copy service arguments to the final report tarball
Inspecting AppArmor configuration
Gathering system information
Copy processes list to the final report tarball
Copy snap list to the final report tarball
Copy VM name (or none) to the final report tarball
Copy disk usage information to the final report tarball
Copy memory usage information to the final report tarball
Copy server uptime to the final report tarball
Copy current linux distribution to the final report tarball
Copy openSSL information to the final report tarball
Copy network configuration to the final report tarball
Inspecting kubernetes cluster
Inspect kubernetes cluster
Inspecting juju
/snap/eks/3/inspect.sh: line 285: store_juju_info: command not found
Inspecting kubeflow
/snap/eks/3/inspect.sh: line 288: store_kubeflow_info: command not found
Building the report tarball
Report tarball is at /var/snap/eks/3/inspection-report-20210125_075140.tar.gz
Environment:
Windows 10 host with 4 VirtualBox VMs running Ubuntu Server 18.04
eks installed on all using snap
cluster created successfully. cluster was functioning correctly for several hours.
Only dashboard was deployed. Dashboard was accessible from the host.
Then one node showed the error
failed command: WRITE FPDMA QUEUED .....
After some time two more nodes showed the same error. (screenshots below)
sudo eks inspect
shows all services are running.
Does this have anything to do with eks?
I have been using VirtualBox with Ubuntu 18.04 VMs on this machine for a long time. I have not encountered this error before.
So today I set up 3x Ubuntu 20.04.2 LTS servers (eksd01, eksd02, eksd03) and ran snap install eks --classic --edge
on each. I joined 02 and 03 to 01 in turn by running eks add-node
on 01, then running the produced eks join
. All fine at this point.
Later, I'm playing around with the consul-helm chart, which creates a pod and associated pv/pvc. On eksd01, the consul server pod is able to start, but on the others it's failing with a message about not being able to write to a location. I go digging around and I find this issue suggesting it's permissions: hashicorp/consul#3795.
# eksd01
ls /var/snap/eks/common/default-storage/ -alh
total 28K
drwxr-xr-x 7 root root 4.0K Apr 2 12:56 .
drwxr-xr-x 5 root root 4.0K Apr 2 07:53 ..
drwxrwxrwx 2 root root 4.0K Apr 2 12:56 consul-data-consul-eksd-server-0-pvc-a8abe292-c2d1-4890-a690-ecba8eff64b5
...
# eksd02
$ ls -alh /var/snap/eks/common/default-storage/
total 24K
drwxr-xr-x 6 root root 4.0K Apr 2 12:57 .
drwxr-xr-x 5 root root 4.0K Apr 2 08:23 ..
drwxr-xr-x 2 root root 4.0K Apr 2 12:49 default-data-default-eksd-server-1-pvc-bc7e3e99-49c8-41eb-ac50-87fe472537c5
...
Huh, weird, 0777
vs 0755
.
I run chmod -R 0777 /var/snap/eks/common/default-storage/
and restart the failing pods. They come back happy, healthy and writing.
So yeah, I'm guessing it's something to do with joining nodes given the host running eks add-node
hasn't had any issues.
Some messaging and configuration refers to "microk8s". It isn't a secret that this snap is based on microk8s, I wonder however how confusing this might be for the user.
I ran into this with the user group configuration, sudo usermod -a -G microk8s $USER
.
Both user group naming and printed help command would need update.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.