Giter VIP home page Giter VIP logo

cluster-api-provider-packet's Introduction

Equinix Metal cluster-api Provider

GitHub release Continuous Integration GoDoc Go Report Card Docker Pulls

This is the official cluster-api provider for Equinix Metal, formerly known as Packet. It implements cluster-api provider version v1beta1.

Packetbot works hard to keep Kubernetes cluster in a good shape

Upgrading to v0.8.X

IMPORTANT We removed support for the very old packet-ccm cloud provider in this release, please migrate to Cloud Provider Equinix Metal before upgrading.

  • Now based on CAPI 1.6, please see Cluster API release notes for kubernetes version compatibility and relevant upgrade notes.
  • The API version v1alpha3 has been completely removed in this release. Realistically, this was not used by anyone, but if you were using it, at this point it's likely easier to deploy a fresh cluster than to try to upgrade.
  • We're deprecating --metrics-bind-addr and defaulting to secure communications for the metric server. Please see more info on the upstream Cluster API PR.
  • We've changed the tags applied to devices in the Equinix Metal API to start with "capp" instead of "cluster-api-provider-packet". This was done to enable longer cluster and machine names within the 80 character limit of the Equinix Metal API. If you have any automation that relies on the old tags, you'll need to update it.
  • Pursuant to the above, if you have a cluster that is likely to add new nodes WHILE you are upgrading the Cluster API Provider Packet component, add the cluster.x-k8s.io/paused annotation to your cluster object. This will pause remediation. Then remember to remove the annotation after the upgrade.

Ugrading to v0.7.X

IMPORTANT Before you upgrade, please note that Facilities have been deprecated as of version v0.7.0

  • Newly generated cluster yaml files will use Metro by default.
  • Facility is still usable, but should be moved away from as soon as you can
  • See here for more info on the facility deprecation: Bye Facilities, Hello (again) Metros
  • If you would like to upgrade your existing clusters from using facilities to using metros, please work with your Equinix support team to figure out the best course of action. We can also provide some support via our community Slack and the Equinix Helix community site.
  • The basic process will be to upgrade to v0.7.0, then replace facility: sv15 with metro: sv (insert your correct metro instead of sv, for more information check out our Metros documentation) in your existing PacketCluster and PacketMachineTemplate objects.
    • For example, to update a PacketCluster object from facility sv15 to metro sv
      • kubectl patch packetclusters my-cluster --type='json' -p '[{"op":"remove","path":"/spec/facility"},{"op":"add","path":"/spec/metro","value":"sv"}]'
    • To update a PacketMachineTemplate object from facility sv15 to metro sv PLEASE NOTE Most people do not set the facility on their PacketMachineTemplate objects, so you may not need to do this step.
      • kubectl patch packetmachinetemplate my-cluster-control-plane --type='json' -p '[{"op":"remove","path":"/spec/template/spec/facility"},{"op":"add","path":"/spec/template/spec/metro","value":"sv"}]'
  • The expectation is that if the devices are already in the correct metros you've specified, no disruption will happen to clusters or their devices, however, as with any breaking change you should verify this outside of production before you upgrade.

Requirements

To use the cluster-api to deploy a Kubernetes cluster to Equinix Metal, you need the following:

  • A Equinix Metal API key
  • A Equinix Metal project ID
  • The clusterctl binary from the official cluster-api provider releases page
  • A Kubernetes cluster - the "bootstrap cluster" - that will deploy and manage the cluster on Equinix Metal.
  • kubectl - not absolutely required, but it is hard to interact with a cluster without it!

For the bootstrap cluster, any compliant cluster will work, including official kubernetes, k3s, kind and k3d.

Once you have your cluster, ensure your KUBECONFIG environment variable is set correctly.

Getting Started

You should then follow the Cluster API Quick Start Guide, selecting the 'Equinix Metal' tabs where offered.

Defaults

If you do not change the generated yaml files, it will use defaults. You can look in the templates/cluster-template.yaml file for details.

  • CPEM_VERSION (defaults to v3.7.0)
  • KUBE_VIP_VERSION (defaults to v0.6.4)
  • NODE_OS (defaults to ubuntu_20_04)
  • POD_CIDR (defaults to 192.168.0.0/16)
  • SERVICE_CIDR (defaults to 172.26.0.0/16)

Reserved Hardware

If you'd like to use reserved instances for your cluster, you need to edit your cluster yaml and add a hardwareReservationID field to your PacketMachineTemplates. That field can contain either a comma-separated list of hardware reservation IDs you'd like to use (which will cause it to ignore the facility and machineType you've specified), or just "next-available" to let the controller pick one that's available (that matches the machineType and facility you've specified). Here's an example:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: PacketMachineTemplate
metadata:
  name: my-cluster-control-plane
  namespace: default
spec:
  template:
    spec:
      billingCycle: hourly
      machineType: c3.small.x86
      os: ubuntu_20_04
      sshKeys:
        - ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIDvMgVEubPLztrvVKgNPnRe9sZSjAqaYj9nmCkgr4PdK username@computer
      tags: []
      #If you want to specify the exact machines to use, provide a comma separated list of UUIDs
      hardwareReservationID: "b537c5aa-2ef3-11ed-a261-0242ac120002,b537c5aa-2ef3-11ed-a261-0242ac120002"
      #Or let the controller pick from available reserved hardware in the project that matches machineType and facility with `next-available`
      #hardwareReservationID: "next-available"

Community, discussion, contribution, and support

Learn how to engage with the Kubernetes community on the community page.

Equinix has a cluster-api guide

You can reach the maintainers of this project at:

Development and Customizations

The following section describes how to use the cluster-api provider for packet (CAPP) as a regular user. You do not need to clone this repository, or install any special tools, other than the standard kubectl and clusterctl; see below.

Code of conduct

Participation in the Kubernetes community is governed by the Kubernetes Code of Conduct.

cluster-api-provider-packet's People

Contributors

cheese-head avatar codinja1188 avatar cpanato avatar cprivitere avatar ctreatma avatar davidspek avatar deitch avatar dependabot[bot] avatar detiber avatar displague avatar gianarb avatar invidian avatar jhead-slg avatar johnstudarus avatar k8s-ci-robot avatar moadqassem avatar ncdc avatar neolit123 avatar ocobles avatar prajyot-parab avatar rawkode avatar rkoster avatar rsmitty avatar stmcginnis avatar thebsdbox avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cluster-api-provider-packet's Issues

Intermittent failures in PacketMachine deployment due to failed bootstrap secret read

During my testing of cluster-API i have been intermittently hitting what appears is a timing condition between the moment the bootstrap secret is created for a machine and the packet controller attempting to consume it. When this occurs the solution is to manually delete the machine resource; which I may have to do for several machines before it works again.

Controller log:

cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager 2020-07-29T04:10:38.581Z    INFO    controllers.PacketMachine.infrastructure.cluster.x-k8s.io/v1alpha3  Machine Controller has not yet set OwnerRef {"packetmachine": "dev-pkt-ewr2-mine-k8s-game/k8s-game-np-default-43e5-fz49q"}
cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager 2020-07-29T04:10:38.581Z    DEBUG   controller-runtime.controller   Successfully Reconciled {"controller": "packetmachine", "request": "dev-pkt-ewr2-mine-k8s-game/k8s-game-np-default-43e5-fz49q"}
cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager 2020-07-29T04:10:38.613Z    INFO    controllers.PacketMachine.infrastructure.cluster.x-k8s.io/v1alpha3  Machine Controller has not yet set OwnerRef {"packetmachine": "dev-pkt-ewr2-mine-k8s-game/k8s-game-np-default-43e5-fz49q"}
cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager 2020-07-29T04:10:38.613Z    DEBUG   controller-runtime.controller   Successfully Reconciled {"controller": "packetmachine", "request": "dev-pkt-ewr2-mine-k8s-game/k8s-game-np-default-43e5-fz49q"}
cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager 2020-07-29T04:10:38.626Z    INFO    controllers.PacketMachine.infrastructure.cluster.x-k8s.io/v1alpha3  Machine Controller has not yet set OwnerRef {"packetmachine": "dev-pkt-ewr2-mine-k8s-game/k8s-game-np-default-43e5-fz49q"}
cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager 2020-07-29T04:10:38.626Z    DEBUG   controller-runtime.controller   Successfully Reconciled {"controller": "packetmachine", "request": "dev-pkt-ewr2-mine-k8s-game/k8s-game-np-default-43e5-fz49q"}
cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager 2020-07-29T04:10:38.639Z    INFO    controllers.PacketMachine.infrastructure.cluster.x-k8s.io/v1alpha3  Machine Controller has not yet set OwnerRef {"packetmachine": "dev-pkt-ewr2-mine-k8s-game/k8s-game-np-default-43e5-fz49q"}
cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager 2020-07-29T04:10:38.639Z    DEBUG   controller-runtime.controller   Successfully Reconciled {"controller": "packetmachine", "request": "dev-pkt-ewr2-mine-k8s-game/k8s-game-np-default-43e5-fz49q"}
cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager 2020-07-29T04:10:38.716Z    INFO    controllers.PacketMachine.infrastructure.cluster.x-k8s.io/v1alpha3  Reconciling PacketMachine   {"packetmachine": "dev-pkt-ewr2-mine-k8s-game/k8s-game-np-default-43e5-fz49q", "machine": "k8s-game-np-default-d5c944fdf-7v8bq", "cluster": "dev-ewr2-mine-k8s-game", "packetcluster": "dev-ewr2-mine-k8s-game"}
cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager 2020-07-29T04:10:38.716Z    INFO    controllers.PacketMachine.infrastructure.cluster.x-k8s.io/v1alpha3  Bootstrap data secret is not yet available  {"packetmachine": "dev-pkt-ewr2-mine-k8s-game/k8s-game-np-default-43e5-fz49q", "machine": "k8s-game-np-default-d5c944fdf-7v8bq", "cluster": "dev-ewr2-mine-k8s-game", "packetcluster": "dev-ewr2-mine-k8s-game"}
cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager 2020-07-29T04:10:38.729Z    DEBUG   controller-runtime.controller   Successfully Reconciled {"controller": "packetmachine", "request": "dev-pkt-ewr2-mine-k8s-game/k8s-game-np-default-43e5-fz49q"}
cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager 2020-07-29T04:10:38.729Z    INFO    controllers.PacketMachine.infrastructure.cluster.x-k8s.io/v1alpha3  Reconciling PacketMachine   {"packetmachine": "dev-pkt-ewr2-mine-k8s-game/k8s-game-np-default-43e5-fz49q", "machine": "k8s-game-np-default-d5c944fdf-7v8bq", "cluster": "dev-ewr2-mine-k8s-game", "packetcluster": "dev-ewr2-mine-k8s-game"}
cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager 2020-07-29T04:10:38.729Z    INFO    controllers.PacketMachine.infrastructure.cluster.x-k8s.io/v1alpha3  Bootstrap data secret is not yet available  {"packetmachine": "dev-pkt-ewr2-mine-k8s-game/k8s-game-np-default-43e5-fz49q", "machine": "k8s-game-np-default-d5c944fdf-7v8bq", "cluster": "dev-ewr2-mine-k8s-game", "packetcluster": "dev-ewr2-mine-k8s-game"}
cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager 2020-07-29T04:10:38.729Z    DEBUG   controller-runtime.controller   Successfully Reconciled {"controller": "packetmachine", "request": "dev-pkt-ewr2-mine-k8s-game/k8s-game-np-default-43e5-fz49q"}
cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager 2020-07-29T04:10:38.736Z    INFO    controllers.PacketMachine.infrastructure.cluster.x-k8s.io/v1alpha3  Reconciling PacketMachine   {"packetmachine": "dev-pkt-ewr2-mine-k8s-game/k8s-game-np-default-43e5-fz49q", "machine": "k8s-game-np-default-d5c944fdf-7v8bq", "cluster": "dev-ewr2-mine-k8s-game", "packetcluster": "dev-ewr2-mine-k8s-game"}
cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager 2020-07-29T04:10:38.736Z    INFO    controllers.PacketMachine.infrastructure.cluster.x-k8s.io/v1alpha3  Bootstrap data secret is not yet available  {"packetmachine": "dev-pkt-ewr2-mine-k8s-game/k8s-game-np-default-43e5-fz49q", "machine": "k8s-game-np-default-d5c944fdf-7v8bq", "cluster": "dev-ewr2-mine-k8s-game", "packetcluster": "dev-ewr2-mine-k8s-game"}
cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager 2020-07-29T04:10:38.736Z    DEBUG   controller-runtime.controller   Successfully Reconciled {"controller": "packetmachine", "request": "dev-pkt-ewr2-mine-k8s-game/k8s-game-np-default-43e5-fz49q"}
cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager 2020-07-29T04:10:39.166Z    INFO    controllers.PacketMachine.infrastructure.cluster.x-k8s.io/v1alpha3  Reconciling PacketMachine   {"packetmachine": "dev-pkt-ewr2-mine-k8s-game/k8s-game-np-default-43e5-fz49q", "machine": "k8s-game-np-default-d5c944fdf-7v8bq", "cluster": "dev-ewr2-mine-k8s-game", "packetcluster": "dev-ewr2-mine-k8s-game"}
cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager 2020-07-29T04:10:39.184Z    ERROR   controller-runtime.controller   Reconciler error    {"controller": "packetmachine", "request": "dev-pkt-ewr2-mine-k8s-game/k8s-game-np-default-43e5-fz49q", "error": "failed to create machine k8s-game-np-default-43e5-fz49q: impossible to retrieve bootstrap data from secret: failed to retrieve bootstrap data secret for PacketMachine dev-pkt-ewr2-mine-k8s-game/k8s-game-np-default-43e5-fz49q: Secret \"k8s-game-np-default-sqqlb\" not found"}
cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager github.com/go-logr/zapr.(*zapLogger).Error
cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager     /go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:128
cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager     /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:258
cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager     /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:232
cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker
cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager     /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:211
cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1
cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager     /go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:152
cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager k8s.io/apimachinery/pkg/util/wait.JitterUntil
cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager     /go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:153
cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager k8s.io/apimachinery/pkg/util/wait.Until
cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager     /go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:88
cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager 2020-07-29T04:10:40.185Z    INFO    controllers.PacketMachine.infrastructure.cluster.x-k8s.io/v1alpha3  Reconciling PacketMachine   {"packetmachine": "dev-pkt-ewr2-mine-k8s-game/k8s-game-np-default-43e5-fz49q", "machine": "k8s-game-np-default-d5c944fdf-7v8bq", "cluster": "dev-ewr2-mine-k8s-game", "packetcluster": "dev-ewr2-mine-k8s-game"}
cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager 2020-07-29T04:10:40.185Z    INFO    controllers.PacketMachine.infrastructure.cluster.x-k8s.io/v1alpha3  Error state detected, skipping reconciliation   {"packetmachine": "dev-pkt-ewr2-mine-k8s-game/k8s-game-np-default-43e5-fz49q", "machine": "k8s-game-np-default-d5c944fdf-7v8bq", "cluster": "dev-ewr2-mine-k8s-game", "packetcluster": "dev-ewr2-mine-k8s-game"}
cluster-api-provider-packet-controller-manager-66d9fc947d-zlqln manager 2020-07-29T04:10:40.186Z    DEBUG   controller-runtime.controller   Successfully Reconciled {"controller": "packetmachine", "request": "dev-pkt-ewr2-mine-k8s-game/k8s-game-np-default-43e5-fz49q"}

Secret exists!

$ k get secret k8s-game-np-default-sqqlb -o yaml
apiVersion: v1
data:
  value: ...
kind: Secret
metadata:
  creationTimestamp: "2020-07-29T04:10:39Z"
  labels:
    cluster.x-k8s.io/cluster-name: dev-ewr2-mine-k8s-game
  name: k8s-game-np-default-sqqlb
  namespace: dev-pkt-ewr2-mine-k8s-game
  ownerReferences:
  - apiVersion: bootstrap.cluster.x-k8s.io/v1alpha3
    controller: true
    kind: KubeadmConfig
    name: k8s-game-np-default-sqqlb
    uid: 0428f42a-0282-4edb-91ed-970011d3c82b
  resourceVersion: "9495969"
  selfLink: /api/v1/namespaces/dev-pkt-ewr2-mine-k8s-game/secrets/k8s-game-np-default-sqqlb
  uid: 560772d3-ad97-4760-93a3-d0e8b60ce34f
type: cluster.x-k8s.io/secret

Module path should be sigs.k8s.io/cluster-api-provider-packet

When importing this package in other projects, the module path must match the source name.

$ go mod vendor
warning: ignoring symlink /Users/marques/src/openshift-installer/pkg/asset/store/data
go: sigs.k8s.io/[email protected]: parsing go.mod:
        module declares its path as: github.com/packethost/cluster-api-provider-packet
                but was required as: sigs.k8s.io/cluster-api-provider-packet

The source should be sigs.k8s.io/cluster-api-provider-packet.

Compare

module github.com/packethost/cluster-api-provider-packet

to:

Support different kubernetes version

In theory you can specify the Kubernetes version you want your cluster to run.

I am saying in theory because right now we do not support that. I feel like this
is in some way related to #118 but not sure yet.

As you can see from cluster-template.yaml you can specify the kubernetes version

apiVersion: cluster.x-k8s.io/v1alpha3
kind: MachineDeployment
metadata:
  name: ${CLUSTER_NAME}-worker-a
  labels:
    cluster.x-k8s.io/cluster-name: ${CLUSTER_NAME}
    pool: worker-a
spec:
  replicas: ${WORKER_MACHINE_COUNT}
  clusterName: ${CLUSTER_NAME}
  selector:
    matchLabels:
      cluster.x-k8s.io/cluster-name: ${CLUSTER_NAME}
      pool: worker-a
  template:
    metadata:
      labels:
        cluster.x-k8s.io/cluster-name: ${CLUSTER_NAME}
        pool: worker-a
    spec:
      version: ${KUBERNETES_VERSION}
      clusterName: ${CLUSTER_NAME}
      bootstrap:
        configRef:
          name: ${CLUSTER_NAME}-worker-a
          apiVersion: bootstrap.cluster.x-k8s.io/v1alpha3
          kind: KubeadmConfigTemplate
      infrastructureRef:
        name: ${CLUSTER_NAME}-worker-a
        apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
        kind: PacketMachineTemplate

We have to dig into it to see how the other cluster-api-providers implemented
this feature.

Support for ignition to enable usage of OS's like Flatcar

To support Flatcar Container Linux for cluster-api for Packet, I have been following the quickstart guide.
Though it fails when reading ignition config from https://metadata.packet.net/userdata.

What I did

$ export PACKET_API_KEY="..."
$ export PROJECT_ID="2c850bb4-70b4-4dcd-b60f-2d0be86d0ae4"
$ export FACILITY="ams1"
$ export NODE_OS="flatcar_stable"
$ export SSH_KEY="2020-06 [email protected]"
$ export POD_CIDR="172.25.0.0/16"
$ export SERVICE_CIDR="172.26.0.0/16"
$ export MASTER_NODE_TYPE="t1.small"
$ export WORKER_NODE_TYPE="t1.small"

$ kind create cluster
... (ok)
$ clusterctl init
... (ok)
$ clusterctl init --infrastructure packet
... (ok)
$ kubectl apply -f ./capi-dongsu.yaml
... (fail)

Error messages:

cat: unrecognized option '-------------------------------------------------------------------------------'
Try 'cat --help' for more information.
Ignition v0.33.0-1-ga7c8752-dirty
reading system config file "/usr/lib/ignition/base.ign"
parsing config with SHA512: 0131bd505bfe1b1215ca4ec9809701a3323bf448114294874f7249d8d300440bd742a7532f60673bfa0746c04de0bd5ca68d0fe9a8ecd59464b13a6401323cb4
parsed url from cmdline: ""
no config URL provided
reading system config file "/usr/lib/ignition/user.ign"
no config at "/usr/lib/ignition/user.ign"
GET https://metadata.packet.net/userdata: attempt #1
GET result: OK
parsing config with SHA512: d4e25b97bb56f8c6f56b052b48726ce2876d94c010911b3bb21099addf254545e2bdd9055aa37ca16c855eb9fb18f9bffccd598913d1b9023301069405126a1c
error at line 1, column 1
invalid character '#' looking for beginning of value
failed to fetch config: config is not valid
failed to acquire config: config is not valid
POST message to Packet Timeline
GET https://metadata.packet.net/metadata: attempt #1
GET result: OK
Ignition failed: config is not valid

capi-dongsu.yaml:

apiVersion: bootstrap.cluster.x-k8s.io/v1alpha3
kind: KubeadmConfig
metadata:
  name: dongsu-capi-control-plane1-config
  namespace: default
spec:
  clusterConfiguration:
    controllerManager:
      extraArgs:
        enable-hostpath-provisioner: "true"
  initConfiguration:
    nodeRegistration:
      kubeletExtraArgs:
        cloud-provider: external
  postKubeadmCommands:
  - 'kubectl --kubeconfig /etc/kubernetes/admin.conf create secret generic -n kube-system
    packet-cloud-config --from-literal=cloud-sa.json=''{"apiKey": "{{ .apiKey }}","projectID":
    "2c850bb4-70b4-4dcd-b60f-2d0be86d0ae4"}'''
  - kubectl apply --kubeconfig /etc/kubernetes/admin.conf -f https://raw.githubusercontent.com/packethost/packet-ccm/master/deploy/releases/v1.0.0/deployment.yaml
  preKubeadmCommands:
  - swapoff -a
  - systemctl enable --now docker.service
  - mkdir -p /opt/bin /opt/cni/bin /etc/systemd/system/kubelet.service.d
  - curl -L "https://github.com/containernetworking/plugins/releases/download/v0.8.2/cni-plugins-linux-amd64-v0.8.2.tgz" | tar -C /opt/cni/bin -xz
  - curl -L "https://github.com/kubernetes-sigs/cri-tools/releases/download/v1.16.0/crictl-v1.16.0-linux-amd64.tar.gz" | tar -C /opt/bin -xz
  - curl -L --remote-name-all https://storage.googleapis.com/kubernetes-release/release/v1.18.3/bin/linux/amd64/{kubeadm,kubelet,kubectl}
  - chmod +x kubeadm kubelet kubectl
  - mv kubeadm kubelet kubectl /opt/bin
  - curl -sSL "https://raw.githubusercontent.com/kubernetes/kubernetes/v1.18.3/build/debs/kubelet.service" | sed "s:/usr/bin:/opt/bin:g" > /etc/systemd/system/kubelet.service
  - curl -sSL "https://raw.githubusercontent.com/kubernetes/kubernetes/v1.18.3/build/debs/10-kubeadm.conf" | sed "s:/usr/bin:/opt/bin:g" > /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
---
apiVersion: cluster.x-k8s.io/v1alpha3
kind: Cluster
metadata:
  name: dongsu-capi
  namespace: default
spec:
  clusterNetwork:
    pods:
      cidrBlocks:
      - 172.25.0.0/16
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
    kind: PacketCluster
    name: dongsu-capi
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
kind: PacketCluster
metadata:
  name: dongsu-capi
  namespace: default
spec:
  projectID: 2c850bb4-70b4-4dcd-b60f-2d0be86d0ae4
---
apiVersion: cluster.x-k8s.io/v1alpha3
kind: Machine
metadata:
  labels:
    cluster.x-k8s.io/cluster-name: dongsu-capi
    cluster.x-k8s.io/control-plane: "true"
  name: dongsu-capi-master-0
  namespace: default
spec:
  bootstrap:
    configRef:
      apiVersion: bootstrap.cluster.x-k8s.io/v1alpha3
      kind: KubeadmConfig
      name: dongsu-capi-control-plane1-config
  clusterName: dongsu-capi
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
    kind: PacketMachine
    name: dongsu-capi-master-0
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
kind: PacketMachine
metadata:
  name: dongsu-capi-master-0
  namespace: default
spec:
  OS: flatcar_stable
  billingCycle: hourly
  facility:
  - ams1
  machineType: t1.small
  sshKeys:
  - "2020-06 [email protected]"
  tags: []
---
apiVersion: cluster.x-k8s.io/v1alpha3
kind: MachineDeployment
metadata:
  labels:
    cluster.x-k8s.io/cluster-name: dongsu-capi
    pool: worker-a
  name: dongsu-capi-worker-a
  namespace: default
spec:
  clusterName: dongsu-capi
  replicas: 3
  selector:
    matchLabels:
      cluster.x-k8s.io/cluster-name: dongsu-capi
      pool: worker-a
  template:
    metadata:
      labels:
        cluster.x-k8s.io/cluster-name: dongsu-capi
        pool: worker-a
    spec:
      bootstrap:
        configRef:
          apiVersion: bootstrap.cluster.x-k8s.io/v1alpha3
          kind: KubeadmConfigTemplate
          name: dongsu-capi-worker-a
      clusterName: dongsu-capi
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
        kind: PacketMachineTemplate
        name: dongsu-capi-worker-a
      version: v1.18.3
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
kind: PacketMachineTemplate
metadata:
  name: dongsu-capi-worker-a
  namespace: default
spec:
  template:
    spec:
      OS: flatcar_stable
      billingCycle: hourly
      facility:
      - ams1
      machineType: t1.small
      sshKeys:
      - "2020-06 [email protected]"
      tags: []
---
apiVersion: bootstrap.cluster.x-k8s.io/v1alpha3
kind: KubeadmConfigTemplate
metadata:
  name: dongsu-capi-worker-a
  namespace: default
spec:
  template:
    spec:
      joinConfiguration:
        nodeRegistration:
          kubeletExtraArgs:
            cloud-provider: external
      preKubeadmCommands:
      - swapoff -a
      - systemctl enable --now docker.service
      - mkdir -p /opt/bin /opt/cni/bin /etc/systemd/system/kubelet.service.d
      - curl -L "https://github.com/containernetworking/plugins/releases/download/v0.8.2/cni-plugins-linux-amd64-v0.8.2.tgz" | tar -C /opt/cni/bin -xz
      - curl -L "https://github.com/kubernetes-sigs/cri-tools/releases/download/v1.16.0/crictl-v1.16.0-linux-amd64.tar.gz" | tar -C /opt/bin -xz
      - curl -L --remote-name-all https://storage.googleapis.com/kubernetes-release/release/v1.18.3/bin/linux/amd64/{kubeadm,kubelet,kubectl}
      - chmod +x kubeadm kubelet kubectl
      - mv kubeadm kubelet kubectl /opt/bin
      - curl -sSL "https://raw.githubusercontent.com/kubernetes/kubernetes/v1.18.3/build/debs/kubelet.service" | sed "s:/usr/bin:/opt/bin:g" > /etc/systemd/system/kubelet.service
      - curl -sSL "https://raw.githubusercontent.com/kubernetes/kubernetes/v1.18.3/build/debs/10-kubeadm.conf" | sed "s:/usr/bin:/opt/bin:g" > /etc/systemd/system/kubelet.service.d/10-kubeadm.conf

As far as I understand, every Packet machine can accept "userdata" in cloud-init format. When simply creating a machine on the Packet UI, the machine gets configured without userdata. If we want to specify userdata, we need to configure via UI before creating the machine.

In the Packet UI for creating the machine, its help message shows:

Use this to execute script/package tasks or trigger more advanced configuration processes after the server is ready.
Paste your YAML or script here. Packet supports most cloud-init config formats. Support for formats other than #cloud-config and #! (script) are experimental.

So I think somewhere cluster-api seems to generate a config with unsupported characters in the beginning of cloud-init config.
But I am not sure exactly where/how it gets configured. The cluster-api has a cloud-init config for userdata. Though changing it does not seem to fix the issue.

Does anyone have an idea?

Thanks.

/cc @vbatts @t-lo

SERVICE_CIDR env var not honored

Packet's cluster-template.yaml has a clusterNetwork block with only pods: in the array, so even if you export SERVICE_CIDR and provide the range when generating the template, the service CIDR is not included in the spec

Too many locations for versions

We have versions in too many places:

  • Makefile
  • templates/metadata-template.yaml
  • test/e2e/config/packet-dev.yaml
  • go.mod

These need to be centralized. We need one place to control versions, rather than duplicating them

PACKET_API_TOKEN is not base64 encoded

When doing clusterctl init with packet provider the PACKET_API_TOKEN env var is required because the manager uses it to interact with PacketAPI.

We create the secret but its content is not base64 encoded, so it means that when kubernetes tries to attach it as env var in the pod it can't decode it

multi-master support with template

ClusterAPI currently works with single control plane but not with multi-master

multi muster works via KubeadmControlPlane. As you can see from the graph it works a bit differently:

https://cluster-api.sigs.k8s.io/images/control-plane-controller.png

The controller that manages KubeadmControlPlane does not create any servers until the IP is set. That creates two options:

  • Run only in single master mode, but having an explicit Machine for the master
  • Get the IP at cluster creation as part of cluster Reconcile()

CAPP currently does the first. We want to enable the second.
Packet is blocked on the second one by an internal Packet API issue around cluster management known internally as 4090. When that is fixed, this issue will be addressed.

Control Plane rolling update stall with EIP

cluster-api v0.3.7
capp v0.3.2
packet-ccm v1.1.0

I am hitting an issue with cluster-api's ability to roll the control plane nodes. This appears to be because of how we bind the EIP to the node via the lo:0 interface and how cluster-api tears down the node's etcd instance before fully draining the workloads running on the node.

To reproduce, bring up a 1 node control plane with a 1 node worker node and then edit the KubeadmControlPlane's kubeadmConfigSpec changing the postKubeadmCommands to include an echo or other innocuous addition.

The new control plane node will begin deployment and eventually come into service. At some point, cluster-api kills the etcd of the older control plane node and the Packet CCM EIP health check moves the EIP to the new node. Once the etcd goes away, the kube-apiserver panics and the node stalls being unable to reach the EIP which is bound locally with no running kube-apiserver.

After several minutes, on the new control plane you can see various pods stuck in Terminating and/or Pending state. The cluster-api will not progress past this point.

# k get -A pods -o wide | grep -v Running
NAMESPACE        NAME                                               READY   STATUS        RESTARTS   AGE   IP              NODE                             NOMINATED NODE   READINESS GATES
core             cert-manager-webhook-69c8965665-49cfh              1/1     Terminating   0          11h   240.0.18.144    k8s-game-cp-1d5ce5-6wnjj         <none>           <none>
kube-system      cilium-operator-7597b4574b-bg94f                   1/1     Terminating   0          11h   10.66.5.5       k8s-game-cp-1d5ce5-6wnjj         <none>           <none>
kube-system      cilium-operator-7597b4574b-nlbjw                   0/1     Pending       0          20m   <none>          <none>                           <none>           <none>
kube-system      cilium-sjtk8                                       0/1     Pending       0          28m   <none>          <none>                           <none>           <none>
kube-system      coredns-66bff467f8-jznm9                           1/1     Terminating   0          11h   240.0.18.145    k8s-game-cp-1d5ce5-6wnjj         <none>           <none>
kube-system      coredns-66bff467f8-s77cv                           0/1     Pending       0          20m   <none>          <none>                           <none>           <none>
topolvm-system   controller-7d85c6bbbc-8ps5q                        0/5     Pending       0          20m   <none>          <none>                           <none>           <none>
topolvm-system   controller-7d85c6bbbc-ppvvz                        5/5     Terminating   0          11h   240.0.18.12     k8s-game-cp-1d5ce5-6wnjj         <none>           <none>

To get things moving again you have to go onto the old control plane node and ip addr del <EIP>/32 dev lo. Once this is done, the local kubelet can talk again to the API, the cluster-api evicts the pods, and the old node is deleted.

I believe these issues may be related:

kubernetes-sigs/cluster-api#2937
kubernetes-sigs/cluster-api#2652

As a work around, I created the following script along with a systemd service which gets installed into all control plane nodes. This setup allows the rolling update to occur without manual interaction.

Script:

#!/usr/bin/env bash

set -o errexit
set -o nounset
set -o pipefail

EIP=$1

while true; do
    rc=0
    curl -fksS --retry 9 --retry-connrefused --retry-max-time 180 https://$EIP:6443/healthz || rc=$?
    if [[ $rc -eq 7 ]]; then
        echo "removing EIP $EIP"
        ifdown lo:0
        ip addr del $EIP/32 dev lo || true
        break
    fi
    echo ""
    sleep $(($RANDOM % 15))
done

postKubeadmCommands addition:

        cat <<EOT > /etc/systemd/system/packet-eip-health.service
        [Unit]
        Description=Packet EIP health check
        Wants=kubelet.service
        After=kubelet.service

        [Service]
        Type=simple
        Restart=on-failure
        ExecStart=/usr/local/bin/packet-eip-health.sh {{ .controlPlaneEndpoint }}

        [Install]
        WantedBy=multi-user.target
        EOT

        systemctl daemon-reload
        systemctl enable packet-eip-health
        systemctl start packet-eip-health

Figure out where we stand about operating systems

Currently, the cluster-template.yaml is tight to Ubuntu (maybe Debian as well)
because we use the BootstrapTemplate to install packages like: kubeadm,
docker, kubectl and so on:

kind: KubeadmConfig
apiVersion: bootstrap.cluster.x-k8s.io/v1alpha3
metadata:
  name: "${CLUSTER_NAME}-control-plane1-config"
spec:
  preKubeadmCommands:
    - swapoff -a
    - apt-get -y update
    - DEBIAN_FRONTEND=noninteractive apt-get install -y apt-transport-https curl
    - curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
    - echo "deb https://apt.kubernetes.io/ kubernetes-xenial main" > /etc/apt/sources.list.d/kubernetes.list
    - curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
    - apt-key fingerprint 0EBFCD88
    - add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
    - apt-get update -y
    - apt-get install -y ca-certificates socat jq ebtables apt-transport-https cloud-utils prips docker-ce docker-ce-cli containerd.io kubelet kubeadm kubectl
    - systemctl daemon-reload
    - systemctl enable docker
    - systemctl start docker

As you can see this does not work for every operating system.

I am not 100% sure about how the other cluster-api-providers manage this, I
remember AWS, DigitalOcean generating their own images.

I think we should not put the effort of supporting multiple operating systems,
but we have to be clear about it.

In general the template can be modified in a lot of ways from the end user to
make the installation process to work for CentOS or ArchLinux, it is just matter
of changing the packege manager.

In any case I like the idea to maintain a list of images that people can grab
from Packet and use with the ClusterAPI mainly because it will decrease the
amount of runtime actions that can fail along the way.

So, this issue is to understand what you think about it.

First step should be to at least write a documentation about where we are to
avoid the pain for a user to discover that CentOS does not work when trying the
cluster-api.

Move facility from Machine to Cluster

Right now the facility is part of the PacketMachine Spec. This has two issues:

  1. It drives you to the wrong assumption that you can easily deploy machines that are part of the same cluster in multiple facilities. This is not currently possible and very hard to achieve because it requires network configuration, security concerns, latency. Nobody in the Kubernetes community suggests for that and we should not.
  2. I am in the process of supporting multi-master #47. We requiring an IP for the cluster we have to specify the facility. The fact that it is not part of the Cluster makes the process of selecting the facility not easy, flaky.

Reassign ElasticIP when control plane is not reachable

Currently, we create an elastic IP for every cluster and we assign it to one of the control-plane (the first one that gets created).

There is no process to re-assign it in case of a failure from the control plane.

There are currently two possibilities:

  1. we can write a routine that checks the IPs for all the clusters and if one does not work (apiserver or host is down), it assigns the IP to another control-plane.
  2. We can use bird and BGP

Add support for reservation IDs with MachineDeployment

Issue #133 and related PR #134 reflect the request to deploy a single Machine with its attendant HardwareReservationID.

However, usually machines are deployed with a MachineDeployment, which has a PacketMachineTemplate.

This issue is to track what that would look like, eventually leading to a PR.

What do when encountered a provisioning failure

We have encountered a provisioning failure for your device test1-g7tqp-master-0.

The PacketAPI returns:

403 You are not authorized to view this device

Right now the MachineController tries over and over without an end. it does not sound great. Should we treat the error as we do when the controller receives a 404? We assume that the server is not running anymore and we make the reconciliation as a success. It is not ideal because 403 will may be fixed by generating a new API key. I think the API is not returning the right status code here.

@deitch

Add CSI to addons.yaml.template

addons.yaml.template includes CCM (basic necessity for getting the cluster to work) but not CSI. We should consider including it, based on the status of block storage support at Packet.

clusterctl init fails due to an invalid URL

I am trying to follow the steps in README.md, but the clusterctl init step fails with the failure:

$ clusterctl --config=https://github.com/packethost/cluster-api-provider-packet/releases/latest/clusterctl.yaml init --infrastructure=packet
Error: failed to initialize the configuration reader: stat https://github.com/packethost/cluster-api-provider-packet/releases/latest/clusterctl.yaml: no such file or directory

The URL https://github.com/packethost/cluster-api-provider-packet/releases/latest/clusterctl.yaml is indeed invalid.

I also tried going back to the latest release v0.1.0. However, README of v0.1.0 does not seem to be compatible with the current clusterctl v0.3.6.

So I am not sure how to proceed.

Switch to calico over weave

the cluster is initialized with Weave's CNI but to really do anything on Packet you need Calico for BGP + MetalLB

Enable multi-master

Currently, the implementation assumes one master Machine. It is possible to do multi-master with kubeadm.

Elastic IP

Consider using an Elastic IP for the cluster, which is assigned to the master, rather than the first master's public IP.

Might be possible to do via metallb and simplify it.

URLs to the Packet CCM and CSI could break

The URLs included in the cluster template are relative to the master branch on the CCM and CSI repositories. If this branch is renamed or the files are removed or relocated, in either repository, this provider will break.

postKubeadmCommands:
- 'kubectl --kubeconfig /etc/kubernetes/admin.conf create secret generic -n kube-system packet-cloud-config --from-literal=cloud-sa.json=''{"apiKey": "{{ .apiKey }}","projectID": "f2a2d7ad-886e-4207-bf38-10ebdf49cf84"}'''
- kubectl apply --kubeconfig /etc/kubernetes/admin.conf -f https://raw.githubusercontent.com/packethost/packet-ccm/master/deploy/releases/v1.0.0/deployment.yaml

These URLs should be pinned to a specific tag to make them less fragile.

Make status.addresses JSON values

I would like to use kubectl get machine X -o jsonpath... to parse out addresses assigned to a node. Unfortunately the values in the status.addresses array are formatted golang data structures. Making these JSON would allow me to parse out whatever I needed from the entries.

status:
  addresses:
  ...
  - address: packngo.IPAddressAssignment{IpAddressCommon:packngo.IpAddressCommon{ID:"REDACTED",
      Address:"10.100.2.17", Gateway:"10.100.2.16", Network:"10.100.2.16", AddressFamily:4,
      Netmask:"255.255.255.254", Public:false, CIDR:31, Created:"2020-07-25T14:54:10Z",
      Updated:"", Href:"/ips/REDACTED", Management:true,
      Manageable:true, Project:packngo.Href{Href:"/projects/REDACTED"},
      Tags:[], CustomData:map[]}, AssignedTo:packngo.Href{Href:"/devices/REDACTED"}}
    type: InternalIP

Uniqueness to machine info and ability to reference in kubeadmConfigSpec

Once the machine resource is created, it contains several useful variables, most notably the address block:

Ex:

  addresses:
    - address: 147.75.194.145
      type: ExternalIP
    - address: 147.75.105.125
      type: ExternalIP
    - address: '2604:1380:1:1200::37'
      type: ExternalIP
    - address: 10.99.189.55
      type: InternalIP

However, you can see that the type is not unique enough to distinguish. Each respective type, e.g IPv4, IPv6, and private vs. public should have unique types to reference.

Also, it appears as though the creation of the master and the IP it's able to advertise is coupled with the ability to create + apply an elastic IP - if you can omit the `{{ .controlPlaneEndpoint }} and have it default to one of the hosts' addresses, then maybe just the docs need to be updated, otherwise it would be valuable to be able to reference these as variables during the cluster initialization.

The end goal being that one does not have to have an elastic IP assigned to the master, and could reference a variable like InternalIP within the respective kubeadmConfigSpec blocks; ex:

    initConfiguration:
      nodeRegistration:
      localAPIEndpoint:
        advertiseAddress: {{ InternalIP }}
        kubeletExtraArgs:
          cloud-provider: external

Double masters

While testing, 2 masters were created instead of just the one that I was intending to create. This worked as expected the 6 times before I tried but the 7th, it gave me this weird behavior. "cluster-api-provider-packet-controller-manager" logs are as follows:

2020-08-14T22:24:47.525Z INFO controllers.PacketMachine.infrastructure.cluster.x-k8s.io/v1alpha3 Machine instance is pending {"packetmachine": "default/kluister07-control-plane-8rkkc", "machine": "kluister07-control-plane-7gz9d", "cluster": "kluister07", "packetcluster": "kluister07", "instance-id": "99a54cbf-13ec-4475-9cef-4e19fc1ee86b"}

2020-08-14T22:24:54.415Z INFO controllers.PacketMachine.infrastructure.cluster.x-k8s.io/v1alpha3 Machine instance is pending {"packetmachine": "default/kluister07-control-plane-8rkkc", "machine": "kluister07-control-plane-7gz9d", "cluster": "kluister07", "packetcluster": "kluister07", "instance-id": "66343c24-7adb-45da-ae8b-5cedbc5b52b8"}

It had the same object name but as you can see, the instance-id was different. I am leaving these machines and logs up just in case but let me know if there are any other log locations you need.

Write a doc about ipxe and custom OS

Recently I added a new doc folder that I called docs/experiences where the idea is to write about how people are using cluster-api. Because it is very flexible and it is not simple to document all its pieces by themself. I think it is nice to share what we do and I hope those use cases will drive personalized solutions for other people.

I would like to ask @rsmitty and @andrewrynhard from Talos to write their story using iPXE and custom os #131 .

Should we have a chat about it?

/kind documentation

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.