open-cluster-management-io / multicluster-controlplane Goto Github PK
View Code? Open in Web Editor NEWA standalone controlplane to run ocm core.
License: Apache License 2.0
A standalone controlplane to run ocm core.
License: Apache License 2.0
Update document about how to deploy the multicluster-controlplane in non-OCP cluster. the possible steps:
deploy_controlplane.sh
I am trying to deploy mc-cp in Linode (LKE). At least 40% of the time, things just don't work (no csr, no crd etc.) From the logs, I don't see any error.
https://gist.github.com/tamalsaha/e6e5b9981c5375d823dfb88d412fbe7e
If etcd / storage bandwidth is the issue, I tried restarting the mc-cp pod. That did not help.
I0829 06:27:13.355606 1 crds.go:80] ocm crd(managedclusteraddons.addon.open-cluster-management.io) is ready
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0x2349ace]
goroutine 3391 [running]:
github.com/openshift/library-go/pkg/operator/resource/resourceapply.reportCreateEvent({0x0, 0x0}, {0x41b8050, 0xc003470500}, {0x0, 0x0})
/go/src/open-cluster-management.io/multicluster-controlplane/vendor/github.com/openshift/library-go/pkg/operator/resource/resourceapply/event_helpers.go:28 +0x2ee
github.com/openshift/library-go/pkg/operator/resource/resourceapply.ApplyCustomResourceDefinitionV1({0x41dbac0, 0xc00073ac80}, {0x7fedb0377b38, 0xc004c51140}, {0x0, 0x0}, 0xc003470500)
/go/src/open-cluster-management.io/multicluster-controlplane/vendor/github.com/openshift/library-go/pkg/operator/resource/resourceapply/apiextensions.go:22 +0x4f2
open-cluster-management.io/multicluster-controlplane/pkg/agent.(*AgentOptions).ensureCRDs(0xc0065fd680, {0x41dbac0, 0xc00073ac80}, {0x41d66a0, 0xc0076b8930})
/go/src/open-cluster-management.io/multicluster-controlplane/pkg/agent/agent.go:164 +0x1f0
open-cluster-management.io/multicluster-controlplane/pkg/agent.(*AgentOptions).RunAgent(0xc0065fd680, {0x41dbac0, 0xc00073ac80})
/go/src/open-cluster-management.io/multicluster-controlplane/pkg/agent/agent.go:133 +0x150
open-cluster-management.io/multicluster-controlplane/pkg/controllers/ocmcontroller.EnableSelfManagement({0x41dbac0, 0xc00073ac80}, 0xc00439e180?, {0x3a85f91, 0x5}, {0xc001a369c0, 0x24})
/go/src/open-cluster-management.io/multicluster-controlplane/pkg/controllers/ocmcontroller/ocmagent.go:93 +0x6ca
created by open-cluster-management.io/multicluster-controlplane/pkg/controllers/ocmcontroller.InstallSelfManagementCluster.func1
/go/src/open-cluster-management.io/multicluster-controlplane/pkg/controllers/ocmcontroller/ocmagent.go:55 +0x1e5
On restart, I don't see this panic any more. It seems like the record event is written on an object that does not exist.
Now that I finally have an installation,
the command:
clusteradm --kubeconfig=<controlplane kubeconfig file> get token --use-bootstrap-token
fails for me as follows:
Error: resource mapping not found for name: "system:open-cluster-management:bootstrap" namespace: "" from "local": no matches for kind "ClusterRole" in version "rbac.authorization.k8s.io/v1"
ensure CRDs are installed first
It seems clusteradm
expects a ClusterRole and Binding to exist, but the standalone control plane does not even have RBAC APIs let alone ClusterRoles.
But then, the README says to grab the token via clusteradm
๐คท
I followed the instructions to build and start as local binary:
make run
go mod tidy
go mod vendor
CGO_ENABLED=0 go build -ldflags="-s -w" -o bin/multicluster-controlplane cmd/server/main.go
hack/start-multicluster-controlplane.sh
multicluster-controlplane configurations in _output/controlplane/ocmconfig.yaml
dataDirectory: /Users/paolo/go/src/github.com/pdettori/multicluster-controlplane/_output/controlplane/.ocm
apiserver:
port: 9443
etcd:
mode: embed
API SERVER secure port is free, proceeding...
Starting apiserver ...
Waiting for apiserver to come up
+++ [0626 22:56:32] On try 5, apiserver: : ok
use 'kubectl --kubeconfig=/Users/paolo/go/src/github.com/pdettori/multicluster-controlplane/_output/controlplane/.ocm/cert/kube-aggregator.kubeconfig' to access the controlplane
then on another terminal:
$ kubectl --kubeconfig=/Users/paolo/go/src/github.com/pdettori/multicluster-controlplane/_output/controlplane/.ocm/cert/kube-aggregator.kubeconfig get crds
error: the server doesn't have a resource type "crds"
$ kubectl --kubeconfig=/Users/paolo/go/src/github.com/pdettori/multicluster-controlplane/_output/controlplane/.ocm/cert/kube-aggregator.kubeconfig api-resources
NAME SHORTNAMES APIVERSION NAMESPACED KIND
bindings v1 true Binding
componentstatuses cs v1 false ComponentStatus
configmaps cm v1 true ConfigMap
endpoints ep v1 true Endpoints
events ev v1 true Event
limitranges limits v1 true LimitRange
namespaces ns v1 false Namespace
nodes no v1 false Node
persistentvolumeclaims pvc v1 true PersistentVolumeClaim
persistentvolumes pv v1 false PersistentVolume
pods po v1 true Pod
podtemplates v1 true PodTemplate
replicationcontrollers rc v1 true ReplicationController
resourcequotas quota v1 true ResourceQuota
secrets v1 true Secret
serviceaccounts sa v1 true ServiceAccount
services svc v1 true Service
apiservices apiregistration.k8s.io/v1 false APIService
Git Info: I am running code from main with
commit 12d2edb23043c630bd6d319a4fca5aa23e03834d (upstream/main)
Author: Wei Liu <[email protected]>
Date: Tue Jun 20 09:30:50 2023 +0800
download kubernetes v1.15.0-alpha.0 in set up
process, leading to kubernetes version replace: go: downgraded k8s.io/kubernetes v1.25.4 => v1.15.0-alpha.0 ,and furthermore, causing go mod tidy
fails.
It seems that Kubernetes has a new Aggregated Discovery api:
https://github.com/kubernetes/enhancements/tree/master/keps/sig-api-machinery/3352-aggregated-discovery
The issue is that for mc-cp the response is not same between the legacy and new aggregated mode. This causes problem where the addon framework thinks the CSR api type is not available.
https://github.com/open-cluster-management-io/addon-framework/blob/44852ea0722f413257fe49016009aaba25abbb42/pkg/utils/csr_helpers.go#L198
Legacy Response:
kind: APIGroupList
apiVersion: v1
groups:
- name: apiregistration.k8s.io
versions:
- groupVersion: apiregistration.k8s.io/v1
version: v1
preferredVersion:
groupVersion: apiregistration.k8s.io/v1
version: v1
- name: events.k8s.io
versions:
- groupVersion: events.k8s.io/v1
version: v1
preferredVersion:
groupVersion: events.k8s.io/v1
version: v1
- name: authentication.k8s.io
versions:
- groupVersion: authentication.k8s.io/v1
version: v1
preferredVersion:
groupVersion: authentication.k8s.io/v1
version: v1
- name: authorization.k8s.io
versions:
- groupVersion: authorization.k8s.io/v1
version: v1
preferredVersion:
groupVersion: authorization.k8s.io/v1
version: v1
- name: certificates.k8s.io
versions:
- groupVersion: certificates.k8s.io/v1
version: v1
preferredVersion:
groupVersion: certificates.k8s.io/v1
version: v1
- name: rbac.authorization.k8s.io
versions:
- groupVersion: rbac.authorization.k8s.io/v1
version: v1
preferredVersion:
groupVersion: rbac.authorization.k8s.io/v1
version: v1
- name: admissionregistration.k8s.io
versions:
- groupVersion: admissionregistration.k8s.io/v1
version: v1
preferredVersion:
groupVersion: admissionregistration.k8s.io/v1
version: v1
- name: apiextensions.k8s.io
versions:
- groupVersion: apiextensions.k8s.io/v1
version: v1
preferredVersion:
groupVersion: apiextensions.k8s.io/v1
version: v1
- name: coordination.k8s.io
versions:
- groupVersion: coordination.k8s.io/v1
version: v1
preferredVersion:
groupVersion: coordination.k8s.io/v1
version: v1
- name: discovery.k8s.io
versions:
- groupVersion: discovery.k8s.io/v1
version: v1
preferredVersion:
groupVersion: discovery.k8s.io/v1
version: v1
- name: flowcontrol.apiserver.k8s.io
versions:
- groupVersion: flowcontrol.apiserver.k8s.io/v1beta2
version: v1beta2
preferredVersion:
groupVersion: flowcontrol.apiserver.k8s.io/v1beta2
version: v1beta2
- name: cluster.open-cluster-management.io
versions:
- groupVersion: cluster.open-cluster-management.io/v1
version: v1
- groupVersion: cluster.open-cluster-management.io/v1beta2
version: v1beta2
- groupVersion: cluster.open-cluster-management.io/v1beta1
version: v1beta1
- groupVersion: cluster.open-cluster-management.io/v1alpha1
version: v1alpha1
preferredVersion:
groupVersion: cluster.open-cluster-management.io/v1
version: v1
- name: work.open-cluster-management.io
versions:
- groupVersion: work.open-cluster-management.io/v1
version: v1
- groupVersion: work.open-cluster-management.io/v1alpha1
version: v1alpha1
preferredVersion:
groupVersion: work.open-cluster-management.io/v1
version: v1
- name: addon.open-cluster-management.io
versions:
- groupVersion: addon.open-cluster-management.io/v1alpha1
version: v1alpha1
preferredVersion:
groupVersion: addon.open-cluster-management.io/v1alpha1
version: v1alpha1
- name: authentication.open-cluster-management.io
versions:
- groupVersion: authentication.open-cluster-management.io/v1beta1
version: v1beta1
- groupVersion: authentication.open-cluster-management.io/v1alpha1
version: v1alpha1
preferredVersion:
groupVersion: authentication.open-cluster-management.io/v1beta1
version: v1beta1
Aggregated Response:
kind: APIGroupDiscoveryList
apiVersion: apidiscovery.k8s.io/v2beta1
metadata: {}
items:
- metadata:
name: apiregistration.k8s.io
creationTimestamp:
versions:
- version: v1
resources:
- resource: apiservices
responseKind:
group: ''
version: ''
kind: APIService
scope: Cluster
singularResource: apiservice
verbs:
- create
- delete
- deletecollection
- get
- list
- patch
- update
- watch
categories:
- api-extensions
subresources:
- subresource: status
responseKind:
group: ''
version: ''
kind: APIService
verbs:
- get
- patch
- update
freshness: Current
Using the nativeClient.Discovery().WithLegacy()
forces the addon-framework to use the legacy response format.
I am installing mc-cp with --set enableSelfManagement=true. The CSR for the self
cluster is created but is not auto approved. If I manually approve the csr, everything seem to work as expected.
errors in multicluster-controlplane log:
W1128 15:52:58.669494 1 watcher.go:229] watch chan error: etcdserver: mvcc: required revision has been compacted
W1128 16:06:59.958809 1 watcher.go:229] watch chan error: etcdserver: mvcc: required revision has been compacted
W1128 16:21:46.038246 1 watcher.go:229] watch chan error: etcdserver: mvcc: required revision has been compacted
If you checked the results of the integration test, you will found there is error thrown
+++ [0420 07:30:59] On try 7, apiserver: : ok
use 'kubectl --kubeconfig=/home/runner/work/multicluster-controlplane/multicluster-controlplane/go/src/open-cluster-management.io/multicluster-controlplane/_output/controlplane/.ocm/cert/kube-aggregator.kubeconfig' to access the controlplane
12788
Joining the managed cluster integration-test to https://10.1.0.81:9443/ with clusteradm
./test/integration/hack/integration.sh: line 36: clusteradm: command not found
Error: hub-server is missing
Error: [managedclusters.cluster.open-cluster-management.io "integration-test" not found, no csr is approved yet for cluster integration-test]
Remove applied resources in the managed cluster integration-test ...
klusterlet is cleaned up already
Stop the controlplane ...
more details: https://github.com/open-cluster-management-io/multicluster-controlplane/actions/runs/4751653606/jobs/8441042752
/assign @ycyaoxdu
logs when you start the controlplane
I0606 10:14:38.833465 2585 options.go:411] the embedded etcd directory: /Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/_output/controlplane/.ocm
I0606 10:14:38.833512 2585 etcd.go:34] Creating embedded etcd server
{"level":"warn","ts":1686017678.8376708,"caller":"fileutil/fileutil.go:57","msg":"check file permission","error":"directory \"/Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/_output/controlplane/.ocm\" exist, but the permission is \"drwxr-xr-x\". The recommended permission is \"-rwx------\" to prevent possible unprivileged access to the data"}
/assign @ycyaoxdu
Hey all, seems like you're building something cool, and I'm wanting to use it!
Right now it's not the easiest path AFAICT though ๐ข
The two PRs against the README have helped a bit, but I can't quite get the helm chart itself to install successfully atm.
At first the openshift route CRD was required, but we don't use openshift.
Then I disabled the creation thereof, but now the pod crashes while looking for a multicluster-controlplane service.
Please advise, and please do consider merging the README PRs that are currently outstanding.
Thanks!
The issue is to discuss and trace any effort to support the cluster-proxy(and other addons) to work under the arch of multicluster-controlplane.
Need more test in this part, e.g. we restart the controlplane pod, etc.
When run the multicluster controlplane, we can see this error
E0306 14:21:40.096577 70174 pathrecorder.go:108] duplicate path registration of "/healthz/etcd": original registration from goroutine 1 [running]:
runtime/debug.Stack()
/usr/local/go/src/runtime/debug/stack.go:24 +0x64
k8s.io/apiserver/pkg/server/mux.(*PathRecorderMux).trackCallers(0x140003ff2d0, {0x14001de2e90, 0xd})
/Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/vendor/k8s.io/apiserver/pkg/server/mux/pathrecorder.go:106 +0x2c
k8s.io/apiserver/pkg/server/mux.(*PathRecorderMux).Handle(0x140003ff2d0, {0x14001de2e90, 0xd}, {0x103e7c1d8?, 0x14004a39560})
/Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/vendor/k8s.io/apiserver/pkg/server/mux/pathrecorder.go:174 +0xe0
k8s.io/apiserver/pkg/server/healthz.InstallPathHandlerWithHealthyFunc({0x103e7cbf8, 0x140003ff2d0}, {0x102c7bc8b, 0x8}, 0x0, {0x14003419000?, 0x0?, 0x0?})
/Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/vendor/k8s.io/apiserver/pkg/server/healthz/healthz.go:191 +0x524
k8s.io/apiserver/pkg/server/healthz.InstallPathHandler(...)
/Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/vendor/k8s.io/apiserver/pkg/server/healthz/healthz.go:165
k8s.io/apiserver/pkg/server/healthz.InstallHandler(...)
/Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/vendor/k8s.io/apiserver/pkg/server/healthz/healthz.go:134
k8s.io/apiserver/pkg/server.(*GenericAPIServer).installHealthz(0x0?)
/Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/vendor/k8s.io/apiserver/pkg/server/healthz.go:98 +0xec
k8s.io/apiserver/pkg/server.(*GenericAPIServer).PrepareRun(0x140018c6c00)
/Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/vendor/k8s.io/apiserver/pkg/server/genericapiserver.go:438 +0x104
k8s.io/apiserver/pkg/server.(*GenericAPIServer).PrepareRun(0x140034f0300)
/Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/vendor/k8s.io/apiserver/pkg/server/genericapiserver.go:424 +0x34
k8s.io/kube-aggregator/pkg/apiserver.(*APIAggregator).PrepareRun(0x140034ffae0)
/Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/vendor/k8s.io/kube-aggregator/pkg/apiserver/apiserver.go:433 +0x1b4
open-cluster-management.io/multicluster-controlplane/pkg/servers.(*server).Start(0x14003746b90, 0x140009d9b00?)
/Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/pkg/servers/server.go:59 +0x88
open-cluster-management.io/multicluster-controlplane/pkg/cmd/controller.NewController.func1(0x14000459f00?, {0x14000665b00?, 0x4?, 0x102c75381?})
/Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/pkg/cmd/controller/controller.go:49 +0xe4
github.com/spf13/cobra.(*Command).execute(0x140008a4c00, {0x14000665aa0, 0x3, 0x3})
/Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/vendor/github.com/spf13/cobra/command.go:940 +0x658
github.com/spf13/cobra.(*Command).ExecuteC(0x140008a4900)
/Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/vendor/github.com/spf13/cobra/command.go:1068 +0x320
github.com/spf13/cobra.(*Command).Execute(...)
/Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/vendor/github.com/spf13/cobra/command.go:992
k8s.io/component-base/cli.run(0x140008a4900)
/Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/vendor/k8s.io/component-base/cli/run.go:146 +0x264
k8s.io/component-base/cli.Run(0x105ee92c8?)
/Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/vendor/k8s.io/component-base/cli/run.go:46 +0x1c
main.main()
/Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/cmd/server/main.go:26 +0x20
We need support ManifestWorkReplicaSet in the controlplane by default
Cannot install multiple controlplane in a single cluster, because it was broken by the helm chart
Error: INSTALLATION FAILED: rendered manifests contain a resource that already exists. Unable to continue with install: ClusterRole "open-cluster-management:multicluster-controlplane" in namespace "" exists and cannot be imported into the current release: invalid ownership metadata; annotation validation error: key "meta.helm.sh/release-namespace" must equal "multicluster-controlplane1": current value is "multicluster-controlplane"
/assign @skeeey
Hey, just a question on this one.
For the ArgoCD Pull Integration, can you please provide installation and usage guidance in relation to this project?
Like should I install it in the same namespace as the multicluster-controlplane deployment?
Thanks!
we have created clusterrole admin/edit/view, but the rules is null. for example:
get clusterrole edit -oyaml
aggregationRule:
clusterRoleSelectors:
- matchLabels:
rbac.authorization.k8s.io/aggregate-to-edit: "true"
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
creationTimestamp: "2023-02-07T14:36:46Z"
labels:
kubernetes.io/bootstrapping: rbac-defaults
rbac.authorization.k8s.io/aggregate-to-admin: "true"
name: edit
resourceVersion: "88"
uid: 38dcc816-3e5e-4e7c-a9ba-a8ba824e25f5
rules: null
The expected behaviour is there are correct rules inside.
I finally created my own load balancer service outside of the helm chart to get around the extra 'n' issue, but now, I'm seeing that the .ocm
dir is locked down enough that the pod cannot generate its own cert.
failed to generate root-ca CA certificate: mkdir /.ocm/cert: permission denied
I'll probably find a workaround for now, but please look into the permissions for this directory.
Thanks!
follow this document to add test for managedserviceaccount - https://github.com/open-cluster-management-io/managed-serviceaccount#usage
Cause the controlplane need to run the OCM hub, the depoyment rousources need to be supported. Otherwise, the following error will occur when joining a cluster to the controlplane hub.
$ clusteradm join --hub-token $token --hub-apiserver ***
I1129 16:06:56.653006 2591989 recorder_in_memory.go:80] &Event{ObjectMeta:{dummy.172c19e1268f3014 dummy 0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[] map[] [] [] []},InvolvedObject:ObjectReference{Kind:Pod,Namespace:dummy,Name:dummy,UID:,APIVersion:v1,ResourceVersion:,FieldPath:,},Reason:DeploymentCreateFailed,Message:Failed to create Deployment.apps/klusterlet -n open-cluster-management: the server could not find the requested resource (post deployments.apps),Source:EventSource{Component:clusteradm,Host:,},FirstTimestamp:2022-11-29 16:06:56.652865556 +0000 UTC m=+0.072841870,LastTimestamp:2022-11-29 16:06:56.652865556 +0000 UTC m=+0.072841870,Count:1,Type:Warning,EventTime:0001-01-01 00:00:00 +0000 UTC,Series:nil,Action:,Related:nil,ReportingController:,ReportingInstance:,}
Error: "join/operator.yaml" (*v1.Deployment): the server could not find the requested resource (post deployments.apps)
/assign @ycyaoxdu
Run echo "# Controlplane " > /home/runner/work/changelog.txt
Error: An error occurred trying to start process '/usr/bin/bash' with working directory '/home/runner/work/multicluster-controlplane/multicluster-controlplane/go/src/open-cluster-management.io/multicluster-controlplane'. No such file or directory
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.