Giter VIP home page Giter VIP logo

multicluster-controlplane's People

Contributors

aii-nozomu-oki avatar clyang82 avatar mikeshng avatar morvencao avatar qiujian16 avatar skeeey avatar tamalsaha avatar yanmxa avatar ycyaoxdu avatar zhiweiyin318 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

multicluster-controlplane's Issues

Panic in multicluster-cp pod

I0829 06:27:13.355606       1 crds.go:80] ocm crd(managedclusteraddons.addon.open-cluster-management.io) is ready
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0x2349ace]

goroutine 3391 [running]:
github.com/openshift/library-go/pkg/operator/resource/resourceapply.reportCreateEvent({0x0, 0x0}, {0x41b8050, 0xc003470500}, {0x0, 0x0})
	/go/src/open-cluster-management.io/multicluster-controlplane/vendor/github.com/openshift/library-go/pkg/operator/resource/resourceapply/event_helpers.go:28 +0x2ee
github.com/openshift/library-go/pkg/operator/resource/resourceapply.ApplyCustomResourceDefinitionV1({0x41dbac0, 0xc00073ac80}, {0x7fedb0377b38, 0xc004c51140}, {0x0, 0x0}, 0xc003470500)
	/go/src/open-cluster-management.io/multicluster-controlplane/vendor/github.com/openshift/library-go/pkg/operator/resource/resourceapply/apiextensions.go:22 +0x4f2
open-cluster-management.io/multicluster-controlplane/pkg/agent.(*AgentOptions).ensureCRDs(0xc0065fd680, {0x41dbac0, 0xc00073ac80}, {0x41d66a0, 0xc0076b8930})
	/go/src/open-cluster-management.io/multicluster-controlplane/pkg/agent/agent.go:164 +0x1f0
open-cluster-management.io/multicluster-controlplane/pkg/agent.(*AgentOptions).RunAgent(0xc0065fd680, {0x41dbac0, 0xc00073ac80})
	/go/src/open-cluster-management.io/multicluster-controlplane/pkg/agent/agent.go:133 +0x150
open-cluster-management.io/multicluster-controlplane/pkg/controllers/ocmcontroller.EnableSelfManagement({0x41dbac0, 0xc00073ac80}, 0xc00439e180?, {0x3a85f91, 0x5}, {0xc001a369c0, 0x24})
	/go/src/open-cluster-management.io/multicluster-controlplane/pkg/controllers/ocmcontroller/ocmagent.go:93 +0x6ca
created by open-cluster-management.io/multicluster-controlplane/pkg/controllers/ocmcontroller.InstallSelfManagementCluster.func1
	/go/src/open-cluster-management.io/multicluster-controlplane/pkg/controllers/ocmcontroller/ocmagent.go:55 +0x1e5

On restart, I don't see this panic any more. It seems like the record event is written on an object that does not exist.

get token fails in bootstrap

Now that I finally have an installation,

the command:

clusteradm --kubeconfig=<controlplane kubeconfig file> get token --use-bootstrap-token

fails for me as follows:

Error: resource mapping not found for name: "system:open-cluster-management:bootstrap" namespace: "" from "local": no matches for kind "ClusterRole" in version "rbac.authorization.k8s.io/v1"
ensure CRDs are installed first

It seems clusteradm expects a ClusterRole and Binding to exist, but the standalone control plane does not even have RBAC APIs let alone ClusterRoles.

But then, the README says to grab the token via clusteradm ๐Ÿคท

Server starts ok but missing CRDs and open cluster management API resources

I followed the instructions to build and start as local binary:

 make run
go mod tidy 
go mod vendor
CGO_ENABLED=0 go build -ldflags="-s -w" -o bin/multicluster-controlplane cmd/server/main.go 
hack/start-multicluster-controlplane.sh
multicluster-controlplane configurations in _output/controlplane/ocmconfig.yaml
dataDirectory: /Users/paolo/go/src/github.com/pdettori/multicluster-controlplane/_output/controlplane/.ocm
apiserver:
  port: 9443
etcd:
  mode: embed

API SERVER secure port is free, proceeding...
Starting apiserver ...
Waiting for apiserver to come up
+++ [0626 22:56:32] On try 5, apiserver: : ok
use 'kubectl --kubeconfig=/Users/paolo/go/src/github.com/pdettori/multicluster-controlplane/_output/controlplane/.ocm/cert/kube-aggregator.kubeconfig' to access the controlplane

then on another terminal:

$ kubectl --kubeconfig=/Users/paolo/go/src/github.com/pdettori/multicluster-controlplane/_output/controlplane/.ocm/cert/kube-aggregator.kubeconfig get crds
error: the server doesn't have a resource type "crds"
$ kubectl --kubeconfig=/Users/paolo/go/src/github.com/pdettori/multicluster-controlplane/_output/controlplane/.ocm/cert/kube-aggregator.kubeconfig api-resources
NAME                     SHORTNAMES   APIVERSION                  NAMESPACED   KIND
bindings                              v1                          true         Binding
componentstatuses        cs           v1                          false        ComponentStatus
configmaps               cm           v1                          true         ConfigMap
endpoints                ep           v1                          true         Endpoints
events                   ev           v1                          true         Event
limitranges              limits       v1                          true         LimitRange
namespaces               ns           v1                          false        Namespace
nodes                    no           v1                          false        Node
persistentvolumeclaims   pvc          v1                          true         PersistentVolumeClaim
persistentvolumes        pv           v1                          false        PersistentVolume
pods                     po           v1                          true         Pod
podtemplates                          v1                          true         PodTemplate
replicationcontrollers   rc           v1                          true         ReplicationController
resourcequotas           quota        v1                          true         ResourceQuota
secrets                               v1                          true         Secret
serviceaccounts          sa           v1                          true         ServiceAccount
services                 svc          v1                          true         Service
apiservices                           apiregistration.k8s.io/v1   false        APIService

Git Info: I am running code from main with

commit 12d2edb23043c630bd6d319a4fca5aa23e03834d (upstream/main)
Author: Wei Liu <[email protected]>
Date:   Tue Jun 20 09:30:50 2023 +0800

Discovery api results are inconsistent across api versions

It seems that Kubernetes has a new Aggregated Discovery api:
https://github.com/kubernetes/enhancements/tree/master/keps/sig-api-machinery/3352-aggregated-discovery

The issue is that for mc-cp the response is not same between the legacy and new aggregated mode. This causes problem where the addon framework thinks the CSR api type is not available.
https://github.com/open-cluster-management-io/addon-framework/blob/44852ea0722f413257fe49016009aaba25abbb42/pkg/utils/csr_helpers.go#L198

Legacy Response:

kind: APIGroupList
apiVersion: v1
groups:
- name: apiregistration.k8s.io
  versions:
  - groupVersion: apiregistration.k8s.io/v1
    version: v1
  preferredVersion:
    groupVersion: apiregistration.k8s.io/v1
    version: v1
- name: events.k8s.io
  versions:
  - groupVersion: events.k8s.io/v1
    version: v1
  preferredVersion:
    groupVersion: events.k8s.io/v1
    version: v1
- name: authentication.k8s.io
  versions:
  - groupVersion: authentication.k8s.io/v1
    version: v1
  preferredVersion:
    groupVersion: authentication.k8s.io/v1
    version: v1
- name: authorization.k8s.io
  versions:
  - groupVersion: authorization.k8s.io/v1
    version: v1
  preferredVersion:
    groupVersion: authorization.k8s.io/v1
    version: v1
- name: certificates.k8s.io
  versions:
  - groupVersion: certificates.k8s.io/v1
    version: v1
  preferredVersion:
    groupVersion: certificates.k8s.io/v1
    version: v1
- name: rbac.authorization.k8s.io
  versions:
  - groupVersion: rbac.authorization.k8s.io/v1
    version: v1
  preferredVersion:
    groupVersion: rbac.authorization.k8s.io/v1
    version: v1
- name: admissionregistration.k8s.io
  versions:
  - groupVersion: admissionregistration.k8s.io/v1
    version: v1
  preferredVersion:
    groupVersion: admissionregistration.k8s.io/v1
    version: v1
- name: apiextensions.k8s.io
  versions:
  - groupVersion: apiextensions.k8s.io/v1
    version: v1
  preferredVersion:
    groupVersion: apiextensions.k8s.io/v1
    version: v1
- name: coordination.k8s.io
  versions:
  - groupVersion: coordination.k8s.io/v1
    version: v1
  preferredVersion:
    groupVersion: coordination.k8s.io/v1
    version: v1
- name: discovery.k8s.io
  versions:
  - groupVersion: discovery.k8s.io/v1
    version: v1
  preferredVersion:
    groupVersion: discovery.k8s.io/v1
    version: v1
- name: flowcontrol.apiserver.k8s.io
  versions:
  - groupVersion: flowcontrol.apiserver.k8s.io/v1beta2
    version: v1beta2
  preferredVersion:
    groupVersion: flowcontrol.apiserver.k8s.io/v1beta2
    version: v1beta2
- name: cluster.open-cluster-management.io
  versions:
  - groupVersion: cluster.open-cluster-management.io/v1
    version: v1
  - groupVersion: cluster.open-cluster-management.io/v1beta2
    version: v1beta2
  - groupVersion: cluster.open-cluster-management.io/v1beta1
    version: v1beta1
  - groupVersion: cluster.open-cluster-management.io/v1alpha1
    version: v1alpha1
  preferredVersion:
    groupVersion: cluster.open-cluster-management.io/v1
    version: v1
- name: work.open-cluster-management.io
  versions:
  - groupVersion: work.open-cluster-management.io/v1
    version: v1
  - groupVersion: work.open-cluster-management.io/v1alpha1
    version: v1alpha1
  preferredVersion:
    groupVersion: work.open-cluster-management.io/v1
    version: v1
- name: addon.open-cluster-management.io
  versions:
  - groupVersion: addon.open-cluster-management.io/v1alpha1
    version: v1alpha1
  preferredVersion:
    groupVersion: addon.open-cluster-management.io/v1alpha1
    version: v1alpha1
- name: authentication.open-cluster-management.io
  versions:
  - groupVersion: authentication.open-cluster-management.io/v1beta1
    version: v1beta1
  - groupVersion: authentication.open-cluster-management.io/v1alpha1
    version: v1alpha1
  preferredVersion:
    groupVersion: authentication.open-cluster-management.io/v1beta1
    version: v1beta1

Aggregated Response:

kind: APIGroupDiscoveryList
apiVersion: apidiscovery.k8s.io/v2beta1
metadata: {}
items:
- metadata:
    name: apiregistration.k8s.io
    creationTimestamp:
  versions:
  - version: v1
    resources:
    - resource: apiservices
      responseKind:
        group: ''
        version: ''
        kind: APIService
      scope: Cluster
      singularResource: apiservice
      verbs:
      - create
      - delete
      - deletecollection
      - get
      - list
      - patch
      - update
      - watch
      categories:
      - api-extensions
      subresources:
      - subresource: status
        responseKind:
          group: ''
          version: ''
          kind: APIService
        verbs:
        - get
        - patch
        - update
    freshness: Current

Using the nativeClient.Discovery().WithLegacy() forces the addon-framework to use the legacy response format.

errors in logs

errors in multicluster-controlplane log:

W1128 15:52:58.669494       1 watcher.go:229] watch chan error: etcdserver: mvcc: required revision has been compacted
W1128 16:06:59.958809       1 watcher.go:229] watch chan error: etcdserver: mvcc: required revision has been compacted
W1128 16:21:46.038246       1 watcher.go:229] watch chan error: etcdserver: mvcc: required revision has been compacted

integration test is flaky

If you checked the results of the integration test, you will found there is error thrown

+++ [0420 07:30:59] On try 7, apiserver: : ok
use 'kubectl --kubeconfig=/home/runner/work/multicluster-controlplane/multicluster-controlplane/go/src/open-cluster-management.io/multicluster-controlplane/_output/controlplane/.ocm/cert/kube-aggregator.kubeconfig' to access the controlplane
12788
Joining the managed cluster integration-test to https://10.1.0.81:9443/ with clusteradm
./test/integration/hack/integration.sh: line 36: clusteradm: command not found
Error: hub-server is missing
Error: [managedclusters.cluster.open-cluster-management.io "integration-test" not found, no csr is approved yet for cluster integration-test]
Remove applied resources in the managed cluster integration-test ... 
klusterlet is cleaned up already
Stop the controlplane ...

more details: https://github.com/open-cluster-management-io/multicluster-controlplane/actions/runs/4751653606/jobs/8441042752

/assign @ycyaoxdu

Investigate the etcd folder permission

logs when you start the controlplane

I0606 10:14:38.833465    2585 options.go:411] the embedded etcd directory: /Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/_output/controlplane/.ocm
I0606 10:14:38.833512    2585 etcd.go:34] Creating embedded etcd server
{"level":"warn","ts":1686017678.8376708,"caller":"fileutil/fileutil.go:57","msg":"check file permission","error":"directory \"/Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/_output/controlplane/.ocm\" exist, but the permission is \"drwxr-xr-x\". The recommended permission is \"-rwx------\" to prevent possible unprivileged access to the data"}

/assign @ycyaoxdu

Service Missing on Installation

Hey all, seems like you're building something cool, and I'm wanting to use it!

Right now it's not the easiest path AFAICT though ๐Ÿ˜ข

The two PRs against the README have helped a bit, but I can't quite get the helm chart itself to install successfully atm.

At first the openshift route CRD was required, but we don't use openshift.

Then I disabled the creation thereof, but now the pod crashes while looking for a multicluster-controlplane service.

Please advise, and please do consider merging the README PRs that are currently outstanding.

Thanks!

pathrecorder.go /healthz/etcd error

When run the multicluster controlplane, we can see this error

E0306 14:21:40.096577   70174 pathrecorder.go:108] duplicate path registration of "/healthz/etcd": original registration from goroutine 1 [running]:
runtime/debug.Stack()
        /usr/local/go/src/runtime/debug/stack.go:24 +0x64
k8s.io/apiserver/pkg/server/mux.(*PathRecorderMux).trackCallers(0x140003ff2d0, {0x14001de2e90, 0xd})
        /Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/vendor/k8s.io/apiserver/pkg/server/mux/pathrecorder.go:106 +0x2c
k8s.io/apiserver/pkg/server/mux.(*PathRecorderMux).Handle(0x140003ff2d0, {0x14001de2e90, 0xd}, {0x103e7c1d8?, 0x14004a39560})
        /Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/vendor/k8s.io/apiserver/pkg/server/mux/pathrecorder.go:174 +0xe0
k8s.io/apiserver/pkg/server/healthz.InstallPathHandlerWithHealthyFunc({0x103e7cbf8, 0x140003ff2d0}, {0x102c7bc8b, 0x8}, 0x0, {0x14003419000?, 0x0?, 0x0?})
        /Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/vendor/k8s.io/apiserver/pkg/server/healthz/healthz.go:191 +0x524
k8s.io/apiserver/pkg/server/healthz.InstallPathHandler(...)
        /Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/vendor/k8s.io/apiserver/pkg/server/healthz/healthz.go:165
k8s.io/apiserver/pkg/server/healthz.InstallHandler(...)
        /Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/vendor/k8s.io/apiserver/pkg/server/healthz/healthz.go:134
k8s.io/apiserver/pkg/server.(*GenericAPIServer).installHealthz(0x0?)
        /Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/vendor/k8s.io/apiserver/pkg/server/healthz.go:98 +0xec
k8s.io/apiserver/pkg/server.(*GenericAPIServer).PrepareRun(0x140018c6c00)
        /Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/vendor/k8s.io/apiserver/pkg/server/genericapiserver.go:438 +0x104
k8s.io/apiserver/pkg/server.(*GenericAPIServer).PrepareRun(0x140034f0300)
        /Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/vendor/k8s.io/apiserver/pkg/server/genericapiserver.go:424 +0x34
k8s.io/kube-aggregator/pkg/apiserver.(*APIAggregator).PrepareRun(0x140034ffae0)
        /Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/vendor/k8s.io/kube-aggregator/pkg/apiserver/apiserver.go:433 +0x1b4
open-cluster-management.io/multicluster-controlplane/pkg/servers.(*server).Start(0x14003746b90, 0x140009d9b00?)
        /Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/pkg/servers/server.go:59 +0x88
open-cluster-management.io/multicluster-controlplane/pkg/cmd/controller.NewController.func1(0x14000459f00?, {0x14000665b00?, 0x4?, 0x102c75381?})
        /Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/pkg/cmd/controller/controller.go:49 +0xe4
github.com/spf13/cobra.(*Command).execute(0x140008a4c00, {0x14000665aa0, 0x3, 0x3})
        /Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/vendor/github.com/spf13/cobra/command.go:940 +0x658
github.com/spf13/cobra.(*Command).ExecuteC(0x140008a4900)
        /Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/vendor/github.com/spf13/cobra/command.go:1068 +0x320
github.com/spf13/cobra.(*Command).Execute(...)
        /Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/vendor/github.com/spf13/cobra/command.go:992
k8s.io/component-base/cli.run(0x140008a4900)
        /Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/vendor/k8s.io/component-base/cli/run.go:146 +0x264
k8s.io/component-base/cli.Run(0x105ee92c8?)
        /Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/vendor/k8s.io/component-base/cli/run.go:46 +0x1c
main.main()
        /Users/chuyang/go/src/github.com/clyang82/multicluster-controlplane/cmd/server/main.go:26 +0x20

Add e2e tests

  1. run the standalone controlplane in a container (docker run ...) and then start a KinD cluster to join as a managed cluster.
  2. check the managedcluster to ensure it is available.

Cannot install multiple controlplane in a single cluster

Cannot install multiple controlplane in a single cluster, because it was broken by the helm chart
Error: INSTALLATION FAILED: rendered manifests contain a resource that already exists. Unable to continue with install: ClusterRole "open-cluster-management:multicluster-controlplane" in namespace "" exists and cannot be imported into the current release: invalid ownership metadata; annotation validation error: key "meta.helm.sh/release-namespace" must equal "multicluster-controlplane1": current value is "multicluster-controlplane"

/assign @skeeey

Question: ArgoCD Pull Integration

Hey, just a question on this one.

For the ArgoCD Pull Integration, can you please provide installation and usage guidance in relation to this project?

Like should I install it in the same namespace as the multicluster-controlplane deployment?

Thanks!

clusterrole admin/edit/view is null

we have created clusterrole admin/edit/view, but the rules is null. for example:

 get clusterrole edit -oyaml
aggregationRule:
  clusterRoleSelectors:
  - matchLabels:
      rbac.authorization.k8s.io/aggregate-to-edit: "true"
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  annotations:
    rbac.authorization.kubernetes.io/autoupdate: "true"
  creationTimestamp: "2023-02-07T14:36:46Z"
  labels:
    kubernetes.io/bootstrapping: rbac-defaults
    rbac.authorization.k8s.io/aggregate-to-admin: "true"
  name: edit
  resourceVersion: "88"
  uid: 38dcc816-3e5e-4e7c-a9ba-a8ba824e25f5
rules: null

The expected behaviour is there are correct rules inside.

Should not create bootstrap-secret repeatedly

I found that a new bootstrap secret is created every time the control plane starts.
Screenshot 2023-07-13 at 14 34 35

This may lead to unexpected results while running clusteradm get token --use-bootstrap-token against the control plane.

Cannot Generate CA as .ocm dir is too restrictive.

I finally created my own load balancer service outside of the helm chart to get around the extra 'n' issue, but now, I'm seeing that the .ocm dir is locked down enough that the pod cannot generate its own cert.

failed to generate root-ca CA certificate: mkdir /.ocm/cert: permission denied

I'll probably find a workaround for now, but please look into the permissions for this directory.

Thanks!

the standalone controlplane need to support deployment

Cause the controlplane need to run the OCM hub, the depoyment rousources need to be supported. Otherwise, the following error will occur when joining a cluster to the controlplane hub.

$ clusteradm  join --hub-token $token --hub-apiserver ***
I1129 16:06:56.653006 2591989 recorder_in_memory.go:80] &Event{ObjectMeta:{dummy.172c19e1268f3014  dummy    0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[] map[] [] []  []},InvolvedObject:ObjectReference{Kind:Pod,Namespace:dummy,Name:dummy,UID:,APIVersion:v1,ResourceVersion:,FieldPath:,},Reason:DeploymentCreateFailed,Message:Failed to create Deployment.apps/klusterlet -n open-cluster-management: the server could not find the requested resource (post deployments.apps),Source:EventSource{Component:clusteradm,Host:,},FirstTimestamp:2022-11-29 16:06:56.652865556 +0000 UTC m=+0.072841870,LastTimestamp:2022-11-29 16:06:56.652865556 +0000 UTC m=+0.072841870,Count:1,Type:Warning,EventTime:0001-01-01 00:00:00 +0000 UTC,Series:nil,Action:,Related:nil,ReportingController:,ReportingInstance:,}
Error: "join/operator.yaml" (*v1.Deployment): the server could not find the requested resource (post deployments.apps)

fix release error

Run echo "# Controlplane " > /home/runner/work/changelog.txt
Error: An error occurred trying to start process '/usr/bin/bash' with working directory '/home/runner/work/multicluster-controlplane/multicluster-controlplane/go/src/open-cluster-management.io/multicluster-controlplane'. No such file or directory

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.