Giter VIP home page Giter VIP logo

platform-operators's Introduction

platform-operators's People

Contributors

awgreene avatar bparees avatar deads2k avatar dobbymoodge avatar dtfranz avatar eggfoobar avatar exdx avatar joelanford avatar kevinrizza avatar ncdc avatar oceanc80 avatar openshift-ci[bot] avatar openshift-merge-bot[bot] avatar openshift-merge-robot avatar timflannagan avatar wking avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

platform-operators's Issues

Improve debugging in the e2e suite

It's difficult to debug the current e2e testing suite given there's no visibility into the state/status of the PlatformOperator/BundleDeployment/ClusterOperator resources we're testing against. The CI feedback in #23 has seen some green runs, but there are some flakes that appear to be popping up over time. When investigating those flakes, it's difficult to determine whether these flakes are legitimate, or there are bugs in rukpak, etc..

PlatfromOperator is accepting the spec.package.name via editing

PlatfromOperator is accepting the spec.package.name via editing but did nothing.

I was trying to perform some testing on the Platform Operator "cert-manager" and found a few issues.

  1. I created a platform operator resource with the yaml file
1. Create a file as follows:

# cat cert-manager.yaml

apiVersion: platform.openshift.io/v1alpha1
kind: PlatformOperator
metadata:
name: cert-manager
spec:
package:
name: openshift-cert-manager-operator


2. Create a PlatformOperator as follows:

oc create -f cert-manager.yaml

# oc get co | grep cert-manager

cert-manager 1.7.1 True False False 35s


3. oc edit platformoperator cert-manager

oc edit platformoperator cert-manager

platformoperator.platform.openshift.io/cert-manager edited

[root@bastion cluster412]# oc get platformoperator cert-manager -o yaml
apiVersion: platform.openshift.io/v1alpha1
kind: PlatformOperator
metadata:
creationTimestamp: "2023-01-23T14:55:35Z"
generation: 2
name: cert-manager
resourceVersion: "51430"
uid: 5279aabc-54a4-4d64-a371-708dcfb2dfe5
spec:
package:
name: file-integrity-operator
status:
activeBundleDeployment:
name: cert-manager
conditions:

  • lastTransitionTime: "2023-01-23T15:00:31Z"
    message: 'load bundle: unexpected response status "404 Not Found"'
    reason: BundleLoadFailed
    status: "False"
    type: Installed

After 10 minutes still platformoperator.spec.package.name is not reverted automatically and FileIntegrityOperator is not installed. 

4. Now Do not revert the changes to back and delete the platformOperator still the cert-manager project operator remaining present:

oc get platformOperator cert-manager -o yaml

apiVersion: platform.openshift.io/v1alpha1
kind: PlatformOperator
metadata:
creationTimestamp: "2023-01-23T15:07:46Z"
generation: 2
name: cert-manager
resourceVersion: "61924"
uid: cdfd215c-192e-42d7-85b9-da25e84963a1
spec:
package:
name: file-integrity-operator
status:
activeBundleDeployment:
name: cert-manager
conditions:

  • lastTransitionTime: "2023-01-23T15:08:22Z"
    message: Successfully applied the cert-manager BundleDeployment resource
    reason: InstallSuccessful
    status: "True"
    type: Installed

[root@bastion cluster412]# oc delete platformOperator cert-manager

[root@bastion cluster412]# oc delete platformOperator cert-manager
platformoperator.platform.openshift.io "cert-manager" deleted


Here are the observations:
1. We find the platformOperator with edit and change the spec.package.name to "file-integrity-operator" the name has been changed in the resource but it has not uninstalled the cert-manager operator and not installed the file-integrity-operator. This means this existing CR is not usable.

2. It has changed the name, Ideally, it should not change or modifiable. If change the name then the CR should be re-usable I mean either it should uninstall the older operator and install the new operator as per the platform.spec.package.name or it should revert the spec.package.name to the previous name if not accepatable or it should not be modifiable. 

Why the PlatformOperator is modifiable, It should throw some error at least if modified. OCP 4.12 cluster. 

Now as per my understanding I tried to use the existing CR(Initially created for cert-manager) to install a different package fileIntegrityOperator(To reuse the CR), but it didn't install the fileintegrity operator and did not revert the change from fileintegrity to cert-manager so which means I am not able to use the existing CR(which confused me) if this doesn't do anything after a change then I thought it should be immutable or at least show some message as a notification or warning like create a new CR to install new package or something else, Currently I couldn't understand what happens after the package change.

One more question can we use a single platformOperator resource for multiple operator package installation? If not which means we can give one name at once to one CR so if we edit the same what happens to the existing package will that uninstall or have no effect on the existing one and it will install the new one after modification?

The POM component doesn't compare the current and desired state of the generated BundleDeployment resource

The POM component which is being implemented in #41 attempts to gate the "sourcing" logic to avoid performing unnecessary work as upgrades aren't supported during phase 0. This sourcing logic is responsible for finding a registry+v1 bundle that satisfies the desired admin package name. During reconciliation, we avoid performing the sourcing logic when that BD resource already exists, so it's possible for a user to modify the underlying BD resource in the current implementation.

There's a couple of ways to prevent this behavior, but the majority require API changes:

  • Extend the PlatformOperators API and include a status ref field. The reconciliation logic would persist the sourcing decision there.
  • Inject an annotation on individual PlatformOperator resources during reconciliation.
  • Inject an annotation on the underlying BundleDeployment resource.

Investigate why the BundleDeployment being generated in the POM controller has an empty metadata template field

When creating a PlatformOperator resource, a BundleDeployment resource will be generated under-the-hood. For whatever reason, the nested metadata field of that resource's spec.Template field is empty:

apiVersion: core.rukpak.io/v1alpha1
kind: BundleDeployment
metadata:
  name: cert-manager
  ownerReferences:
    ...
spec:
  provisionerClassName: core-rukpak-io-plain
  template:
    metadata: {}
    spec:
      provisionerClassName: core-rukpak-io-registry
      source:
        image:
          ref: registry.redhat.io/cert-manager/cert-manager-operator-bundle@sha256:0cb53beabc016e0a3c2358cd3855fb35f37ef27417a955cf07e7668ca44487be
        type: image

Update Rukpak image used to be a kustomize variable

Summary

As the project evolves alongside Rukpak, it will be nice to have a guarantee that the image we're using works with our manifests. Since we're using the main tag for the Rukpak image, this is not always guaranteed. It would be nice if we could have a kustomize variable that allows us to change the image used in one place and have that be generated in the manifests.

The kustomize manifests should default to the origin images

The kustomize manifests currently specify the upstream rukpak main branch tag, and use controller:latest for the POM container image. During downstream deployments, these images are substituted with what's present in the release payload, but for local dev purposes that rely on kubectl apply -f manifests workflows, it opens the potential for incorrect feedback. There's available images in the quay.io/openshift/origin repository that we can use by default.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.