Giter VIP home page Giter VIP logo

websphere-liberty-operator's Introduction

IBM WebSphere Liberty Operator

The WebSphere Liberty Operator allows you to deploy and manage containerized Liberty applications securely and easily on Red Hat OpenShift as well as other Kubernetes-based platforms in a consistent way. You can also perform Day-2 operations such as gathering traces and memory dumps easily using the operator.

Documentation

For information on how to use the WebSphere Liberty Operator, see the documentation.

Issues and Contributions

For issues relating specifically to the operator, please use the GitHub Issues tracker. For more general issues relating to IBM WebSphere Application Server Liberty you can get help through the WASdev community or, if you have production licenses for WebSphere Application Server, via the usual support channels. We welcome contributions following our guidelines.

License

This project is licensed under the Apache License 2.0.

websphere-liberty-operator's People

Contributors

arturdzm avatar bradleymayo avatar davco01a avatar dependabot[bot] avatar dtadcox avatar halim-lee avatar hibell avatar idlewis avatar kabicin avatar leochr avatar mcurran-us avatar mingcyu avatar nottycode avatar pbaity avatar smcclem avatar william-reames avatar yongja79 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

websphere-liberty-operator's Issues

[linter] Add license/documentation links to csv

As per linter requirements. Needs to be done for all three CRs that are in the owned section of the CSV. Should look something like this:

spec:
  customresourcedefinitions:
    owned:
    - description: 'Documentation For additional details regarding install parameters
        check: http://ibm.biz/<product>-readme. License By installing this product
        you accept the license terms http://ibm.biz/<product>-license.'

Populate CSV's description from a separate file

The CSV has a description section (here). The content for it is currently specified in the CSV itself. Since the content is in Markup format, it's hard to edit/format it while it's in CSV. Instead we should have a separate description.md file with the content and when make bundle is run, it should populate the CSV with the content. WebSphere Automation Operator does that, use it as a reference (link).

Probes default values

The dynamic linter includes this error:

***[ERROR] scanned-statefulset-dulfcbe-dyn-0-websphereliberty-app-sample.yaml: (failureThreshold * periodSeconds) + initialSecondsDelay for livenessProbe (60) less than values for readinessProbe (90) for container spec.template.spec.containers[0] (ContainerLivenessLongerThanReadiness) --> see references https://github.ibm.com/IBMPrivateCloud/content-verification/blob/master/docs/rules/ContainerLivenessLongerThanReadiness.md

The sample app runs with this configuration:

apiVersion: liberty.websphere.ibm.com/v1
kind: WebSphereLibertyApplication
metadata:
  name: websphereliberty-app-sample
spec:
  license:
    accept: false
    edition: IBM WebSphere Application Server
    productEntitlementSource: Standalone
    metric: Virtual Processor Core (VPC)
  applicationImage: registry.connect.redhat.com/ibm/open-liberty-samples:springPetClinic
  expose: true
  replicas: 3
  probes:
    liveness:
      httpGet:
        path: /
        port: 9443
        scheme: HTTPS
      initialDelaySeconds: 30 
    readiness: 
      httpGet:
        path: /
        port: 9443
        scheme: HTTPS
      initialDelaySeconds: 60
  resources:
    limits:
      cpu: '1'
      memory: 1Gi
    requests:
      cpu: 500m 
      memory: 400Mi
  statefulSet: {}
  serviceability:
    size: 1Gi

Defaults for liveness and readiness probe is causing Liberty application to not become ready

I am deploying Liberty application using WebSphere Liberty Operator taking defaults for probes. Application is not able to get to the ready state.
Default configurations

probes:
    readiness: {}
    liveness: {}

Above configuration creates below probes in the deployment:

         readinessProbe:
            httpGet:
              path: /health/ready
              port: 9443
              scheme: HTTPS
            initialDelaySeconds: 10
            timeoutSeconds: 2
            periodSeconds: 10
            successThreshold: 1
            failureThreshold: 10
          terminationMessagePath: /dev/termination-log
          name: app
          livenessProbe:
            httpGet:
              path: /health/live
              port: 9443
              scheme: HTTPS
            initialDelaySeconds: 60
            timeoutSeconds: 2
            periodSeconds: 10
            successThreshold: 1
            failureThreshold: 3

Problem seems to be that liveness probe starts at 60 seconds and tries only 3 times every 10s giving application only 90 (60+ 10x3)seconds to become ready which is not enough as it takes typically 2 minutes for it to become ready. So Liveness probe kills the pod which was about to get ready. This loop continues. This problem happened for at least 2 SVT applications. This problem can be resolved by updating config in multiple ways. Below liveness probes values are working. Increasing failureThreshold to a higher value like 12 and also increasing periodSeconds to 20 so that it checks every 20s rather than 10s.

liveness: 
        httpGet:
          path: /health/live
          port: 9443
          scheme: HTTPS
        initialDelaySeconds: 60
        timeoutSeconds: 2
        periodSeconds: 20
        successThreshold: 1
        failureThreshold: 12

I am suggesting that we update default Liveness probes which will work for a typical customer application. These defaults can be overridden but having default value a bit more forgiving will be helpful.

Report reconciled version in status

Report the reconciled version at .status.versions.reconciled

This should be set only after the entire reconciliation process is completed successfully.

This is checked by linter as well

Create automated tests for base functionalities

Add automated tests to ensure the functionalities of WL operator. Use the kuttl framerwork for tests.

We can start with copying the tests from Open Liberty Operator. Change the relevant images from Open Liberty to WebSphere Liberty.

Operator README

The README for the operator must contain the following information.

Template Fields:

  • Name: Product Name
  • Introduction: Paragraph overview of the workload
  • Details: Describes what the workload is deploying
  • Prerequisites: Describes Environmental pre-requisites
  • SecurityContextConstraints Requirements: Describe the SecurityContextConstraints pre-requisite
  • Resources Required: Describes Minimum System Resources Required
  • Installing: Ways to install
  • Configuration: Define the Parms/Configuration Option
  • Storage: Define how workload storage setup
  • Limitations: Define key limitations

Use WebSphere Automation README as an example/reference

Additional information:
Operator README

Enhance rendering in OpenShift UI

Review and ensure that the CR fields are rendering as expected in OpenShift. They should be logically grouped and ordered.

For some CR fiedls (especially the k8s in-lined types like probes/volumes), evaluate if it's possible to hide/exclude advanced/optional fields (to keep the config simpler)

[scan] Investigate and resolve items flagged by detect-secrets scanner

OnePipeline runs the detect-secrets scanner step (as part of the code-ci-compliance stage)

Review the results and address the actual issues. For the false positives, investigate if there is a way to add an override, so they don't continue to show up in the results and mark the step as a failure. OnePipeline uses this tool to scan: https://w3.ibm.com/w3publisher/detect-secrets/developer-tool

detect-secrets.log

Sample results:

FOUND SECRETS:
File: api/v1/webspherelibertyapplication_types.go Line: 307 Type: Secret Keyword
File: bundle/manifests/ibm-websphere-liberty.clusterserviceversion.yaml Line: 308 Type: Secret Keyword
File: bundle/manifests/liberty.websphere.ibm.com_webspherelibertyapplications.yaml Line: 1926 Type: Secret Keyword

Run e2e tests on minikube

Run the e2e tests on minikube

OLO runs the minikube tests as part of Travis (here). The scripts from it can be reused to run the tests on minikube as part of OnePipeline.

This relevant PR in RCO has some changes that are not yet in OLO's scripts - which may need to be ported to WLO.

Add documentation for manage TLS/certificates

Describe the default manage TLS behaviour with cert-manager or OpenShift Certificate Manager. How to configure (in CR as well as in operator's ConfigMap), override or disabling it. Also document the reserved secret names used by the operator.

Since this is common for all runtime operators, add it to RCO's draft doc for v1 release

[scan] Assess the whitesource scan results

OnePipeline doesn't support whitesource at this time. It's on their roadmap. Alternative is to use GitHub Enterprise, which provides whitesource, to scan the operator source (the source has to be copied to GH Enterprise repo - not sure if there is a way to link to this GH public repo).

Following the guidelines from https://pages.github.ibm.com/Whitesource-IBM/whitesource/, the whitesource scan is enabled on the GitHub enterprise repo: https://github.ibm.com/websphere/liberty-operator

It links to this repo as a submodule. But it hasn't triggered a scan. Find a way to scan it and then assess the results. Whitesource should raise GitHub issues for the CVEs it detects. For WLO, expect at least a few issues to be reported.

Slack channel for requesting help with whitesource: #sos-whitesource

Enhance Makefile to easily run operator on cluster

Most of the time the operator can be run locally during development. But in some cases, especially for OLM scenarios, the operator should be deployed to run on the cluster. Enhance the Makefile to easily do this.

Add License Acceptance Option to WebSphere Liberty Operator

now that the Operators will be delivered with Liberty/tWAS, we were discussing how the license will be displayed with Alasdair. There will be a pull down choice that will allow the customer to accept the license via WAS, Family Edition, CP4Apps or WHE. This will need to be implemented by the Operators. I think we need issues to track this which I can get opened for both Operators.

Slack discussion: https://ibm-cloud.slack.com/archives/C020VHDP0M8/p1639495015013000

Implement Service Binding Spec

Service Binding Spec 1.0 has been released:
https://servicebinding.io/

In our operator, we provide some implementation via the service.binding field. More information is here: https://github.com/application-stacks/runtime-component-operator/blob/main/doc/user-guide-v1beta2.adoc#service-binding

Evaluate what's missing and implement.

The generated Secret only includes the service URL. But when Route is enabled should it be replaced by Route URL? or added via a separate key?

Modify CSV description as per linter requirements

The description in the CSV must match the linter rules:
" the CSV description must contain the same sections as the case README in markdown."

  description: |-
    # Introduction
    # Details
    Various Details
    ## Prerequisites
    Various prerequisites
    ## Resources Required
    Various Resources Required
    ## Installing
    Various Installation details
    # Configuration
    Various Configuration information
    ### Limitations
    Various Limitations
    ### SecurityContextConstraints Requirements
    SCC Requirements

Create default affinity to spread pods across nodes when arch is specified

The operator creates an affinity by default to spread out the pods on different nodes (here in common code). But when .spec.affinity.* is specified, it doesn't create this. We should make an exception to .spec.affinity.architecture so users can easily specify the arch(s) for their app and still get the default affinity to spread out instances.

Documentation

Gather documentation related details:

  • potential limitation. initialDelaySeconds of probes can't be set to 0. It must be set to 1 or above to override the default value operator sets.
  • troubleshooting. Operator pod encounters out of memory/cpu. It can happen if a high number of applications are managed by the operator. Provide instruction to override default values. Example is here
  • Ingress limitations : uri for service binding, endpoints

Config Map missing in operators global namespace

I installed WebSphere Liberty operator in global (openshift-operators) namespace. I created sample application CR with all the defaults.
I do see sample application installed with default manageTLS: true option. The operator created all the resources needed for the sample to work with default OpenShift certificates. There should be a Config Map created under global namespace to override certificate issuer properties and I can not find this Config Map.

Here is the output from openshift-operators namespace.

oc get cm -n openshift-operators
NAME                         DATA   AGE
7111f50b.websphere.ibm.com   0      5h34m
kube-root-ca.crt             1      29h
openshift-service-ca.crt     1      29h

Create the base structure for the WL operator

  • create an operator project (use same operator-sdk version as RCO/OLO)
    • operator/project name: ibm-websphere-liberty
    • group name: liberty.websphere.ibm.com
    • API version: v1
  • create the APIs and the controllers
    • WebSphereLibertyApplication
    • WebSphereLibertyTrace
    • WebSphereLibertyDump
  • add the Makefile
  • add the scripts

Exclude vendor files. Let's not check them in. Add vendor/ to .gitignore

Update operator version, channel and name

Update the version of operator (in Makefile, Dockerfiles, etc) to 1.0.0
Update the channel and default channel to v1.0
Update the name of operator bundle/CSV from websphere-liberty to ibm-websphere-liberty

Run e2e tests with operator installed via a Subscription

The e2e tests run with operator deployed via operator-sdk run bundle (here). Change it to add a CatalogSource using the built Catalog image and install the operator via a Subscription (OwnNamespace mode).

The dev.sh script does the creation of CatalogSource, OperatorGroup and Subscription. The updateStrategy field in CatalogSource can be omitted since that's only needed for dev.

Add documentation for network policy

Describe the default network policy created out-of-the-box and document ways to configure or disable it.

Since this is common for all runtime operators, add it to RCO's draft doc for v1 release

Resolve e2e test failures

Some tests are failing in OnePipeline. Knative/Serverless needs to be installed on the cluster.

Results:
	Name: affinity
	State: fail

	Errors:
		resource Deployment:wlo-test-115/node-affinity-label-lib: .status.readyReplicas: key is missing from map

	Name: probe
	State: pass


	Name: trace
	State: fail

	Errors:
		resource StatefulSet:wlo-test-115/trace-ws: .status.readyReplicas: key is missing from map

	Name: storage
	State: fail

	Errors:
		resource StatefulSet:wlo-test-115/storage-wsliberty: .status.readyReplicas: key is missing from map

	Name: statefulset-strategy
	State: fail

	Errors:
		resource StatefulSet:wlo-test-115/statefulset-strategy-lib-app: .status.readyReplicas: key is missing from map

	Name: sso2-providers
	State: pass


	Name: sso1-social
	State: pass


	Name: service-types
	State: pass


	Name: service-certificate
	State: pass


	Name: service-binding2
	State: pass


	Name: service-binding1
	State: pass


	Name: routes
	State: pass


	Name: route-certificate
	State: pass


	Name: pullpolicy
	State: pass


	Name: knative2
	State: fail

	Errors:
		retrieving API resource for [serving.knative.dev/v1](http://serving.knative.dev/v1), Kind=Service failed: the server could not find the requested resource

	Name: persistent-storage
	State: fail

	Errors:
		resource StatefulSet:wlo-test-115/persistent-wsliberty: .status.readyReplicas: key is missing from map

	Name: monitor
	State: pass


	Name: knative
	State: fail

	Errors:
		retrieving API resource for [serving.knative.dev/v1](http://serving.knative.dev/v1), Kind=Service failed: the server could not find the requested resource

	Name: auto3
	State: pass


	Name: deployment-strategy
	State: pass


	Name: basic
	State: pass


	Name: image-stream
	State: pass


	Name: dump
	State: fail

	Errors:
		resource StatefulSet:wlo-test-115/dump-ws: .status.readyReplicas: key is missing from map

	Name: auto1
	State: pass


	Name: auto2
	State: pass


	Name: annotations
	State: pass



****** Scorecard tests failed...
Makefile:269: recipe for target 'test-pipeline-e2e' failed
make: *** [test-pipeline-e2e] Error 1

 === Execution of custom script for stage 'acceptance-test' finished. Exit Code: '2'. ===

[linter] Add license/documentation links to csv

As per linter requirements. Needs to be done for all three CRs that are in the owned section of the CSV. Should look something like this:

  customresourcedefinitions:
    owned:
    - description: 'Documentation For additional details regarding install parameters
        check: http://ibm.biz/<product>-readme. License By installing this product
        you accept the license terms http://ibm.biz/<product>-license.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.