open-policy-agent / kube-mgmt Goto Github PK

View Code? Open in Web Editor NEW

225.0 10.0 104.0 44.58 MB

Sidecar for managing OPA instances in Kubernetes.

License: Apache License 2.0

Go 86.60% Mustache 4.11% Shell 6.19% Just 3.10%

kubernetes policy devops opa k8s

kube-mgmt's People

Contributors

Stargazers

Watchers

Forkers

tsandall mmussomele lorellalou tedli shrinandj sandeepbhojwani kevdowney ashutosh-narkar nilekhc kinvolk-archives paulbouwer andy-v-h plallin hasinthaindrajee kfox1111 sai-adarsh dskatz prabushyam joaobravecoding janwillies trilokgm iverberk r0fls mallow111 asuffield amruta-bandhu-chaudhury patrick-east keisukeyamashita arminfelder lawrencegripper alebedev87 silenceshell acumenix wearehadock annegies laashub-soa pieterv-icloud-com rpenugonda itaysk zhutony bo0km4n devopstoday11 jonniedoe clix-dev-llc jutley jeunii toxicwar staerion dwaler kirk-patton mikisct driosalido doytsujin velothump yerinu2019 flowkap damoon t-kusanagi atzawada alexcern sir-jacques rg2011 jacksgt ucsd-ets stepanpelc warpcomdev odidev subhamkumarpuresoftware bjhaid chukmunnlee ryanjkemper wesuuu mrsharff erwbgy adriananeci kleug elchenberg srenatus amentebekele-okta scarman4 assistdeployerseu tehlers320 anderseknert jthan24 alex0z1 samkenxstream mvaalexp nejec shubhiroy rahultw jalberti farhan1094 opalmer andreavida vsamidurai shreyas-badiger rfletcher getkrabhijeet crdueck saranyareddy24

kube-mgmt's Issues

Document list of group/version/kind values that can be used

The implementation is generic so any set should work, however, the format is somewhat finicky (e.g., v1/services, NOT v1/Service or v1/service, etc.)

Migrate to GitHub Actions

We should migrate to GitHub actions as travis-ci.org is being shutdown in December.

We run OPA 0.13.2 and kube-mgmt 0.10 with replication of namespaces, services and ingresses. It is a 6 pod deployment on GKE 1.13.x. We upgraded kube-mgmt from 0.8 to 0.10 yesterday. To verify this upgrade we first added a custom metrics exporter that counts the result items in /v1/data/kubernetes/{namespaces,ingresses,services}.

Analysing the metrics today it seems we have gaps in the OPA data. The following graph shows the spread between the lowest number of data items found versus the highest number. This line should be zero indicating all pods have the same number of data items.

When looking at a single incident. I see the namespace count in a single OPA pod drops from 554 to 0. Then goes slowly to 1, 2, 3 and finally jumps back to 554. All of this over a period of 12 minutes. See graph below to illustrate this gap. Other incidents took longer, but all auto recovered.

To verify our metrics are to be trusted I did REST API calls to the affected OPA container. The metrics are correct.

$ curl -ks https://0:1443/v1/data/kubernetes/namespaces | jq '.result|keys' wc-l
      3

After recovery (which can be triggered by killing the kube-mgmt process) I see:

$ curl -ks https://0:1443/v1/data/kubernetes/namespaces | jq '.result|keys'|wc -l
     546

I did report this issue earlier when using kube-mgmt 0.9 and the fix ought to be present in 0.10 - but it clearly is not.

Logging of kube-mgmt show these events, but based on other issues I assume they are not related:

opa-internal-7f98c9546f-2vmxn kube-mgmt E1018 08:29:30.686185       1 streamwatcher.go:109] Unable to decode an event from the watch stream: unable to decode watch event: no kind "Status" is registered for version "v1" in scheme "github.com/open-policy-agent/kube-mgmt/pkg/configmap/configmap.go:102"

... and ...

opa-internal-7f98c9546f-2vmxn kube-mgmt time="2019-10-18T10:35:30Z" level=info msg="Sync channel for v1/namespaces closed. Restarting immediately."
opa-internal-7f98c9546f-2vmxn kube-mgmt time="2019-10-18T10:35:30Z" level=info msg="Syncing v1/namespaces."
opa-internal-7f98c9546f-2vmxn kube-mgmt time="2019-10-18T10:35:31Z" level=info msg="Listed v1/namespaces and got 544 resources with resourceVersion 720115319. Took 507.146764ms."

Logging of OPA is just full of normal stuff.

I do see a few PUT data requests, which result in the slow increase of namespace items . Like:

opa-internal-7f98c9546f-2vmxn opa {"client_addr":"127.0.0.1:45916","level":"info","msg":"Received request.","req_id":116,"req_method":"PUT","req_path":"/v1/data/kubernetes/namespaces/sdrtdfuygiu","time":"2019-10-18T11:12:21Z"}

We've rolled back to kube-mgmt 0.8 for the time being. Knowing that version does slow data pumping (~5 minutes for all the data to be PUT to OPA) and has a sync bug (deleted k8s resources do not always get deleted in OPA).

kube-mgmt resource replication requires cluster wide access

I tried to run the kube-mgmt sidecar to opa on my Kubernetes cluster and granted it limited privileges. The role that the deployment was running with only had access to it's own namespace and nothing else.

I only had --replicate=v1/pods as the argument to kube-mgmt. I saw the following errors:

E0412 05:31:24.735947       1 reflector.go:201] github.com/open-policy-agent/kube-mgmt/pkg/policies/configmap.go:100: Failed to list *v1.ConfigMap: unknown (get configmaps)

If I understand the code right, it seems that kube-mgmt currently watches for resources across all namespaces.

File: pkg/policies/configmap.go

        source := cache.NewListWatchFromClient(
                client,
                "configmaps",
                v1.NamespaceAll, <<<---------------
                fields.Everything())

File: ./pkg/data/generic.go

        source := cache.NewListWatchFromClient(
                client,
                s.ns.Resource,
                api.NamespaceAll,  <<<--------------
                fields.Everything())

As a result, kube-mgmt can only run if it is given a role that has cluster wide access to these resources.

I changed the cluster-binding to cluster-admin (basically, ran opa and kube-mgmt as root) and things worked fine.

It'll be good if, kube-mgmt can watch resources in the namespace that the user provides (and maybe default to all).

Can not load configmap data from a namespace with a dash in it's name

After following the instructions for json-loadin I could not access the data being loaded if the namespace contained a dash. I've using kube-mgmt v0.11.

Here is an example configmap modified from the example in the docs I was using for testing.

kind: ConfigMap
apiVersion: v1
metadata:
  name: hello-data
  namespace: opa-data
  labels:
    openpolicyagent.org/data: opa
data:
  x.json: |
    {"a": [1,2,3,4]}

When applied I should see a log like so in OPA however there are no logs related to the data.

[INFO] Sent response.
  resp_bytes = 0
  resp_duration = 0.443091
  resp_body = ""
  client_addr = "127.0.0.1:38344"
  req_id = 68
  req_method = "PUT"
  req_path = "/v1/data/opa-data/hello-data/x.json"
  resp_status = 204

[INFO] Received request.
  req_id = 69
  req_method = "PUT"
  req_path = "/v1/data/opa-data/hello-data/x.json"
  req_body = |
      {"a":[1,2,3,4]}

And the openpolicyagent.org/policy-status annotation never gets created or updated on the configmap. The pod has cluster wide access to read/update config maps. Everything works as expected if I switch from a namespace called opa-data to one called opa.

querying /v1/data while using kube-mgmt+configmaps+OPA returns {}

time="2021-04-01T00:21:18Z" level=warning msg="First line of log stream."
E0401 00:21:18.310694 1 reflector.go:126] github.com/open-policy-agent/kube-mgmt/pkg/configmap/configmap.go:174: Failed to list *v1.ConfigMap: Get https://:443/api/v1/namespaces/opa/configmaps?limit=500&resourceVersion=0: dial tcp :443: connect: connection refused

OPA logs shows policies loaded up through configmaps but when queries on /v1/data -- returns empty response http 200 OK

when queried on /v1/policies -- returns all the loaded policies through configmaps.

how can i debug the issue ?

auth token should not be listed on command line

This is an enchancement request.

CURRENT STATE

The opa authentication token is passed in via command line with the --opa-auth-token command. This means OPA can be compromised by:

anyone with access to the Kubernetes Deployment (kubectl get deploy -n opa -o yaml)
anyone with access to the kube-mgmt pod (kubectl get pod -n opa -o yaml)
anyone with exec privileges on the kube-mgmt pod

DESIRED STATE

For security reasons, provide a more secure authentication mechanism (TLS) or reduce access to the authentication token. To make authentication tokens more secure:

the authentication token should be stored and optionally encrypted in a Kubernetes secret and mapped to kube-mgmt's filesystem
kube-mgmt takes a command-line argument indicating the location of the auth token file
kube-mgmt takes a command-line argument indicating the decryption key (optional -- no argument indicates the auth token file is not encrypted).

ALTERNATIVES

The following alternatives lower or eliminate the need for this enchancement request:

Support TLS encrypted communications between kube-mgmt and OPA with both services presenting and verifying certificates.
Document an OPA authorization policy that validates policy change requests come from the IP address (localhost?) of the kube-mgmt container.
Anything else that removes the authentication token from the command line, Kubernetes deployment, and Kubernetes pod.

Automatic policy removal using ConfigMap's label

Hello,

I can see that the policy is removed from OPA if the corresponding ConfigMap is deleted.
However, I expected the same behavior when I simply remove openpolicyagent.org/policy=rego label from the policy's ConfigMap.

Can you please advice whether the removal by label is intentionally disabled or not?

kube-mgmt version: 0.10
Flags used:

args:
    - "--policies=my-opa"
    - "--enable-data"
    - "--require-policy-label"

Kubernetes version: v1.11.0+d4cacc0 (via OpenShift v3.11.0+0cbc58b)

If the use case I described looks like a good idea to you, I created this PR.

Thank you,
Andrey

Move to Go modules?

Hi, is there any appetite to move to Go modules? We use it for CoreDNS and works quite well.

Happy to send a PR that deletes 1M files :)

Kubernetes admission request generator

When testing policies with k8s I think it would be beneficial to have some kind of admission request generator so I could test policies on my IDE (vscode) without deploying to a k8s env such as minikube.
Also, it would be helpful to get like a snapshot data json to simulate how OPA would cache information requested - e.g, if I ask OPA to cache v1/namespaces, how will the output data object will look like on a given environment on a given point in time.
If those are already available please tell me how to achieve this kind of data - it would accelerate my policy development process.
Thanks!
Awesome project!

Deployment issue with generateAdmissionControllerCerts

Hi,
Users can use generateAdmissionControllerCerts to ask the chart to generate the certs for them during the deploy. So it seems every new deploy will use new certs when this flag is true?
If this is the case, I'm wondering if this will cause some issues.
During the deploy, depends on how the maxUnavailable is set, you may have some new pods using new certs and old pods using old certs.
If the opa is deployed in a large cluster, what could happen is that at the same time, the caBundle in the validatingwebhookconfiguration is updated, which cause the requests to the old pods fail. The new olds are overloaded by the requests and start crashing. So this may cause all requests to OPA fail.

I might miss something here. Is this true? Thanks!

kube-mgmt is not deleting unexistent k8s resources from opa cache after restart

I am using kube-mgmt as sidecar to OPA to sync some kubernetes resources into OPA. I am currently facing an issue where some ingresses that no longer exist in the cluster still show up in OPAs data cache. kube-mgmt continously crashes +30 times in 2 days due to issues with the informer that it uses (see stack trace at the end). I am unable to replicate since I cannot make kube-mgmt purposely crash, but the observed timeline of the issue seems to be as follows

• An ingress is created
• kube-mgmt syncs the ingress
• Ingress shows up in OPA data cache
• After X amount of time, kube-mgmt informer gets stuck
• Ingress gets deleted
• kube-mgmt panics (see stack trace)
• kube-mgmt comes back up
• kube-mgmt syncs resources, but it doesnt delete the deleted ingress from opa's cache

This makes sense since the ingress doesnt exist in the cluster after kube-mgmt comes back up, so the informer never gets a notice to delete the ingress.

Possible solutions

Add ability for kube-mgmt to terminate the OPA container in the same pod if it crashes for any reason. This will assure that OPA's cache is completely in sync with kube-mgmt
Update the client version that kube-mgmt uses. It currently supports 2.0. The latest client-go version is 11.0. It is recommended that share informers are used to reduce load on the API server

Details:
Kubernetes version: v1.12.9
OPA version: 0.11.0
Kube-mgmt version: 0.8

kube-mgmt stacktrace

      Message:   nt-go/tools/cache.NewInformer.func1(0x1349e40, 0xc420fdf3c0, 0x1349e40, 0xc420fdf3c0)
                 /go/src/github.com/open-policy-agent/kube-mgmt/vendor/k8s.io/client-go/tools/cache/controller.go:317 +0x516
github.com/open-policy-agent/kube-mgmt/vendor/k8s.io/client-go/tools/cache.(*DeltaFIFO).Pop(0xc42039dd40, 0xc4203edfb0, 0x0, 0x0, 0x0, 0x0)
  /go/src/github.com/open-policy-agent/kube-mgmt/vendor/k8s.io/client-go/tools/cache/delta_fifo.go:451 +0x27e
github.com/open-policy-agent/kube-mgmt/vendor/k8s.io/client-go/tools/cache.(*controller).processLoop(0xc42031d200)
  /go/src/github.com/open-policy-agent/kube-mgmt/vendor/k8s.io/client-go/tools/cache/controller.go:147 +0x40
github.com/open-policy-agent/kube-mgmt/vendor/k8s.io/client-go/tools/cache.(*controller).(github.com/open-policy-agent/kube-mgmt/vendor/k8s.io/client-go/tools/cache.processLoop)-fm()
  /go/src/github.com/open-policy-agent/kube-mgmt/vendor/k8s.io/client-go/tools/cache/controller.go:121 +0x2a
github.com/open-policy-agent/kube-mgmt/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc420568fb0)
  /go/src/github.com/open-policy-agent/kube-mgmt/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:97 +0x5e
github.com/open-policy-agent/kube-mgmt/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc4205a3fb0, 0x3b9aca00, 0x0, 0x12a6701, 0xc4201ad8c0)
  /go/src/github.com/open-policy-agent/kube-mgmt/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:98 +0xbd
github.com/open-policy-agent/kube-mgmt/vendor/k8s.io/apimachinery/pkg/util/wait.Until(0xc420568fb0, 0x3b9aca00, 0xc4201ad8c0)
  /go/src/github.com/open-policy-agent/kube-mgmt/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:52 +0x4d
github.com/open-policy-agent/kube-mgmt/vendor/k8s.io/client-go/tools/cache.(*controller).Run(0xc42031d200, 0xc4201ad8c0)
  /go/src/github.com/open-policy-agent/kube-mgmt/vendor/k8s.io/client-go/tools/cache/controller.go:121 +0x237
created by github.com/open-policy-agent/kube-mgmt/pkg/data.(*GenericSync).Run
  /go/src/github.com/open-policy-agent/kube-mgmt/pkg/data/generic.go:80 +0x3b6

Use v1beta1 version of admissionregistration.k8s.io API

Currently, when kube-mgmt is configured to register OPA as an admission controller, it does so using an ExternalAdmissionHookConfiguration from admissionregistration.k8s.io/v1alpha1. This should be updated to use a ValidatingWebhookConfiguration from admissionregistration.k8s.io/v1beta1. This would have the side benefit of making kube-mgmt's registration implementation in-line with the docs on deploying OPA to kube: https://www.openpolicyagent.org/docs/kubernetes-admission-control.html

Use the bundle API

OPA has a readiness probe that leverages bundle API. This is the best way to ensure OPA is ready on initial start and guarantees it consistently keeps up to date with policy changes. Let me describe an example we are faced with the push approach.

acme service is deployed with a opa sidecar. acme becomes ready and accepts requests. opa does not have a readiness probe b/c it is unaware of what data needs to be loaded into it. acme service fails every request with 500/403 b/c opa is unable to process authorization requests to acme.

We have added a workaround to this approach by using an authorization request as a readiness probe, but this does not guarantee all types of policy data has been loaded into OPA. I would like to see other approaches to solve this problem, but it seems to us that the bundle API is the preferred way of handling it in OPA

Ref: https://www.openpolicyagent.org/docs/v0.11.0/rest-api/#health-api

Improve error handling in policy/data updates

kube-mgmt should be able to recover if policy/data updates against OPA fail. Performing a complete re-synchronization is likely the best approach.

make the image rootless

make the image rootless to e.g. comply with runAsNonRoot

redundant configmap patch operations

kube-mgmt/pkg/configmap/configmap.go

Line 250 in 191879e

patch := map[string]interface{}{

there appears to be no check on the configmap annotations here, to only issue a PATCH to apiserver if the desired value is different than the current. i started investigating this because our OPA policies rarely change, yet I see PATCH requests constantly issued to apiserver. this appears to, perhaps, be part of the problem.

Document the set of OPA APIs that kube-mgmt uses

For the purposes of setting a least-privilege authorization policy on OPA, it would be useful to have a list of OPA APIs that kube-mgmt uses.

Optionally, a sample authz policy would be even better for my specific case.

Add support for loading policies via CRDs

Some users have requested the ability to load policies into OPA via CRDs. Creating an issue to track the work.

Define new CRD to contain OPA policies (e.g., policy.openpolicyagent.org/v1beta1)
Implement controller in kube-mgmt similar to configmap controller that loads policies into OPA
Add flag to kube-mgmt to switch policy loading from configmaps to CRDs
Update documentation to refer to CRDs (including recommended RBAC configuration)

readyness check

kube-mgmt should provide a readyness check to ensure at least one sync has completed. This will enable seamless rolling upgrades of the pods.

Namespace of JSON loaded data

Hello,

I'm using the external data loaded from a ConfigMap labeled openpolicyagent.org/data=opa.
README.md says that I need to know the namespace of this ConfigMap when I try to access its data from a policy: data.<ns>.<key>.

I have 2 questions about this:

What is the best way to get the namespace where I put the data ConfigMap? (Workaround I found: use opa.runtime() which gives me all the environment variables of OPA and then make OPA container consume the namespace using the downwardAPI).
Can the data ConfigMap be created in other namespaces (not OPA's one) like policies (--policies flag)?

Kube-mgmt version being used: 0.10

Thank you,
Andrey

Availability of metrics for kube-mgmt

This is a question, not an issue.
I am trying to monitor the OPA running on my K8s cluster via prometheus and grafana. Can you please tell me if there are metrics for kube-mgmt? Are they exposed on the metrics endpoint to be scraped by Prometheus?

OPA TLS based client authentication with kube-mgmt

OPA authentication supports bearer token and TLS based.

referencing document is available when OPA is configured with token-based authN and kube-mgmt, but what if OPA is configured with TLS base authN, in that case how kube-mgmt gets authenticated by OPA, does mgmt support any other cmd line flag to be passed
if yes please suggest with an example if possible
if not guide me about the solution considering we want to opt for TLS based authN within OPA, how mgmt authentication can be handled by OPA

Trouble replicating CRD resources in kube-mgmt

Hi everyone,

I'm having problems caching crds that i created into kube-mgmt
say i have a CRD like such

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: examples.examples.example.com
spec:
  group: examples.example.com
  version: v1
  names:
    kind: Example
    plural: examples
  scope: Cluster
`

I'm trying to import it as
"--replicate-cluster=examples.example.com/v1/examples"

however this does not work.
Any suggestions?

Data object access in policy always results in status "allowed"

Hey,

While deploying the opa helm chart (1.14.6) on our cluster we ran into an issue we haven't been able to solve during the last couple of days. All resources (except our policy cm) are deployed using this values.yaml:

opa:
  opa: false 
  imageTag: 0.30.2
  logLevel: debug

  admissionControllerFailurePolicy: Fail
  admissionControllerNamespaceSelector:
    matchExpressions:
      - {key: openpolicyagent.org/webhook, operator: NotIn, values: [ignore]}

  admissionControllerRules:
    - operations: ["CREATE","UPDATE"]
      apiGroups: ["*"]
      apiVersions: ["*"]
      resources: ["ingressgates"]
      scope: "*"
    - operations: ["CREATE","UPDATE"]
      apiGroups: ["*"]
      apiVersions: ["*"]
      resources: ["ingresses"]
      scope: "*"
    - operations: ["CREATE","UPDATE"]
      apiGroups: ["*"]
      apiVersions: ["*"]
      resources: ["services"]
      scope: "*"

  imagePullSecrets:
  - dockerhub-imagepullsecret

  mgmt:
    configmapPolicies:
      enabled: true
      namespaces: [open-policy-agent]
      requireLabel: true
    data:
      enabled: true

  rbac:
    create: true
    rules:
      cluster: 
      - apiGroups:
        - ""
        resources:
        - configmaps
        verbs:
        - get
        - list
        - watch
        - patch

Then I inserted this policy into the same namespace to test the functionality (should deny ALL ingresses):

apiVersion: v1
data:
  deny-all-ingress.rego: |
    package kubernetes.admission
    deny[msg] {
        input.request.kind.kind == "Ingress"
        msg := sprintf("INGRESS DENIED!")
    } 
kind: ConfigMap
metadata:
  labels:
    openpolicyagent.org/policy: rego
  name: deny-all-ingress

After attempting to insert an ingress into the cluster we recieve this message from kubectl:

Error from server (InternalError): error when creating "ingress.yaml": Internal error occurred: failed calling webhook "webhook.openpolicyagent.org": the server could not find the requested resource

I find this log in the opa container:

[INFO] Received request.
  req_params = |
      {
        "timeout": [
          "30s"
        ]
      }
  client_addr = "10.110.0.2:25765"
  req_id = 8
  req_method = "POST"
  req_path = "/"
  req_body = <ingress data omitted>

[INFO] Sent response.
  req_method = "POST"
  req_path = "/"
  resp_status = 404
  resp_bytes = 99
  resp_duration = 7.827122
  resp_body = |
      {
        "code": "undefined_document",
        "message": "document missing or undefined: data.system.main"
      }
  client_addr = "10.110.0.2:25765"
  req_id = 8

These log lines suggest the webhook is working. Unfortunately we haven't been able to find their cause. Is there anything wrong with our values.yaml? Or is there anything obvious we're overlooking?

Kind regards,
Jacco

High CPU usage in OPA containers

Trying to use opa and kube-mgmt in a cluster to replicate nodes, deployments and statefulsets is causing high cpu (almost 2 cores) in OPA. When I disable replication of these objects CPU usage goes back to 1 millicore.

Number of cluster objects are as follows:
Nodes: 171
Deployments: 2021
StatefulSets: 171

When we hit high CPU, OPA is not responsive and returns failures for all updates from kube-mgmt. Is there a way to batch updates from kube-mgmt to OPA?

Document supported group/version/kind values

The implementation is generic so any set should work, however, the format is somewhat finicky (e.g., v1/services, NOT v1/Service or v1/service, etc.)

Alpine base image

This is a question regarding base image of kube-mgmt

The 0.11 release of kube-mgmt is using alpine VERSION_ID=3.11.2, This version has the below vulnerability

CVE-2020-28928
In musl libc through 1.2.1, wcsnrtombs mishandles particular combinations of destination buffer size and source character limit, as demonstrated by an invalid write access (buffer overflow).

The above vulnerability is fixed in alpine:3.12.3 and above. Below are the details
alpinelinux/docker-alpine#123

I can see that the latest kube-mgmt tag 0.12-dev has the updated alpine image. When can we expect this kube-mgmt release.

Allow setting CA certificate and/or running an insecure https connection

Currently, kube-mgmt can connect to OPA using HTTPS, but it is not possible to do so if OPA's certificate is signed by an internal CA.

In order to support this scenario, kube-mgmt should include flags to either specify a file containing OPA's CA, allow insecure connections, or both.

Kube-mgmt can't delete ingresses it created in OPA

My kube-mgmt instance can't delete the ingresses that it added to the data section of the opa sidecar.

From kube-mgmt:

time="2018-02-06T17:13:52Z" level=error msg="Failed to remove u-nicholas/examplesvc-ingress-debug" 
time="2018-02-06T17:13:52Z" level=error msg="Failed to remove u-nicholas/examplesvc-ingress"

From opa:

sses/u-nicholas/examplesvc-ingress
time="2018-02-06T17:13:52Z" level=info msg="Sent response." client_addr="127.0.0.1:43816" req_id=545957 req_method=PUT req_path=/v1/data/kubernetes/ingresses/u-nicholas/examplesvc-ingress resp_bytes=0 resp_duration=0.219423 resp_status=204
time="2018-02-06T17:13:52Z" level=info msg="Received request." client_addr="127.0.0.1:43816" req_id=545958 req_method=PUT req_params="map[]" req_path=/v1/data/kubernetes/ingresses/u-nicholas/examplesvc-ingress-debug
time="2018-02-06T17:13:52Z" level=info msg="Sent response." client_addr="127.0.0.1:43816" req_id=545958 req_method=PUT req_path=/v1/data/kubernetes/ingresses/u-nicholas/examplesvc-ingress-debug resp_bytes=0 resp_duration=0.112821 resp_status=204
time="2018-02-06T17:13:52Z" level=info msg="Received request." client_addr="127.0.0.1:43816" req_id=545959 req_method=PUT req_params="map[]" req_path=/v1/data/kubernetes/ingresses/u-nicholas/examplesvc-ingress
time="2018-02-06T17:13:52Z" level=info msg="Sent response." client_addr="127.0.0.1:43816" req_id=545959 req_method=PUT req_path=/v1/data/kubernetes/ingresses/u-nicholas/examplesvc-ingress resp_bytes=0 resp_duration=0.129813 resp_status=204
time="2018-02-06T17:13:52Z" level=info msg="Received request." client_addr="127.0.0.1:43816" req_id=545960 req_method=PUT req_params="map[]" req_path=/v1/data/kubernetes/ingresses/u-nicholas/examplesvc-ingress-debug
time="2018-02-06T17:13:52Z" level=info msg="Sent response." client_addr="127.0.0.1:43816" req_id=545960 req_method=PUT req_path=/v1/data/kubernetes/ingresses/u-nicholas/examplesvc-ingress-debug resp_bytes=0 resp_duration=0.102676 resp_status=204

I'm not sure if I'm reading the Rest API docs right, but there's not much about removing existing data.

The ingress is still there if I curl the opa API directly.

Create --require-label option

When running OPA in a namespace with other services, I'd like to limit which configmaps are interpreted as policies. Right now its either in a single namespace, or by a label. It would be good if you could require both.

Allow filtering attributes for caching

Using the caching feature to sync kubernetes resources we currently load about 25 MB of data into each OPA container. Only about 1% of this data is ever used for authorization decisions however. While OPA still performs very well it is a lot harder to browse the data on a running server since 99% of it is essentially noise. Somehow enabling syncing only the data of interest rather than full objects would help a lot here. Might be that I get some time to look into this myself if it's a feature others would benefit from as well.

New release?

Sorry to annoy, but would love to see a new helm chart released with this merged change #86

❤️

structure of /v1/data/ in opa has changed with kube-mgmt 0.10-rc1

We sync kubernetes namespace data to OPA.

With kube-mgmt 0.9 the response of GET /v1/data/kubernetes/namespaces is like:

{
  "decision_id": "0330b1eb-2fc8-4f2f-befe-330b7bbe3032",
  "result": {
    "my-namespace": {
      "apiVersion": "v1",
      "kind": "Namespace",
      "metadata": {
...

With kube-mgmt 0.10-rc1 the response changed to:

{
  "result": {
    "": {                                                  # <-----------
      "my-namespace": {
        "apiVersion": "v1",
        "kind": "Namespace",
        "metadata": {
...

Note the empty-string hash key.

ValidatingWebhookConfiguration v1beta1 fail for kubernetes 1.22

When i try to use the chart (version 2.0.0) with kubernetes 1.22 i get this error:
no matches for kind "ValidatingWebhookConfiguration" in version "admissionregistration.k8s.io/v1beta1"

Kubernetes 1.21 works fine.
Kubernetes 1.22 stopped serving this version see: https://kubernetes.io/docs/reference/using-api/deprecation-guide/#webhook-resources-v122.

I suggest to update to admissionregistration.k8s.io/v1.

incompatible with cert-manager 1.6

cert-manager 1.4 deprecated v1beta1, v1alpha3, and v1alpha2,
cert-manager 1.6 removed these versions now.
see: https://cert-manager.io/docs/contributing/crds/#versions

when trying to install the chart the message cert-manager CRD does not appear to be installed appears, that confused me for a moment because cert-manager was clearly installed.

--policies=* does not appear to be working

Given the following deployment:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: opa
  namespace: opa
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: opa
    spec:
      containers:
      - name: opa
        image: openpolicyagent/opa
        args:
        - "run"
        - "--server"
        ports:
        - name: http
          containerPort: 8181
      - name: kube-mgmt
        image: openpolicyagent/kube-mgmt:0.10
        args:
          - --enable-policies=true
          - --policies=*
          - --require-policy-label=true

And given the following configmap:

apiVersion: v1
kind: ConfigMap
metadata:
  name: test
  namespace: test
  labels:
    openpolicyagent.org/policy: rego
data:
  test.rego: |
    package kubernetes

    example = "Hello, Kubernetes!"

kube-mgmt does not appear to be loading the configmap at all, if I were to set --policies=test with an explicit namespace it will load the configmaps fine.

I was under the impression that --policies=* will load from any namespace?

Add examples for easier use of kube-mgmt with OPA

kube-mgmt with OPA can be very powerful for operating Kubernetes clusters at scale in large companies. Policies can enforcing company specific compliance rules or Initializers can directly injecting required parameters into the Kubernetes objects. However, it is currently hard to discover and quickly get off the ground with this setup. The only document I found was this webpage on the website. That too is limited in its utility since it does not talk about RBAC.

Along with the official documentation, it will be better to have an example directory with complete (directly usable) examples to get started.

kube-mgmt throwing errors when pod has been running for quite some time

Unsure if this is an actual issue or if I should log separate issues for each of them.

I am seeing a weird behavior in kube-mgmt after it has ran for a while; eg: more than 5 hours

This logs start appearing

E0904 12:18:45.089926 1 streamwatcher.go:109] Unable to decode an event from the watch stream: unable to decode watch event: no kind "Status" is registered for version "v1" in scheme "github.com/open-policy-agent/kube-mgmt/pkg/configmap/configmap.go:102"

kube-mgmt seems to stop honoring the --require-policy-label as all the configmaps in the opa namespace get annotated with the penpolicyagent.org/policy-status: annotation
Resource syncs seem to crash periodically (on an hourly basis)

time="2019-09-04T10:51:03Z" level=info msg="Sync channel for extensions/v1beta1/ingresses closed. Restarting immediately."
time="2019-09-04T10:51:03Z" level=info msg="Syncing extensions/v1beta1/ingresses."
time="2019-09-04T10:51:08Z" level=info msg="Listed extensions/v1beta1/ingresses and got 319 resources with resourceVersion 1188097528. Took 5.462164663s."
time="2019-09-04T10:52:05Z" level=info msg="Loaded extensions/v1beta1/ingresses resources into OPA. Took 56.650552306s. Starting watch at resourceVersion 1188097528."

time="2019-09-04T11:56:18Z" level=info msg="Sync channel for extensions/v1beta1/ingresses closed. Restarting immediately."
time="2019-09-04T11:56:18Z" level=info msg="Syncing extensions/v1beta1/ingresses."
time="2019-09-04T11:56:19Z" level=info msg="Listed extensions/v1beta1/ingresses and got 319 resources with resourceVersion 1188317650. Took 872.624828ms."
time="2019-09-04T11:57:43Z" level=info msg="Loaded extensions/v1beta1/ingresses resources into OPA. Took 1m24.203876182s. Starting watch at resourceVersion 1188317650."

time="2019-09-04T12:39:50Z" level=info msg="Sync channel for extensions/v1beta1/ingresses closed. Restarting immediately."
time="2019-09-04T12:39:50Z" level=info msg="Syncing extensions/v1beta1/ingresses."
time="2019-09-04T12:39:53Z" level=info msg="Listed extensions/v1beta1/ingresses and got 319 resources with resourceVersion 1188471126. Took 2.790341522s."
time="2019-09-04T12:41:36Z" level=info msg="Loaded extensions/v1beta1/ingresses resources into OPA. Took 1m43.350371162s. Starting watch at resourceVersion 1188471126."

Unsure what the root cause may be. But here are some details about the versions being used

kube-mgmt version: 0.9
OPA version: 0.12.2

trouble loading policy, kube-mgmt logs throwing error

I am trying to roll out opa with helm chart. The chart was successfully rolled out but now I am having trouble enforcing the policy.

Version of Kubernetes:
k8s- v1.15.6
opa- opa:0.15.1
kube-mgmt- 0.10

What happened:
my config map looks like this

Name:         admission-control
Namespace:    opa-test
Labels:       openpolicyagent.org/policy=rego
Annotations:  openpolicyagent.org/policy-status: {"status":"ok"}

Data
====
admission-policy.rego:
----
package kubernetes.admission

deny[reason] {
  some container
  input_containers[container]
  not startswith(container.image, "docker-integration.cernerrepos.net")
  reason := "container image refers to wrong registry, must be from docker-integration.cernerrepos.net"
}

input_containers[container] {
  container := input.request.object.spec.containers[_]
}

input_containers[container] {
  container := input.request.object.spec.template.spec.containers[_]
}

Events:  <none>

      - name: mgmt
          image: openpolicyagent/kube-mgmt:0.10
          imagePullPolicy: IfNotPresent
          resources:
            {}
            
          args:
            - --opa-auth-token-file=/bootstrap/mgmt-token
            - --opa-url=http://127.0.0.1:8181/v1
            - --replicate-path=kubernetes
            - --enable-data=false
            - --enable-policies=true
            - --policies=opa-test
            - --require-policy-label=true

I went ahead and looked up my kube-mgmt logs and it shows-

time="2020-01-27T22:55:11Z" level=warning msg="First line of log stream."
E0127 23:03:12.799556       1 streamwatcher.go:109] Unable to decode an event from the watch stream: unable to decode watch event: no kind "Status" is registered for version "v1" in scheme "github.com/open-policy-agent/kube-mgmt/pkg/configmap/configmap.go:102"
E0127 23:04:53.804350       1 streamwatcher.go:109] Unable to decode an event from the watch stream: unable to decode watch event: no kind "Status" is registered for version "v1" in scheme "github.com/open-policy-agent/kube-mgmt/pkg/configmap/configmap.go:102"

What you expected to happen:

Expected the policy to be enforced

I am not sure of this logs I am seeing here in kube-mgmt.

Helm installation is stating outdated command structure

Refer to https://github.com/open-policy-agent/kube-mgmt/blame/master/charts/opa/README.md#L27

The helm 3 command structure is now:

/# helm upgrade -i -n opa --create-namespace opa/opa
Error: "helm upgrade" requires 2 arguments

Usage:  helm upgrade [RELEASE] [CHART] [flags]

Resulting in the correct version like this:

helm upgrade opa opa/opa -i -n opa --create-namespace

Remove need for --replicate flags

The policies should be analyzed to determine which Kubernetes resources to replicate into OPA.

For example, given a policy like:

package kubernetes.admission

import data.kubernetes.resources.namespaces
import data.kubernetes.resources.pods

deny[msg] { ... }

We could establish a convention that Kubernetes data is inserted at a specific path (e.g., kubernetes.resources.) Then the policies could be analyzed to determine which resource kinds are required.

kube-mgmt with istio as opa side car

This is not an issue but a query.

I am implementing with OPA as sidecar in istio and Kubernetes. I am successfully installed this combination using OPA - ISTIO - plugin.
https://github.com/open-policy-agent/opa-istio-plugin

But, want to implement kube-mgmt also, do we have any recommended approach for the same.

Add flag to automatically generate admission CA and server cert

When kube-mgmt is deployed and automatically registers OPA as an admission controller, it would be nice if it would take care of generating the CA and server cert (and making the latter available to OPA) as this is currently done manually. This should be an optional feature that users can enable using a command line argument.

arm support

I tried deploying opa using helm on a Rancher k3s cluster running on a number of Raspberry Pis.

The deployment fails with an "exec format error" message

opa only supports the amd64 architecture, not the arm/v6 architecture.

Support for multiple admission controllers in the Helm chart

Is there any reason why we wouldn't want to have multiple admission controllers (e.g. both validating and mutating) pointing to the same opa deployment? My org. needs to use both features in our k8s cluster.

If not, I can submit a pull request to generate multiple admission controllers like below:

# values.yaml
admissionControllers:
- admissionControllerKind: MutatingWebhookConfiguration
  admissionControllerFailurePolicy: Ignore
  admissionControllerNamespaceSelector:
    matchExpressions:
      - {key: openpolicyagent.org/webhook, operator: NotIn, values: [ignore]}
  admissionControllerSideEffect: Unknown
  admissionControllerRules:
  - operations: ["*"]
    apiGroups: ["*"]
    apiVersions: ["*"]
    resources: ["*"]
- admissionControllerKind: ValidatingWebhookConfiguration
  admissionControllerFailurePolicy: Ignore
  admissionControllerNamespaceSelector:
    matchExpressions:
      - {key: openpolicyagent.org/webhook, operator: NotIn, values: [ignore]}
  admissionControllerSideEffect: Unknown
  admissionControllerRules:
  - operations: ["*"]
    apiGroups: ["*"]
    apiVersions: ["*"]
    resources: ["*"]

Reuse Kubernetes client across replicated resources

Is there a reason that kube-mgmt creates a new kubernetes client for each replicated resource type? I think it should be more efficient to create a single Kubernetes client that's used for all k8s api requests.

kube-mgmt is not recompiling/rechecking the syntactical issues

Hi @tsandall ,

Facing one strange issue, where for example i have deployment one configmap rego policy which then compiled and validated by kube-mgmt and loaded into opa server and annotated with status OK.

Post above if i made any syntactical error(intentionally just to validate) in the loaded policy by using kubectl edit configmap command, kube-mgmt annotate the status with ERROR, which is as per normal behaviour of kube-mgmt, but i observed below issues.

any new policy configmap does not get annotated by kube-mgmt with respect to its status and also it does not gets applied by opa.
even if i rollback/revert the syntactical error made earlier, but status would remain the same with ERROR.

In order to overcome above issues, i have to restart the opa deployment.

I checked the opa & mgmt logs as well, didn't find anything suspicious.

Please guide me here with your expertise.

Thx,
Sandeep

Customize configmap policy label key/value

We have a setup to run OPA validating and mutating webhooks as two different k8s deployments. Now, we would want the policy configmaps for validating and mutating hooks to be distinguished. Unfortunately the configmap label key/value is hard-coded and both OPA instances load all the validating & mutating configmaps. Could we add a CLI option to specify the policy label key & value to look for?

[Helm] better default for namespaces list passed via `--policies`

The default doesn't make much sense. We could at least have an empty list as default and set the release namespace if left empty, otherwise use the provided list.