Giter VIP home page Giter VIP logo

pulsar-helm-chart's Issues

Configuration .Values.tls.proxy.enableTlsWithBroker and .Values.broker.enableForProxyToBroker are conflicting

.Values.tls.proxy.enableTlsWithBroker:

proxy:
# Applies to connections to standalone function worker, too.
enableTlsWithBroker: false

.Values.broker.enableForProxyToBroker:

broker:
enableForProxyToBroker: false

Similar problem seems to exist with .Values.tls.function.enableTlsWithBroker and .Values.tls.broker.enableForFunctionWorkerToBroker.

Wouldn't it make sense to have .Values.tls.broker.enabled instead?

Missing charts

  1. Using current master branch helmcharts
  2. Download dev-values.yaml
  3. helm install pulsar-hank -n hank-test -f dev-values.yaml ./pulsar/
    Out: Error: found in Chart.yaml, but missing in charts/ directory: kube-prometheus-stack, cert-manager, keycloak

Unable to specify loadBalancerIP on Services

I'd like to set an internal static IP to a service with type LoadBalancer, but there is no configuration available to define this.
Could we get a service.loadBalancerIP option for pulsar services so this could be set via helm?

externalDNS version doesn't work with k8s 1.22

The version of externalDNS image used by the chart does not work in k8s 1.22.

The following version works: k8s.gcr.io/external-dns/external-dns:v0.10.2

Also, since the older ingress API versions were removed, the ClusterRole rules needs to be updated to this:

rules:
  - apiGroups: [""]
    resources: ["services"]
    verbs: ["get","watch","list"]
  - apiGroups: [""]
    resources: ["pods"]
    verbs: ["get","watch","list"]
  - apiGroups: ["networking","networking.k8s.io"]
    resources: ["ingresses"]
    verbs: ["get","watch","list"]
  - apiGroups: [""]
    resources: ["nodes"]
    verbs: ["get","watch","list"]
  - apiGroups: [""]
    resources: ["endpoints"]
    verbs: ["get","watch","list"]

[Functions] Failed to resolve 'pulsar-broker.mypulsar.svc.cluster.local'

When deploying pulsar with TLS enabled and running a source function the following error is logged in the spawned function:

java.util.concurrent.CompletionException: org.apache.pulsar.client.api.PulsarClientException: java.util.concurrent.CompletionException: java.net.UnknownHostException: Failed to resolve 'pulsar-broker.mypulsar.svc.cluster.local' after 2 queries 
	at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331) ~[?:?]
	at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:346) ~[?:?]
	at java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:704) ~[?:?]
	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) ~[?:?]
	at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088) ~[?:?]
	at org.apache.pulsar.client.impl.ConnectionPool.lambda$createConnection$10(ConnectionPool.java:226) ~[java-instance.jar:?]
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) ~[java-instance.jar:?]
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:469) ~[java-instance.jar:?]
	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:391) ~[java-instance.jar:?]
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986) ~[java-instance.jar:?]
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[java-instance.jar:?]
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[java-instance.jar:?]
	at java.lang.Thread.run(Thread.java:829) [?:?]
Caused by: org.apache.pulsar.client.api.PulsarClientException: java.util.concurrent.CompletionException: java.net.UnknownHostException: Failed to resolve 'pulsar-broker.mypulsar.svc.cluster.local' after 2 queries 
	... 8 more

Repro:

  1. Deploy pulsar with tls and functions enabled (values below)
  2. create a partitioned topic and subscription
  3. generate a source function:
                        bin/pulsar-admin sources create
                        -t data-generator --name data-generator-source
                        --source-config '{"sleepBetweenMessages":"10"}'
                        --destination-topic-name persistent://public/default/test

Values:

enableAntiAffinity: no
enableTls: yes
tls:
  function:
    enableTlsWithBroker: true
    enableHostnameVerification: true
cert-manager:
  enabled: true
createCertificates:
  selfSigned:
    enabled: true
enableTokenAuth: yes
autoRecovery:
  enableProvisionContainer: yes
restartOnConfigMapChange:
  enabled: yes
image:
  zookeeper:
    repository: datastaxlunastreaming-all
    tag: 2.7.2_1.1.32
  bookie:
    repository: datastaxlunastreaming-all
    tag: 2.7.2_1.1.32
  bookkeeper:
    repository: datastaxlunastreaming-all
    tag: 2.7.2_1.1.32
  autorecovery:
    repository: datastaxlunastreaming-all
    tag: 2.7.2_1.1.32
  broker:
    repository: datastaxlunastreaming-all
    tag: 2.7.2_1.1.32
  proxy:
    repository: datastaxlunastreaming-all
    tag: 2.7.2_1.1.32
  functions:
    repository: datastaxlunastreaming-all
    tag: 2.7.2_1.1.32
  function:
    repository: datastaxlunastreaming-all
    tag: 2.7.2_1.1.32
extra:
  broker: false
  brokerSts: true
  function: yes
  burnell: yes
  burnellLogCollector: yes
  pulsarHeartbeat: yes
  pulsarAdminConsole: yes
  functionsAsPods: yes
default_storage:
  existingStorageClassName: server-storage
volumes:
  data: #ASF Helm Chart
    storageClassName: existent-storage-class
zookeeper:
  replicaCount: 3
bookkeeper:
  replicaCount: 3
broker:
  component: broker
  replicaCount: 2
  ledger:
    defaultEnsembleSize: 1
    defaultAckQuorum:  1
    defaultWriteQuorum: 1
function:
  replicaCount: 1
  functionReplicaCount: 1
  runtime: "kubernetes"
proxy:
  disableZookeeperDiscovery: true
  useStsBrokersForDiscovery: true
  replicaCount: 2
  autoPortAssign:
    enablePlainTextWithTLS: yes
  service:
    type: ClusterIP
    autoPortAssign:
      enabled: yes
grafanaDashboards:
  enabled: yes
pulsarAdminConsole:
  replicaCount: 0
  service:
    type: ClusterIP
grafana: #ASF Helm Chart
  service:
    type: ClusterIP
pulsar_manager:
  service: #ASF Helm Chart
    type: ClusterIP
kube-prometheus-stack: # Luna Streaming Helm Chart
  enabled: no
  prometheusOperator:
    enabled: no
  grafana:
    enabled: no
    service:
      type: ClusterIP
pulsarSQL:
  service:
    type: ClusterIP

Prepare next release with support for OpenShift deployment

This is an issue to track the tasks for preparing the release with support for OpenShift deployment

Tasks

  • Publish docker image for datastax/burnell
  • Update datastax/burnell docker image tag in values.yaml
  • Publish docker image for datastax/pulsar-heartbeat
  • Update datastax/pulsar-heartbeat docker image tag in values.yaml
  • Publish docker image for datastax/pulsar-admin-console
  • Update pulsar-admin-console docker image tag and make deployment to support rootless image - update helm-chart-sources/pulsar/templates/admin-console/pulsar-admin-console-deployment.yaml: /root/dashboard/dist/config-override.js -> /home/appuser/dashboard/dist/config-override.js
  • Add documentation for OpenShift deployment
  • Add tests for kube-prometheus-stack upgrade to 16.x.x and merge #22 after tests exist
  • Test OpenShift deployment with TLS
  • Add documentation for setting up TLS with OpenShift
  • Test OpenShift monitoring integration where the built-in OpenShift Prometheus operator is used
  • Add documentation for OpenShift monitoring setup

Cannot set existing StorageClass with `existingStorageClassName`.

Problem

I both tried setting existingStorageClassName under global default_storage or under specific volume to an existing StorageClass ebs-pulsar, but the created PVCs are still using the default StorageClass ebs-gp3.

image

image

My custom helm values:

components:
  zookeeper: true
  bookkeeper: true
  autorecovery: true
  broker: true
  functions: true
  proxy: true
  toolset: true
  pulsar_manager: true
monitoring:
  prometheus: false
  grafana: false
volumes:
  persistence: true
default_storage:
  existingStorageClassName: ebs-pulsar
antiAffinity:
  host:
    enabled: true
    mode: required
  zone:
    enabled: true
nodeSelector:
  dedicated: infrastructure
zookeeper:
  volumes:
    data:
      name: data
      size: 40Gi
  tolerations:
    - effect: NoSchedule
      key: dedicated
      operator: Equal
      value: infrastructure
bookkeeper:
  volumes:
    journal:
      name: journal
      size: 20Gi
    ledgers:
      name: ledgers
      size: 100Gi
    ranges:
      name: ranges
      size: 10Gi
  tolerations:
    - effect: NoSchedule
      key: dedicated
      operator: Equal
      value: infrastructure
autorecovery:
  tolerations:
    - effect: NoSchedule
      key: dedicated
      operator: Equal
      value: infrastructure
broker:
  tolerations:
    - effect: NoSchedule
      key: dedicated
      operator: Equal
      value: infrastructure
functions:
  volumes:
    data:
      name: logs
      size: 10Gi
      existingStorageClassName: ebs-pulsar
  tolerations:
    - effect: NoSchedule
      key: dedicated
      operator: Equal
      value: infrastructure
proxy:
  tolerations:
    - effect: NoSchedule
      key: dedicated
      operator: Equal
      value: infrastructure
toolset:
  tolerations:
    - effect: NoSchedule
      key: dedicated
      operator: Equal
      value: infrastructure
pulsar_manager:
  tolerations:
    - effect: NoSchedule
      key: dedicated
      operator: Equal
      value: infrastructure

Env

Kubernetes: v1.21 in EKS
Helm: v3.3.4
Pulsar Chart: v2.7.6

Add documentation to disable `kube-prometheus-stack` optional features

Documentation should be added to cover the issue described below.

Whenever the conditional dependency kube-prometheus-stack.enabled is disabled, it does not produce the expected behaviour of disabling all the Prometheus stack components. In fact, each of those should be disabled in the values.yaml, by adding the following:

kube-prometheus-stack:
  enabled: false
  prometheusOperator:
    enabled: false
  grafana:
    enabled: false
    adminPassword: e9JYtk83*4#PM8
  alertmanager:
    enabled: false
  prometheus:
    enabled: false

It should be noted that unless all the Prometheus components are disabled, the helmchart will attempt to install Prometheus CRDs. This might be an issue in cases when the service account used to deploy the cluster does not have enough permissions to install Custom Resources Definitions.

Pod in crashloop ‘data/zookeeper’: Permission denied

Hey there,

currently getting a crashloop on zookeeper pod:

[conf/zookeeper.conf] Adding config quorumListenOnAllIPs = true
Current server id 1
Creating data/zookeeper/myid with id = 1
mkdir: cannot create directory ‘data/zookeeper’: Permission denied
kubectl get pods -n pulsar   
NAME                                                  READY   STATUS             RESTARTS   AGE
angry-shark-29-grafana-58fbb56d8f-j2ntg               2/2     Running            0          2m34s
angry-shark-29-kube-promet-operator-8bbcdbcf6-8mfpt   1/1     Running            0          2m34s
angry-shark-29-kube-state-metrics-7797486dfd-6xfm6    1/1     Running            0          2m34s
angry-shark-29-prometheus-node-exporter-mbcsk         1/1     Running            0          2m34s
angry-shark-29-prometheus-node-exporter-slxfk         1/1     Running            0          2m34s
angry-shark-29-prometheus-node-exporter-zlwk4         1/1     Running            0          2m34s
prometheus-angry-shark-29-kube-promet-prometheus-0    2/2     Running            0          2m32s
pulsar-autorecovery-59b5cbd74d-xx2qq                  1/1     Running            3          2m34s
pulsar-bastion-7f9656fdd7-czfbn                       1/1     Running            0          2m34s
pulsar-bookkeeper-0                                   0/1     Pending            0          2m59s
pulsar-broker-54cdf98dc9-pqdjd                        0/1     Pending            0          2m59s
pulsar-broker-c97d67cb8-sfd5j                         0/1     Init:0/1           0          2m33s
pulsar-proxy-559687647b-sthqj                         0/2     Pending            0          2m59s
pulsar-proxy-5fcbc8c9bd-h54fx                         0/3     Pending            0          2m33s
pulsar-zookeeper-0                                    0/1     CrashLoopBackOff   4          2m59s

Current HELM Values:

 storageValues = {
        "default_storage": {
          "provisioner": "kubernetes.io/aws-ebs",
          "type": "gp2",
          "fsType": "ext4",
          "extraParams": {
            "iopsPerGB": "10"
          }
        },
      };
const values = 
       {
        "fullnameOverride": "pulsar",
        "dnsName": "pulsar.example.com",
        "enableWaitContainers": "false",
        "rbac": {
          "create": true,
          "clusterRoles": true
        },
        "persistence": true,
        "enableAntiAffinity": false,
        "enableTls": false,
        "enableTokenAuth": false,
        "restartOnConfigMapChange": {
          "enabled": true
        },
        "extra": {
          "function": true,
          "burnell": true,
          "burnellLogCollector": true,
          "pulsarHeartbeat": true,
          "pulsarAdminConsole": true
        },
        "zookeeper": {
          "replicaCount": 1,
          "resources": {
            "requests": {
              "memory": "300Mi",
              "cpu": 0.3
            }
          },
          "configData": {
            "PULSAR_MEM": "\"-Xms300m -Xmx300m -Djute.maxbuffer=10485760 -XX:+ExitOnOutOfMemoryError\""
          }
        },
        "bookkeeper": {
          "replicaCount": 1,
          "resources": {
            "requests": {
              "memory": "512Mi",
              "cpu": 0.3
            }
          },
          "configData": {
            "BOOKIE_MEM": "\"-Xms312m -Xmx312m -XX:MaxDirectMemorySize=200m -XX:+ExitOnOutOfMemoryError\""
          }
        },
        "broker": {
          "component": "broker",
          "replicaCount": 1,
          "ledger": {
            "defaultEnsembleSize": 1,
            "defaultAckQuorum": 1,
            "defaultWriteQuorum": 1
          },
          "resources": {
            "requests": {
              "memory": "600Mi",
              "cpu": 0.3
            }
          },
          "configData": {
            "PULSAR_MEM": "\"-Xms400m -Xmx400m -XX:MaxDirectMemorySize=200m -XX:+ExitOnOutOfMemoryError\""
          }
        },
        "autoRecovery": {
          "resources": {
            "requests": {
              "memory": "300Mi",
              "cpu": 0.3
            }
          }
        },
        "function": {
          "replicaCount": 1,
          "functionReplicaCount": 1,
          "resources": {
            "requests": {
              "memory": "512Mi",
              "cpu": 0.3
            }
          },
          "configData": {
            "PULSAR_MEM": "\"-Xms312m -Xmx312m -XX:MaxDirectMemorySize=200m -XX:+ExitOnOutOfMemoryError\""
          }
        },
        "proxy": {
          "replicaCount": 1,
          "resources": {
            "requests": {
              "memory": "512Mi",
              "cpu": 0.3
            }
          },
          "wsResources": {
            "requests": {
              "memory": "512Mi",
              "cpu": 0.3
            }
          },
          "configData": {
            "PULSAR_MEM": "\"-Xms400m -Xmx400m -XX:MaxDirectMemorySize=112m\""
          },
          "autoPortAssign": {
            "enablePlainTextWithTLS": true
          },
          "service": {
            "autoPortAssign": {
              "enabled": true
            }
          }
        },
        "grafanaDashboards": {
          "enabled": true
        },
        "pulsarAdminConsole": {
          "replicaCount": 1
        },
        "kube-prometheus-stack": {
          "enabled": true,
          "prometheusOperator": {
            "enabled": true
          },
          "grafana": {
            "enabled": true,
            "adminPassword": "***********"
          }
        }
      }

Some help/hints much appreciated

Remove default credentials from values file

There are default credentials for configuring Tardigrade and Grafana in the values file:

tardigrade:
  access: access-key-generated-with-uplink
  accessKey: 2J7EJY4xTK6uHKqnCE4nAhdGfXqy
  secretKey: 4YeYwYdsoFFpvtNFuncWcTVqSTPL
  service:
    port: 7777
    type: ClusterIP

And:

  grafana:
    enabled: true
    # namespaceOverride: "monitoring"
    testFramework:
      enabled: false
    defaultDashboardsEnabled: true
    adminPassword: ZhF9sS8B7PQSTR

These default values should be removed.

Key duplication in configMaps causing errors on some deployment tools

In these two files I found that there is a key duplication in the configMap:

https://github.com/datastax/pulsar-helm-chart/blob/master/helm-chart-sources/pulsar/templates/broker-deployment/broker-configmap.yaml#L222
https://github.com/datastax/pulsar-helm-chart/blob/master/helm-chart-sources/pulsar/templates/zookeeper/zookeeper-configmap.yaml#L40

For example, if I run helm template . and view zookeeper's configMap output:

apiVersion: v1
kind: ConfigMap
metadata:
  name: "pulsar-broker"
  namespace: nuri-test
  labels:
    app: pulsar
    chart: pulsar-2.0.9
    release: RELEASE-NAME
    heritage: Helm
    component: broker
    cluster: pulsar
data:
  zookeeperServers:
    pulsar-zookeeper-ca:2181
  configurationStoreServers:
    pulsar-zookeeper-ca:2181
  clusterName: pulsar
  allowAutoTopicCreationType: "non-partitioned"
  PULSAR_EXTRA_OPTS: -Dpulsar.log.root.level=info
  PULSAR_GC: -XX:+UseG1GC
  PULSAR_LOG_LEVEL: info
  PULSAR_LOG_ROOT_LEVEL: info
  PULSAR_MEM: -Xms2g -Xmx2g -XX:MaxDirectMemorySize=2g -Dio.netty.leakDetectionLevel=disabled
    -Dio.netty.recycler.linkCapacity=1024 -XX:+ExitOnOutOfMemoryError
  backlogQuotaDefaultRetentionPolicy: producer_exception
  brokerDeduplicationEnabled: "false"
  exposeConsumerLevelMetricsInPrometheus: "false"
  exposeTopicLevelMetricsInPrometheus: "true"
  # Workaround for double-quoted values in old values files
  PULSAR_MEM: -Xms2g -Xmx2g -XX:MaxDirectMemorySize=2g -Dio.netty.leakDetectionLevel=disabled -Dio.netty.recycler.linkCapacity=1024 -XX:+ExitOnOutOfMemoryError
  PULSAR_GC: -XX:+UseG1GC 

The last 2 (PULSAR_MEM and PULSAR_GC) appear twice. This isn't a problem for Helm to handle.

However, for deployment automation tools such as kustomize and FluxCD this presents a problem as they depend on kyaml.

This is a known issue: kubernetes-sigs/kustomize#3480
There's even the chance the error will eventually end up in Helm's newer versions.

I haven't noticed this duplication/workaround on apache's pulsar chart.

Would it be possible to remove the duplication or handle it in a way that will not produce double keys in the configMap?
There might be other instances of the duplication in other configMaps, I haven't gone through all of them.

Thank you.

Enable annotations on services

The following services do not support annotations, which mean that on AWS you cannot control the load balancer type used:

  • broker
  • pulsarSql

Add an option to delete PVCs while uninstalling the Helm Chart

While running tests on GKE with Fallout we found out that disks are not released even by deleting the GKE cluster.
This is because the PVC created by the Helm Chart are not deleted when uninstalling the chart.

It would be great to have an option to automatically delete the PVCs

PulsarHeartBeat and Bastion compnent failed to find "rootCaSecretName"

When TLS is enabled and a customized root ca secrete name (e.g. tls-ss-ca) is provided (see below), PulsarHeartBeat and Bastion components failed to initialize with error "MountVolume.SetUp failed for volume "certs" : secret "tls-ss-ca" not found"

tls:
  ... ... 
   rootCaSecretName: "tls-ss-ca"

Checking the secrets, there is no ca certificate created with the customized name (e.g. 'tls-ss-ca'). However, a ca certificate with the default name 'pulsar-ss-ca' is created:

% kubectl get secrets | grep ss-ca
pulsar-ss-ca                                    kubernetes.io/tls                     3      7m32s

Move bookkeeper metadata initialization to it's own job

Currently
bookkeeper shell metaformat --nonInteractive is called every time a bookie is started:

# This initContainer will make sure that the bookkeeper
# metadata is in zookeeper
- name: pulsar-bookkeeper-metaformat
image: "{{ .Values.image.bookkeeper.repository }}:{{ .Values.image.bookkeeper.tag }}"
imagePullPolicy: {{ .Values.image.bookkeeper.pullPolicy }}
command: ["sh", "-c"]
args:
- >
bin/apply-config-from-env.py conf/bookkeeper.conf &&
bin/apply-config-from-env.py conf/bkenv.sh &&
bin/bookkeeper shell metaformat --nonInteractive || true;

bookkeeper shell metaformat command is deprecated since 4.7.0
https://bookkeeper.apache.org/docs/latest/reference/cli/#bookkeeper-shell-metaformat

This command is deprecated since 4.7.0, in favor of using initnewcluster for initializing a new cluster and nukeexistingcluster for nuking an existing cluster.

Consider moving the bookkeeper metadata initialization to it's own job

Error while installing: " unable to build kubernetes objects from release manifest"

Dears,
After following your instructions:

helm repo add datastax-pulsar https://datastax.github.io/pulsar-helm-chart
helm repo update
curl -LOs https://datastax.github.io/pulsar-helm-chart/examples/dev-values.yaml
helm install pulsar -f dev-values.yaml datastax-pulsar/pulsar

I get this:

$ helm install pulsar -f dev-values.yaml datastax-pulsar/pulsar
Error: unable to build kubernetes objects from release manifest: error validating "": error validating data: [ValidationError(Prometheus.spec): unknown field "probeNamespaceSelector" in com.coreos.monitoring.v1.Prometheus.spec, ValidationError(Prometheus.spec): unknown field "probeSelector" in com.coreos.monitoring.v1.Prometheus.spec, ValidationError(Prometheus.spec): unknown field "shards" in com.coreos.monitoring.v1.Prometheus.spec]

Is there a bug in your templates ?

Regards

Service Annotations in pulsar admin console incorrectly scoped

At this line:

The helm variables name for service annotations is .Values.pulsarAdminConsole.annotations but should be scoped to the service, consistent with other components, as .Values.pulsarAdminConsole.service.annotations

Workaround is to change your values.yaml to match, but it is then inconsistent with standards and the other components of the chart.

DNS resolutions errors with Broker host names returned by Pulsar lookups

There's currently a conflicting problem with the Pulsar k8s deployment and how Pulsar load balancing works.

When a Pulsar broker starts, it will register itself as a broker in the internal Pulsar load balancer. Pulsar load balancer might immediately assign new namespace bundles to the broker and the topics might immediately get requests.

The conflicting problem is that DNS resolution for the broker's host name will fail with the current settings until the broker's readiness probe succeeds.

Pulsar might already return the hostname of a specific broker to a client, but the client cannot resolve the DNS name since the broker's readiness probe hasn't passed. This causes extra delays and also bugs when connecting to topics after a load balancing event. Pulsar clients usually backoff and retry. For Admin API HTTP requests, clients might not properly handle errors and for example Pulsar Proxy will fail the request when there's a DNS lookup issue.

solution:
Broker statefulset's service should use publishNotReadyAddresses: true

There's useful information about stateful sets and publishNotReadyAddresses setting:
k8ssandra/cass-operator#18

There's an alternative solution in #198 which is fine for cases where TLS is disabled for brokers. Stable hostnames are required when using TLS to be able to do hostname verification for the certificates.

Review Port Definitions to Ensure Chart Flexibility

Observation

The current chart has hard coded ports throughout. A good example is the Pulsar Admin Console's nginx configuration. It has hard coded ports for the pulsar proxy service. However, the pulsar proxy service allows for ports to be defined. If a user were to deploy the chart with non-default ports for the proxy service (and possibly other services), the components might not integrate properly.

Solution

Review all hard coded ports. Use the .Values to make those ports configurable. In the case of services, it can be easier to declare target ports using the pod's port names instead of the port numbers. I think using port names, instead of numbers, can make it more readable and reduces the configuration necessary for a service. This feature is described here: https://kubernetes.io/docs/concepts/services-networking/service/#defining-a-service.

Istio service mesh compatibility

Currently the manner in which jobs start in this helm chart interferes with the use of an istio service mesh.

However, pulsar otherwise would be very compatible with a service mesh and this would reduce complexity for mTLS and ingress routing to a separate aspect of the environment, which is often desirable.

This issue can be reproduced as follows with a basic istio installation. From a new installation:

kubectl create namespace pulsar
kubectl label namespace pulsar istio-injection=enabled
helm upgrade --install  \
   pulsar  datastax-pulsar/pulsar \
  --namespace pulsar \
  --create-namespace 

This will result in a set of initialization jobs that are hung up.

If instead we disable istio in the namespace kubectl label namespace pulsar istio-injection-, install datastax luna pulsar via the helm chart, then re-eanble istio, and cycle the various pods that take place in the data plane, the system works as expected.

This indicates that a set of tweaks to the jobs ( e.g. by setting pod labels sidecar.istio.io/inject=false ) may make this chart compatible with istio.

Jobs observed:

  • pulsar-kube-prometheus-sta-admission-create
  • pulsar-kube-prometheus-sta-admission-patch
  • pulsar-dev-zookeeper-metadata

These jobs will not reach a completed state unless the associated istio-proxy container exits successfully (i.e. shell into the istio-proxy container and run kill -TERM 1). An alternative approach is to add a preStop condition to the main container to call curl -sf -XPOST http://127.0.0.1:15020/quitquitquit which will tell the istio-proxy to exit, or other ideas include a dedicated additional sidecar to manage this. See https://discuss.istio.io/t/best-practices-for-jobs/4968/2 for more.

PulsarSQL worker node readiness and liveness probes fail

The readiness and liveness probes fail for the pulsarSQL worker nodes.

For example, from the pod in question:

pulsar@pulsar-sql-worker-84b789f6bc-wtztr:/pulsar/conf$ hostname -i
192.168.14.34

Calling the status endpoint as localhost fails:

pulsar@pulsar-sql-worker-84b789f6bc-wtztr:/pulsar/conf$ curl http://localhost:8090/v1/service/presto
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
<title>Error 404 Not Found</title>
</head>
<body><h2>HTTP ERROR 404 Not Found</h2>
<table>
<tr><th>URI:</th><td>/v1/service/presto</td></tr>
<tr><th>STATUS:</th><td>404</td></tr>
<tr><th>MESSAGE:</th><td>Not Found</td></tr>
<tr><th>SERVLET:</th><td>org.glassfish.jersey.servlet.ServletContainer-5f6e2ad9</td></tr>
</table>

</body>
</html>

As does calling the status endpoint via the pod IP:

pulsar@pulsar-sql-worker-84b789f6bc-wtztr:/pulsar/conf$ curl http://192.168.14.34:8090/v1/service/presto
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
<title>Error 404 Not Found</title>
</head>
<body><h2>HTTP ERROR 404 Not Found</h2>
<table>
<tr><th>URI:</th><td>/v1/service/presto</td></tr>
<tr><th>STATUS:</th><td>404</td></tr>
<tr><th>MESSAGE:</th><td>Not Found</td></tr>
<tr><th>SERVLET:</th><td>org.glassfish.jersey.servlet.ServletContainer-5f6e2ad9</td></tr>
</table>

</body>
</html>

Release request

Hello,

What is the schedule for releases? We'd like to use a previously patched commit but we're just waiting for the next release.
Thank you again for the continuous patches.

This is the patch we're after.
#141

Thanks!

Liveness check failed due to the way to deploy broker in Statefulset

  1. Modify broker deploy way from Deployment to Statefulset
  2. Execute the helm install command
  3. Pulsar heartbeat and function pod can not be running due to connection error

Screen Shot 2022-02-16 at 11 12 41 AM

Screen Shot 2022-02-16 at 11 09 18 AM

Screen Shot 2022-02-16 at 11 13 19 AM

Works fine after updating broker.component from broker to brokersts in dev-values.yaml file. Is this the expected way?

Zookeeper service accessed by shortname

It is only possible to access Zookeeper service via shortname (pulsar-zookeeper-ca:2181)

I understand that this works for DNS resolution since the brokers live in the same namespace and the domain is present in the search option of the /etc/resolv.conf. However, we'd like to avoid having to generate certificates using short-names. Is it possible to add an option to add a full domain for the zookeeper service?

For example something like:
{{ template "pulsar.fullname" . }}-{{ .Values.zookeeper.component }}-ca{{- if .Values.zookeeper.domain -}}.{{ .Values.zookeeper.domain }}{{- end -}}:2281

Thanks

Support existing Keycloak instance

Thanks for the OIDC plugin and including a setup for Keycloak in this helm chart!

For those of us who already have existing Keycloak instances, it would be great to be able to leverage those by configuring the Keycloak component to point to our existing instance rather than deploying a new one.

superUserRoles shouldn't need to contain the proxy role

Currently the sample "superUserRoles" is superuser,admin,websocket,proxy.

Why does this include all possible roles?

One reason seems to be that the token generation with Burnell uses the "SuperRoles" environment variable with .Values.superUserRoles to generate the tokens.

- name: SuperRoles
value: {{ .Values.superUserRoles }}

https://github.com/datastax/burnell/blob/5a7c261e498ff5b34356b0c164d357e9f3a8b81b/src/workflow/keys-jwt.go#L98

Update Ingress API Version

Sample warning when using the Ingress for PulsarSQL:

W0128 16:06:08.502415 22507 warnings.go:70] extensions/v1beta1 Ingress is deprecated in v1.14+, unavailable in v1.22+; use networking.k8s.io/v1 Ingress

Upgrade Circle CI Base Image

[Action Required] Ubuntu 14.04 machine image deprecation & EOL

We are deprecating Ubuntu 14.04-based machine images on CircleCI in preparation for an EOL on Tuesday, May 31, 2022 to ensure your builds remain secure. For a detailed overview of how this will affect your workflow, read the blog article here.

We will also be conducting temporary brownouts on Tuesday, March 29, 2022, and again on Tuesday, April 26, 2022 during which these images will be unavailable.

We are contacting you because one or more of your projects has a job that either:

does not specify an image (uses machine: true in config)
explicitly uses an Ubuntu 14.04-based image

Jobs that do not specify an image default to using an Ubuntu 14.04-based image.

If you have specified an Ubuntu 14.04-based image or you are using machine: true in your config file, please see our migration guide to upgrade to a newer version of Ubuntu image in order to avoid any service disruption during the brownout & subsequent EOL.

We will also be releasing a CircleCI Ubuntu 22.04 image on April 22nd offering the flexibility to upgrade to the latest LTS version of Ubuntu image before we remove older versions permanently. A beta version of the image will be available March 21st.

Allow configuring liveness and readiness probe timeouts

Kubernetes has a default probe timeout of 1 second.
https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#configure-probes

timeoutSeconds: Number of seconds after which the probe times out. Defaults to 1 second. Minimum value is 1.

The default 1 second timeout causes the probe to intermittently fail. This can cause undesired restarts.

Make the liveness and readiness probe timeouts configurable and set the default value to 5 seconds to prevent undesired restarts.

In the k8s docs it says

"Before Kubernetes 1.20, the field timeoutSeconds was not respected for exec probes: probes continued running indefinitely, even past their configured deadline, until a result was returned."
https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#configure-probes

Fix org.apache.kafka.common.errors.UnknownServerException on ktool deployment

When the deploying the helmchart with values:

              enableAntiAffinity: no
              initialize: true  # ASF Helm Chart
              restartOnConfigMapChange:
                enabled: yes
              image:
                zookeeper:
                  repository: pulsar-repo
                  tag: 2.8.x
                bookie:
                  repository: pulsar-repo
                  tag: 2.8.x
                bookkeeper:
                  repository: pulsar-repo
                  tag: 2.8.x
                autorecovery:
                  repository: pulsar-repo
                  tag: 2.8.x
                broker:
                  repository: pulsar-repo
                  tag: 2.8.x
                proxy:
                  repository: pulsar-repo
                  tag: 2.8.x
                functions:
                  repository: pulsar-repo
                  tag: 2.8.x
                function:
                  repository: pulsar-repo
                  tag: 2.8.x
              extra:
                function: yes
                burnell: no
                burnellLogCollector: no
                pulsarHeartbeat: no
                pulsarAdminConsole: no
                autoRecovery: no
                functionsAsPods: yes
              default_storage:
                existingStorageClassName: server-storage
              volumes:
                data: #ASF Helm Chart
                  storageClassName: existent-storage-class
              zookeeper:
                replicaCount: 3
              bookkeeper:
                replicaCount: 3
              broker:
                component: broker
                replicaCount: 2
                ledger:
                  defaultEnsembleSize: 2
                  defaultAckQuorum:  2
                  defaultWriteQuorum: 2
                service:
                  annotations: {}
                  type: ClusterIP
                  headless: false
                  ports:
                  - name: http
                    port: 8080
                  - name: pulsar
                    port: 6650
                  - name: https
                    port: 8443
                  - name: pulsarssl
                    port: 6651
                  - name: kafkaplaintext
                    port: 9092
                  - name: kafkassl
                    port: 9093
                  - name: kafkaschemareg
                    port: 8001
                kafkaOnPulsarEnabled: true
                kafkaOnPulsar:
                  saslAllowedMechanisms: PLAIN
                  brokerEntryMetadataInterceptors: "org.apache.pulsar.common.intercept.AppendIndexMetadataInterceptor,org.apache.pulsar.common.intercept.AppendBrokerTimestampMetadataInterceptor"
                  kopSchemaRegistryEnable: true
              function:
                replicaCount: 1
                functionReplicaCount: 1
                runtime: "kubernetes"
              proxy:
                replicaCount: 2
                autoPortAssign:
                  enablePlainTextWithTLS: yes
                service:
                  type: ClusterIP
                  autoPortAssign:
                    enabled: yes
                configData:
                  PULSAR_MEM: "\"-Xms400m -Xmx400m -XX:MaxDirectMemorySize=112m\""
                  PULSAR_PREFIX_kafkaListeners: "SASL_PLAINTEXT://0.0.0.0:9092"
                  PULSAR_PREFIX_kafkaAdvertisedListeners: "SASL_PLAINTEXT://pulsar-proxy:9092"
                  PULSAR_PREFIX_saslAllowedMechanisms: PLAIN
                  PULSAR_PREFIX_kafkaProxySuperUserRole: superuser
                  PULSAR_PREFIX_kopSchemaRegistryProxyEnableTls: "false"
                  PULSAR_PREFIX_kopSchemaRegistryEnable: "true"
                  PULSAR_PREFIX_kopSchemaRegistryProxyPort: "8081"
                extensions:
                  enabled: true
                  extensions: "kafka"
                  containerPorts:
                    - name: kafkaplaintext
                      containerPort: 9092
                    - name: kafkassl
                      containerPort: 9093
                    - name: kafkaschemareg
                      containerPort: 8081
                  servicePorts:
                  - name: kafkaplaintext
                    port: 9092
                    protocol: TCP
                    targetPort: kafkaplaintext
                  - name: kafkassl
                    port: 9093
                    protocol: TCP
                    targetPort: kafkassl
                  - name: kafkaschemareg
                    port: 8081
                    protocol: TCP
                    targetPort: kafkaschemareg
              grafanaDashboards:
                enabled: no
              pulsarAdminConsole:
                replicaCount: 0
                service:
                  type: ClusterIP
              grafana: #ASF Helm Chart
                service:
                  type: ClusterIP
              pulsar_manager:
                service: #ASF Helm Chart
                  type: ClusterIP
              kube-prometheus-stack: # Luna Streaming Helm Chart
                enabled: no
                prometheusOperator:
                  enabled: no
                grafana:
                  enabled: no
                  adminPassword: 123
                  service:
                    type: ClusterIP
              pulsarSQL:
                service:
                  type: ClusterIP
              enableTls: no
              enableTokenAuth: no

The following error is thrown by the ktool pod:

[2022-01-01 12:49:53,587] ERROR Failed to start KSQL (io.confluent.ksql.rest.server.KsqlServerMain:66)
io.confluent.ksql.util.KsqlServerException: Could not get Kafka cluster configuration!
	at io.confluent.ksql.services.KafkaClusterUtil.getConfig(KafkaClusterUtil.java:96)
	at io.confluent.ksql.security.KsqlAuthorizationValidatorFactory.isKafkaAuthorizerEnabled(KsqlAuthorizationValidatorFactory.java:81)
	at io.confluent.ksql.security.KsqlAuthorizationValidatorFactory.create(KsqlAuthorizationValidatorFactory.java:51)
	at io.confluent.ksql.rest.server.KsqlRestApplication.buildApplication(KsqlRestApplication.java:724)
	at io.confluent.ksql.rest.server.KsqlRestApplication.buildApplication(KsqlRestApplication.java:637)
	at io.confluent.ksql.rest.server.KsqlServerMain.createExecutable(KsqlServerMain.java:152)
	at io.confluent.ksql.rest.server.KsqlServerMain.main(KsqlServerMain.java:59)
Caused by: org.apache.kafka.common.errors.UnknownServerException: io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: Connection refused: /10.244.4.18:9092

This is resolved by reverting ab5216de64e5868a378a2a08f0ff6efa0f0430ef, as it can be seen by using this "version" of the helmchart. However, a better solution is needed.

Cannot start CertManager on latest minikube v1.23.0

I am trying to configure Pulsar with TLS on minikube on Mac, but the default/pulsar-cert-manager-cainjector-564d757c9f-hzpps:cert-manager pod errors with:

I0910 06:45:11.336198 1 request.go:645] Throttling request took 1.0469241s, request: GET:https://10.96.0.1:443/apis/authentication.k8s.io/v1?timeout=32s
│ E0910 06:45:12.194358 1 start.go:151] cert-manager/ca-injector "msg"="Error registering certificate based controllers. Retrying after 5 seconds." "error"="no matches for kind "MutatingWebhookConfigurat │
│ Error: error registering secret controller: no matches for kind "MutatingWebhookConfiguration" in version "admissionregistration.k8s.io/v1beta1" │
│ Usage: │
│ ca-injector [flags]

I have followed the instructions in the README and it used to work with a start k8s installation

Minikube version:

minikube version
minikube version: v1.23.0
commit: 5931455374810b1bbeb222a9713ae2c756daee10

Pulsar admin in bastion does not work

Today I have installed on a vanilla k8s env this helm chart, but I have troubles with using pulsar-admin


(ctool-env) enrico.olivelli@eolivelli-rmbp16 pulsar % kubectl exec $(kubectl get pods -l component=bastion -o jsonpath="{.items[*].metadata.name}") -it -- /bin/bash
root@pulsar-bastion-7489594b85-fxjz7:/pulsar# 
root@pulsar-bastion-7489594b85-fxjz7:/pulsar# 
root@pulsar-bastion-7489594b85-fxjz7:/pulsar# 
root@pulsar-bastion-7489594b85-fxjz7:/pulsar# bin/pulsar-admin tenants list
Warning: Nashorn engine is planned to be removed from a future JDK release

null

Reason: java.util.concurrent.CompletionException: org.apache.pulsar.client.admin.internal.http.AsyncHttpConnector$RetryException: Could not complete the operation. Number of retries has been exhausted. Failed reason: Connection refused: pulsar-proxy/10.109.252.132:8080

we have two problems here:

  • it does not work: Connection refused: pulsar-proxy/10.109.252.132:8080
  • I am still seeing the "Nashorn engine is planned to be removed from a future JDK release" that has been removed by Ming

Pulsar Broker metadata initialization should use the given Broker image instead of Zookeeper image

In the chart, Pulsar Broker metadata initialization is called "zookeeperMetadata" which is misleading. It uses the Zookeeper image which is wrong. The Pulsar Broker image should be used for initializing the Pulsar Broker metadata.

image: "{{ .Values.image.zookeeper.repository }}:{{ .Values.image.zookeeper.tag }}"
imagePullPolicy: {{ .Values.image.zookeeper.pullPolicy }}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.