vesoft-inc / nebula-operator Goto Github PK

View Code? Open in Web Editor NEW

76.0 27.0 27.0 1.74 MB

Operation utilities for Nebula Graph

Home Page: https://vesoft-inc.github.io/nebula-operator

License: Apache License 2.0

Makefile 0.73% Go 97.48% Smarty 0.47% Shell 1.25% Dockerfile 0.06%

hacktoberfest

nebula-operator's Introduction

Nebula Operator

Nebula Operator manages NebulaGraph clusters on Kubernetes and automates tasks related to operating a NebulaGraph cluster. It evolved from NebulaGraph Cloud Service, makes NebulaGraph a truly cloud-native database.

install nebula operator

See install/uninstall nebula operator .

Create and destroy a nebula cluster

$ kubectl create -f config/samples/nebulacluster.yaml

A none ha-mode nebula cluster will be created.

$ kubectl get pods -l app.kubernetes.io/cluster=nebula
NAME                READY   STATUS    RESTARTS   AGE
nebula-graphd-0     1/1     Running   0          1m
nebula-metad-0      1/1     Running   0          1m
nebula-storaged-0   1/1     Running   0          1m
nebula-storaged-1   1/1     Running   0          1m
nebula-storaged-2   1/1     Running   0          1m

See client service for how to access nebula clusters created by the operator.
If you are working with kubeadm locally, create a nodePort service and test that nebula is responding:

$ kubectl create -f config/samples/graphd-nodeport-service.yaml

$ nebula-console -u user -p password --address=192.168.8.26 --port=32236
2021/04/15 16:50:23 [INFO] connection pool is initialized successfully

Welcome to NebulaGraph!
(user@nebula) [(none)]>

Destroy the nebula cluster:

$ kubectl delete -f config/samples/nebulacluster.yaml

Resize a nebula cluster

Create a nebula cluster:

$ kubectl create -f config/samples/nebulacluster.yaml

In config/samples/nebulacluster.yaml the initial storaged replicas is 3.
Modify the file and change replicas from 3 to 5.

  storaged:
    resources:
      requests:
        cpu: "1"
        memory: "1Gi"
      limits:
        cpu: "1"
        memory: "1Gi"
    replicas: 5
    image: vesoft/nebula-storaged
    version: v3.6.0
    storageClaim:
      resources:
        requests:
          storage: 2Gi
      storageClassName: fast-disks

Apply the replicas change to the cluster CR:

$ kubectl apply -f config/samples/nebulacluster.yaml

The storaged cluster will scale to 5 members (5 pods):

$ kubectl get pods -l app.kubernetes.io/cluster=nebula
NAME                READY   STATUS    RESTARTS   AGE
nebula-graphd-0     1/1     Running   0          2m
nebula-metad-0      1/1     Running   0          2m
nebula-storaged-0   1/1     Running   0          2m
nebula-storaged-1   1/1     Running   0          2m
nebula-storaged-2   1/1     Running   0          2m
nebula-storaged-3   1/1     Running   0          5m
nebula-storaged-4   1/1     Running   0          5m

Similarly, we can decrease the size of the cluster from 5 back to 3 by changing the replicas field again and reapplying the change.

  storaged:
    resources:
      requests:
        cpu: "1"
        memory: "1Gi"
      limits:
        cpu: "1"
        memory: "1Gi"
    replicas: 3
    image: vesoft/nebula-storaged
    version: v3.6.0
    storageClaim:
      resources:
        requests:
          storage: 2Gi
      storageClassName: fast-disks

We should see that storaged cluster will eventually reduce to 3 pods:

$ kubectl get pods -l app.kubernetes.io/cluster=nebula
NAME                READY   STATUS    RESTARTS   AGE
nebula-graphd-0     1/1     Running   0          10m
nebula-metad-0      1/1     Running   0          10m
nebula-storaged-0   1/1     Running   0          10m
nebula-storaged-1   1/1     Running   0          10m
nebula-storaged-2   1/1     Running   0          10m

In addition, you can Install Nebula Cluster with helm.

Upgrade a nebula cluster

Create a nebula cluster with the version specified (v3.6.0):

$ kubectl apply -f config/samples/nebulacluster.yaml
$ kubectl get pods -l app.kubernetes.io/cluster=nebula
NAME                READY   STATUS    RESTARTS   AGE
nebula-graphd-0     1/1     Running   0          25m
nebula-metad-0      1/1     Running   0          26m
nebula-storaged-0   1/1     Running   0          22m
nebula-storaged-1   1/1     Running   0          24m
nebula-storaged-2   1/1     Running   0          25m

The container image version should be v3.6.0:

$ kubectl get pods -l app.kubernetes.io/cluster=nebula  -o jsonpath="{.items[*].spec.containers[*].image}" |tr -s '[[:space:]]' '\n' |sort |uniq -c
      1 vesoft/nebula-graphd:v3.6.0
      1 vesoft/nebula-metad:v3.6.0
      3 vesoft/nebula-storaged:v3.6.0

Now modify the file nebulacluster.yaml and change the version from v3.6.0 to v3.6.x:

Apply the version change to the cluster CR:

$ kubectl apply -f config/samples/nebulacluster.yaml

Wait few minutes. The container image version should be updated to v3.6.x:

$ kubectl get pods -l app.kubernetes.io/cluster=nebula  -o jsonpath="{.items[*].spec.containers[*].image}" |tr -s '[[:space:]]' '\n' |sort |uniq -c
      1 vesoft/nebula-graphd:v3.6.x
      1 vesoft/nebula-metad:v3.6.x
      3 vesoft/nebula-storaged:v3.6.x

Warning:

Rolling upgrade must be within the same vX.Y version, if you need upgrade enterprise edition, please contact us.

Backup and Restore a nebula cluster

See backup/restore nebula cluster .

Failover

If the minority of nebula components crash, the nebula operator will automatically recover the failure. Let's walk through this in the following steps.

Create a nebula cluster:

$ kubectl create -f config/samples/nebulacluster.yaml

Wait until pods are up. Simulate a member failure by deleting a storaged pod:

$ kubectl delete pod nebula-storaged-2 --now

The nebula operator will recover the failure by creating a new pod nebula-storaged-2:

$ kubectl get pods -l app.kubernetes.io/cluster=nebula
NAME                READY   STATUS    RESTARTS   AGE
nebula-graphd-0     1/1     Running   0          15m
nebula-metad-0      1/1     Running   0          15m
nebula-storaged-0   1/1     Running   0          15m
nebula-storaged-1   1/1     Running   0          15m
nebula-storaged-2   1/1     Running   0          19s

Guidelines

FAQ

Please refer to FAQ.md

Community

Feel free to reach out if you have any questions. The maintainers of this project are reachable via:

Filing an issue against this repo

Contributing

Contributions are welcome and greatly appreciated.

Start by some issues
Submit Pull Requests to us. Please refer to how-to-contribute.

Acknowledgements

nebula-operator refers to tidb-operator. They have made a very good product. We have a similar architecture, although the product pattern is different from the application scenario, we would like to express our gratitude here.

License

NebulaGraph is under the Apache 2.0 license. See the LICENSE file for details.

nebula-operator's People

Contributors

Stargazers

Watchers

nebula-operator's Issues

Use Always as the default ImagePullPolicy

https://discuss.nebula-graph.com.cn/t/topic/5828
Like this problem, now imagePullPolicy is IfNotPresent, it may cause confusion

suggestion: possibly to put ngctl binary into release assets?

as title.

Thanks!

Add kubernetes version in compatibility matrix

should provide kubernetes version in compatibility matrix

Conf change precisely reconcile

Some of the configurations are possible to be lively changed with curl from the HTTP interface, some are not.

If we scope them properly, ideally, in the operator, we could choose to reconcile corresponding configurations:

for those need a process restart, do it
for those who don't need a restart, update with the HTTP interface and update the conf file, on the fly.

【 Compatible 】- v3.0.0

nebula-operator support cloud version authentication

how to set storage class for aliyun, and hope to support minikube for test and learning

config/samples/apps_v1alpha1_nebulacluster.yaml
please add more examples

offline install guideline

Provide a guideline for offline installation.

Log rotation best practices for nebulaCuster pods

Dear Team,
May I know if there is already some mechanism to enable log rotation/ old files truncation?

Thanks!

More integration tests with mainstream sidecars

Like placed together with istio sidecars etc in pods. :-)

Support full-text search feature

didn't find where i can configure the full-text search feature after deploying a nebula graph cluster with k8s

helm install nebula cluster

unable to install nebula-operator on on-premise Rancher v17.17

I am receiving the below error but not sure how to resolve this error.

"transitioning": "error",
"transitioningMessage": "failed to install app nebula-operator. Error: unable to build kubernetes objects from release manifest: [unable to recognize "": no matches for kind "Certificate" in version "cert-manager.io/v1", unable to recognize "": no matches for kind "Issuer" in version "cert-manager.io/v1"]",

cluster upgrade failure

when I upgrade a cluster from v2.0.1 to v v2.6.1，after waiting for about 10m, it didn't work.

ran kubectl describe nc nebula, showing

enablePVReclaim doesn't work

create a nebulacluster resource by using

apiVersion: apps.nebula-graph.io/v1alpha1
kind: NebulaCluster
metadata:
  name: {nebulacluster_name}
spec:
  graphd:
    config:
      heartbeat_interval_secs: "1"
    resources:
      requests:
        cpu: "500m"
        memory: "500Mi"
      limits:
        cpu: "1"
        memory: "2Gi"
    replicas: {graphd_num}
    image: vesoft/nebula-graphd
    version: v2-nightly
    service:
      type: NodePort
      externalTrafficPolicy: Local
    storageClaim:
      resources:
        requests:
          storage: 2Gi
      storageClassName: nfs-client
  metad:
    config:
      heartbeat_interval_secs: "1"
    resources:
      requests:
        cpu: "500m"
        memory: "500Mi"
      limits:
        cpu: "1"
        memory: "1Gi"
    replicas: {metad_num}
    image: vesoft/nebula-metad
    version: v2-nightly
    storageClaim:
      resources:
        requests:
          storage: 2Gi
      storageClassName: nfs-client
  storaged:
    config:
      heartbeat_interval_secs: "1"
    resources:
      requests:
        cpu: "500m"
        memory: "500Mi"
      limits:
        cpu: "1"
        memory: "1Gi"
    replicas: {storaged_num}
    image: vesoft/nebula-storaged
    version: v2-nightly
    storageClaim:
      resources:
        requests:
          storage: 1Gi
      storageClassName: nfs-client
  reference:
    name: statefulsets.apps
    version: v1
  schedulerName: default-scheduler
  imagePullPolicy: IfNotPresent
  enablePVReclaim: true

delete the nebulacluster

expected result:
delete pvc

actual result:
doesn't delete.

GraphService.cpp:228] Unknown auth type:

集群运行环境：nebula-operator 1.0.0 nebula-graphd 3.0.2
[root@k8s-master1 ~]# kubectl get pod -n kube-nebula
NAME READY STATUS RESTARTS AGE
nebula-graphd-0 1/1 Running 0 34m
nebula-metad-0 1/1 Running 0 29m
nebula-operator-controller-manager-deployment-79fb98f94-pbfj7 2/2 Running 0 76m
nebula-operator-controller-manager-deployment-79fb98f94-vk98s 2/2 Running 2 5h7m
nebula-operator-scheduler-deployment-7898449b87-469zx 2/2 Running 1 5h7m
nebula-operator-scheduler-deployment-7898449b87-blpxw 2/2 Running 2 5h7m
nebula-storaged-0 1/1 Running 1 46m
nebula-storaged-1 1/1 Running 1 46m
nebula-storaged-2 1/1 Running 1 46m

客户端连接日志ba报错

nebula-graphd日志报错：

storageClaim bug

Error: NebulaCluster.apps.nebula-graph.io "nebula-cluster" is invalid: [spec.graphd.storageClaim.storageClassName:
Invalid value: "null": spec.graphd.storageClaim.storageClassName in body must be of type string: "null",
spec.metad.storageClaim.storageClassName: Invalid value: "null":
spec.metad.storageClaim.storageClassName in body must be of type string: "null", spec.storaged.storageClaim.storageClassName:
Invalid value: "null": spec.storaged.storageClaim.storageClassName in body must be of type string: "null"]

Replace golint with revive

When I run make lint. something below appears:

WARN [runner] The linter 'golint' is deprecated (since v1.41.0) due to: The repository of the linter has been archived by the owner.  Replaced by revive.

[nebula-operator cannot set the annotations of service] nebula-operator中没有处理service里面的annotations，没办法增加自定义的label到service的annotations里面

ngctl cmd tool

ngctl is a terminal cmd tool for nebula-operator, it has the following functions:

debug // debug a container in the specified pod
console // establish a connection to graphd
list // list all nebulaclusters
cluster-info // print cluster details, contains component status, resources, image versions etc.
version // print git commit info

nebula-operator : graphd cyclic restart

k8s 安装nebula-operator成功以后 helm方式安装cluster 后pod 为nebula-graphd不断重启。求助

Add e2e tests

Link of "openkruise installation documentation" is invalid

I find the link of openkruise installation documentation in Installing Add-ons is redirect to Page Not Found

Clarify add-ons installation instructions

Summary

here, it says users can install add-ons, but it's unclear what should happen after the installation. do i need to configure anything for nebula to work with these add-ons?

Add more detailed documentation on how to configure Nebula to work with these add-ons, how these add-ons work with Nebula, what each of them is for?

storage pod not ready until manually running `add host storageip:por`

Before executing the add hosts command, the storage pod status kept showing not ready.

After executing add hosts manually, the storage pod status shows ready.

Application practice optimization suggestions for building Nebula clusters in Aliyun cloud

Hi, Nebula-operator Dev Team:
When I was using nebula-operator recently, I found a rather interesting place.
Currently nebula-operator only supportsthe storageClassName to create storage dynamically, when I practice building clusters on the cloud. AliCloud provides local storage (NVMe SSD) ECS to save half of the build cost when I creating a container cluster, and in my scenario I was able to get an additional 2TB of cloud disk space on this cluster.
However, the nebula-operator needs to provide a mechanism for setting up PVCs for the services, otherwise it will not be possible to configure existing PVs.

Cannot download latest nebula-cluster chart

helm upgrade nebula nebula-operator/nebula-cluster -f values.yaml --version=1.1.0
Error: Failed to render chart: exit status 1: Error: failed to download "nebula-operator/nebula-cluster" at version "1.1.0"

nebula-operator/nebula-operator @ 1.1.0 seems to work fine though

The minimum supported version of Kubernetes

In Install Guide, the minimum supported version of Kubernetes is 1.12. However, the add-on Cert-Manager v1.2.0 required the version of Kubernetes at least 1.16. And when I try to create a nebula cluster, it shows the error below:
failed to install CRD crds/nebulacluster.yaml: unable to recognize "":no matches for kind "CustomResourceDefinition" in version "apiextensions.k8s.io/v1"

The version of my kubernetes server is 1.13, and I found that in kubernetes document, the minimum version to support CRD is 1.16. So what is the minimum supported version of Kubernetes actually? Can I use it with kubernetes v1.13?

It could create resource concurrently

https://github.com/vesoft-inc/nebula-operator/blob/master/pkg/controller/nebulacluster/nebula_cluster_control.go#L90

could it create the resource concurrently, not create one by one?

helm部署开启认证

我helm启动需要开启认证，发现nebula-cluster里面nebula-cluster.yaml文件缺少以下配置

spec:
  graphd:
    config: {{ toYaml .Values.nebula.graphd.config | nindent 6 }}

补充后在valus.yaml文件添加 enable_authorize 配置才得以开启认证

nebula:
  graphd:
    config:
      "enable_authorize": "true"
      "auth_type": "password"

collecting nebula metrics

Add nebula-stats-exporter to when create nebula cluster.

Recovery nebula cluster in crash scenario

https://discuss.nebula-graph.com.cn/t/topic/5569/20

Support ingress for accessing the nebula cluster

Support use the default storage class

Chart package missing config configuration item

Chart package downloaded through helm pull, nebula cluster.yaml template is missing config configuration item

webhook bug

I'd set admissionWebhook.create to false, but I got an error while I created nebula cluster.

Error: Internal error occurred: failed calling webhook "nebulaclustervalidating.nebula-graph.io":
Post https://nebula-operator-webhook-service.nebula-operator-system.svc:443/apis/admission.nebula-graph.io/v1alpha1/nebulaclustervalidating?timeout=10s:
service "nebula-operator-webhook-service" not found

manually CRD update is required, this could be a pitfall even in case of fresh deployment of v0.9.0 operator
upgrade path from v0.8.0(v2.5.x) to v0.9(v2.5.x) seems to be not possible due to the CRD change, it will force the end-user to run production v0.8.0 to a redeployment

Thanks :)
PS. great job on the v0.9.0 on rolling upgrade support!

Add OpenKruise Operator install step in document.

When install nebula-operator, i find it watch openkruise advance statefulset crd.

ngctl: small improvements

First, it's awesome to have ngctl, really love it, thank you so much!!

In initial connection, the first line of +--------------------+ should be better newline-ed:

$ngctl console
(root@nebula) [(none)]> show spaces
(root@nebula) [(none)]> +--------------------+
| Name               |
+--------------------+
| "basketballplayer" |
| "shareholding"     |
+--------------------+

Possible to support UP-btn for history commands or even Ctrl-R?

Support the auto-scale cluster feature

How can specify separate nodeselector for each of graphd, metad and storaged

We wanted to specify separate nodeselector for each of graphd, metad and storaged component.

We wanted to use different VM SKU for nebula components
Example

For storaged, use VM SKU that has SSD for PVC
metad/graphd, use VM SKU that doesn't have SSD

How can I do that?

NebulaCluster reference with gvk

Need to discuss

add more configurable options for pod template

During the tests of MEG friends, it's found there could be a conflict between nebulaCluster pods and istio-proxy sidecar pods, in order to disable the sidecar, below annotation fields are needed.

While for now, there is no such configuration interface to do so.

Is it feasible to add configurable options on this kind of k/v pairs via values.yaml ?