Giter VIP home page Giter VIP logo

kmesh-net / kmesh Goto Github PK

View Code? Open in Web Editor NEW
366.0 366.0 48.0 42.17 MB

High Performance ServiceMesh Data Plane Based on Programmable Kernel

Home Page: https://kmesh.net

License: Apache License 2.0

Makefile 1.48% C 48.42% Go 44.65% Shell 4.99% Dockerfile 0.06% CMake 0.19% Smarty 0.19%
ebpf high-performance kernel kubernetes low-overhead microservice networking resiliency service-mesh traffic-management

kmesh's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kmesh's Issues

Add CI Checks

This is an umbrella issue for CI tasks.
There are many checks we should make in CI

  • Boilerplate header check
  • Lint check
  • Build
  • Unit test
  • Integration test

Add an indication whether a pod is in charge by kmesh

Curretly whether kmesh takes effect on a pod not only depends on the namespace label istio.io/dataplane : kmesh, but also whether the pod has sidecar injected, So i propose we add a label or annotation to the pod that kmesh will be in charge.

This could be done in the kmesh cni plugin via a pod update

deserial_update_elem failed

What happened:
After kmesh daemon start up, i saw this error:

time="2024-01-16T03:56:12Z" level=info msg="options InitDaemonConfig successful" subsys=manager
time="2024-01-16T03:56:13Z" level=info msg="bpf Start successful" subsys=manager
time="2024-01-16T03:56:13Z" level=info msg="controller Start successful" subsys=manager
time="2024-01-16T03:56:13Z" level=info msg="command StartServer successful" subsys=manager
time="2024-01-16T03:56:13Z" level=info msg="start write CNI config\n" subsys=cniplugin
time="2024-01-16T03:56:13Z" level=info msg="kmesh cni use chained\n" subsys=cniplugin
time="2024-01-16T03:56:13Z" level=info msg="Copied /usr/bin/kmesh-cni to /opt/cni/bin." subsys=cniplugin
time="2024-01-16T03:56:13Z" level=info msg="kubeconfig either does not exist or is out of date, writing a new one" subsys=cniplugin
time="2024-01-16T03:56:13Z" level=info msg="wrote kubeconfig file /opt/cni/bin/kmesh-cni-kubeconfig" subsys=cniplugin
time="2024-01-16T03:56:13Z" level=info msg="command Start cni successful" subsys=manager
RouteConfiguration is not set
time="2024-01-16T03:56:22Z" level=error msg="RouteConfigUpdate deserial_update_elem failed" subsys=cache/v2
RouteConfiguration is not set
time="2024-01-16T03:56:22Z" level=error msg="RouteConfigUpdate deserial_update_elem failed" subsys=cache/v2
RouteConfiguration is not set
time="2024-01-16T03:56:22Z" level=error msg="RouteConfigUpdate deserial_update_elem failed" subsys=cache/v2
RouteConfiguration is not set
time="2024-01-16T03:56:22Z" level=error msg="RouteConfigUpdate deserial_update_elem failed" subsys=cache/v2
RouteConfiguration is not set
time="2024-01-16T03:56:22Z" level=error msg="RouteConfigUpdate deserial_update_elem failed" subsys=cache/v2
RouteConfiguration is not set
time="2024-01-16T03:56:22Z" level=error msg="RouteConfigUpdate deserial_update_elem failed" subsys=cache/v2
RouteConfiguration is not set
time="2024-01-16T03:56:22Z" level=error msg="RouteConfigUpdate deserial_update_elem failed" subsys=cache/v2
RouteConfiguration is not set
time="2024-01-16T03:56:22Z" level=error msg="RouteConfigUpdate deserial_update_elem failed" subsys=cache/v2

Not sure why the bpf map write failed

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • Kmesh version:
  • Others:

Kmesh CNI overwrite cni conflist with empty value

What happened:

After running kmesh for some time, we cannot restart kmesh daemon and also not able to start a new application pod.

By looking into kmesh logs, we do see:

time="2024-03-06T10:06:41Z" level=info msg="Copied /usr/bin/kmesh-cni to /opt/cni/bin." subsys="cni installer"
time="2024-03-06T10:06:41Z" level=info msg="failed to read conflist: /etc/cni/net.d/10-calico.conflist, error parsing configuration list: unexpected end of JSON input" subsys="cni installer"
time="2024-03-06T10:06:41Z" level=error msg="can not found the valid cni config!\n" subsys="cni installer"
time="2024-03-06T10:06:41Z" level=error msg="can not found the valid cni config!\n" subsys=manager

What you expected to happen:

Kmesh should not overwrite invalid conflist

How to reproduce it (as minimally and precisely as possible):

not sure how to reproduce

Anything else we need to know?:

Environment:

  • Kmesh version:
  • Others:

Support mtls for service to service communication

What would you like to be added:

For east-west service communication, we should encrypt the traffic to implement zero-trust network.

As to ingress traffic through ingressgateway, we should do similar, but IMO this can be done in v0.4

Why is this needed:

HashName mem leak

What happened:

In workload model, we introduced many bpf maps keyyed by str's hash value. And we used an inner object HashName to provide unique hash val, it stores a map innerly. But we can see the keys is never deleted even when a workload/service removed.

FYI: https://github.com/kmesh-net/kmesh/blob/main/pkg/controller_workload/workload/workload_hash.go

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • Kmesh version:
  • Others:

Problems with libbpf in make build

What would you like to be added:
Add a note about the libbpf version in the make image and the host libbpf version.
Why is this needed:
The libbpf used in the make image is version 0.8.0. The bpf_map_create has been changed in different libbpf versions.
If the host version is different from the one in the make image, an error will occurs.

Image Tag Mismatch Between Helm Chart and Repository

What happened:
Image Pull Failure When Installing Kmesh with Helm

Normal   Pulling      28m (x5 over 31m)      kubelet  Pulling image "ghcr.io/kmesh-net/kmesh:0.2.0"
Normal   BackOff      3m21s (x121 over 31m)  kubelet  Back-off pulling image "ghcr.io/kmesh-net/kmesh:0.2.0"

Then I find Image Tag Mismatch Between Helm Chart and Repository

In Helm Chart:

    image:
      repository: ghcr.io/kmesh-net/kmesh
      tag: 0.2.0
    imagePullPolicy: IfNotPresent

In Image Repository:

docker pull ghcr.io/kmesh-net/kmesh:v0.2.0

What you expected to happen:
Sync Image Tags Between Helm Chart and Registry
How to reproduce it (as minimally and precisely as possible):
Prepare the cluster environment, then run follow commend:

helm install kmesh ./deploy/helm -n kmesh-system --create-namespace

Anything else we need to know?:
Perhaps we should modify the script that generates the image so that it removes the v from the tag
Environment:

  • Kmesh version: v0.2.0
  • Others: istio :1.20.1

Kmesh support Workload model

Currently, Kmesh has implemented traffic governance functions for L4 and L7 through XDS protocol. However, in some scenarios, microservice applications focus more on L4 traffic governance, and L7 governance can be deployed as needed. The Istio community has launched a Workload model to provide lightweight L4 traffic governance functions, which Kmesh needs to consider supporting.

In Kmesh v0.1.0, enable L7 Routing on Kubernetes 1.27 and Istio 1.19

What would you like to be added:
We would like Kmesh to be compatible with Kubernetes 1.27 and Istio 1.19 for L7 routing.

Why is this needed:
our production environment is running Kubernetes 1.27 and Istio 1.19. Currently, the latest Kubernetes version for which L7 routing works is 1.20, and the latest Istio version is 1.14.5 (see table at the bottom of the issue for reference).

Steps To Reproduce Issue

Setup

  1. OS: OpenEuler 23.03
  2. Kubernetes Version: 1.27
  3. Istio Version: 1.19
  4. Kmesh Version: v0.1.0
  5. Backend Services: 2 HTTP Services httpecho-a and httpecho-b
  6. HTTP Client: Netutils

Step 1:
Deploy two backend services httpecho-a and httpecho-b using the below configs

  • Yaml for httpecho-a
# kubectl  apply -f service-a.yaml
apiVersion: v1
kind: Service
metadata:
  name: httpecho-a
  labels:
    app: httpecho-a
    service: httpecho-a
spec:
  ports:
  - port: 5000
    name: http
  selector:
    app: httpecho-a
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: httpecho-a
  labels:
    app: httpecho-a
    version: v1
spec:
  replicas: 1
  selector:
    matchLabels:
      app: httpecho-a
      version: v1
  template:
    metadata:
      labels:
        app: httpecho-a
        version: v1
    spec:
      containers:
      - name: httpecho-a
        image: docker.io/istio/examples-helloworld-v1
        resources:
          requests:
            cpu: "100m"
        imagePullPolicy: IfNotPresent #Always
        ports:
        - containerPort: 5000
  • Yaml for httpecho-b
# kubectly apply -f service-b.yaml
apiVersion: v1
kind: Service
metadata:
  name: httpecho-b
  labels:
    app: httpecho-b
    service: httpecho-b
spec:
  ports:
  - port: 5000
    name: httpb
  selector:
    app: httpecho-b
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: httpecho-b
  labels:
    app: httpecho-b
    version: v2
spec:
  replicas: 1
  selector:
    matchLabels:
      app: httpecho-b
      version: v2
  template:
    metadata:
      labels:
        app: httpecho-b
        version: v2
    spec:
      containers:
      - name: httpecho-b
        image: docker.io/istio/examples-helloworld-v1
        resources:
          requests:
            cpu: "100m"
        imagePullPolicy: IfNotPresent #Always
        ports:
        - containerPort: 5000

Step 2: Deploy netutils client for sending http requests

kubectl run --image=hwchiu/netutils netutils

Send requests to http-echo-a and httpecho-b with the netutils client. Note the difference in pod names in the response from each service

$ kubectl exec -it netutils -- curl httpecho-a:5000
Hello V1, routed from pod httpecho-a-gu83s


$ kubectl exec -it netutils -- curl httpecho-b:5000
Hello V1, routed from pod httpecho-b-t7wx5

Step 3: Deploy Virtual Service for L7 Url Routing

This rule should route all requests sent to httpecho-b to httpecho-a instead

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: route-b-to-a
spec:
  hosts:
  - httpecho-b
  http:
  - match:
    - port: 5000
    route:
    - destination:
        host: httpecho-a.default.svc.cluster.local
        port:
          number: 5000

Step 4: Send request to httpecho-b

Note the pod name in the response. If routing works, it should be httpecho-a-gu83s, and NOT httpecho-b-t7wx5.

$ kubectl exec -it netutils -- curl  httpecho-b:5000/hello
Hello V1, routed from pod httpecho-b-t7wx5

If you run the above reproduction steps on Kubernetes 1.20 and Istio 1.19 the request would get successfully routed to httpecho-a.

Additional Context

As a part of our investigation, we tried running the same tests on other versions of Istio and Kubernetes. Below is an overview of our findings

kmesh_matrix

Support systemd cgroup driver

What would you like to be added:

In enableKmeshControl we added a markENABLE_KMESH_MARK = "0x1000"
to the net cgroup of a pod. Currently it only works on cgroups driver, but recent k8s releases make systemd as the default cgroup driver.
So suggest we automatically detect which cgroup driver kubelet use, and set the mark accordingly.

Why is this needed:

Support remote L7 traffic management

What would you like to be added:

As discussed, in kmesh we would want to implement a proxy(maybe call waypoint) which should be able to do L7 traffic management, telemetry and security enforcement. It can be deployed on any node of k8s or outside of the kubernetes cluster.

The waypoint can be for a single service, a namespace or even shared.

Why is this needed:

Currently kmesh L7 depends on kernel enhancement, which is coupled with OS version, and it cannot handle HTTPS HTTP2 and grpc protocol well. So we need a userspace proxy.

Support build kmesh from container

When I first run build.sh to build kmesh, it installs a lot of dependencies on my VM. While istio has an image named build-tools, a developer can run make build make docker, make lint make gen, and many other cmd within the container. It can provide good contributing experience for new contributors.

So let's provide a build tools image, and support build kmesh binary/image from the container.

无法编译kmesh

使用22.03 SP1进行测试,已经成功修改内核,但是运行./build.sh -b会出现如下错误:

BUILD   kmesh-daemon
# oncn.io/mesh/pkg/bpf
In file included from pkg/bpf/bpf_kmesh.go:23:
/root/kmesh/bpf/kmesh/include/kmesh_common.h:56:1: error: expected '=', ',', ';', 'asm' or '__attribute__' before '#pragma'
   56 | } outer_map SEC(".maps");
      | ^~~
/root/kmesh/bpf/kmesh/include/kmesh_common.h:56:1: error: expected identifier or '(' before '#pragma'
/root/kmesh/bpf/kmesh/include/kmesh_common.h:64:1: error: expected '=', ',', ';', 'asm' or '__attribute__' before '#pragma'
   64 | } inner_map SEC(".maps");
      | ^~~
/root/kmesh/bpf/kmesh/include/kmesh_common.h:64:1: error: expected identifier or '(' before '#pragma'
/root/kmesh/bpf/kmesh/include/kmesh_common.h: In function 'kmesh_get_ptr_val':
/root/kmesh/bpf/kmesh/include/kmesh_common.h:128:46: error: 'outer_map' undeclared (first use in this function)
  128 |  inner_map_instance = kmesh_map_lookup_elem(&outer_map, &idx);
      |                                              ^~~~~~~~~
/root/kmesh/bpf/kmesh/include/kmesh_common.h:128:46: note: each undeclared identifier is reported only once for each function it appears in
make: *** [Makefile:42: all] Error 2

请问这是否是某个依赖的版本问题?应当如何修复?

Feature: decouple dependency on kubeconfig file

Now kmesh daemon needs to mount kubeconfig file path to startup, and kmesh-cni has similar requirement.

It is not possible to have a kubeconfig file in real Kubernetes nodes for normal pods to use.

Foe kmesh daemon, it can use the sa mounted to communicate with kube-apiserver. And for kmesh-cni, kmesh daemon should generate a kubeconfig file from serviceaccount. And pass the path to kmesh-cni.

refer to istio https://github.com/istio/istio/blob/master/manifests/charts/istio-cni/templates/configmap-cni.yaml#L28-L31

Missing null pointer check in ads_loader.go

What happened:
Missing null pointer check in ads_loader.go
e.g.:

func newApiSocketAddress(address *config_core_v3.Address) *core_v2.SocketAddress {
var addr *config_core_v3.SocketAddress
switch address.GetAddress().(type) {
case *config_core_v3.Address_SocketAddress:
addr = address.GetSocketAddress()
default:
return nil
}
if addr == nil || !nets.GetConfig().IsEnabledProtocol(addr.GetProtocol().String()) {
return nil
}
return &core_v2.SocketAddress{
// Protocol: core_v2.SocketAddress_Protocol(addr.GetProtocol()),
Port: nets.ConvertPortToBigEndian(addr.GetPortValue()),
Ipv4: nets.ConvertIpToUint32(addr.GetAddress()),
}
}

What you expected to happen:
add nil pointer check in ads_loader.go

Environment:

  • Kmesh version:0.2.0
  • Others:

Split manifest out from ./build

The target is to separate deployment manifest from building script, in the long run we may need to support deploy kmesh using helm or other tools like kustomization.

Support label or annotate the pod to indicate that Kmesh is in charge of traffic management.

What would you like to be added:
Support label or annotate the pod to indicate that Kmesh is in charge of traffic management.

Why is this needed:
Kmesh supports collaborating with existing mesh data plane. Consider the following scenario: a namespace has already been injected with a sidecar, and then the Kmesh data plane is injected into this namespace.

According to the design, existing pods in the namespace will continue to have their traffic governed by the sidecar. However, the traffic of new pods created in the namespace will be taken over by Kmesh, although a new pod will still have a sidecar created, as shown in the diagram.
image

In this situation, there needs to be a method to inform the operations team which pods' traffic is being managed by Kmesh; otherwise, confusion may arise.

Kmesh v0.2.0 needs to be restarted for configuration changes to take effect (using Istio 1.19 and Kubernetes 1.27)

What happened:

With Kmesh v0.2.0 already running, I deployed a Virtual Service for L7 url routing. Below are my two observations

  1. The exported bpf config was NOT updated with the routing rule
  2. The request did NOT get routed as expected.

However, after restarting Kmesh, the following was observed

  1. The exported bpf config was updated with the routing rule
  2. The request got routed as expected.

For reference, refer to this comment from another github issue:

#133 (comment)
#133 (comment)

What you expected to happen:

After deploying the virtual service, the bpf map should get updated and the routing should take effect WITHOUT having to restart Kmesh

How to reproduce it (as minimally and precisely as possible):

Run the below steps with Kmesh v0.2.0 already running

Step 1

Deploy 2 Backend Services httpecho-a and httpecho-b

  • yaml for service httpecho-a
# kubectl  apply -f service-a.yaml
apiVersion: v1
kind: Service
metadata:
  name: httpecho-a
  labels:
    app: httpecho-a
    service: httpecho-a
spec:
  ports:
  - port: 5000
    name: http
  selector:
    app: httpecho-a
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: httpecho-a
  labels:
    app: httpecho-a
    version: v1
spec:
  replicas: 1
  selector:
    matchLabels:
      app: httpecho-a
      version: v1
  template:
    metadata:
      labels:
        app: httpecho-a
        version: v1
    spec:
      containers:
      - name: httpecho-a
        image: docker.io/istio/examples-helloworld-v1
        resources:
          requests:
            cpu: "100m"
        imagePullPolicy: IfNotPresent #Always
        ports:
        - containerPort: 5000
  • yaml for httpecho-b
# kubectly apply -f service-b.yaml
apiVersion: v1
kind: Service
metadata:
  name: httpecho-b
  labels:
    app: httpecho-b
    service: httpecho-b
spec:
  ports:
  - port: 5000
    name: httpb
  selector:
    app: httpecho-b
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: httpecho-b
  labels:
    app: httpecho-b
    version: v2
spec:
  replicas: 1
  selector:
    matchLabels:
      app: httpecho-b
      version: v2
  template:
    metadata:
      labels:
        app: httpecho-b
        version: v2
    spec:
      containers:
      - name: httpecho-b
        image: docker.io/istio/examples-helloworld-v1
        resources:
          requests:
            cpu: "100m"
        imagePullPolicy: IfNotPresent #Always
        ports:
        - containerPort: 5000

Step 2: Deploy netutils client for sending http requests

kubectl run --image=hwchiu/netutils netutils

Send requests to http-echo-a and httpecho-b with the netutils client. Note the difference in pod names in the response from each service

$ kubectl exec -it netutils -- curl httpecho-a:5000
Hello V1, routed from pod httpecho-a-gu83s


$ kubectl exec -it netutils -- curl httpecho-b:5000
Hello V1, routed from pod httpecho-b-t7wx5

Step 3: Deploy Virtual Service for L7 Url Routing

This rule should route all requests sent to httpecho-b to httpecho-a instead

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: route-b-to-a
spec:
  hosts:
  - httpecho-b
  http:
  - match:
    - port: 5000
    route:
    - destination:
        host: httpecho-a.default.svc.cluster.local
        port:
          number: 5000

Step 4: Send request to httpecho-b

Note the pod name in the response. If routing works, it should be httpecho-a-gu83s, and NOT httpecho-b-t7wx5.

$ kubectl exec -it netutils -- curl  httpecho-b:5000/hello
Hello V1, routed from pod httpecho-b-t7wx5

Also, export the bpf config using curl GET /bpf/kmesh/maps:15200. The newly added rule is NOT a part of the bpf config.

Step 4: Restart Kmesh and run the above command once again:

The request gets routed to httpecho-a

$ kubectl exec -it netutils -- curl  httpecho-b:5000/hello
Hello V1, routed from pod httpecho-a-gu83s

export the bpf config using curl GET /bpf/kmesh/maps:15200. The newly added rule is a part of the exported bpf config.

bpf_config_image

The same issue occurs when we try to delete the routing rule. The request keeps getting routed to httpecho-a. Once we restart kmesh, the request gets routed to httpecho-b

Anything else we need to know?:

Environment:

  • Kmesh version: v0.2.0
  • Istio: 1.19.0
  • Kubernetes: 1.27.0

Improve ADS with istiod

The bootstrap config currently is almost static, we can only update the ads server address now. And it cannot communicate with server via a secure port.

IMO, we have several steps to make the ads more secure

  • Support secure ads communication with kmesh token
  • Support dynamic node id generate, node id is the identity correspond with pod name. "nodeType~" + ip + "~" + podname.namespace+ "~" + "namespace.svc.cluster.local"
  • Bring nonce when ack/nack
  • we donot need to request for listener after CDS handled each time. IMO, we only need send once each stream. Otherwise, istiod will duplicate listener pushes

Refactor: Ads and WDS should share connection

What would you like to be added:

As workload controler introduced, we almost duplicate the client implementation from ads. In order to make it simple to maintain we should refactor the client.

Why is this needed:

complie error: /usr/include/bpf/bpf_helpers.h:74:2: error: '#' is not followed by a macro parameter

  1. env:
    [root@localhost kmesh]# uname -r
    5.10.0-60.18.0.50.oe2203.x86_64
    [root@localhost kmesh]# yum info libbpf
    Last metadata expiration check: 22:58:20 ago on Fri 17 Nov 2023 11:45:50 AM CST.
    Installed Packages
    Name : libbpf
    Epoch : 2
    Version : 0.3
    Release : 1.h0.oe2203
    Architecture : x86_64
    Size : 239 k
    Source : libbpf-0.3-1.h0.oe2203.src.rpm
    Repository : @System
    From repo : everything
    Summary : Libbpf library
    URL : https://github.com/libbpf/libbpf
    License : LGPLv2 or BSD
    Description : A mirror of bpf-next linux tree bpf-next/tools/lib/bpf directory plus its
    : supporting header files. The version of the package reflects the version of
    : ABI.

[root@localhost kmesh]#

  1. compile failed: complie error: /usr/include/bpf/bpf_helpers.h:74:2: error: '#' is not followed by a macro parameter
    [root@localhost kmesh]# ./build.sh
    ......
    In file included from ../../include/common.h:28:
    /usr/include/bpf/bpf_helpers.h:74:2: error: '#' is not followed by a macro parameter
    #if GNUC && !clang
    ^
    /usr/include/bpf/bpf_helpers.h:76:2: error: #else after #else
    #else
    ^
    /usr/include/bpf/bpf_helpers.h:80:2: error: #else after #else

  2. After the preceding error is temporarily avoided, the following error message is displayed when the compilation continues:
    /home/wcy/code/kmesh/oncn-mda/ebpf_src/sock_redirect.c:23:31: error: use of undeclared identifier 'NULL'
    struct sock_key *redir_key = NULL;
    It looks like some basic header files are not included. pls check it, thanks.

The Kmesh as the proxy of the server works with the envoy client has problem.

When the Kmesh collaborates with the Envoy, the Kmesh can function as the client to bypass the Envoy and manage the Kmesh, which greatly improves the performance.
However, when the Kmesh function as the server, a problem occurs. If the client sends messages through the Envoy, the server shorts the Envoy when the Kmesh function exists. However, when the client sends messages, the Envoy may use the mtls encryption. As a result, an error occurs when the server receives messages.

kmesh image start failed, cannot to istiod

What happened:
when I apply kmesh.yaml , a issue occurred:

[root@master config]# kubectl logs -f -n kmesh-system kmesh-deploy-gcdsn
time="2024-02-04T02:29:14Z" level=error msg="failed to get service istiod in namespace istio-system!" subsys=controller/envoy
time="2024-02-04T02:29:14Z" level=info msg="services "istiod" is forbidden: User "system:serviceaccount:kmesh-system:kmesh" cannot get resource "services" in API group "" in the namespace "istio-system"" subsys=controller/envoy
time="2024-02-04T02:29:14Z" level=info msg="options InitDaemonConfig successful" subsys=manager
time="2024-02-04T02:29:15Z" level=info msg="bpf Start successful" subsys=manager
time="2024-02-04T02:29:35Z" level=error msg="ads StreamAggregatedResources failed, rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 192.168.0.1:15010: i/o timeout"" subsys=manager

and, apply fortio pod:

Events:
Type Reason Age From Message


Normal Scheduled 15s default-scheduler Successfully assigned default/fortio-client-deployment-5dd87867f8-gtnkm to master
Warning FailedCreatePodSandBox 13s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "9eda64cbd12cc5096acf1f30b6c6a8729cf233a7e1adf1dbcdb2cc74cd98e505" network for pod "fortio-client-deployment-5dd87867f8-gtnkm": networkPlugin cni failed to set up pod "fortio-client-deployment-5dd87867f8-gtnkm_default" network: failed to get k8s client: stat kmesh-cni-kubeconfig: no such file or directory

What you expected to happen:

[root@master config]# kubectl logs -f -n kmesh-system kmesh-deploy-wnpmr
time="2024-02-04T02:26:59Z" level=info msg="options InitDaemonConfig successful" subsys=manager
time="2024-02-04T02:27:00Z" level=info msg="bpf Start successful" subsys=manager
time="2024-02-04T02:27:00Z" level=info msg="controller Start successful" subsys=manager
time="2024-02-04T02:27:00Z" level=info msg="command StartServer successful" subsys=manager
time="2024-02-04T02:27:00Z" level=info msg="start write CNI config\n" subsys="cni installer"
time="2024-02-04T02:27:00Z" level=info msg="kmesh cni use chained\n" subsys="cni installer"
time="2024-02-04T02:27:02Z" level=info msg="Copied /usr/bin/kmesh-cni to /opt/cni/bin." subsys="cni installer"
time="2024-02-04T02:27:02Z" level=info msg="kubeconfig either does not exist or is out of date, writing a new one" subsys="cni installer"
time="2024-02-04T02:27:02Z" level=info msg="wrote kubeconfig file /etc/cni/net.d/kmesh-cni-kubeconfig" subsys="cni installer"
time="2024-02-04T02:27:02Z" level=info msg="command Start cni successful" subsys=manager

How to reproduce it (as minimally and precisely as possible):

kubectl apply -f kmesh.yaml -n kmesh-system

kubectl apply -f fortio-client.yaml in kmesh/test/performance/long_test/config

Anything else we need to know?:

Environment:oe2303

  • Kmesh version:
  • Others:

Removed resources in wds may contains both Workload and Service types

What happened:

Now we only consider the removed resources are only pod or only service. This is not right, more details we can see the controlplane istiod.

func handleDeleteResponse(rsp *service_discovery_v3.DeltaDiscoveryResponse) error {
var (
err error
)
if strings.Contains(strings.Join(rsp.RemovedResources, ""), "Kubernetes//Pod") {
// delete as a workload
if err = RemoveWorkloadResource(rsp.GetRemovedResources()); err != nil {
log.Errorf("RemoveWorkloadResource failed: %s", err)
}
} else {
// delete as a service
if err = RemoveServiceResource(rsp.GetRemovedResources()); err != nil {
log.Errorf("RemoveServiceResource failed: %s", err)
}
}
return err

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • Kmesh version:
  • Others:

在通过daemonset方式启动kmesh时kubectl logs报错

openeuler23-03环境,在安装istio-1.19.3之后通过kubectl apply -f kmesh.yaml安装kmesh时候报错,该 kmesh-deploy daemonset可以在k8s集群中运行
image
但是kubectl logs kmesh-deploy-t4zcj 会出现如下报错

[root@host-192-168-100-230 docker]# kubectl logs kmesh-deploy-t4zcj
time="2023-12-08T08:39:01Z" level=info msg="options InitDaemonConfig successful" subsys=manager
time="2023-12-08T08:39:01Z" level=info msg="bpf Start successful" subsys=manager
time="2023-12-08T08:39:01Z" level=info msg="controller Start successful" subsys=manager
time="2023-12-08T08:39:01Z" level=info msg="command StartServer successful" subsys=manager
panic: runtime error: index out of range [3] with length 0

goroutine 26 [running]:
encoding/binary.littleEndian.Uint32(...)
        /usr/lib/golang/src/encoding/binary/binary.go:80
oncn.io/mesh/pkg/nets.ConvertIpToUint32({0xc00044f8f0?, 0xc0004461b0?})
        /root/kmesh/test/mugen-master/testcases/smoke-test/kmesh/oe_test_service_function/rpmbuild/BUILD/kmesh-0.0.1/pkg/nets/nets.go:34 +0x4e
oncn.io/mesh/pkg/controller/envoy.newApiSocketAddress(0xb?)
        /root/kmesh/test/mugen-master/testcases/smoke-test/kmesh/oe_test_service_function/rpmbuild/BUILD/kmesh-0.0.1/pkg/controller/envoy/ads_loader.go:138 +0xf1
oncn.io/mesh/pkg/controller/envoy.newApiClusterLoadAssignment(0xc00047d1a0)
        /root/kmesh/test/mugen-master/testcases/smoke-test/kmesh/oe_test_service_function/rpmbuild/BUILD/kmesh-0.0.1/pkg/controller/envoy/ads_loader.go:107 +0x2ad
oncn.io/mesh/pkg/controller/envoy.(*AdsLoader).CreateApiClusterByCds(0xc000500f60, 0x2, 0xc00055f000)
        /root/kmesh/test/mugen-master/testcases/smoke-test/kmesh/oe_test_service_function/rpmbuild/BUILD/kmesh-0.0.1/pkg/controller/envoy/ads_loader.go:76 +0x258
oncn.io/mesh/pkg/controller/envoy.(*ServiceEvent).handleCdsResponse(0xc0005a1860, 0xc0000d4d80)
        /root/kmesh/test/mugen-master/testcases/smoke-test/kmesh/oe_test_service_function/rpmbuild/BUILD/kmesh-0.0.1/pkg/controller/envoy/ads_event.go:150 +0x393
oncn.io/mesh/pkg/controller/envoy.(*ServiceEvent).processAdsResponse(0xc0005a1860, 0xc0000d4d80)
        /root/kmesh/test/mugen-master/testcases/smoke-test/kmesh/oe_test_service_function/rpmbuild/BUILD/kmesh-0.0.1/pkg/controller/envoy/ads_event.go:112 +0x1a9
oncn.io/mesh/pkg/controller/envoy.(*AdsClient).runControlPlane(0xc000430690, {0x1b26a90, 0xc0008cf300})
        /root/kmesh/test/mugen-master/testcases/smoke-test/kmesh/oe_test_service_function/rpmbuild/BUILD/kmesh-0.0.1/pkg/controller/envoy/ads_client.go:142 +0x15c
created by oncn.io/mesh/pkg/controller/envoy.(*AdsClient).Run
        /root/kmesh/test/mugen-master/testcases/smoke-test/kmesh/oe_test_service_function/rpmbuild/BUILD/kmesh-0.0.1/pkg/controller/envoy/ads_client.go:160 +0x125

尝试demo示例也没有看到kmesh参与流量治理
image

REQUEST: New membership for LiZhenCheng9527

GitHub Username

LiZhenCheng9527

Organization you are requesting membership in

Kmesh

Requirements

  • I have reviewed the community membership guidelines
  • I have joined in the community mailing group and slack
  • I have watched Kmest-net/kmesh on GitHub to subscribe updates (My account appears in the watchers list)
  • I have enabled two-factor authentication on my GitHub account
  • I am actively contributing to 1 or more Kmesh subprojects
  • I have spoken to my sponsors ahead of this application, and they have agreed to sponsor my application

Sponsors

@hzxuzhonghu

List of contributions to the Kurator project

Questions about error return in ads_envent.go

Please provide an in-depth description of the question you have:

func (svc *ServiceEvent) handleCdsResponse(rsp *service_discovery_v3.DiscoveryResponse) error {
	var (
		err     error
		cluster = &config_cluster_v3.Cluster{}
	)

	current := sets.New[string]()
	for _, resource := range rsp.GetResources() {
		if err = anypb.UnmarshalTo(resource, cluster, proto.UnmarshalOptions{}); err != nil {
			continue
		}
		current.Insert(cluster.GetName())
		// compare part[0] CDS now
		// Cluster_EDS need compare tow parts, compare part[1] EDS in EDS handler
		apiStatus := core_v2.ApiStatus_UPDATE
		newHash := hash.Sum64String(resource.String())
		if newHash != svc.DynamicLoader.ClusterCache.GetCdsHash(cluster.GetName()) {
			svc.DynamicLoader.ClusterCache.SetCdsHash(cluster.GetName(), newHash)
			log.Debugf("[CreateApiClusterByCds] update cluster %s, status %d, cluster.type %v",
				cluster.GetName(), apiStatus, cluster.GetType())
			svc.DynamicLoader.CreateApiClusterByCds(apiStatus, cluster)
		} else {
			log.Debugf("unchanged cluster %s", cluster.GetName())
		}
	}

	removed := svc.DynamicLoader.ClusterCache.GetResourceNames().Difference(current)
	for key := range removed {
		svc.DynamicLoader.UpdateApiClusterStatus(key, core_v2.ApiStatus_DELETE)
	}

	// TODO: maybe we don't need to wait until all clusters ready before loading, like cluster delete

	if len(svc.DynamicLoader.clusterNames) > 0 {
		svc.rqt = newAdsRequest(resource_v3.EndpointType, svc.DynamicLoader.clusterNames)
		svc.DynamicLoader.clusterNames = nil
	} else {
		svc.DynamicLoader.ClusterCache.Flush()
	}
	return nil
}

In the above code Kmesh processes the cds response.
However, the error during processing is not returned, but just a return nil in the end.

func (svc *ServiceEvent) processAdsResponse(rsp *service_discovery_v3.DiscoveryResponse) {
	var err error

	log.Debugf("handle ads response, %#v\n", rsp.GetTypeUrl())

	svc.ack = newAckRequest(rsp)
	if rsp.GetResources() == nil {
		return
	}

	switch rsp.GetTypeUrl() {
	case resource_v3.ClusterType:
		err = svc.handleCdsResponse(rsp)
	case resource_v3.EndpointType:
		err = svc.handleEdsResponse(rsp)
	case resource_v3.ListenerType:
		err = svc.handleLdsResponse(rsp)
	case resource_v3.RouteType:
		err = svc.handleRdsResponse(rsp)
	default:
		err = fmt.Errorf("unsupport type url %s", rsp.GetTypeUrl())
	}

	if err != nil {
		log.Error(err)
	}
}

So in the above function, won't the final error be nil?

What do you think about this question?:
I don't know if this is a bug, or if there is another way of passing error in xds.

Environment:

  • Kmesh version: 0.2.0

Commit apis generated from protobuf

What would you like to be added:

In kmesh we generated some golang and c api files from the proto definition. I propose we commit them into this repo. And add a CI task to check these files are in line with the original proto

Why is this needed:

These protos are not frequently updated.
Without these api files, we cannot even run go mod tidy successfully.

Improve project readme

  1. The project logo is too large, which wastes page space and results in poor reading experience.
  2. Project-related tags and badges to be added, e.g. CI, SLSA level, releases, openssf best practices etc.

mini test can not pass

By default, Kmesh takes effect for all sockets before. However, this behavior was modified after commit 84c496d. From the perspective of Kubernetes, you need to mark the Kubernetes namespace so that pods in the namespace can be managed by kmesh. This actually marks the classid in the pod cgroup.

The mini test simplifies the test environment. There is no Kubernetes. Therefore, the test need to manually create a cgroup to store the sockets created by which processes should be managed by Kmesh.

Refactor CacheFactory and CacheFlush

Why is this needed:

type CacheFactory interface {
	StatusFlush(status core_v2.ApiStatus) int
	StatusDelete(status core_v2.ApiStatus)
	StatusReset(old, new core_v2.ApiStatus)
}

CacheFactory is the interface to update and delete xds resources that received from xds control plane.

But currently the cache flushing procedure is quite hard to understand, And there are other caveats:

  1. It does not support delta cache flush, which is useful for delta xds.
  2. the underlying implementor of CacheFactory stores redundant resources

Do we need to add issus templates?

We only have black issus when we create issus. Should we provide templates for the commonly used issus themes. Like bug report, question, security vulnerability and so on

Improve unit test coverage

What would you like to be added:

Currently there are no unit tests, we should add test coverage for each module. Basically i suggest doing this per package.

Eveyone can take part of this, and can list your interested package below

Use Github Actions to Automatically Sync Code Pushed to Github to Gitee

What would you like to be added:
Use Github Actions to Automatically Sync Code Pushed to Github to Gitee.
Why is this needed:
In addition to having a code repository on GitHub, the Kmesh project also has a code repository on Gitee. But the code repository on Gitee lags far behind and has a lot more missing code. And the newly released was also not synced to the Gitee repository. Therefore, we hope there is a GitHub action that can automatically push the new version code to Gitee when releasing a version.

Why can't I import kmesh's own package?

Please provide an in-depth description of the question you have:
Before I contribute to the kmesh project, I tried to set up the kmesh development environment. When I executed go mod tidy, I encountered the following error.

kmesh.net/kmesh/api/v2/endpoint: cannot find module providing package kmesh.net/kmesh/api/v2/endpoint: unrecognized import path "kmesh.net/kmesh/api/v2/endpoint": reading https://kmesh.net/kmesh/api/v2/endpoint?go-get=1: 404 Not Found

I tried many methods but none of them could resolve it. Do you have any suggestions on how to solve this?
What do you think about this question?:
Maybe there's something I'm not dealing with that's causing this.
Environment:

  • Kmesh version:0.1.0

End to end tests

What would you like to be added:

This is to improve test coverage for features, we should test each feature in black box way. I am not sure whether github action machine allow running kmesh. This may be a blocker, so first we can test it.

The tasks can be divided as:

  • Provide script or go code to install kubernetes and kmesh
  • Prepare apps and framework that we need for e2e test
  • Write the first test as example
  • Write the other cases one by one

Why is this needed:

make docker failed

What happened:

$ make docker
docker build --build-arg arch=amd64 -f build/docker/kmesh.dockerfile -t  .
ERROR: "docker buildx build" requires exactly 1 argument.
See 'docker buildx build --help'.

Usage:  docker buildx build [OPTIONS] PATH | URL | -

Start a build
make: *** [Makefile:134: docker] Error 1

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • Kmesh version:
  • Others:
# docker version
Client: Docker Engine - Community
Version:           23.0.3
API version:       1.42
Go version:        go1.19.7
Git commit:        3e7cbfd
Built:             Tue Apr  4 22:06:10 2023
OS/Arch:           linux/amd64
Context:           default

Server: Docker Engine - Community
Engine:
 Version:          23.0.3
 API version:      1.42 (minimum version 1.12)
 Go version:       go1.19.7
 Git commit:       59118bf
 Built:            Tue Apr  4 22:06:10 2023
 OS/Arch:          linux/amd64
 Experimental:     false
containerd:
 Version:          1.6.20
 GitCommit:        2806fc1057397dbaeefbea0e4e17bddfbd388f38
runc:
 Version:          1.1.5
 GitCommit:        v1.1.5-0-gf19387a
docker-init:
 Version:          0.19.0
 GitCommit:        de40ad0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.