aliyuncontainerservice / kube-eventer Goto Github PK
View Code? Open in Web Editor NEWkube-eventer emit kubernetes events to sinks
License: Apache License 2.0
kube-eventer emit kubernetes events to sinks
License: Apache License 2.0
docker pull registry.aliyuncs.com/acs/kube-eventer-amd64:v1.1.0-c93a835-aliyun
提示:
Error response from daemon: Get https://registry.aliyuncs.com/v2/acs/kube-eventer-amd64/manifests/latest: Get https://dockerauth.cn-hangzhou.aliyuncs.com/auth?scope=repository%3Aacs%2Fkube-eventer-amd64%3Apull&service=registry.aliyuncs.com%3Acn-hangzhou%3A26842: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
麻烦重新提供一下镜像地址,谢谢。
kube-eventer配置如下:
image: registry.aliyuncs.com/acs/kube-eventer-amd64:v1.1.0-c93a835-aliyun
command:
- "/kube-eventer"
- --frequency=10s
- --stderrthreshold=10
- "--source=kubernetes:https://kubernetes.default"
- --sink=dingtalk:https://oapi.dingtalk.com/robot/send?access_token=
XXXXXX&level=Warning&namespaces=kube-system,default&label=Pre-k8s&kinds=Pod
发现同一个Waring信息的发送次数有20几条。
发现实际告警时间和last_occurrence_time有较大延时(有时会多达一个小时)
启动kube-eventer之后启动了一个8084端口,但是/metrics报404错误。
/ $ wget http://10.128.14.53:8084/metrics/
Connecting to 10.128.14.53:8084 (10.128.14.53:8084)
wget: server returned error: HTTP/1.1 404 Not Found
--sink=dingtalk:[your_webhook_url]&label=[your_cluster_id]&level=[可选参数:Normal或者Warning,默认值为:Warning]
这个代码中的your_cluster_id 只能是阿里云的k8s吗,如果是自己搭建的k8s这个字段如何写
目前钉钉推送的格式类似是这样的
<clusterid>
Level:Warning
Namespace:default
Name:pod-abcd
Message:Port 31666 was assigned to multiple services; please recreate service
Reason:PortAlreadyAllocated
Timestamp:3019-07-01 08:38:03 +0000 UTC
Level: Warning
Namespace: default
Name: pod-abcd
Message: Port 31666 was assigned to multiple services; please recreate service
Reason:PortAlreadyAllocated
Timestamp:3019-07-01 08:38:03 +0000 UTC
根据你们控制台的URL规律以及event的事件类型,显示对应的URL
Deployment/Pod
https://cs.console.aliyun.com/#/k8s/deployment/detail///default/deployment1/pods
StatefulSet
https://cs.console.aliyun.com/#/k8s/statefulset/detail///default/statefulset1/pods
DaemonSet
https://cs.console.aliyun.com/#/k8s/daemonset/detail///kube-system/daemonset1/pods
Each go file has a notice:
Copyright <year> Google Inc. All Rights Reserved.
Does this project was forked from a project which was open sourced by Google Inc?
readme上面的deployment command用的还是eventer
, 并没有更改为kube-eventer
, 镜像版本也比较低, 建议和deploy/ 目录保持一致
image: registry.cn-beijing.aliyuncs.com/acs/kube-eventer-amd64:v1.0.0-d9898e1-aliyun
name: kube-eventer
command
- "/kube-eventer"
- "--source=kubernetes:https://kubernetes.default"
- "--sink=dingtalk:https://oapi.dingtalk.com/robot/send?access_token=<My-Token>&label=Kubernetes-Test&level=Normal"
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0xad615f]
goroutine 28 [running]:
github.com/AliyunContainerService/kube-eventer/sinks/dingtalk.(*DingTalkSink).Ding(0xc00030c380, 0xc000144780)
/src/github.com/AliyunContainerService/kube-eventer/sinks/dingtalk/dingtalk.go:151 +0x30f
github.com/AliyunContainerService/kube-eventer/sinks/dingtalk.(*DingTalkSink).ExportEvents(0xc00030c380, 0xc000102720)
/src/github.com/AliyunContainerService/kube-eventer/sinks/dingtalk/dingtalk.go:93 +0x97
github.com/AliyunContainerService/kube-eventer/sinks.export(0x17ea640, 0xc00030c380, 0xc000102720)
/src/github.com/AliyunContainerService/kube-eventer/sinks/manager.go:145 +0x91
github.com/AliyunContainerService/kube-eventer/sinks.NewEventSinkManager.func1(0x17ea640, 0xc00030c380, 0xc00012f320, 0xc00012f380)
/src/github.com/AliyunContainerService/kube-eventer/sinks/manager.go:77 +0x216
created by github.com/AliyunContainerService/kube-eventer/sinks.NewEventSinkManager
/src/github.com/AliyunContainerService/kube-eventer/sinks/manager.go:73 +0x199
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: kube-eventer
rules:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
annotations:
name: kube-eventer
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: kube-eventer
subjects:
apiVersion: v1
kind: ServiceAccount
metadata:
name: kube-eventer
namespace: kube-system
提示报错
当前kubectl version
Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.5", GitCommit:"20c265fef0741dd71a66480e35bd69f18351daea", GitTreeState:"clean", BuildDate:"2019-10-15T19:16:51Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.6", GitCommit:"7015f71e75f670eb9e7ebd4b5749639d42e20079", GitTreeState:"clean", BuildDate:"2019-11-13T11:11:50Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}
配置文件中的 label=[your_cluster_id] 这个 your_cluster_id的值 是阿里云的环境中才有的么。可以使用该项目用于自己私有化环境搭建的k8s集群么
写错了吗/
W0904 05:46:30.011489 1 driver.go:90] Failed to export data to ElasticSearch sink: elastic: Error 400 (Bad Request): Failed to parse mapping [general]: No handler for type [string] declared on field [cluster_name] [type=mapper_parsing_exception]
Chat bots are similar and we should make this case more common.
试用了一段时间, 发现 在更新 Deployment 的时候 Pod伸缩的时候 , 也会 报警, 只要 Pod的状态 改变都会触发 报警, 能不能 判断一下, 区分下 是意料之中的状态改变, 还是 应该 报警的状态。
是否能延迟报警, 将 状态改变的Pod 放入待观察的队列里, 进行进一步的赛选呢。
太过频繁的报警 让我们 无法区分 到底 要不要去处理
error: unable to recognize "eventer.yaml": no matches for kind "Deployment" in version "apps/v1beta2"
由于建表时,设置了唯一索引,同一evnet重复写入mysql时,会导致唯一索引报错的Err
Honeycomb is the lowest frequency of use. Remove it in progress.
es版本为7.2,日志没有报错,正常创建索引且日志显示正常上报数据,但是es对应索引没有数据
Add Timezone to dingtalk Sink
Due to #66 #42 . We decide to support aggregate events with same ref and reason. This is especially when you have a very large deployment can not be scheduled because of some reasons such as lack of resource.
deployment A - >
Pod A
Pod B
Pod C
...
Pod Z
There would be so many events emitted.
Name: Pod A Reason: Pending Ref: Deploy A Message: inefficient cpu
Name: Pod B Reason: Pending Ref: Deploy A Message: inefficient cpu
Name: Pod C Reason: Pending Ref: Deploy A Message: inefficient cpu
...
Name: Pod Z Reason: Pending Ref: Deploy A Message: inefficient cpu
This proposal would aggregate event in on batch like below
Name: Pod A and other 25 Pods Reason: Pending Ref: Deploy A Message: inefficient cpu
目前我们使用的是elasticsearch6,配完以后,日志发送的时候报错说不支持string。
W0813 03:37:00.073560 1 driver.go:90] Failed to export data to ElasticSearch sink: elastic: Error 400 (Bad Request): Failed to parse mapping [general]: No handler for type [string] declared on field [cluster_name] [type=mapper_parsing_exception]
如题,多副本部署会不会出现多个实例都在watch,导致重复事件推送?
测试了一下,当level=Warning时,就无法推送消息出来,等于level=Normal时才能正常报警。
环境:本地自建集群
格式:--sink=dingtalk:https://oapi.dingtalk.com/robot/send?access_token=xxxx&label=online-cluster&level=Warning
no kind "Deployment" is registered for version "apps/v1beta2"
no kind "ClusterRole" is registered for version "rbac.authorization.k8s.io/v1"
no kind "ClusterRoleBinding" is registered for version "rbac.authorization.k8s.io/v1"
根据项目readme部署报错 集群版本 1.12.6-aliyun.1
在指定mysql sink的参数时,应指定类似下方的格式:
--sink=mysql:?root:123456@tcp(172.22.2.192:3306)/kube_event?charset=utf8
而且,应该说明,用户应该提前自建一个kube_event的数据库,并在库中自建kube_event的数据表,不然会报错。
背景
1.需要针对不通namespace的报警,比如pro namespace报警到指定群,uat namespace报警到指定群
2.可不可以支持自定义条件报警,比如说只报ErrImagePull的错误信息到指定群
your_cluster_id 是在哪查询的呢?
您好,建议增加报警的频率设置,不然集群内的pod出现问题,不论设置的Warining还是Normal,钉钉上都会连续收到报警,而客户希望一种情况只收到一条报警就好了
有些Warning报警确认是不影响实际使用的,且一段时间内无法消除掉,所有需要一个基于Reason的报警过滤规则
是否可以添加支持 Slak
,Mattermost
,Rocket
的消息推送?目前不能直接就 events
直接推送到这个 Slack 或者 Mattermost,Rocket 这样的工具的某个 channel
。希望能增加将 Mattermost
支持,如果可以,其余二者增加就更好。
钉钉发送使用加签模式下
出现错误
I1220 16:46:20.997661 1 eventer.go:67] /kube-eventer --source=kubernetes:https://kubernetes.default --sink=dingtalk:https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxx×tamp=1576831364338&sign=qBQH5DbKEs8NMyrpNZ%!B(MISSING)Id6Tp%!F(MISSING)MW4eYPAkWWS8O2RZSc%!D(MISSING)&label=c453bfb4987314ca9b31e5ed3c188aa2a&level=Normal
钉钉机器人安全设置后接收不到消息
现在只能存入到 kube_event 表中,需求期望在同一个库中,将不同集群的事件存入不同的数据库表中
希望能支持自定义的webhook机器人
是否可以让同一条Warning信息,钉钉不要重复推送.
When using es as the event receiver, When you specify the sink parameter, if you do not set use_namespace,The following error will be displayed:
W1010 03:22:00.006045 1 driver.go:94] Failed to export data to ElasticSearch sink: elastic: Error 400 (Bad Request): Invalid index name [heapster-2019-10-10 07:21:55 +0000 UTC], must not contain the following characters [ , ", *, \, <, |, ,, >, /, ?] [type=invalid_index_name_exception]
So,if this parameter must be specified, please help to join the es help file. @rralcala @ringtail
本地调试时,如何配置--source= 使用本地kubeconfig?
Isn't it supposed to auto-generated?
只想要30秒发送一次event信息给自定义url,url自行处理
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.