grafana / kubernetes-app Goto Github PK
View Code? Open in Web Editor NEWA set of dashboards and panels for kubernetes.
Home Page: https://grafana.com/plugins/grafana-kubernetes-app
License: Apache License 2.0
A set of dashboards and panels for kubernetes.
Home Page: https://grafana.com/plugins/grafana-kubernetes-app
License: Apache License 2.0
hello.
How to run a prometheus outside the node,there is a possibility.
Hi, I just added the kubernetes app to my Grafana v5. I already added my prometheus data source which receives metrics from kube-metrics and from the node_exporter.
After configuring all the plugin settings, Grafana displays the following error:
Saved but failed to connect to MY CLUSTERNAME. Error: [object Object]
I guess, my settings are not correct but I am not sure how to get further help because there are no messages in the grafana log.
Some of the dashboard panels do work, for example Cluster Disk Usage
.
When I open a Dashboard, the following error shows: Templating init failed Cannot read property 'length' of undefined
I saw the presentation at GrafanaCon and I would really like to use the app but the setup is not completely clear (at least for me).
Hope you can provide some further debugging help.
Best
On node info dashboard , it work fine for all node , but when I select one special node , nothing show !!
grafana --- > v5.2
QL:
sum((avg(irate(node_cpu_seconds_total{instance=~"$node", mode="system"}[5m])) * 100))
Hello,
It would be awesome if one could just helm install it, eh? ;)
Hi. I tried to run this plugin on an existing kube+prometheus+grafana setup and noticed grafana needs additional permissions to complete the setup (specifically: list deployments.apps in the namespace "kube-system").
Would you guys be able to add an RBAC config to this repo? I can't really figure out what permissions this app needs beyond adding them one at a time, every time I get an error.
I have installed prometheus using Prometheus operator. Could you help me how I can Use this plugin with prometheus. Is the integration even possible?
On the K8s Node Dashboard, the node_load panel is showing the raw node_load value which for a multi-core/processor architecture always gets over the 1,2 threshold defined.
We can modify it by hand but our changes would get overridden by the next update.
Could you use count(node_cpu{mode="system"}) in the query?
Thanks in advance!
what is the Prometheus installation type recommended ?
Openshift 3.7 / 3.9 has integrated prometheus. will it be supported?
When I click in Plugins / kubernetes,click config button,it will show a red error warning page:
Plugin Error
Fetch error: 404 Not Found Instantiating http://xxx:3000/public/app/plugins/app/grafana-kubernetes-app/module Loading app/plugins/app/grafana-kubernetes-app/module
kubernetes-app install method:grafana-cli plugins install grafana-kubernetes-app
grafana version: 5.2.1
prometheus version: 1.7.0
Hi,
I think the two titles 'Network Traffic(inbound)' and 'Network Traffic(outbound)' should be wrong
Title
Network Traffic (Outbound)
Metric
rate(container_network_receive_bytes_total{pod_name=~"$pod", kubernetes_io_hostname=~"$node"}[2m])
Version : 1.0.1
Trying to build locally on mac. Following
$ yarn install --frozen-lockfile
$ yarn run grunt
yarn run v1.12.3
$ /Users/edwardmcfarlane/go/src/github.com/grafana/kubernetes-app/node_modules/.bin/grunt
Running "clean:0" (clean) task
>> 1 path cleaned.
Running "copy:dist_js" (copy) task
Copied 16 files
Running "sass:dist" (sass) task
Running "typescript:build" (typescript) task
>> dist/panels/nodeData/nodeData.ts(27,5): error TS2346: Supplied parameters do not match any signature of call target.
>> dist/panels/podNav/podNav.ts(23,5): error TS2346: Supplied parameters do not match any signature of call target.
Warning: Task "typescript:build" failed. Use --force to continue.
Aborted due to warnings.
error Command failed with exit code 3.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.
In K8s Cluster dashboard the disk usage panel uses node_filesystem_size{nodename=~"$node"}
.
In my Prometheus instance (and I guess that in other as well), this metric doesn't have a nodename
label.
What am I missing here?
Hi everyone.
Can please someone quickly explain how we can connect this plugin to a Rancher (2.1.1) cluster? Our kubeconfig file looks as follows:
apiVersion: v1
kind: Config
clusters:
- name: "jls"
cluster:
server: "https://k8s.company.tld/k8s/clusters/c-ckrdr"
api-version: v1
users:
- name: "user-rd343"
user:
token: "kubeconfig-user-rd343:cxdgf235zgsdg6436g325z45ujsgsdgwegwfwd4qr2"
contexts:
- name: "company"
context:
user: "user-rd343"
cluster: "company"
current-context: "company"
We're perfectly able to use kubectl with this kubeconfig. No problems at all. But the plugin for grafana to monitor our cluster seems very reluctant to integrate.
Any suggestions are highly appreciated.
I use one local grafana server with one prometheus on k8s as datasource, I had verified the prometheus datasource's correctness, and the configured kubernetes-app kubernetes datasource showed up too. But the metrics on all dashboard are not displayed. For example, on K8s Cluster
dashboard, I found the reason is that the node
variable is not parsed correctly although the cluster
variable is parsed. I don't understand the query of node
variable , will the prometheus accept the node
string as one query parameter, I guess that the plugin does some other work to support this.
So my question is why the dashboard doesn't work and why the variables defined is not parsed correctly, for example node
in K8s Cluster
dashboard?
Another question is the what's the implementation guide of the kubernetes-app plugin, I can't find other documentation to know it deeply, such as the defined variable.
Hello, I seemingly have added my first k8s cluster without issue, I have explained the process I used to add the first cluster here: #46 (comment)
My issue is that when I want to add another cluster, my dashboards end up looking like this:
Hello,
I can't find the option for connecting using token. Is it possible?
I have Grafana / prometheus installed and working fine. I had modified prometheus.yaml for collecting metrics data (cluster url ). I had installed Kubernetes app plugin in Grafana and trying to connecting our cluster. I am following the URL But I do not have authentication details. But I am getting error message saying that could not connect to cluster. what is wrong with me config ?
I have some problems getting Prometheus and the Kubernetes cluster to work together. When created a proxy to the cluster with kubectl proxy
all the Prometheus metric disappear from the Kuberntes overview, and if I stop the proxy connection, Prometheus matric comes back. There is no problem accessing the Prometheus service directly in the browser or on a Prometheus dashboard when a proxy port is running. Both the Data sources have proxy access with TLS client auth, Prometheus URL is the service IP from kubectl get services
, while kubernetes is accessed from the localhost when proxy is running.
root@grafana:~# grafana-cli plugins install kubernetes-app
Failed to send requesterrorApi returned invalid status: 404 Not Found
Error: ✗ Failed to send request. error: Api returned invalid status: 404 Not Found
NAME:
Grafana cli plugins install - install <plugin id> <plugin version (optional)>
USAGE:
Grafana cli plugins install [arguments...]
what should I config to solve the issue? Thank you
Hi,
I have a standalone grafana server with the plugin installed.
Was able to successfully connect to the GKE cluster (the node-exporter and kube-state-metrics) were also successfully deployed automatically.
I went ahead and deployed Prometheus in the cluster and updated the scrape config to match the one provided in the add-cluster page.
Thought I can see the node list, services, namespaces and PODs, none of the metrics are shown. All I get in the dashboard is NA and an error stating 'Datasource named $datasource was not found'.
Please help.
Thanks,
Vikas
When running in the cluster, it would be nice if the kubernetes app would create a kubernetes datasource for the cluster automatically.
Hello,
In "K8s Container" i can select multiple labels about my container, i can see too that the variable "$pod" is used in Metrics, my doubt is if i can use the other labels (selected in "Pod filtering") in my metrics?
Regards
my prometheus does not have kube_* metric
what prometheus should i install ?
or is the kubernetes version, i have 1.10.5
Hi!
I have a kubernetes cluster on which I recently installed prometheus, through prometheus operator, following official prometheus guide.
I connected Grafana Kubernetes App to kubernetes/prometheus then I noticed that the metrics exposed at K8s container dashboard for memory and cpu was been showing doubled.
After some debug, I discovered the bellow query for cpu usage:
sum(irate(container_cpu_usage_seconds_total{pod_name=~"my-pod"}[2m])) by (pod_name)
Executing the same query but removing aggregate (container_cpu_usage_seconds_total{pod_name=~"my-pod"}
) I could see that there is an extra item that represents the sum of all containers in POD as in table bellow:
So, this justify the doubled metrics.
The extra item with the sum of containers (the 3th item on the above table) don't contains the container_name attribute.
In order to avoid Grafana Kubernetes App showing doubled values, I edited queries on dashboard definition by adding container_name=~".+"
to filter, so the query for cpu_usage now looks as:
sum(irate(container_cpu_usage_seconds_total{pod_name=~"my-pod", container_name=~".+"}[2m])) by (pod_name)
I'm wondering if there is a better way to handle this. Have you ever seen this scenario before?
HI All,
Can someone please send the documentation for creating a new cluster, bcoz after i click on new cluster and fill in the details i am stuck, i get error like datasource name already exists, 404 http error etc...
Would be really helpful if someone has a documentation for this.Also would request the grafana team to please update it.
I found that Node-exporter 1.6 version of some indicators and plug-in templates are a lot of pairs, excuse me, will this update repair?
I used helm to deploy grafana, but I cannot get it to connect to the kubernetes api. I cannot find instructions on how to create a cluserrole for grafana.
I've been able to install the kubernetes-app plugin into grafana running on our kubernetes cluster, but I'm not able to get the plugin to work. I think I probably don't have it configured correctly, but I'm not sure how to correct it. In my cluster config I have datasource set to "prometheus", and access set to "Server (Default)". But I'm not sure what to set for the http url or the auth settings. I have grafana and prometheus running as pods on the k8s cluster. (Installed via core-os prometheus-operator kube-prometheus.) So I'm not sure what http url to use: url of the kubernetes system? url of the prometheus service? The docs/help system aren't clear. Would appreciate if someone could help get this issue addressed, as the plugin looks quite nice. Thanks!
We try to deploy grafana with kubernetes-app plugin.
Installing plugins is easy but I can't see any simple way to:
a) Enabling plugin
b) Configure plugin via API call ( i.e datasource )
Additonally as we modify default dashboards it should be possible to not import default ones when plugin is enabled ( maybe additional option for it )
the node dashboard says memory is in bits, i believe it is bytes
so we have 8 times less ram available than what the dashboard leads you to believe
I have successfully configured the plugin to monitore my GKE kubernetes cluster, I have Grafana installed using the official stable helm chart (same for prometheus).
My problem is that non admin users have permission denied
errors while trying to access the plugin page, with the list of clusters, same thing for some of the dashboards (node and container ones). I checked the chrome console while trying to reproduce and it seems to be the call made to /api/datasources
that results in a 403 for non admin users.
Is there anything to configure to fix this ?
Thanks
Hi,
detailed instructions to deploy them manually them with kubectl
I don't see them.
Thanks!
https://www.dropbox.com/s/1ecgy6foslcczxj/Screenshot%202018-08-24%2009.54.44.png?dl=0
api/datasources/proxy/1/api/v1/query_range?query=sum((avg(irate(node_cpu%7Bnodename%3D~%22ip-10-10-2-217_ap-southeast-2_compute_internal%22%2C%20mode%3D%22iowait%22%7D%5B5m%5D))%20*%20100))&start=1535068162&end=1535068462&step=60"
In the k8s-node dashboard you can select per node metrics, I find in the query inspector all .
are being rewritten to _
so the hostnames do not exist in prometheus and so... it returns nothing.
questions; how is this controlled and why is the char changing?
[root@localhost prometheus]# kubectl version
Client Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.5", GitCommit:"51dd616cdd25d6ee22c83a858773b607328a18ec", GitTreeState:"clean", BuildDate:"2019-01-16T18:24:45Z", GoVersion:"go1.10.7", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.5", GitCommit:"51dd616cdd25d6ee22c83a858773b607328a18ec", GitTreeState:"clean", BuildDate:"2019-01-16T18:14:49Z", GoVersion:"go1.10.7", Compiler:"gc", Platform:"linux/amd64"}
image version:
prom/prometheus:v2.4.3
grafana/grafana:5.3.4
Then I deploy the grafana-kubernetes-app plugin, here is the configmap of prometheus:
[root@localhost prometheus]# cat prometheus-cm.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-config
namespace: monitoring
data:
prometheus.yml: |
global:
scrape_interval: 15s
scrape_timeout: 15s
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'kubernetes-nodes'
kubernetes_sd_configs:
- role: node
relabel_configs:
- source_labels: [__address__]
regex: '(.*):10250'
replacement: '${1}:9100'
target_label: __address__
action: replace
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- job_name: 'kubernetes-kubelet'
kubernetes_sd_configs:
- role: node
relabel_configs:
- source_labels: [__address__]
regex: '(.*):10250'
replacement: '${1}:10255'
target_label: __address__
action: replace
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- job_name: 'kubernetes-cadvisor'
kubernetes_sd_configs:
- role: node
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
- job_name: 'kubernetes-apiservers'
kubernetes_sd_configs:
- role: endpoints
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
action: keep
regex: default;kubernetes;https
- job_name: 'kubernetes-service-endpoints'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
action: replace
target_label: __scheme__
regex: (https?)
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: kubernetes_name
After completion:
[root@localhost prometheus]# kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-779ffd89bd-rvd42 1/1 Running 0 3h29m
kube-state-metrics-fcdc7d964-qmbfg 1/1 Running 0 87m
kubernetes-dashboard-659798bd99-j6rj2 1/1 Running 0 3h26m
node-exporter-bhvvt 1/1 Running 0 87m
node-exporter-jrncb 1/1 Running 0 87m
node-exporter-q8tzh 1/1 Running 0 87m
node-exporter-r52fg 1/1 Running 0 87m
[root@localhost prometheus]# kubectl get pod -n monitoring
NAME READY STATUS RESTARTS AGE
grafana-c7bf74c8-sbmmp 1/1 Running 0 93m
grafana-chown-rd96q 0/1 Completed 0 99m
prometheus-7c958795dd-6ndfj 1/1 Running 0 106m
When I visit grafana:
Only the k8s-container dashboard has data,I also see that the kube-state-metrics pod has the following exceptions:
E0204 09:33:42.538610 1 reflector.go:205] k8s.io/kube-state-metrics/collectors/pod.go:187: Failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:kube-system:default" cannot list resource "pods" in API group "" at the cluster scope
E0204 09:33:42.738156 1 reflector.go:205] k8s.io/kube-state-metrics/collectors/persistentvolumeclaim.go:78: Failed to list *v1.PersistentVolumeClaim: persistentvolumeclaims is forbidden: User "system:serviceaccount:kube-system:default" cannot list resource "persistentvolumeclaims" in API group "" at the cluster scope
E0204 09:33:42.923342 1 reflector.go:205] k8s.io/kube-state-metrics/collectors/daemonset.go:82: Failed to list *v1beta1.DaemonSet: daemonsets.extensions is forbidden: User "system:serviceaccount:kube-system:default" cannot list resource "daemonsets" in API group "extensions" at the cluster scope
E0204 09:33:42.924153 1 reflector.go:205] k8s.io/kube-state-metrics/collectors/cronjob.go:93: Failed to list *v2alpha1.CronJob: cronjobs.batch is forbidden: User "system:serviceaccount:kube-system:default" cannot list resource "cronjobs" in API group "batch" at the cluster scope
E0204 09:33:42.925186 1 reflector.go:205] k8s.io/kube-state-metrics/collectors/replicaset.go:87: Failed to list *v1beta1.ReplicaSet: replicasets.extensions is forbidden: User "system:serviceaccount:kube-system:default" cannot list resource "replicasets" in API group "extensions" at the cluster scope
E0204 09:33:42.937971 1 reflector.go:205] k8s.io/kube-state-metrics/collectors/namespace.go:80: Failed to list *v1.Namespace: namespaces is forbidden: User "system:serviceaccount:kube-system:default" cannot list resource "namespaces" in API group "" at the cluster scope
E0204 09:33:43.138574 1 reflector.go:205] k8s.io/kube-state-metrics/collectors/node.go:142: Failed to list *v1.Node: nodes is forbidden: User "system:serviceaccount:kube-system:default" cannot list resource "nodes" in API group "" at the cluster scope
god.. who can help me.
Thanks.
After some debugging on why the plugin was working on our dev cluster but was not working for our staging one, I've found that it requires legacy authorization to be enabled on the cluster.
See https://cloud.google.com/kubernetes-engine/docs/how-to/role-based-access-control for more details on the legacy authorization model.
On GKE this can be changed here:
It would be much better if the plugin could support token authentication (like mentioned in one of the issues below), or allow to use service account directly (in case it was deployed to the same cluster it's going to connect).
Currently I've just removed everything related to the Kubernetes Datasource provided by this plugin and changed the dashboards according, so basically the plugin is not used anymore.
Most of the variables can be replaced to use some prometheus metric:
$node
can be label_values(kube_node_info, node)
$namespace
can be label_values(kube_namespace_created, namespace)
Both metrics are from kube-state-metrics
.
There are other issues that are probably related to this one:
#12
#22
#34
#41
#47
It appears that in addition to the Node Exporter and Kube State Metrics, a 3rd component (prometheus scraper) must be manually added by the user in order for this to function.
A user must manually do the following for this to work:
Without these steps, almost no metrics will work. These requirements are missing from the readme.
The Deploy button will deploy the following:
(1) A promtheus configmap which contains the prometheus jobs that collect metrics used by the dashboards in the kubernetes app
- Incorrect, the grafana kuberentes app does not do this ^^
(2) a Node Exporter deployment, and
(3) a Kube-State Metrics deployment
Unless I am missing something else possibly?
It would be so much simpler to install if the app were to use token auth not just private key auth because then people could create service accounts and use the app with a proper RBAC setup.
As it stands, the way you create TLS certs you have to use openssl and have access to the CA private key, at least with token you can use kubectl.
Hello.
I have prometheus and grafana implemented into my gke (Google Kubernetes Engine) cluster and when I try to test your app it can't connect to the cluster and shows a Bad Request or Bad gateway error.
Have tried for hours and can't find out how to connect my cluster to this app.
I would really appreciate your help.
In the node dashboard, when selecting a node in the "filter by node" panel, the metrics are not shown.
The reason is the "slugify" of .
to _
, thus prometheus doesn't find the nodename:
On AWS, the nodes are named like ip-172-20-xxx-xx.ec2.internal
, thus, it results in a wrong query.
What is the reason for replacing .
with _
?
Can we easily remove it?
When selecting a node using the 'Kubernetes Node Info' panel, it appears that you use slugify to replace periods with underscores when a single node is selected.. And this propogates its way back to the $node
variable in some way.
Unfortunately this breaks all of the default panels for us, because they're relying on $node
to match nodename
.
Hello, following the setup instructions, i'm not being able to connect our grafana to the AWS EKS cluster. the following message is given:
Tried to deploy cluster node exporters manually with given json, but still not able to connect grafana with eks. There are some rbac files that i should also deploy?
Thanks in advance.
Grafana 5.1.3.
When going /datasources
page, the icon for the datasource http://HOSTNAME/public/plugins/grafana-kubernetes-datasource/img/logo.svg
always 404.
Grafana is running as a pod within K8s.
The logo is present in the repo, but the app seems to point at a wrong path.
I already have a full prometheus stack deployed in my cluster. I want to use these dashboards on my existing grafana server (outside of K8s). I have exposed my prometheus-server service using a LoadBalancer @ port 9090 and have added that datasource to my grafana. I've got about 80 of the panels working.
Looks like pod filtering doesn't work in case you have many pods in the cluster and be default filtering tries to request everything. As it is done via GET request, it can no be handled well and fails with errors like:
Annotation Query Failed
{"data":null,"status":-1,"config":{"method":"GET","transformRequest":[null],"transformResponse":[null],"jsonpCallbackParam":"callback","url":"api/annotations","params":{"from":1526539366174,"to":1526541166174,"dashboardId":19},"retry":0,"headers":{"X-Grafana-Org-Id":1,"Accept":"application/json, text/plain, */*"}},"statusText":"","xhrStatus":"abort"}
Unexpected error
To reproduce: create the cluster with few hundreds of pods and open the 'K8s container' dashboard.
Workaround: limit a scope of pods by selecting the small time range, specific application, node and overwrite this dashboard.
However, as this dashboard comes with the plugin, it will be reverted after next redeployment or plugin update => still will be nice to have some better fix.
Prometheus Operator is a great way to deploy a Prometheus. However it generates scrape configs is a way that is incompatible with this plugin.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.