Giter VIP home page Giter VIP logo

zabbix-nvidia-smi-multi-gpu's People

Contributors

fz883 avatar henrofall avatar ixcat avatar jason-m avatar plambe avatar stackkorora avatar t0d0r avatar tqre-mc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

zabbix-nvidia-smi-multi-gpu's Issues

Unable to discover when swapped to Agent Active

Added the template and required scripts/ config. I have changed all the Types of Prototypes to Zabbix Agent (active) and am getting this error in the Zabbix UI.

Invalid discovery rule value: cannot parse as a valid JSON object: invalid object format, expected opening character '{' or '[' at: 'The syntax of the command is incorrect. C:\windows\system32><!DOCTYPE html>'

Could be me just being silly though, who knows.

Perl one-liner instead of get_gpus_info.sh

Hi, this is great work, and I hope we can use this easier.
In my environment, I can replace the get_gpus_info.sh script into the one-liners with perl as following:

nvidia-smi -L | perl -le 'while(<>){push @a,qq|{"{#GPUINDEX}":"$1", "{#GPUUUID}":"$2"}| if(/GPU (
\d+).*UUID\: (.*)\)$/);} print qq|{"data":[\n| . join(",\n", @a) . qq|\n]}|;'

So, we can replace the UserParameter:

UserParameter=gpu.discovery,nvidia-smi -L | perl -le 'while(<>){push @a,qq|{"{#GPUINDEX}":"$1", "{#GPUUUID}":"$2"}| if(/GPU (
\d+).*UUID\: (.*)\)$/);} print qq|{"data":[\n| . join(",\n", @a) . qq|\n]}|;'

Note that, we need setting of PATH environment for nvidia-smi and perl commands.

Cannot collect info of GPU resource on VM

I want to monitor GPU resource on VM using zabbix. I followed the instructions in the README, but the VM outputs the error shown below. On bare metal, it worked.
If you know a solution, please let me know.

■ Environment
Host OS: vSphere ESXi 7.0U3
GPU: A40
Guest OS: Windows 10 Pro
GPU profile (Guest OS): NVIDIA GRID vGPU nvidia_a40-8q
GPU driver (Guest OS & Host OS): 510.47.03

スクリーンショット 2022-09-26 13 14 39

GPU Total Memory KMB

Hi,
I need help to understand and fix.
when i run this command i see it as mb " sudo /usr/bin/nvidia-smi -L"

image

In the Zabbix scheme, the gpu memory information appears as kmb. How do I megabyte make

image

image

image

thanks a lot

get_gplus_info.sh no found

Hi,
I need help to understand and fix.

I follow and install the script in /etc/zabbix/scripts/ and add /etc/zabbix/zabbix_agentd.d/userparameter_nvidia-smi.conf

I also import the template and assign that template to the server. I try to run the command manually to make sure I don't miss something.

image

image

image

can you help me,thnks.

UserParameter=gpu.number only discovers GeForce Cards

This part in the linux config file:

UserParameter=gpu.number,/usr/bin/nvidia-smi -L | /bin/grep GeForce | /usr/bin/wc -l
makes Zabbix only detect GeForce cards, whereas Tesla's are ignored for example.

Why grep anyway?

[root@host~]# /usr/bin/nvidia-smi -L
GPU 0: Tesla V100-PCIE-16GB (UUID: GPU-yyyyyyyyy-xxxx-yyyyy-xxxx-yxyxyxyxyxyxy)
GPU 1: Tesla V100-PCIE-16GB (UUID: GPU-xxxxxxxxx-yyyy-xxxx-yyyy-xxxxyyyyyxxx)

It should be possible to grep for "GPU" and count those lines.

FYI:

[root@host~]# nvidia-installer -v

nvidia-installer:  version 396.26  (buildmeister@swio-display-x64-rhel04-19)  Mon Apr 30 18:40:31 PDT 2018

License?

Hi,

many thanks for this useful plugin!
Could you please add a LICENSE to your project to allow contributions and "safe" usage of it?

Thanks! 😄

windows script

Hi, thank you for your work.
Could you make a windows script for GPUS discovery? I tried using awk but awk for windows doesn't work very well with quotes.

Number of GPUs unsupported item key.

Hi,
I need help to understand and fix.

I follow and install the script in /etc/zabbix/scripts/ and add /etc/zabbix/zabbix_agentd.d/userparameter_nvidia-smi.conf

I also import the template and assign that template to the server. I try to run the command manually to make sure I don't miss something.

The problem is about the lack of information in zabbix and I see a problem in calculating the number of GPUs.

image

This is the result when I run the command localy.

root@hostname:/etc/zabbix# sudo -u zabbix zabbix_agentd -t gpu.number
gpu.number                                    [t|9]
root@hostname:/etc/zabbix# sudo -u zabbix zabbix_agentd -t gpu.discovery
gpu.discovery                                 [t|{
"data":[
{"{#GPUINDEX}":"0", "{#GPUUUID}":"GPU-UUID"},
{"{#GPUINDEX}":"1", "{#GPUUUID}":"GPU-UUID"},
{"{#GPUINDEX}":"2", "{#GPUUUID}":"GPU-UUID"},
{"{#GPUINDEX}":"3", "{#GPUUUID}":"GPU-UUID"},
{"{#GPUINDEX}":"4", "{#GPUUUID}":"GPU-UUID"},
{"{#GPUINDEX}":"5", "{#GPUUUID}":"GPU-UUID"},
{"{#GPUINDEX}":"6", "{#GPUUUID}":"GPU-UUID"},
{"{#GPUINDEX}":"7", "{#GPUUUID}":"GPU-UUID"},
{"{#GPUINDEX}":"8", "{#GPUUUID}":"GPU-UUID"}
]
}]

Script Bash

Heyo,

I am having issue with the bash script you provide.
It says get_gpus_info.sh: 23: get_gpus_info.sh: Syntax error: redirection unexpected

Can you help me fix it ?

Decoder/Encoder monitoring is missing in windows zabbix config

UserParameter=gpu.utilization.dec.min[*].....
UserParameter=gpu.utilization.dec.max[*].....
UserParameter=gpu.utilization.enc.min[*].....
UserParameter=gpu.utilization.enc.max[*].....

are missing for windows hosts and will yield as "unsupported item" in zabbix monitoring

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.