Comments (6)
Ok, then this is not something that can be solved inside the exporter, it is the behavior of the nvidia-smi
tool itself. My recommendations are the following:
- Do a clean reinstall of the Nvidia driver
- Try older/newer driver versions
- Try the game-ready driver
- Investigate the power options/NVIDIA control panel options that might cause such behavior
Please let me know if you can pinpoint it, so I can add it to the documentation.
I will close this ticket, since it seems there's nothing that can be done on exporter level. If you find out something that indicates otherwise, feel free to report it, so we can re-open.
Thanks.
from nvidia_gpu_exporter.
This is pretty difficult for me to debug. Can you disable your metric scraping on prometheus, put your monitor to sleep and then make a single request from another machine (your cellphone or from a laptop) to the http://<YOUR_PC_IP>:9835/metrics
and see it if wakes your monitor up.
This way we can be sure that the scraping operation itself actually wakes it up and it is not caused by something else.
And a question for clarification: Is it your whole computer waking up from the sleep, or is it just the monitor?
If it is the whole computer, then the issue might be that your ethernet card is configured to wake up from a sleep when it receives a network packet. To disable this behavior, you can go to device manager, open your ethernet adapter settings and find this screen:
Then either disable the device to wake up the computer, or make sure that Only allow a magic packet to wake the computer is checked.
from nvidia_gpu_exporter.
This is pretty difficult for me to debug. Can you disable your metric scraping on prometheus, put your monitor to sleep and then make a single request from another machine (your cellphone or from a laptop) to the http://<YOUR_PC_IP>:9835/metrics and see it if wakes your monitor up.
Thanks i will try and let you know.
p.s. I don't put whole pc to sleep, just monitor
from nvidia_gpu_exporter.
I check it out, and yes, scraping wakes up monitor. I stopped prometheus and open http://<YOUR_PC_IP>:9835/metrics from my phone, and monitor wake up, and instantly on/off again every time i refresh page with metrics.
from nvidia_gpu_exporter.
Thanks, this is helpful. Now I'll ask you to try one more thing to find out if the cause of wake-up is the exporter code or is it the querying of the gpu itself.
Can you please do the following:
-
Open a Powershell prompt
-
Run the following command:
Start-Sleep -Seconds 30; nvidia-smi --query-gpu="timestamp,driver_version,count,name,serial,uuid,pci.bus_id,pci.domain,pci.bus,pci.device,pci.device_id,pci.sub_device_id,pcie.link.gen.current,pcie.link.gen.max,pcie.link.width.current,pcie.link.width.max,index,display_mode,display_active,persistence_mode,accounting.mode,accounting.buffer_size,driver_model.current,driver_model.pending,vbios_version,inforom.img,inforom.oem,inforom.ecc,inforom.pwr,gom.current,gom.pending,fan.speed,pstate,clocks_throttle_reasons.supported,clocks_throttle_reasons.active,clocks_throttle_reasons.gpu_idle,clocks_throttle_reasons.applications_clocks_setting,clocks_throttle_reasons.sw_power_cap,clocks_throttle_reasons.hw_slowdown,clocks_throttle_reasons.hw_thermal_slowdown,clocks_throttle_reasons.hw_power_brake_slowdown,clocks_throttle_reasons.sw_thermal_slowdown,clocks_throttle_reasons.sync_boost,memory.total,memory.used,memory.free,compute_mode,utilization.gpu,utilization.memory,encoder.stats.sessionCount,encoder.stats.averageFps,encoder.stats.averageLatency,ecc.mode.current,ecc.mode.pending,ecc.errors.corrected.volatile.device_memory,ecc.errors.corrected.volatile.dram,ecc.errors.corrected.volatile.register_file,ecc.errors.corrected.volatile.l1_cache,ecc.errors.corrected.volatile.l2_cache,ecc.errors.corrected.volatile.texture_memory,ecc.errors.corrected.volatile.cbu,ecc.errors.corrected.volatile.sram,ecc.errors.corrected.volatile.total,ecc.errors.corrected.aggregate.device_memory,ecc.errors.corrected.aggregate.dram,ecc.errors.corrected.aggregate.register_file,ecc.errors.corrected.aggregate.l1_cache,ecc.errors.corrected.aggregate.l2_cache,ecc.errors.corrected.aggregate.texture_memory,ecc.errors.corrected.aggregate.cbu,ecc.errors.corrected.aggregate.sram,ecc.errors.corrected.aggregate.total,ecc.errors.uncorrected.volatile.device_memory,ecc.errors.uncorrected.volatile.dram,ecc.errors.uncorrected.volatile.register_file,ecc.errors.uncorrected.volatile.l1_cache,ecc.errors.uncorrected.volatile.l2_cache,ecc.errors.uncorrected.volatile.texture_memory,ecc.errors.uncorrected.volatile.cbu,ecc.errors.uncorrected.volatile.sram,ecc.errors.uncorrected.volatile.total,ecc.errors.uncorrected.aggregate.device_memory,ecc.errors.uncorrected.aggregate.dram,ecc.errors.uncorrected.aggregate.register_file,ecc.errors.uncorrected.aggregate.l1_cache,ecc.errors.uncorrected.aggregate.l2_cache,ecc.errors.uncorrected.aggregate.texture_memory,ecc.errors.uncorrected.aggregate.cbu,ecc.errors.uncorrected.aggregate.sram,ecc.errors.uncorrected.aggregate.total,retired_pages.single_bit_ecc.count,retired_pages.double_bit.count,retired_pages.pending,temperature.gpu,temperature.memory,power.management,power.draw,power.limit,enforced.power.limit,power.default_limit,power.min_limit,power.max_limit,clocks.current.graphics,clocks.current.sm,clocks.current.memory,clocks.current.video,clocks.applications.graphics,clocks.applications.memory,clocks.default_applications.graphics,clocks.default_applications.memory,clocks.max.graphics,clocks.max.sm,clocks.max.memory,mig.mode.current,mig.mode.pending" --format=csv
-
Immediately put your monitor to sleep. The command you ran will wait for 30 seconds then run the exact command the exporter runs when its scrape endpoint is called.
-
Observe if the monitor wakes up in 30 seconds, when the
nvidia-smi
command runs.
Also please check the Powershell output afterwards to make sure that command actually was run on the background.
from nvidia_gpu_exporter.
Yes, wakes up from this command
from nvidia_gpu_exporter.
Related Issues (20)
- [Discussion] Offering CPU and Memory Monitoring Support HOT 2
- scoop install is not updated
- Change from 'throttle' to 'event' in output from nvidia-smi v535.113.01
- Grafana customizable GPU uuid
- getting this working on wsl2
- Add support for the PCIe TX Throughput and RX Throughput metrics HOT 1
- most ratio metrics are zeroes HOT 2
- Add instance filter in Grafana dashboard HOT 2
- Help getting pulling stats from exporter. HOT 3
- Can't figure out how to connect to Grafana cloud HOT 2
- Working with multinode servers? HOT 5
- macOS binaries? HOT 1
- change prometheus for netdata? HOT 1
- Process
- Run as Windows service is failing HOT 1
- get gpu info fail
- Use `go-nvlib` and/or `go-nvml` instead of exec
- pod running error HOT 1
- Add GPU index ID as a label like uuid
- how to add Total RAM , GPU Core, CUDA Core, CPU Core, and RAM in used etc
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nvidia_gpu_exporter.