Giter VIP home page Giter VIP logo

Comments (13)

trexminer avatar trexminer commented on August 23, 2024 1

We've been chasing an issue where the miner stops hashing after a dev fee session, and the log you provided indicates it might be the same issue. In this case however the watchdog correctly did its job and restarted the miner, but we would like to fix the root cause. Would you be willing to help us with the investigation? If so we'll prepare a build which will produce an extra debugging info, so if you could run it and then send us the log file, that would be much appreciated. How long does it usually take for the problem to show itself? Which CUDA version do you use?

from t-rex.

CBDMINER avatar CBDMINER commented on August 23, 2024 1

I have the same issue as outlined above, is there any new information on how to resolve the error?

from t-rex.

Jerkysan avatar Jerkysan commented on August 23, 2024 1

I found out if your time changes on your computer this error will be caused.. just an fyi..

from t-rex.

trexminer avatar trexminer commented on August 23, 2024

There should be a message saying which GPU is idle prior to that. Please upgrade to 0.11.0 and send me the full log if the issue occurs again

from t-rex.

aleqx avatar aleqx commented on August 23, 2024

That was the first error/warning message in the session. Regardless, I don't see a reason why you shouldn't display the GPU# in there (or don't you know it?).

from t-rex.

trexminer avatar trexminer commented on August 23, 2024

If there was no message with GPU# as you've just said, then yes, the miner doesn't know which GPU caused the problem. The behaviour you're describing is not expected and appears to be a bug that needs investigation. If you start the miner with --log-path trex.log -P parameters it'll create a detailed log file which will help troubleshoot the issue

from t-rex.

aleqx avatar aleqx commented on August 23, 2024

example from mining BCD. There were no Xid errors reported by the driver in kern.log either. Seems odd that the miner can't identify which card, it's definitely not all the 9 cards as t-rex seems to suggest -- it's the only miner with this problem from all the miners that I have been using in the past 2+ years.

[...]
20190609 13:33:52 GPU #0: Gigabyte GTX 1080 Ti - 38.31 MH/s
20190609 13:33:52 GPU #1: ASUS GTX 1070        - 22.72 MH/s
20190609 13:33:52 GPU #2: EVGA GTX 1070        - 21.59 MH/s
20190609 13:33:52 GPU #3: ASUS GTX 1070        - 22.95 MH/s
20190609 13:33:52 GPU #4: EVGA GTX 1080 Ti     - 38.42 MH/s
20190609 13:33:52 GPU #5: MSI GTX 1080 Ti      - 35.71 MH/s
20190609 13:33:52 GPU #6: Gigabyte GTX 1080 Ti - 38.40 MH/s
20190609 13:33:52 GPU #7: Gigabyte GTX 1080 Ti - 37.86 MH/s
20190609 13:33:52 GPU #8: EVGA GTX 1070        - 22.21 MH/s
20190609 13:33:52 Shares/min: 3.425 (Avr. 8.086)
20190609 13:33:52 Uptime: 2 hours 53 mins 8 secs | Algo: bcd | T-Rex v0.11.1
20190609 13:34:35 [ OK ] 1401/1401 - 278.40 MH/s, 201ms
20190609 13:34:36 [ OK ] 1402/1402 - 278.43 MH/s, 202ms
20190609 13:34:46 [ OK ] 1403/1403 - 278.49 MH/s, 202ms
20190609 13:35:35 Dev fee mined (44 secs)
20190609 13:35:56 WARN: GPU #0: Gigabyte GeForce GTX 1080 Ti is idle, last activity was 21 secs ago
20190609 13:35:56 WARN: GPU #1: ASUS GeForce GTX 1070 is idle, last activity was 21 secs ago
20190609 13:35:56 WARN: GPU #2: EVGA GeForce GTX 1070 is idle, last activity was 21 secs ago
20190609 13:35:56 WARN: GPU #3: ASUS GeForce GTX 1070 is idle, last activity was 21 secs ago
20190609 13:35:56 WARN: GPU #4: EVGA GeForce GTX 1080 Ti is idle, last activity was 21 secs ago
20190609 13:35:56 WARN: GPU #5: MSI GeForce GTX 1080 Ti is idle, last activity was 21 secs ago
20190609 13:35:56 WARN: GPU #6: Gigabyte GeForce GTX 1080 Ti is idle, last activity was 21 secs ago
20190609 13:35:56 WARN: GPU #7: Gigabyte GeForce GTX 1080 Ti is idle, last activity was 21 secs ago
20190609 13:35:56 WARN: GPU #8: EVGA GeForce GTX 1070 is idle, last activity was 21 secs ago
20190609 13:36:01 WARN: GPU #0: Gigabyte GeForce GTX 1080 Ti is idle, last activity was 26 secs ago
20190609 13:36:01 WARN: GPU #1: ASUS GeForce GTX 1070 is idle, last activity was 26 secs ago
20190609 13:36:01 WARN: GPU #2: EVGA GeForce GTX 1070 is idle, last activity was 26 secs ago
20190609 13:36:01 WARN: GPU #3: ASUS GeForce GTX 1070 is idle, last activity was 26 secs ago
20190609 13:36:01 WARN: GPU #4: EVGA GeForce GTX 1080 Ti is idle, last activity was 26 secs ago
20190609 13:36:01 WARN: GPU #5: MSI GeForce GTX 1080 Ti is idle, last activity was 26 secs ago
20190609 13:36:01 WARN: GPU #6: Gigabyte GeForce GTX 1080 Ti is idle, last activity was 26 secs ago
20190609 13:36:01 WARN: GPU #7: Gigabyte GeForce GTX 1080 Ti is idle, last activity was 26 secs ago
20190609 13:36:01 WARN: GPU #8: EVGA GeForce GTX 1070 is idle, last activity was 26 secs ago
20190609 13:36:06 WARN: GPU #0: Gigabyte GeForce GTX 1080 Ti is idle, last activity was 31 secs ago
20190609 13:36:06 WARN: GPU #1: ASUS GeForce GTX 1070 is idle, last activity was 31 secs ago
20190609 13:36:06 WARN: GPU #2: EVGA GeForce GTX 1070 is idle, last activity was 31 secs ago
20190609 13:36:06 WARN: GPU #3: ASUS GeForce GTX 1070 is idle, last activity was 31 secs ago
20190609 13:36:06 WARN: GPU #4: EVGA GeForce GTX 1080 Ti is idle, last activity was 31 secs ago
20190609 13:36:06 WARN: GPU #5: MSI GeForce GTX 1080 Ti is idle, last activity was 31 secs ago
20190609 13:36:06 WARN: GPU #6: Gigabyte GeForce GTX 1080 Ti is idle, last activity was 31 secs ago
20190609 13:36:06 WARN: GPU #7: Gigabyte GeForce GTX 1080 Ti is idle, last activity was 31 secs ago
20190609 13:36:06 WARN: GPU #8: EVGA GeForce GTX 1070 is idle, last activity was 31 secs ago
20190609 13:36:11 WARN: GPU #0: Gigabyte GeForce GTX 1080 Ti is idle, last activity was 36 secs ago
20190609 13:36:11 WARN: GPU #1: ASUS GeForce GTX 1070 is idle, last activity was 36 secs ago
20190609 13:36:11 WARN: GPU #2: EVGA GeForce GTX 1070 is idle, last activity was 36 secs ago
20190609 13:36:11 WARN: GPU #3: ASUS GeForce GTX 1070 is idle, last activity was 36 secs ago
20190609 13:36:11 WARN: GPU #4: EVGA GeForce GTX 1080 Ti is idle, last activity was 36 secs ago
20190609 13:36:11 WARN: GPU #5: MSI GeForce GTX 1080 Ti is idle, last activity was 36 secs ago
20190609 13:36:11 WARN: GPU #6: Gigabyte GeForce GTX 1080 Ti is idle, last activity was 36 secs ago
20190609 13:36:11 WARN: GPU #7: Gigabyte GeForce GTX 1080 Ti is idle, last activity was 36 secs ago
20190609 13:36:11 WARN: GPU #8: EVGA GeForce GTX 1070 is idle, last activity was 36 secs ago
20190609 13:36:16 WARN: GPU #0: Gigabyte GeForce GTX 1080 Ti is idle, last activity was 41 secs ago
20190609 13:36:16 WARN: GPU #1: ASUS GeForce GTX 1070 is idle, last activity was 41 secs ago
20190609 13:36:16 WARN: GPU #2: EVGA GeForce GTX 1070 is idle, last activity was 41 secs ago
20190609 13:36:16 WARN: GPU #3: ASUS GeForce GTX 1070 is idle, last activity was 41 secs ago
20190609 13:36:16 WARN: GPU #4: EVGA GeForce GTX 1080 Ti is idle, last activity was 41 secs ago
20190609 13:36:16 WARN: GPU #5: MSI GeForce GTX 1080 Ti is idle, last activity was 41 secs ago
20190609 13:36:16 WARN: GPU #6: Gigabyte GeForce GTX 1080 Ti is idle, last activity was 41 secs ago
20190609 13:36:16 WARN: GPU #7: Gigabyte GeForce GTX 1080 Ti is idle, last activity was 41 secs ago
20190609 13:36:16 WARN: GPU #8: EVGA GeForce GTX 1070 is idle, last activity was 41 secs ago
20190609 13:36:21 WARN: GPU #0: Gigabyte GeForce GTX 1080 Ti is idle, last activity was 46 secs ago
20190609 13:36:21 WARN: GPU #1: ASUS GeForce GTX 1070 is idle, last activity was 46 secs ago
20190609 13:36:21 WARN: GPU #2: EVGA GeForce GTX 1070 is idle, last activity was 46 secs ago
20190609 13:36:21 WARN: GPU #3: ASUS GeForce GTX 1070 is idle, last activity was 46 secs ago
20190609 13:36:21 WARN: GPU #4: EVGA GeForce GTX 1080 Ti is idle, last activity was 46 secs ago
20190609 13:36:21 WARN: GPU #5: MSI GeForce GTX 1080 Ti is idle, last activity was 46 secs ago
20190609 13:36:21 WARN: GPU #6: Gigabyte GeForce GTX 1080 Ti is idle, last activity was 46 secs ago
20190609 13:36:21 WARN: GPU #7: Gigabyte GeForce GTX 1080 Ti is idle, last activity was 46 secs ago
20190609 13:36:21 WARN: GPU #8: EVGA GeForce GTX 1070 is idle, last activity was 46 secs ago
20190609 13:36:22 WARN: WATCHDOG: T-Rex has a problem with GPU, terminating...

from t-rex.

aleqx avatar aleqx commented on August 23, 2024

CUDA 10.0

Sadly, my time is very limited to help with testing and I don't mine with t-rex all the time either, but I can give it a try (I'm just not promising anything).

from t-rex.

trexminer avatar trexminer commented on August 23, 2024

Please try 0.12.0 when you have time, there is a chance that the error is fixed, although we are not 100% sure.

from t-rex.

OverchenkoDev avatar OverchenkoDev commented on August 23, 2024

I'm using v0.19.1 and have the same issue. The problems occurs right after miner starting. This is my output:

20201209 15:12:49 T-Rex NVIDIA GPU miner v0.19.1 - [CUDA v10.0]
20201209 15:12:49 r.99a7206c3590
20201209 15:12:49
20201209 15:12:49 NVIDIA Driver v450.80.02
20201209 15:12:49 CUDA devices available: 3
20201209 15:12:49
20201209 15:12:49 WARN: DevFee 1% (ethash)
20201209 15:12:49
20201209 15:12:49 URL : my_pool
20201209 15:12:49 USER: my_user
20201209 15:12:49 PASS:
20201209 15:12:49
20201209 15:12:49 Starting on: my_pool
20201209 15:12:49 ApiServer: HTTP server started on 0.0.0.0:4067
20201209 15:12:49 ----------------------------------------------------
20201209 15:12:49 For control navigate to: http://172.17.0.1:4067/trex
20201209 15:12:49 ----------------------------------------------------
20201209 15:12:49 ApiServer: Telnet server started on 127.0.0.1:3333
20201209 15:12:49 WARN: GPU #2(000600): MSI GeForce GTX 1070 Ti, intensity set to 22
20201209 15:12:49 WARN: GPU #0(000100): MSI GeForce GTX 1070 Ti, intensity set to 22
20201209 15:12:49 WARN: GPU #1(000500): MSI GeForce GTX 1070 Ti, intensity set to 22
20201209 15:12:54 Using protocol: stratum1.
20201209 15:12:54 Authorizing...
20201209 15:12:54 Authorized successfully.
20201209 15:13:10 WARN: GPU #0: MSI GeForce GTX 1070 Ti is idle, last activity was 20 secs ago
20201209 15:13:10 WARN: GPU #1: MSI GeForce GTX 1070 Ti is idle, last activity was 20 secs ago
20201209 15:13:10 WARN: GPU #2: MSI GeForce GTX 1070 Ti is idle, last activity was 20 secs ago
20201209 15:13:15 WARN: GPU #0: MSI GeForce GTX 1070 Ti is idle, last activity was 25 secs ago
20201209 15:13:15 WARN: GPU #1: MSI GeForce GTX 1070 Ti is idle, last activity was 25 secs ago
20201209 15:13:15 WARN: GPU #2: MSI GeForce GTX 1070 Ti is idle, last activity was 25 secs ago
20201209 15:13:20 WARN: GPU #0: MSI GeForce GTX 1070 Ti is idle, last activity was 30 secs ago
20201209 15:13:20 WARN: GPU #1: MSI GeForce GTX 1070 Ti is idle, last activity was 30 secs ago
20201209 15:13:20 WARN: GPU #2: MSI GeForce GTX 1070 Ti is idle, last activity was 30 secs ago
20201209 15:13:25 WARN: GPU #0: MSI GeForce GTX 1070 Ti is idle, last activity was 35 secs ago
20201209 15:13:25 WARN: GPU #1: MSI GeForce GTX 1070 Ti is idle, last activity was 35 secs ago
20201209 15:13:25 WARN: GPU #2: MSI GeForce GTX 1070 Ti is idle, last activity was 35 secs ago
20201209 15:13:30 WARN: GPU #0: MSI GeForce GTX 1070 Ti is idle, last activity was 40 secs ago
20201209 15:13:30 WARN: GPU #1: MSI GeForce GTX 1070 Ti is idle, last activity was 40 secs ago
20201209 15:13:30 WARN: GPU #2: MSI GeForce GTX 1070 Ti is idle, last activity was 40 secs ago
20201209 15:13:35 WARN: GPU #0: MSI GeForce GTX 1070 Ti is idle, last activity was 45 secs ago
20201209 15:13:35 WARN: GPU #1: MSI GeForce GTX 1070 Ti is idle, last activity was 45 secs ago
20201209 15:13:35 WARN: GPU #2: MSI GeForce GTX 1070 Ti is idle, last activity was 45 secs ago
20201209 15:13:36 WARN: WATCHDOG: T-Rex has a problem with GPU, terminating...
20201209 15:13:36 WARN: WATCHDOG: recovering T-Rex
20201209 15:13:38 T-Rex NVIDIA GPU miner v0.19.1 - [CUDA v10.0]
....
20201209 16:42:39 WARN: shutdown t-rex, signal [2] received
20201209 16:42:39 Main loop finished. Cleaning up resources...
20201209 16:42:39 ApiServer: stopped listening on 0.0.0.0:4067
20201209 16:42:39 ApiServer: stopped listening on 127.0.0.1:3333
terminate called after throwing an instance of 'std::runtime_error'
what(): wrong device nonce index

I tried to add different gpu parameters such an indexing or gpu indexes setting but it didn't help. How can I solve it?

from t-rex.

aleqx avatar aleqx commented on August 23, 2024

@OverchenkoDev it would be best if the miner showed which GPU, but it's still not doing that. If you are on Linux, then you can grep Xid /var/log/kern.log | tail as Xid are Nvidia hardware errors. They are usually overclocking problems (you pushed the o/c too much)

Nnote that the explanation of the Xid errors doesn't always help you debug which particular o/c you need to bring down (mem,gpu,pow)

from t-rex.

OverchenkoDev avatar OverchenkoDev commented on August 23, 2024

@aleqx That's what I see after this command:

coser@rige0d55e8357e3:~$ grep Xid /var/log/kern.log | tail
Dec 9 12:34:44 rige0d55e8357e3 kernel: [ 7561.089746] NVRM: Xid (PCI:0000:01:00): 56, pid=2628, CMDre 0000000b 00000ffc ffffffff 00000007 00ffffff
Dec 9 12:34:44 rige0d55e8357e3 kernel: [ 7561.089759] NVRM: Xid (PCI:0000:01:00): 56, pid=2628, CMDre 0000000c 00000ffc ffffffff 00000007 00ffffff
Dec 9 12:34:44 rige0d55e8357e3 kernel: [ 7561.089772] NVRM: Xid (PCI:0000:01:00): 56, pid=2628, CMDre 0000000d 00000ffc ffffffff 00000007 00ffffff
Dec 9 12:34:44 rige0d55e8357e3 kernel: [ 7561.089785] NVRM: Xid (PCI:0000:01:00): 56, pid=2628, CMDre 0000000e 00000ffc ffffffff 00000007 00ffffff
Dec 9 12:34:44 rige0d55e8357e3 kernel: [ 7561.089798] NVRM: Xid (PCI:0000:01:00): 56, pid=2628, CMDre 0000000f 00000ffc ffffffff 00000007 00ffffff
Dec 9 12:34:44 rige0d55e8357e3 kernel: [ 7561.089811] NVRM: Xid (PCI:0000:01:00): 56, pid=2628, CMDre 00000010 00000ffc ffffffff 00000007 00ffffff
Dec 9 12:34:44 rige0d55e8357e3 kernel: [ 7561.089824] NVRM: Xid (PCI:0000:01:00): 56, pid=2628, CMDre 00000011 00000ffc ffffffff 00000007 00ffffff
Dec 9 12:34:44 rige0d55e8357e3 kernel: [ 7561.089837] NVRM: Xid (PCI:0000:01:00): 56, pid=2628, CMDre 00000012 00000ffc ffffffff 00000007 00ffffff
Dec 9 12:34:44 rige0d55e8357e3 kernel: [ 7561.089849] NVRM: Xid (PCI:0000:01:00): 56, pid=2628, CMDre 00000013 00000ffc ffffffff 00000007 00ffffff
Dec 9 12:34:44 rige0d55e8357e3 kernel: [ 7561.089862] NVRM: Xid (PCI:0000:01:00): 56, pid=2628, CMDre 00000014 00000ffc ffffffff 00000007 00ffffff

Find it difficult to understand this

Also I used bencmark mode and there was no problems. Output:

20201209 16:47:11 NVIDIA Driver v450.80.02
20201209 16:47:11 CUDA devices available: 3
20201209 16:47:11
20201209 16:47:11 WARN: BENCHMARK MODE (ethash)
20201209 16:47:11 WARN: EPOCH 1
20201209 16:47:11
20201209 16:47:11 WARN: GPU #0(000100): MSI GeForce GTX 1070 Ti, intensity set to 22
20201209 16:47:11 WARN: GPU #1(000500): MSI GeForce GTX 1070 Ti, intensity set to 22
20201209 16:47:11 WARN: GPU #2(000600): MSI GeForce GTX 1070 Ti, intensity set to 22
20201209 16:47:12 GPU #1: generating DAG 1.01 GB for epoch 1 ...
20201209 16:47:12 GPU #2: generating DAG 1.01 GB for epoch 1 ...
20201209 16:47:12 GPU #0: generating DAG 1.01 GB for epoch 1 ...
20201209 16:47:14 GPU #0: DAG generated [time: 1867 ms], memory left: 6.81 GB
20201209 16:47:14 GPU #2: DAG generated [time: 1887 ms], memory left: 6.81 GB
20201209 16:47:14 GPU #1: DAG generated [time: 1890 ms], memory left: 6.81 GB
20201209 16:47:31 GPU #0: using kernel #3
20201209 16:47:31 GPU #2: using kernel #3
20201209 16:47:31 GPU #1: using kernel #3
20201209 16:47:32 Total: 22.66 MH/s
20201209 16:47:34 Total: 67.98 MH/s
20201209 16:47:36 Total: 67.99 MH/s
20201209 16:47:38 Total: 67.99 MH/s
20201209 16:47:40 Found 1 share(s)
20201209 16:47:40 Found 1 share(s)
20201209 16:47:40 Found 1 share(s)
20201209 16:47:40 Total: 66.77 MH/s
20201209 16:47:41 Found 1 share(s)
20201209 16:47:41 Found 1 share(s)
20201209 16:47:41 Found 1 share(s)
20201209 16:47:42 Total: 68.15 MH/s
20201209 16:47:43 Found 1 share(s)
20201209 16:47:43 Found 1 share(s)
20201209 16:47:43 Found 1 share(s)
20201209 16:47:44 Total: 68.13 MH/s
.....

from t-rex.

muratkavuncu avatar muratkavuncu commented on August 23, 2024

WARN: WATCHDOG: T-Rex has a problem with GPU, terminating...
i am getting this error and it keep continuing with this error. if it restart the system it would be solved. is there a way?

from t-rex.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.