What GPU could you recommend? I tried it with RTX 3090. SynthSeg takes up all the

Recommended GPU about synthseg HOT 15 CLOSED

bbillot commented on August 18, 2024

Recommended GPU

from synthseg.

Comments (15)

BBillot commented on August 18, 2024

Do you mean for training or testing ?

Training SynthSeg can take quite some memory. Using crops of 160 during training, I needed a 24Gb GPU. To decrease memory usage you can:

use smaller crops (128 for example)
decrease the size of the network (using less features, using 4 levels instead of 5)
the thing that takes the most memory (besides the segmentation network) is the spatial deformation. If your dataset is big enough you can also deactivate the spatial deformation.
you can migrate some of the augmentation on the CPU in the while loop of build_model_inputs.py

Testing on a 12Gb should be fine if you only use the segmentation model (ie if you don't use --robust, --parc or --qc flags). Otherwise you might want to use a bigger GPU or to do inference on the CPU. Alternatively, you can use --crop to decrease the size of the processed images, but that might crop out some brain parts.

RTX GPUs should be fine, assuming they are large enough. Titan Xp should also be fine for testing (not so much for training).

CUDNN issue: you might want to install tensorflow/keras with conda, since it will automatically install the matching version of cuda and cudnn, so you don't need to worry about this anymore.

Hope this helps
Benjamin

from synthseg.

mishaberegov commented on August 18, 2024

My RTX 3090 actually freezes on the inference, not training. :(
That's why I wonder if this could be the environment issue. Especially if simple replacement with RTX 2090 solves the issue for some of the scans.

from synthseg.

BBillot commented on August 18, 2024

Hmm that's weird, I think it should work...

What command are you using ? What's the size of your images ?

Does it give you memory issue or just freezes ? Sometimes I also observed that with some GPUS (A6000 for example), but never with RTX...

from synthseg.

mishaberegov commented on August 18, 2024

I use the following command, for example:

python ../../SynthSeg/scripts/commands/SynthSeg_predict.py \ --i {nii_input} \ --o test/{nii_output} --parc --crop 224 224 192 --fast

It prints all the usual things, and freezes after this:
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 22195 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 3090, pci bus id: 0000:01:00.0, compute capability: 8.6) predicting 1/1 2023-04-26 16:03:12.545105: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7

In fact, it's the same with any size. Ideally, I would like it to work with 256x256x256 image, but for now the result is the same with any cropping settings.
It indeed takes 22 Gb of GPU memory and doesn't produce any output.

Here's the full message:

Full message

Using TensorFlow backend.

SynthSeg 2.0 (fast)

using 1 thread
2023-04-26 15:57:09.065709: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2023-04-26 15:57:09.122867: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-04-26 15:57:09.123112: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3090 computeCapability: 8.6
coreClock: 1.695GHz coreCount: 82 deviceMemorySize: 23.69GiB deviceMemoryBandwidth: 871.81GiB/s
2023-04-26 15:57:09.126193: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2023-04-26 15:57:09.178028: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2023-04-26 15:57:09.209394: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2023-04-26 15:57:09.217617: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2023-04-26 15:57:09.282536: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2023-04-26 15:57:09.292220: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2023-04-26 15:57:09.443288: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2023-04-26 15:57:09.443753: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-04-26 15:57:09.444513: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-04-26 15:57:09.445016: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2023-04-26 15:57:09.445604: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2023-04-26 15:57:09.500297: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 3399905000 Hz
2023-04-26 15:57:09.500968: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x17f2c30 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2023-04-26 15:57:09.500993: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2023-04-26 15:57:09.590659: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-04-26 15:57:09.590864: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x1379020 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2023-04-26 15:57:09.590881: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): NVIDIA GeForce RTX 3090, Compute Capability 8.6
2023-04-26 15:57:09.591688: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-04-26 15:57:09.591793: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3090 computeCapability: 8.6
coreClock: 1.695GHz coreCount: 82 deviceMemorySize: 23.69GiB deviceMemoryBandwidth: 871.81GiB/s
2023-04-26 15:57:09.591827: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2023-04-26 15:57:09.591841: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2023-04-26 15:57:09.591854: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2023-04-26 15:57:09.591867: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2023-04-26 15:57:09.591880: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2023-04-26 15:57:09.591893: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2023-04-26 15:57:09.591906: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2023-04-26 15:57:09.591959: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-04-26 15:57:09.592087: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-04-26 15:57:09.592197: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2023-04-26 15:57:09.592931: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2023-04-26 15:57:09.593859: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2023-04-26 15:57:09.593871: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] 0
2023-04-26 15:57:09.593877: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0: N
2023-04-26 15:57:09.593964: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-04-26 15:57:09.594103: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-04-26 15:57:09.594205: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 22195 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 3090, pci bus id: 0000:01:00.0, compute capability: 8.6)
predicting 1/1
2023-04-26 16:03:12.545105: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7

from synthseg.

mishaberegov commented on August 18, 2024

Update.
I tried to wait a bit longer.
The segmentation finished, but there seems to be a compatibility issue.

2023-04-26 16:50:38.755224: W tensorflow/stream_executor/gpu/redzone_allocator.cc:314] **Internal: ptxas exited with non-zero error code 65280, output: ptxas fatal   : Value 'sm_86' is not defined for option 'gpu-name'**

Relying on driver to perform ptx compilation. 
Modify $PATH to customize ptxas location.
This message will be only logged once.
2023-04-26 16:50:39.649522: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2023-04-26 16:51:00.818533: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 2658926592 exceeds 10% of free system memory.

segmentation  saved in:    /home/medicinestand/morphometry/Samples/test/1.nii.gz

If you use this tool in a publication, please cite:
SynthSeg: domain randomisation for segmentation of brain MRI scans of any contrast and resolution
B. Billot, D.N. Greve, O. Puonti, A. Thielscher, K. Van Leemput, B. Fischl, A.V. Dalca, J.E. Iglesias

When I execute the nvcc --help command, the sm_86 is indeed not listed in the architectures options.

from synthseg.

BBillot commented on August 18, 2024

How does the segmentation look like ? I mean the one saved under /home/medicinestand/morphometry/Samples/test/1.nii.gz ?

from synthseg.

mishaberegov commented on August 18, 2024

It looks very good, just as it used to look under normal circumstances. However, the whole process took like 20 minutes. Seems like this is the tensorflow thing.

from synthseg.

BBillot commented on August 18, 2024

Hmm okay, so the problem is definitely between tensorflow and your GPU. Have you tried googling the tensorflow warning ?

2023-04-26 16:50:38.755224: W tensorflow/stream_executor/gpu/redzone_allocator.cc:314] Internal: ptxas exited with non-zero error code 65280, output: ptxas fatal : Value 'sm_86' is not defined for option 'gpu-name'
Relying on driver to perform ptx compilation.
Modify $PATH to customize ptxas location.

from synthseg.

mishaberegov commented on August 18, 2024

Yeah, seems that ptxas from the CUDA 10.1 doesn't support some newer GPUs.

from synthseg.

BBillot commented on August 18, 2024

Have you tried reinstalling all the cuda stuff with a fresh conda install ? For example using one of the commands that are in the README ?

from synthseg.

mishaberegov commented on August 18, 2024

I have, but no effect. I'll try once more. For now, I replaced the ptxas in the CUDA directory with ptxas from CUDA 11. The error is gone, but it takes around 6 minutes per scan.

from synthseg.

BBillot commented on August 18, 2024

6 minutes is till too long, it should take somewhere between 10 and 20 seconds per scan given that you're running the segmentation and the parcellation.
Let me know if you manage to solve this :)

from synthseg.

BBillot commented on August 18, 2024

Any news on this ?

from synthseg.

mishaberegov commented on August 18, 2024

Unfortunately, no. Seems to be the TensorFlow version limitation.

from synthseg.

BBillot commented on August 18, 2024

Hmm okay, sorry I couldn't be of more help, but this sounds like a problem bigger than just SynthSeg...
Good luck though !

from synthseg.

Recommended GPU about synthseg HOT 15 CLOSED

Comments (15)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent