Giter VIP home page Giter VIP logo

Comments (15)

BBillot avatar BBillot commented on August 18, 2024

Do you mean for training or testing ?

Training SynthSeg can take quite some memory. Using crops of 160 during training, I needed a 24Gb GPU. To decrease memory usage you can:

  • use smaller crops (128 for example)
  • decrease the size of the network (using less features, using 4 levels instead of 5)
  • the thing that takes the most memory (besides the segmentation network) is the spatial deformation. If your dataset is big enough you can also deactivate the spatial deformation.
  • you can migrate some of the augmentation on the CPU in the while loop of build_model_inputs.py

Testing on a 12Gb should be fine if you only use the segmentation model (ie if you don't use --robust, --parc or --qc flags). Otherwise you might want to use a bigger GPU or to do inference on the CPU. Alternatively, you can use --crop to decrease the size of the processed images, but that might crop out some brain parts.

RTX GPUs should be fine, assuming they are large enough. Titan Xp should also be fine for testing (not so much for training).

CUDNN issue: you might want to install tensorflow/keras with conda, since it will automatically install the matching version of cuda and cudnn, so you don't need to worry about this anymore.

Hope this helps
Benjamin

from synthseg.

mishaberegov avatar mishaberegov commented on August 18, 2024

My RTX 3090 actually freezes on the inference, not training. :(
That's why I wonder if this could be the environment issue. Especially if simple replacement with RTX 2090 solves the issue for some of the scans.

from synthseg.

BBillot avatar BBillot commented on August 18, 2024

Hmm that's weird, I think it should work...

What command are you using ? What's the size of your images ?

Does it give you memory issue or just freezes ? Sometimes I also observed that with some GPUS (A6000 for example), but never with RTX...

from synthseg.

mishaberegov avatar mishaberegov commented on August 18, 2024

I use the following command, for example:

python ../../SynthSeg/scripts/commands/SynthSeg_predict.py \ --i {nii_input} \ --o test/{nii_output} --parc --crop 224 224 192 --fast

It prints all the usual things, and freezes after this:
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 22195 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 3090, pci bus id: 0000:01:00.0, compute capability: 8.6) predicting 1/1 2023-04-26 16:03:12.545105: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7

In fact, it's the same with any size. Ideally, I would like it to work with 256x256x256 image, but for now the result is the same with any cropping settings.
It indeed takes 22 Gb of GPU memory and doesn't produce any output.

Here's the full message:

Full message Using TensorFlow backend.

SynthSeg 2.0 (fast)

using 1 thread
2023-04-26 15:57:09.065709: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2023-04-26 15:57:09.122867: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-04-26 15:57:09.123112: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3090 computeCapability: 8.6
coreClock: 1.695GHz coreCount: 82 deviceMemorySize: 23.69GiB deviceMemoryBandwidth: 871.81GiB/s
2023-04-26 15:57:09.126193: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2023-04-26 15:57:09.178028: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2023-04-26 15:57:09.209394: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2023-04-26 15:57:09.217617: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2023-04-26 15:57:09.282536: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2023-04-26 15:57:09.292220: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2023-04-26 15:57:09.443288: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2023-04-26 15:57:09.443753: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-04-26 15:57:09.444513: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-04-26 15:57:09.445016: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2023-04-26 15:57:09.445604: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2023-04-26 15:57:09.500297: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 3399905000 Hz
2023-04-26 15:57:09.500968: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x17f2c30 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2023-04-26 15:57:09.500993: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2023-04-26 15:57:09.590659: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-04-26 15:57:09.590864: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x1379020 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2023-04-26 15:57:09.590881: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): NVIDIA GeForce RTX 3090, Compute Capability 8.6
2023-04-26 15:57:09.591688: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-04-26 15:57:09.591793: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3090 computeCapability: 8.6
coreClock: 1.695GHz coreCount: 82 deviceMemorySize: 23.69GiB deviceMemoryBandwidth: 871.81GiB/s
2023-04-26 15:57:09.591827: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2023-04-26 15:57:09.591841: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2023-04-26 15:57:09.591854: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2023-04-26 15:57:09.591867: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2023-04-26 15:57:09.591880: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2023-04-26 15:57:09.591893: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2023-04-26 15:57:09.591906: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2023-04-26 15:57:09.591959: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-04-26 15:57:09.592087: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-04-26 15:57:09.592197: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2023-04-26 15:57:09.592931: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2023-04-26 15:57:09.593859: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2023-04-26 15:57:09.593871: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] 0
2023-04-26 15:57:09.593877: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0: N
2023-04-26 15:57:09.593964: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-04-26 15:57:09.594103: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-04-26 15:57:09.594205: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 22195 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 3090, pci bus id: 0000:01:00.0, compute capability: 8.6)
predicting 1/1
2023-04-26 16:03:12.545105: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7

from synthseg.

mishaberegov avatar mishaberegov commented on August 18, 2024

Update.
I tried to wait a bit longer.
The segmentation finished, but there seems to be a compatibility issue.

2023-04-26 16:50:38.755224: W tensorflow/stream_executor/gpu/redzone_allocator.cc:314] **Internal: ptxas exited with non-zero error code 65280, output: ptxas fatal   : Value 'sm_86' is not defined for option 'gpu-name'**

Relying on driver to perform ptx compilation. 
Modify $PATH to customize ptxas location.
This message will be only logged once.
2023-04-26 16:50:39.649522: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2023-04-26 16:51:00.818533: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 2658926592 exceeds 10% of free system memory.

segmentation  saved in:    /home/medicinestand/morphometry/Samples/test/1.nii.gz

If you use this tool in a publication, please cite:
SynthSeg: domain randomisation for segmentation of brain MRI scans of any contrast and resolution
B. Billot, D.N. Greve, O. Puonti, A. Thielscher, K. Van Leemput, B. Fischl, A.V. Dalca, J.E. Iglesias

When I execute the nvcc --help command, the sm_86 is indeed not listed in the architectures options.

from synthseg.

BBillot avatar BBillot commented on August 18, 2024

How does the segmentation look like ? I mean the one saved under /home/medicinestand/morphometry/Samples/test/1.nii.gz ?

from synthseg.

mishaberegov avatar mishaberegov commented on August 18, 2024

It looks very good, just as it used to look under normal circumstances. However, the whole process took like 20 minutes. Seems like this is the tensorflow thing.

from synthseg.

BBillot avatar BBillot commented on August 18, 2024

Hmm okay, so the problem is definitely between tensorflow and your GPU. Have you tried googling the tensorflow warning ?

2023-04-26 16:50:38.755224: W tensorflow/stream_executor/gpu/redzone_allocator.cc:314] Internal: ptxas exited with non-zero error code 65280, output: ptxas fatal : Value 'sm_86' is not defined for option 'gpu-name'
Relying on driver to perform ptx compilation.
Modify $PATH to customize ptxas location.

from synthseg.

mishaberegov avatar mishaberegov commented on August 18, 2024

Yeah, seems that ptxas from the CUDA 10.1 doesn't support some newer GPUs.

from synthseg.

BBillot avatar BBillot commented on August 18, 2024

Have you tried reinstalling all the cuda stuff with a fresh conda install ? For example using one of the commands that are in the README ?

from synthseg.

mishaberegov avatar mishaberegov commented on August 18, 2024

I have, but no effect. I'll try once more. For now, I replaced the ptxas in the CUDA directory with ptxas from CUDA 11. The error is gone, but it takes around 6 minutes per scan.

from synthseg.

BBillot avatar BBillot commented on August 18, 2024

6 minutes is till too long, it should take somewhere between 10 and 20 seconds per scan given that you're running the segmentation and the parcellation.
Let me know if you manage to solve this :)

from synthseg.

BBillot avatar BBillot commented on August 18, 2024

Any news on this ?

from synthseg.

mishaberegov avatar mishaberegov commented on August 18, 2024

Unfortunately, no. Seems to be the TensorFlow version limitation.

from synthseg.

BBillot avatar BBillot commented on August 18, 2024

Hmm okay, sorry I couldn't be of more help, but this sounds like a problem bigger than just SynthSeg...
Good luck though !

from synthseg.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.