Comments (3)
@tiangexiang Any ideas on this?
from ddm2.
Sorry for the late response! The error you reported particularly indicates a mismatch between pytorch version and CUDA version. And you are right that the validation loader failure is probably due to version mismatch as well. In this way, I do recommend duplicating the exact environment as specified in environment.yaml
, since it is guaranteed to work (be careful with the CUDA version though! It has to match your own hardware).
from ddm2.
@tiangexiang Thanks for the reply. I checked very carefully and to match my hardware, I set up cudatoolkit=11.3
and the corresponding PyTorch versions as follows:
name: ddm2_experiment
channels:
- defaults
dependencies:
- _libgcc_mutex=0.1
- _openmp_mutex=4.5
- _pytorch_select=0.1
- blas=1.0
- ca-certificates=2022.3.29
- certifi=2021.10.8
- cudatoolkit=11.3
- freetype=2.11.0
- giflib=5.2.1
- intel-openmp=2021.4.0
- jpeg=9d
- lcms2=2.12
- ld_impl_linux-64=2.35.1
- libffi=3.3
- libgcc-ng=9.3.0
- libgomp=9.3.0
- libpng=1.6.37
- libstdcxx-ng=9.3.0
- libtiff=4.2.0
- libuv=1.40.0
- libwebp=1.2.2
- libwebp-base=1.2.2
- lz4-c=1.9.3
- mkl=2021.4.0
- mkl-service=2.4.0
- mkl_fft=1.3.1
- mkl_random=1.2.2
- ncurses=6.3
- ninja=1.10.2
- openssl=1.1.1n
- pip=21.2.4
- python=3.8.13
- readline=8.1.2
- setuptools=58.0.4
- six=1.16.0
- sqlite=3.38.2
- tk=8.6.11
- typing_extensions=4.1.1
- wheel=0.37.1
- xz=5.2.5
- zlib=1.2.11
- zstd=1.4.9
- pip:
- beautifulsoup4==4.11.1
- charset-normalizer==2.0.12
- cycler==0.11.0
- dipy==1.5.0
- filelock==3.6.0
- fonttools==4.31.2
- gdown==4.4.0
- h5py==3.6.0
- idna==3.3
- imageio==2.16.1
- joblib==1.1.0
- kiwisolver==1.4.2
- matplotlib==3.5.1
- networkx==2.7.1
- nibabel==3.2.2
- numpy==1.22.3
- opencv-python==4.5.4.58
- packaging==21.3
- pandas==1.4.1
- pillow==9.1.0
- pydicom==2.3.0
- pyparsing==3.0.7
- pysocks==1.7.1
- python-dateutil==2.8.2
- pytz==2022.1
- pywavelets==1.3.0
- pyyaml==6.0
- requests==2.27.1
- scikit-image==0.19.2
- scikit-learn==1.0.2
- scipy==1.8.0
- seaborn==0.11.2
- soupsieve==2.3.2.post1
- statannot==0.2.3
- threadpoolctl==3.1.0
- tifffile==2022.3.25
- timm==0.4.12
- torch==1.8.0
- torchvision==0.9.0
- tqdm==4.63.1
- urllib3==1.26.9
Even though the matching happened, I still had problems with the validation part of the training.
Validation
Traceback (most recent call last):
File "train_noise_model.py", line 92, in <module>
for _, val_data in enumerate(val_loader):
File "/home/anar/mambaforge-pypy3/envs/ddm2_experiment/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 681, in __next__
data = self._next_data()
File "/home/anar/mambaforge-pypy3/envs/ddm2_experiment/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1376, in _next_data
return self._process_data(data)
File "/home/anar/mambaforge-pypy3/envs/ddm2_experiment/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1402, in _process_data
data.reraise()
File "/home/anar/mambaforge-pypy3/envs/ddm2_experiment/lib/python3.8/site-packages/torch/_utils.py", line 461, in reraise
raise exception
IndexError: Caught IndexError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/anar/mambaforge-pypy3/envs/ddm2_experiment/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop
data = fetcher.fetch(index)
File "/home/anar/mambaforge-pypy3/envs/ddm2_experiment/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/anar/mambaforge-pypy3/envs/ddm2_experiment/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/anar/DDM2/data/mri_dataset.py", line 130, in __getitem__
raw_input = raw_input[:,:,0]
IndexError: index 0 is out of bounds for axis 2 with size 0
Even trying the latest versions for torch
& torchvision
did not help at all 🙁
from ddm2.
Related Issues (20)
- Training and other questions HOT 1
- Environment-Cuda-Version HOT 2
- train_diffusion_model HOT 1
- 2d-image HOT 4
- Inference Error when denoising all volumes HOT 7
- AttributeError: Can't pickle local object 'MRIDataset.__init__.<locals>.<lambda>' HOT 5
- Experimenting on DCE-MRI simulated phantom with added noise HOT 1
- Need for Stage 1 Model in Stage 3 training HOT 1
- Clarifications on dimensions HOT 7
- Configurations for executing code with the PPMI dataset HOT 2
- Evaluation HOT 5
- Question regarding dataset HOT 2
- Question about the second stage HOT 1
- Question about training data valid_mask[10,160]
- sampling process HOT 5
- RUNNING ERROR HOT 5
- Dataroot for train and Val HOT 9
- Evaluation Metrics HOT 16
- Inference HOT 16
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ddm2.