Comments (5)
Hi tRosenflanz,
Thank you for your interest in our work.
According to your error log, it doesn't seem to be an out-of-memory problem.
The code is basically trying to access some tensor at an index out of its boundary.
Would you check if your training/testing data construction was correct?
(e.g. maybe you are using 1-based indexing instead of 0-based indexing?)
Thanks,
Ed
from med2vec.
Hi Ed,
I checked - I use 0 based indexing. Here is an error log with exception_verbosity=high. Interesting part is Constant{-1} followed by an error in the CudaNdarrayConstant:
- b_emb_rgrad2, Shared Input, Shape: (200,), ElemSize: 4 Byte(s), TotalSize: 800 Byte(s)
- b_hidden_rgrad2, Shared Input, Shape: (200,), ElemSize: 4 Byte(s), TotalSize: 800 Byte(s)
- Elemwise{Cast{int64}}.0, Shape: (6,), ElemSize: 8 Byte(s), TotalSize: 48 Byte(s)
- Elemwise{Cast{int64}}.0, Shape: (6,), ElemSize: 8 Byte(s), TotalSize: 48 Byte(s)
- jVector, Input, Shape: (6,), ElemSize: 4 Byte(s), TotalSize: 24 Byte(s)
- GpuElemwise{Exp}[(0, 0)].0, Shape: (6,), ElemSize: 4 Byte(s), TotalSize: 24 Byte(s)
- iVector, Input, Shape: (6,), ElemSize: 4 Byte(s), TotalSize: 24 Byte(s)
- Constant{1}, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0 Byte(s)
- Constant{-1}, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0 Byte(s)
- CudaNdarrayConstant{error while transferring the value: error (an illegal memory access was encountered)copying data to host}, Shape: (
1,), ElemSize: 4 Byte(s), TotalSize: 4 Byte(s)
- CudaNdarrayConstant{error while transferring the value: error (an illegal memory access was encountered)copying data to host}, Shape: (
1, 1), ElemSize: 4 Byte(s), TotalSize: 4 Byte(s)
- GpuCAReduce{add}{1,1}.0, Shape: (), ElemSize: 4 Byte(s), TotalSize: 4.0 Byte(s)
- CudaNdarrayConstant{error while transferring the value: error (an illegal memory access was encountered)copying data to host}, Shape: (
1, 1), ElemSize: 4 Byte(s), TotalSize: 4 Byte(s)
- CudaNdarrayConstant{error while transferring the value: error (an illegal memory access was encountered)copying data to host}, Shape: (
), ElemSize: 4 Byte(s), TotalSize: 4.0 Byte(s)
- CudaNdarrayConstant{error while transferring the value: error (an illegal memory access was encountered)copying data to host}, Shape: (
1, 1), ElemSize: 4 Byte(s), TotalSize: 4 Byte(s)
- CudaNdarrayConstant{error while transferring the value: error (an illegal memory access was encountered)copying data to host}, Shape: (
1,), ElemSize: 4 Byte(s), TotalSize: 4 Byte(s)
- CudaNdarrayConstant{error while transferring the value: error (an illegal memory access was encountered)copying data to host}, Shape: (
1, 1), ElemSize: 4 Byte(s), TotalSize: 4 Byte(s)
- TensorConstant{1.0}, Shape: (), ElemSize: 4 Byte(s), TotalSize: 4.0 Byte(s)
- CudaNdarrayConstant{error while transferring the value: error (an illegal memory access was encountered)copying data to host}, Shape: (
1, 1), ElemSize: 4 Byte(s), TotalSize: 4 Byte(s)
- GpuElemwise{Add}[(0, 1)].0, Shape: (), ElemSize: 4 Byte(s), TotalSize: 4.0 Byte(s)
- CudaNdarrayConstant{error while transferring the value: error (an illegal memory access was encountered)copying data to host}, Shape: (
1, 1), ElemSize: 4 Byte(s), TotalSize: 4 Byte(s)
- CudaNdarrayConstant{error while transferring the value: error (an illegal memory access was encountered)copying data to host}, Shape: (
1,), ElemSize: 4 Byte(s), TotalSize: 4 Byte(s)
- CudaNdarrayConstant{error while transferring the value: error (an illegal memory access was encountered)copying data to host}, Shape: (
), ElemSize: 4 Byte(s), TotalSize: 4.0 Byte(s)
- GpuCAReduce{add}{1,1}.0, Shape: (), ElemSize: 4 Byte(s), TotalSize: 4.0 Byte(s)
- CudaNdarrayConstant{error while transferring the value: error (an illegal memory access was encountered)copying data to host}, Shape: (
1,), ElemSize: 4 Byte(s), TotalSize: 4 Byte(s)
- mask, Input, Shape: (1,), ElemSize: 4 Byte(s), TotalSize: 4 Byte(s)
- CudaNdarrayConstant{error while transferring the value: error (an illegal memory access was encountered)copying data to host}, Shape: (
1, 1), ElemSize: 4 Byte(s), TotalSize: 4 Byte(s)
- GpuElemwise{add,no_inplace}.0, Shape: (0, 47108), ElemSize: 4 Byte(s), TotalSize: 0 Byte(s)
- GpuElemwise{mul,no_inplace}.0, Shape: (0, 47108), ElemSize: 4 Byte(s), TotalSize: 0 Byte(s)
- GpuSubtensor{:int64:}.0, Shape: (0, 47108), ElemSize: 4 Byte(s), TotalSize: 0 Byte(s)
- GpuDimShuffle{0,x}.0, Shape: (0, 1), ElemSize: 4 Byte(s), TotalSize: 0 Byte(s)
- GpuElemwise{mul,no_inplace}.0, Shape: (0, 47108), ElemSize: 4 Byte(s), TotalSize: 0 Byte(s)
- GpuDimShuffle{0,x}.0, Shape: (0, 1), ElemSize: 4 Byte(s), TotalSize: 0 Byte(s)
- GpuElemwise{add,no_inplace}.0, Shape: (0, 47108), ElemSize: 4 Byte(s), TotalSize: 0 Byte(s)
- GpuSubtensor{int64::}.0, Shape: (0, 47108), ElemSize: 4 Byte(s), TotalSize: 0 Byte(s)
- GpuElemwise{sub,no_inplace}.0, Shape: (0, 47108), ElemSize: 4 Byte(s), TotalSize: 0 Byte(s)
- GpuElemwise{sub,no_inplace}.0, Shape: (0, 47108), ElemSize: 4 Byte(s), TotalSize: 0 Byte(s)
TotalSize: 9142276336.0 Byte(s) 8.514 GB
TotalSize inputs: 227357052.0 Byte(s) 0.212 GB
from med2vec.
So I created a toy dataset:
lens=[np.random.randint(1,10) for x in range(1000)]
data=[list(np.random.randint(0,10,size=x,dtype=int)) if x>1 else [-1] for x in lens ]
And started trying different values for <n_input_codes> and it breaks at around 47000. I am thinking that this is due to Tensor with shape 47000,47000 which would make sense since 47000^2 * 4(bytes) * 2(data+gradient) ~16gb which overflows the memory
from med2vec.
I think this issue can be closed - I recreated the original dataset with less codes by grouping some of them together. Total number of codes is now 33000 which works just fine and trains decently well. I recommend adding a small note saying that large number of codes can lead to issues.
Thank you for the amazing paper and providing the code for it!
from med2vec.
Thanks for the important info.
I never had this problem because my dataset had less than 40K unique codes.
I will add this (and your username) to the readme.
Best,
Ed
from med2vec.
Related Issues (20)
- TyperError: Expected Variable, got odict values HOT 4
- Negative Visit Forward Cross-Entropy on MIMIC-III HOT 1
- Questions about experiments HOT 1
- questions about the training data format HOT 3
- How to tune parameters to avoid cost:nan? HOT 1
- Where I can find the AHFS classification table? HOT 1
- Cannot able to Interpret Output of npz model File HOT 6
- Negative Code Embeddings HOT 2
- high training cost HOT 2
- Scatter plot from learned code representations HOT 16
- Epochs and loss during training HOT 3
- Mapping embeddings to ICD codes HOT 2
- NaN gradient may be due to weight initialization HOT 4
- Interpretation of learned representations
- How to make demo.txt
- Cost and Weights are NAN HOT 2
- output file HOT 2
- Output model/weights? HOT 3
- Questions about complexity analysis HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from med2vec.