Giter VIP home page Giter VIP logo

Comments (2)

mp2893 avatar mp2893 commented on September 26, 2024

Hi,

I've seen similar cost values to yours in my experiments, although I don't recall running into NaNs.
Since I don't know your experiment detail (e.g. the dataset, the application, etc) I can't really say for sure why you run into NaNs.
But I think simple gradient clipping would suffice in this situation.
I can't guarantee when, but I will add an option to turn on gradient clipping in the future.

from med2vec.

tRosenflanz avatar tRosenflanz commented on September 26, 2024

That sounds fair to me- I appreciate the help and understand that you have other stuff going on!

For all purposes I think it is fair to say that my dataset is similar to the one from the paper - it is a list of icd,cpt,ndc codes from person's visit to the doctor with the transformations from the provided ReadMe (lists of indexed ints with different patients separated by [-1] )

I have tried implementing gradient clipping by adding grad_clip on total_cost in build_model method but even with thresholds of -.5 and .5 I am still eventually getting NAN (probably because that is not the right way to do it)
Here is the lengthy output of NanGuardMode if it helps:

Med2Vec$ CUDA_VISIBLE_DEVICES=0 THEANO_FLAGS=mode=NanGuardMode python med2vec.py /da
ta/trosenfl/visit.pkl 32228 output.pkl --batch_size 10 --cr_size 500 --vr_size 1000 --window_size 3 --verbose --n_epoch 20
WARNING (theano.sandbox.cuda): The cuda backend is deprecated and will be removed in the next release (v0.10).  Please switc
h to the gpuarray backend. You can get more information about how to switch at this URL:
 https://github.com/Theano/Theano/wiki/Converting-to-the-new-gpu-back-end%28gpuarray%29

Using gpu device 0: Tesla P100-SXM2-16GB (CNMeM is enabled with initial size: 80.0% of memory, cuDNN 5110)
initializing parameters
building models
loading data
object
training start
epoch:0, iteration:0/1475771, cost:374.362030
epoch:0, iteration:10/1475771, cost:292.984375
epoch:0, iteration:20/1475771, cost:333.820068
epoch:0, iteration:30/1475771, cost:496.689789
Traceback (most recent call last):
  File "med2vec.py", line 321, in <module>
    train_med2vec(seqFile=args.seq_file, demoFile=args.demo_file, labelFile=args.label_file, outFile=args.out_file, numXcode
s=args.n_input_codes, numYcodes=args.n_output_codes, embDimSize=args.cr_size, hiddenDimSize=args.vr_size, batchSize=args.bat
ch_size, maxEpochs=args.n_epoch, L2_reg=args.L2_reg, demoSize=args.demo_size, windowSize=args.window_size, logEps=args.log_e
ps, verbose=args.verbose)
  File "med2vec.py", line 289, in train_med2vec
    cost = f_grad_shared(x, mask, iVector, jVector)
  File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 884, in __call__
    self.fn() if output_subset is None else\
  File "/usr/local/lib/python2.7/dist-packages/theano/gof/vm.py", line 513, in __call__
    storage_map=storage_map)
  File "/usr/local/lib/python2.7/dist-packages/theano/gof/link.py", line 325, in raise_with_op
    reraise(exc_type, exc_value, exc_trace)
  File "/usr/local/lib/python2.7/dist-packages/theano/gof/vm.py", line 482, in __call__
    _, dt = self.run_thunk_of_node(current_apply)
  File "/usr/local/lib/python2.7/dist-packages/theano/gof/vm.py", line 402, in run_thunk_of_node
    compute_map=self.compute_map,
  File "/usr/local/lib/python2.7/dist-packages/theano/compile/nanguardmode.py", line 344, in nan_check
    do_check_on(storage_map[var][0], node)
  File "/usr/local/lib/python2.7/dist-packages/theano/compile/nanguardmode.py", line 332, in do_check_on
    raise AssertionError(msg)
AssertionError: NaN detected
NanGuardMode found an error in the output of a node in this variable:
GpuElemwise{Composite{(((((((-i0) + (-i1)) / i2) + (((-i3) + (-i4)) / i5)) + (((-i6) + (-i7)) / i8)) + ((-i9) / i10)) + (i11
 * i12))}}[(0, 0)] [id A] ''
 |GpuCAReduce{add}{1,1} [id B] ''
 | |GpuElemwise{Composite{((i0 * log(i1)) + (i2 * log1p((-i3))))},no_inplace} [id C] ''
 |   |GpuSubtensor{int64::} [id D] ''
 |   | |GpuFromHost [id E] ''
 |   | | |x [id F]
 |   | |Constant{1} [id G]
 |   |GpuElemwise{add,no_inplace} [id H] ''
 |   | |CudaNdarrayConstant{[[  9.99999994e-09]]} [id I]
 |   | |GpuElemwise{mul,no_inplace} [id J] ''
 |   |   |GpuSubtensor{:int64:} [id K] ''
 |   |   | |GpuSoftmaxWithBias [id L] ''
 |   |   | | |GpuDot22 [id M] ''
 |   |   | | | |GpuElemwise{maximum,no_inplace} [id N] ''
 |   |   | | | | |GpuElemwise{Add}[(0, 0)] [id O] ''
 |   |   | | | | | |GpuDot22 [id P] ''
 |   |   | | | | | | |GpuElemwise{maximum,no_inplace} [id Q] ''
 |   |   | | | | | | | |GpuElemwise{Add}[(0, 0)] [id R] ''
 |   |   | | | | | | | | |GpuDot22 [id S] ''
 |   |   | | | | | | | | | |GpuFromHost [id E] ''
 |   |   | | | | | | | | | |W_emb [id T]
 |   |   | | | | | | | | |GpuDimShuffle{x,0} [id U] ''
 |   |   | | | | | | | |   |b_emb [id V]
 |   |   | | | | | | | |CudaNdarrayConstant{[[ 0.]]} [id W]
 |   |   | | | | | | |W_hidden [id X]
 |   |   | | | | | |GpuDimShuffle{x,0} [id Y] ''
 |   |   | | | | |   |b_hidden [id Z]
 |   |   | | | | |CudaNdarrayConstant{[[ 0.]]} [id W]
 |   |   | | | |W_output [id BA]
 |   |   | | |b_output [id BB]
 |   |   | |Constant{-1} [id BC]
 |   |   |GpuElemwise{mul,no_inplace} [id BD] ''
 |   |     |GpuDimShuffle{0,x} [id BE] ''
 |   |     | |GpuSubtensor{:int64:} [id BF] ''
 |   |     |   |GpuFromHost [id BG] ''
 |   |     |   | |mask [id BH]
 |   |     |   |Constant{-1} [id BC]
 |   |     |GpuDimShuffle{0,x} [id BI] ''
 |   |       |GpuSubtensor{int64::} [id BJ] ''
 |   |         |GpuFromHost [id BG] ''
 |   |         |Constant{1} [id G]
 |   |GpuElemwise{sub,no_inplace} [id BK] ''
 |   | |CudaNdarrayConstant{[[ 1.]]} [id BL]
 |   | |GpuSubtensor{int64::} [id D] ''
 |   |GpuElemwise{mul,no_inplace} [id J] ''
 |GpuCAReduce{add}{1,1} [id BM] ''
 | |GpuElemwise{Composite{((i0 * log(i1)) + (i2 * log1p((-i3))))},no_inplace} [id BN] ''
 |   |GpuSubtensor{:int64:} [id BO] ''
 |   | |GpuFromHost [id E] ''
 |   | |Constant{-1} [id BC]
 |   |GpuElemwise{add,no_inplace} [id BP] ''
 |   | |CudaNdarrayConstant{[[  9.99999994e-09]]} [id I]
 |   | |GpuElemwise{mul,no_inplace} [id BQ] ''
 |   |   |GpuSubtensor{int64::} [id BR] ''
 |   |   | |GpuSoftmaxWithBias [id L] ''
 |   |   | |Constant{1} [id G]
 |   |   |GpuElemwise{mul,no_inplace} [id BD] ''
 |   |GpuElemwise{sub,no_inplace} [id BS] ''
 |   | |CudaNdarrayConstant{[[ 1.]]} [id BL]
 |   | |GpuSubtensor{:int64:} [id BO] ''
 |   |GpuElemwise{mul,no_inplace} [id BQ] ''
 |GpuElemwise{Add}[(0, 1)] [id BT] ''
 | |CudaNdarrayConstant{9.99999993923e-09} [id BU]
 | |GpuCAReduce{add}{1,1} [id BV] ''
 |   |GpuElemwise{mul,no_inplace} [id BD] ''
 |GpuCAReduce{add}{1,1} [id BW] ''
 | |GpuElemwise{Composite{((i0 * log(i1)) + (i2 * log1p((-i3))))},no_inplace} [id BX] ''
 |   |GpuSubtensor{int64::} [id BY] ''
 |   | |GpuFromHost [id E] ''
 |   | |Constant{2} [id BZ]
 |   |GpuElemwise{add,no_inplace} [id CA] ''
 |   | |CudaNdarrayConstant{[[  9.99999994e-09]]} [id I]
 |   | |GpuElemwise{mul,no_inplace} [id CB] ''
 |   |   |GpuSubtensor{:int64:} [id CC] ''
 |   |   | |GpuSoftmaxWithBias [id L] ''
 |   |   | |Constant{-2} [id CD]
 |   |   |GpuElemwise{Composite{((i0 * i1) * i2)},no_inplace} [id CE] ''
 |   |     |GpuDimShuffle{0,x} [id CF] ''
 |   |     | |GpuSubtensor{:int64:} [id CG] ''
 |   |     |   |GpuFromHost [id BG] ''
 |   |     |   |Constant{-2} [id CD]
 |   |     |GpuDimShuffle{0,x} [id CH] ''
 |   |     | |GpuSubtensor{int64:int64:} [id CI] ''
 |   |     |   |GpuFromHost [id BG] ''
 |   |     |   |Constant{1} [id G]
 |   |     |   |Constant{-1} [id BC]
 |   |     |GpuDimShuffle{0,x} [id CJ] ''
 |   |       |GpuSubtensor{int64::} [id CK] ''
 |   |         |GpuFromHost [id BG] ''
 |   |         |Constant{2} [id BZ]
 |   |GpuElemwise{sub,no_inplace} [id CL] ''
 |   | |GpuSubtensor{int64::} [id BY] ''
 |   |GpuElemwise{mul,no_inplace} [id CB] ''
 |GpuCAReduce{add}{1,1} [id CM] ''
 | |GpuElemwise{Composite{((i0 * log(i1)) + (i2 * log1p((-i3))))},no_inplace} [id CN] ''
 |   |GpuSubtensor{:int64:} [id CO] ''
 |   | |GpuFromHost [id E] ''
 |   | |Constant{-2} [id CD]
 |   |GpuElemwise{add,no_inplace} [id CP] ''
 |   | |CudaNdarrayConstant{[[  9.99999994e-09]]} [id I]
 |   | |GpuElemwise{mul,no_inplace} [id CQ] ''
 |   |   |GpuSubtensor{int64::} [id CR] ''
 |   |   | |GpuSoftmaxWithBias [id L] ''
 |   |   | |Constant{2} [id BZ]
 |   |   |GpuElemwise{Composite{((i0 * i1) * i2)},no_inplace} [id CE] ''
 |   |GpuElemwise{sub,no_inplace} [id CS] ''
 |   | |CudaNdarrayConstant{[[ 1.]]} [id BL]
 |   | |GpuSubtensor{:int64:} [id CO] ''
 |   |GpuElemwise{mul,no_inplace} [id CQ] ''
 |GpuElemwise{Add}[(0, 1)] [id CT] ''
 | |CudaNdarrayConstant{9.99999993923e-09} [id BU]
 | |GpuCAReduce{add}{1,1} [id CU] ''
 |   |GpuElemwise{Composite{((i0 * i1) * i2)},no_inplace} [id CE] ''
 |GpuCAReduce{add}{1,1} [id CV] ''
 | |GpuElemwise{Composite{((i0 * log(i1)) + (i2 * log1p((-i3))))},no_inplace} [id CW] ''
 |   |GpuSubtensor{int64::} [id CX] ''
 |   | |GpuFromHost [id E] ''
 |   | |Constant{3} [id CY]
 |   |GpuElemwise{add,no_inplace} [id CZ] ''
 |   | |CudaNdarrayConstant{[[  9.99999994e-09]]} [id I]
 |   | |GpuElemwise{mul,no_inplace} [id DA] ''
 |   |   |GpuSubtensor{:int64:} [id DB] ''
 |   |   | |GpuSoftmaxWithBias [id L] ''
 |   |   | |Constant{-3} [id DC]
 |   |   |GpuElemwise{Composite{(((i0 * i1) * i2) * i3)},no_inplace} [id DD] ''
 |   |     |GpuDimShuffle{0,x} [id DE] ''
 |   |     | |GpuSubtensor{:int64:} [id DF] ''
 |   |     |   |GpuFromHost [id BG] ''
 |   |     |   |Constant{-3} [id DC]
 |   |     |GpuDimShuffle{0,x} [id DG] ''
 |   |     | |GpuSubtensor{int64:int64:} [id DH] ''
 |   |     |   |GpuFromHost [id BG] ''
 |   |     |   |Constant{1} [id G]
 |   |     |   |Constant{-2} [id CD]
 |   |     |GpuDimShuffle{0,x} [id DI] ''
 |   |     | |GpuSubtensor{int64:int64:} [id DJ] ''
 |   |     |   |GpuFromHost [id BG] ''
 |   |     |   |Constant{2} [id BZ]
 |   |     |   |Constant{-1} [id BC]
 |   |     |GpuDimShuffle{0,x} [id DK] ''
 |   |       |GpuSubtensor{int64::} [id DL] ''
 |   |         |GpuFromHost [id BG] ''
 |   |         |Constant{3} [id CY]
 |   |GpuElemwise{sub,no_inplace} [id DM] ''
 |   | |CudaNdarrayConstant{[[ 1.]]} [id BL]
 |   | |GpuSubtensor{int64::} [id CX] ''
 |   |GpuElemwise{mul,no_inplace} [id DA] ''
 |GpuCAReduce{add}{1,1} [id DN] ''
 | |GpuElemwise{Composite{((i0 * log(i1)) + (i2 * log1p((-i3))))},no_inplace} [id DO] ''
 |   |GpuSubtensor{:int64:} [id DP] ''
 |   | |GpuFromHost [id E] ''
 |   | |Constant{-3} [id DC]
 |   |GpuElemwise{add,no_inplace} [id DQ] ''
 |   | |CudaNdarrayConstant{[[  9.99999994e-09]]} [id I]
 |   | |GpuElemwise{mul,no_inplace} [id DR] ''
 |   |   |GpuSubtensor{int64::} [id DS] ''
 |   |   | |GpuSoftmaxWithBias [id L] ''
 |   |   | |Constant{3} [id CY]
 |   |   |GpuElemwise{Composite{(((i0 * i1) * i2) * i3)},no_inplace} [id DD] ''
 |   |GpuElemwise{sub,no_inplace} [id DT] ''
 |   | |CudaNdarrayConstant{[[ 1.]]} [id BL]
 |   | |GpuSubtensor{:int64:} [id DP] ''
 |   |GpuElemwise{mul,no_inplace} [id DR] ''
 |GpuElemwise{Add}[(0, 1)] [id DU] ''
 | |CudaNdarrayConstant{9.99999993923e-09} [id BU]
 | |GpuCAReduce{add}{1,1} [id DV] ''
 |   |GpuElemwise{Composite{(((i0 * i1) * i2) * i3)},no_inplace} [id DD] ''
 |GpuCAReduce{add}{1} [id DW] ''
 | |GpuElemwise{log,no_inplace} [id DX] ''
 |   |GpuElemwise{Composite{(i0 + (i1 / i2))},no_inplace} [id DY] ''
 |     |CudaNdarrayConstant{[  9.99999994e-09]} [id DZ]
 |     |GpuElemwise{Exp}[(0, 0)] [id EA] ''
 |     | |GpuCAReduce{add}{0,1} [id EB] ''
 |     |   |GpuElemwise{mul,no_inplace} [id EC] ''
 |     |     |GpuAdvancedSubtensor1 [id ED] ''
 |     |     | |GpuElemwise{maximum,no_inplace} [id EE] ''
 |     |     | | |W_emb [id T]
 |     |     | | |CudaNdarrayConstant{[[ 0.]]} [id W]
 |     |     | |Elemwise{Cast{int64}} [id EF] ''
 |     |     |   |iVector [id EG]
 |     |     |GpuAdvancedSubtensor1 [id EH] ''
 |     |       |GpuElemwise{maximum,no_inplace} [id EE] ''
 |     |       |Elemwise{Cast{int64}} [id EI] ''
 |     |         |jVector [id EJ]
 |     |GpuAdvancedSubtensor1 [id EK] ''
 |       |GpuCAReduce{add}{0,1} [id EL] ''
 |       | |GpuElemwise{Exp}[(0, 0)] [id EM] ''
 |       |   |GpuDot22 [id EN] ''
 |       |     |GpuElemwise{maximum,no_inplace} [id EE] ''
 |       |     |GpuDimShuffle{1,0} [id EO] ''
 |       |       |GpuElemwise{maximum,no_inplace} [id EE] ''
 |       |Elemwise{Cast{int64}} [id EF] ''
 |GpuFromHost [id EP] ''
 | |Elemwise{Cast{float32}} [id EQ] ''
 |   |Shape_i{0} [id ER] ''
 |     |iVector [id EG]
 |CudaNdarrayConstant{0.0010000000475} [id ES]
 |GpuCAReduce{pre=sqr,red=add}{1,1} [id ET] ''
   |W_emb [id T]



Apply node that caused the error: GpuElemwise{Composite{(((((((-i0) + (-i1)) / i2) + (((-i3) + (-i4)) / i5)) + (((-i6) + (-i
7)) / i8)) + ((-i9) / i10)) + (i11 * i12))}}[(0, 0)](GpuCAReduce{add}{1,1}.0, GpuCAReduce{add}{1,1}.0, GpuElemwise{Add}[(0,
1)].0, GpuCAReduce{add}{1,1}.0, GpuCAReduce{add}{1,1}.0, GpuElemwise{Add}[(0, 1)].0, GpuCAReduce{add}{1,1}.0, GpuCAReduce{ad
d}{1,1}.0, GpuElemwise{Add}[(0, 1)].0, GpuCAReduce{add}{1}.0, GpuFromHost.0, CudaNdarrayConstant{0.0010000000475}, GpuCARedu
ce{pre=sqr,red=add}{1,1}.0)
Toposort index: 138
Inputs types: [CudaNdarrayType(float32, scalar), CudaNdarrayType(float32, scalar), CudaNdarrayType(float32, scalar), CudaNda
rrayType(float32, scalar), CudaNdarrayType(float32, scalar), CudaNdarrayType(float32, scalar), CudaNdarrayType(float32, scal
ar), CudaNdarrayType(float32, scalar), CudaNdarrayType(float32, scalar), CudaNdarrayType(float32, scalar), CudaNdarrayType(f
loat32, scalar), CudaNdarrayType(float32, scalar), CudaNdarrayType(float32, scalar)]
Inputs shapes: [(), (), (), (), (), (), (), (), (), (), (), (), ()]
Inputs strides: [(), (), (), (), (), (), (), (), (), (), (), (), ()]
Inputs values: [CudaNdarray(nan), CudaNdarray(-75.6651153564), CudaNdarray(9.0), CudaNdarray(-68.2716598511), CudaNdarray(-6
8.2715835571), CudaNdarray(8.0), CudaNdarray(-60.6214866638), CudaNdarray(-60.6214637756), CudaNdarray(7.0), CudaNdarray(0.0
), CudaNdarray(0.0), CudaNdarray(0.0010000000475), CudaNdarray(462.694641113)]
Outputs clients: [[HostFromGpu(GpuElemwise{Composite{(((((((-i0) + (-i1)) / i2) + (((-i3) + (-i4)) / i5)) + (((-i6) + (-i7))
 / i8)) + ((-i9) / i10)) + (i11 * i12))}}[(0, 0)].0)]]

from med2vec.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.