Giter VIP home page Giter VIP logo

Comments (4)

mp2893 avatar mp2893 commented on June 26, 2024

Hi Victor,

Thanks for taking interest in my work.

As for your first question: No, I haven’t tried other initialization strategy. But I think your approach makes sense. Maybe care to contribute to the repo?

For the second question: IIRC (it was a long time ago I wrote this code) ivec and jvec are constructed from the preprocessed patient records so there is no concept of “patient” in the minibatch. There is just a bunch of random visits from the EHR.

Best
Ed

from med2vec.

victorconan avatar victorconan commented on June 26, 2024

Hi Ed! Thanks for the reply! Very appreciate it!

I am transforming your code into TF2 and testing it. I will see if I can contribute to the repo. I am also comparing the results if I implement the code exactly as described in your paper. My data is larger (~2M patients, ~77k medical codes) and it seems to take 2.5 days to train 1 epoch on single CPU...

from med2vec.

mp2893 avatar mp2893 commented on June 26, 2024

Sounds interesting. Feel free to share any result from your experiments, so that others might gain new knowledge!

from med2vec.

victorconan avatar victorconan commented on June 26, 2024

I got my 10 epochs of training done. And I found that 80% of the codes are all 0s embeddings (I am taking ReLU(W_emb))...In general the visit loss (~1e-3) is much smaller than the code loss (~10). It seems the co-occurrence loss is dominating the training? and it has difficulty learning for most of the codes.

Also I found transferring the code loss into TF 2 would have some issue when calculating the exponential terms. Taking the exponential of vector product would require the vector to be sparse. Otherwise the value would be very large:

emb_w = tf.maximum(emb_w, 0)
emb_w_transpose = tf.transpose(emb_w)
norms = tf.reduce_sum(tf.math.exp(tf.matmul(emb_w, emb_w_transpose)), axis=1)

i = tf.gather(emb_w_transpose, ivec, axis=1)
j = tf.gather(emb_w_transpose, jvec, axis=1)

numerator = tf.math.exp(tf.reduce_sum(j * i, axis=0))
denominator = tf.gather(norms, ivec)
cost = -tf.math.log(
       numerator / denominator 
        + eps
)
cost = tf.reduce_mean(cost)

So I switch to the below tensorflow function which will prevent inf loss:

norms = tf.matmul(emb_w, emb_w_transpose)

numerator = tf.reduce_sum(j * i, axis=0)
denominator = tf.math.reduce_logsumexp(tf.gather(norms, ivec), axis=1)
cost = - (numerator - denominator)
cost = tf.reduce_mean(cost)

And it's 3 times slower...

from med2vec.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.