Giter VIP home page Giter VIP logo

retain's People

Contributors

davedecaprio avatar mp2893 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

retain's Issues

Reason using 2 sets of attention weights ?

Hello retain team,

Great job ! Thank you for sharing it.
Do you have explanation why you use 2 sets of attention weights (visits and variables) instead of only one for variables ?
With this set you can still get a visit contribution using aggregating method, average or sum of the variable weights of each visit for instance
Thanks in advance for your help

Reason behind dividing GRUs' outputs by 2

Hello Edward,

Thank you for sharing the code for the Retain paper! I have been looking through it and found that in attention_step function of build_model you've divided the GRUs' outputs by 2. What is the reasoning behind this operation? Relevant code:

def attentionStep(att_timesteps):
		reverse_emb_t = temb[:att_timesteps][::-1]
		reverse_h_a = gru_layer(tparams, reverse_emb_t, 'a', alphaHiddenDimSize)[::-1] * 0.5
		reverse_h_b = gru_layer(tparams, reverse_emb_t, 'b', betaHiddenDimSize)[::-1] * 0.5 

Use of Biderctional LSTM

Hi Edward,

This is more of a question than an actual issue so I will close it immediately:

What do you think about the use of Bidirectional LSTM instead of just the reversed order? Do you think it impedes the interpretable spirit of the RETAIN or is it fine to use as long as it provides the boost in performance? I've been using concatenated Bidirectional lstm with moderate increase in AUC - I think due to concatenation of the outputs it is totally plausible that the network will select the order that provides more value but I am not entirely sure that this makes sense from clinical perspective.

Do you have any thoughts on this?

Thanks,
Tim

Using Retain for multiclass problem

Hi Edward,

This is more of a question on Retain paper and not on the actual code.
I have a usecase where RETAIN can be a good possible solution however, the problem is multi-class(5 classes) in nature.
I am interested to hear your initial thoughts on my approach -
if I implement attention mechanism in the same way explained in RETAIN paper but the model architecture is enhanced to predict multi-class as output (i.e. I will have one set of alpha, beta and context however, 5 nodes in output layer).

Do you expect attention mechanism and interpretation to still hold good with my above approach? Does having just one context vector for five classes make sense?

Thanks in advance!
-Priya

Imbalance and noise handling

Hello Ed,

First of all, thank you for this amazing work. I was wondering if you could answer the following questions, it'd be really appreciated.

  1. How do you think RETAIN handles with class-imbalance situation? I personally saw it handled well in my dataset, but couldn't follow why it worked great.

  2. Patients can have common medical codes like hypertension or diabetes, which can be regarded as noise in terms of predicting other disease. Can RETAIN capture such codes with less weight even though they appear a lot?

Best Regards,

Feeding sequence in inverse order

Hi,

My query is more like a question than an issue. I am a bit confused about the following foot note in your paper (image attached).

Screenshot from 2020-05-06 21-27-11

My query is that why feeding the sequence in forward order will generate the same e_1 and beta_1 as they depend on learnable parameters also ?

Regards,

Usama.

error: unrecognized arguments: --dropout_context 0.8 --dropout_emb 0.0

Hi, I've just started using this repo and got the following error,

ajay@ajay-h8-1170uk:~/PythonProjects/retain-master$ python2 retain.py RETAIN_DATA.3digitICD9.seqs 942 RETAIN_DATA.morts /home/ajay/PythonProjects/retain-master --simple_load --n_epochs 100 --dropout_context 0.8 --dropout_emb 0.0. 
Using gpu device 0: GeForce GTX 570
usage: retain.py [-h] [--time_file TIME_FILE] [--model_file MODEL_FILE]
                 [--use_log_time {0,1}] [--embed_file EMBED_FILE]
                 [--embed_size EMBED_SIZE] [--embed_finetune {0,1}]
                 [--alpha_hidden_dim_size ALPHA_HIDDEN_DIM_SIZE]
                 [--beta_hidden_dim_size BETA_HIDDEN_DIM_SIZE]
                 [--batch_size BATCH_SIZE] [--n_epochs N_EPOCHS]
                 [--L2_output L2_OUTPUT] [--L2_emb L2_EMB]
                 [--L2_alpha L2_ALPHA] [--L2_beta L2_BETA]
                 [--keep_prob_emb KEEP_PROB_EMB]
                 [--keep_prob_context KEEP_PROB_CONTEXT] [--log_eps LOG_EPS]
                 [--solver {adadelta,adam}] [--simple_load] [--verbose]
                 <visit_file> <n_input_codes> <label_file> <out_file>
retain.py: error: unrecognized arguments: --dropout_context 0.8 --dropout_emb 0.0.

The code thought seems to train fine using only,

 python2 retain.py RETAIN_DATA.3digitICD9.seqs 942 RETAIN_DATA.morts /home/ajay/PythonProjects/retain-master --simple_load --n_epochs 100  

Including more features apart from diagnostic codes

Hi Retain team,

I am interested in using retain for my research. I wanted to know how to prepare data in the presence of more features like vital signs or medication apart from diagnostic codes? Do I need to concatenate all the feature together? For example for a patient with diagnostic codes (c1 to c3) and vital signs (v1 to v3) , should the input be like [c1,c2,c3,v1,v2,v3] for a single visit?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.