mp2893 / retain Goto Github PK

View Code? Open in Web Editor NEW

204.0 204.0 78.0 424 KB

RETAIN: Interpretable Predictive Model in Healthcare using Reverse Time Attention Mechanism

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%

retain's People

Contributors

Stargazers

Watchers

retain's Issues

Reason using 2 sets of attention weights ?

Hello retain team,

Great job ! Thank you for sharing it.
Do you have explanation why you use 2 sets of attention weights (visits and variables) instead of only one for variables ?
With this set you can still get a visit contribution using aggregating method, average or sum of the variable weights of each visit for instance
Thanks in advance for your help

Reason behind dividing GRUs' outputs by 2

Hello Edward,

Thank you for sharing the code for the Retain paper! I have been looking through it and found that in attention_step function of build_model you've divided the GRUs' outputs by 2. What is the reasoning behind this operation? Relevant code:

def attentionStep(att_timesteps):
		reverse_emb_t = temb[:att_timesteps][::-1]
		reverse_h_a = gru_layer(tparams, reverse_emb_t, 'a', alphaHiddenDimSize)[::-1] * 0.5
		reverse_h_b = gru_layer(tparams, reverse_emb_t, 'b', betaHiddenDimSize)[::-1] * 0.5

Use of Biderctional LSTM

Hi Edward,

This is more of a question than an actual issue so I will close it immediately:

What do you think about the use of Bidirectional LSTM instead of just the reversed order? Do you think it impedes the interpretable spirit of the RETAIN or is it fine to use as long as it provides the boost in performance? I've been using concatenated Bidirectional lstm with moderate increase in AUC - I think due to concatenation of the outputs it is totally plausible that the network will select the order that provides more value but I am not entirely sure that this makes sense from clinical perspective.

Do you have any thoughts on this?

Thanks,
Tim

Using Retain for multiclass problem

Hi Edward,

This is more of a question on Retain paper and not on the actual code.
I have a usecase where RETAIN can be a good possible solution however, the problem is multi-class(5 classes) in nature.
I am interested to hear your initial thoughts on my approach -
if I implement attention mechanism in the same way explained in RETAIN paper but the model architecture is enhanced to predict multi-class as output (i.e. I will have one set of alpha, beta and context however, 5 nodes in output layer).

Do you expect attention mechanism and interpretation to still hold good with my above approach? Does having just one context vector for five classes make sense?

Thanks in advance!
-Priya

Imbalance and noise handling

Hello Ed,

First of all, thank you for this amazing work. I was wondering if you could answer the following questions, it'd be really appreciated.

How do you think RETAIN handles with class-imbalance situation? I personally saw it handled well in my dataset, but couldn't follow why it worked great.
Patients can have common medical codes like hypertension or diabetes, which can be regarded as noise in terms of predicting other disease. Can RETAIN capture such codes with less weight even though they appear a lot?

Best Regards,

Feeding sequence in inverse order

Hi,

My query is more like a question than an issue. I am a bit confused about the following foot note in your paper (image attached).

My query is that why feeding the sequence in forward order will generate the same e_1 and beta_1 as they depend on learnable parameters also ?

Regards,

Usama.

error: unrecognized arguments: --dropout_context 0.8 --dropout_emb 0.0

Hi, I've just started using this repo and got the following error,

ajay@ajay-h8-1170uk:~/PythonProjects/retain-master$ python2 retain.py RETAIN_DATA.3digitICD9.seqs 942 RETAIN_DATA.morts /home/ajay/PythonProjects/retain-master --simple_load --n_epochs 100 --dropout_context 0.8 --dropout_emb 0.0. 
Using gpu device 0: GeForce GTX 570
usage: retain.py [-h] [--time_file TIME_FILE] [--model_file MODEL_FILE]
                 [--use_log_time {0,1}] [--embed_file EMBED_FILE]
                 [--embed_size EMBED_SIZE] [--embed_finetune {0,1}]
                 [--alpha_hidden_dim_size ALPHA_HIDDEN_DIM_SIZE]
                 [--beta_hidden_dim_size BETA_HIDDEN_DIM_SIZE]
                 [--batch_size BATCH_SIZE] [--n_epochs N_EPOCHS]
                 [--L2_output L2_OUTPUT] [--L2_emb L2_EMB]
                 [--L2_alpha L2_ALPHA] [--L2_beta L2_BETA]
                 [--keep_prob_emb KEEP_PROB_EMB]
                 [--keep_prob_context KEEP_PROB_CONTEXT] [--log_eps LOG_EPS]
                 [--solver {adadelta,adam}] [--simple_load] [--verbose]
                 <visit_file> <n_input_codes> <label_file> <out_file>
retain.py: error: unrecognized arguments: --dropout_context 0.8 --dropout_emb 0.0.

The code thought seems to train fine using only,

 python2 retain.py RETAIN_DATA.3digitICD9.seqs 942 RETAIN_DATA.morts /home/ajay/PythonProjects/retain-master --simple_load --n_epochs 100

Including more features apart from diagnostic codes

Hi Retain team,

I am interested in using retain for my research. I wanted to know how to prepare data in the presence of more features like vital signs or medication apart from diagnostic codes? Do I need to concatenate all the feature together? For example for a patient with diagnostic codes (c1 to c3) and vital signs (v1 to v3) , should the input be like [c1,c2,c3,v1,v2,v3] for a single visit?

mp2893 / retain Goto Github PK

retain's People

Contributors

Stargazers

Watchers

Forkers

retain's Issues

Reason using 2 sets of attention weights ?

Reason behind dividing GRUs' outputs by 2

Use of Biderctional LSTM

Using Retain for multiclass problem

Imbalance and noise handling

Feeding sequence in inverse order

error: unrecognized arguments: --dropout_context 0.8 --dropout_emb 0.0

Including more features apart from diagnostic codes

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent