mp2893 / retain Goto Github PK
View Code? Open in Web Editor NEWRETAIN: Interpretable Predictive Model in Healthcare using Reverse Time Attention Mechanism
License: BSD 3-Clause "New" or "Revised" License
RETAIN: Interpretable Predictive Model in Healthcare using Reverse Time Attention Mechanism
License: BSD 3-Clause "New" or "Revised" License
Hello retain team,
Great job ! Thank you for sharing it.
Do you have explanation why you use 2 sets of attention weights (visits and variables) instead of only one for variables ?
With this set you can still get a visit contribution using aggregating method, average or sum of the variable weights of each visit for instance
Thanks in advance for your help
Hello Edward,
Thank you for sharing the code for the Retain paper! I have been looking through it and found that in attention_step function of build_model you've divided the GRUs' outputs by 2. What is the reasoning behind this operation? Relevant code:
def attentionStep(att_timesteps):
reverse_emb_t = temb[:att_timesteps][::-1]
reverse_h_a = gru_layer(tparams, reverse_emb_t, 'a', alphaHiddenDimSize)[::-1] * 0.5
reverse_h_b = gru_layer(tparams, reverse_emb_t, 'b', betaHiddenDimSize)[::-1] * 0.5
Hi Edward,
This is more of a question than an actual issue so I will close it immediately:
What do you think about the use of Bidirectional LSTM instead of just the reversed order? Do you think it impedes the interpretable spirit of the RETAIN or is it fine to use as long as it provides the boost in performance? I've been using concatenated Bidirectional lstm with moderate increase in AUC - I think due to concatenation of the outputs it is totally plausible that the network will select the order that provides more value but I am not entirely sure that this makes sense from clinical perspective.
Do you have any thoughts on this?
Thanks,
Tim
Hi Edward,
This is more of a question on Retain paper and not on the actual code.
I have a usecase where RETAIN can be a good possible solution however, the problem is multi-class(5 classes) in nature.
I am interested to hear your initial thoughts on my approach -
if I implement attention mechanism in the same way explained in RETAIN paper but the model architecture is enhanced to predict multi-class as output (i.e. I will have one set of alpha, beta and context however, 5 nodes in output layer).
Do you expect attention mechanism and interpretation to still hold good with my above approach? Does having just one context vector for five classes make sense?
Thanks in advance!
-Priya
Hello Ed,
First of all, thank you for this amazing work. I was wondering if you could answer the following questions, it'd be really appreciated.
How do you think RETAIN handles with class-imbalance situation? I personally saw it handled well in my dataset, but couldn't follow why it worked great.
Patients can have common medical codes like hypertension or diabetes, which can be regarded as noise in terms of predicting other disease. Can RETAIN capture such codes with less weight even though they appear a lot?
Best Regards,
Hi, I've just started using this repo and got the following error,
ajay@ajay-h8-1170uk:~/PythonProjects/retain-master$ python2 retain.py RETAIN_DATA.3digitICD9.seqs 942 RETAIN_DATA.morts /home/ajay/PythonProjects/retain-master --simple_load --n_epochs 100 --dropout_context 0.8 --dropout_emb 0.0.
Using gpu device 0: GeForce GTX 570
usage: retain.py [-h] [--time_file TIME_FILE] [--model_file MODEL_FILE]
[--use_log_time {0,1}] [--embed_file EMBED_FILE]
[--embed_size EMBED_SIZE] [--embed_finetune {0,1}]
[--alpha_hidden_dim_size ALPHA_HIDDEN_DIM_SIZE]
[--beta_hidden_dim_size BETA_HIDDEN_DIM_SIZE]
[--batch_size BATCH_SIZE] [--n_epochs N_EPOCHS]
[--L2_output L2_OUTPUT] [--L2_emb L2_EMB]
[--L2_alpha L2_ALPHA] [--L2_beta L2_BETA]
[--keep_prob_emb KEEP_PROB_EMB]
[--keep_prob_context KEEP_PROB_CONTEXT] [--log_eps LOG_EPS]
[--solver {adadelta,adam}] [--simple_load] [--verbose]
<visit_file> <n_input_codes> <label_file> <out_file>
retain.py: error: unrecognized arguments: --dropout_context 0.8 --dropout_emb 0.0.
The code thought seems to train fine using only,
python2 retain.py RETAIN_DATA.3digitICD9.seqs 942 RETAIN_DATA.morts /home/ajay/PythonProjects/retain-master --simple_load --n_epochs 100
Hi Retain team,
I am interested in using retain for my research. I wanted to know how to prepare data in the presence of more features like vital signs or medication apart from diagnostic codes? Do I need to concatenate all the feature together? For example for a patient with diagnostic codes (c1 to c3) and vital signs (v1 to v3) , should the input be like [c1,c2,c3,v1,v2,v3] for a single visit?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.