Comments (5)
To be fair, Med2Vec is a co-occurrence based algorithm, so it will show good performance in applications where co-occurrence information between codes plays an important role. But Med2Vec probably won't help you find novel cure for cancer. For fraud detection, I think it will be helpful since fraud detection can be seen as anomaly detection.
As for your questions:
-
Your question is valid. Actually I tried both softmax and sigmoid in other papers when doing the visit-level prediction. But softmax almost always out-performed sigmoid. We think this is because softmax is a strong regularizer due to the normalizing denominator. Also, there aren't many codes per visit (e.g. typical less than 10 in most datasets) so using softmax instead of sigmoid doesn't have too drastic a impact.
-
If you use the input sequence as the label sequence then it will take a very long time to train because you are training a softmax with tens of thousands of possible outcomes. In my paper, I grouped the codes with existing groupers (such as CCS diagnosis grouper) to reduce the output space. I suggest you do the same, as it significantly increases training speed, and has minimal impact to the overall performance (although it depends on what application you have in mind)
Thanks,
Ed
from med2vec.
Hi Kirk,
It's wonderful to meet another person with the same interest.
It would great to have a distribution-learning-enabled med2vec for people with large data.
I can't guarantee I'll promptly review the code, but it would be nice to have a pull request.
(or we can have a separate script that trains in distributed fashion)
from med2vec.
Hi Xianlong,
Thanks for your interest in our work.
To answer your question:
-
To be fair, CHOA and MIMIC-III are very different datasets, former being an outpatient record of 550K patients and the latter being the ICU records of only 7K patients. Also there are more codes per visit in MIMIC-III than CHOA, so the performance cannot straightforwardly be compared. I haven't tested Med2Vec on MIMIC-III. But MIMIC-III is a public dataset, so you could do evaluation yourself. It would be great if you could share the results as well.
-
That's a valid question. I think it depends on what you want to achieve with concept embedding. I was interested in finding out the underlying relationship between different types of codes. For example, if you embed diagnosis codes and medication codes to the same latent space, you can easily find out which drugs are closely related to which diagnoses. Moreover, in Med2Vec, if you embed diagnosis/medication/procedure codes to the same latent space, you can study that latent space and find out how each dimension is related to various diagnosis/medication/procedure codes (see Table 5 in the paper).
Thanks,
Ed
from med2vec.
Hi Ed, thanks for your quick respond.
I am working on a medical related project (predict "fraud" billing, "define" patient status and etc. ), finding a good representation of medical concept will be a great help for me, and this paper seems achieved state-of-art performance (right? ^_^), so I would like to bother you with some detail questions if you don't mind.
-
For visit-level prediction, you used softmax to predict the neighbor visit, but there are multiple codes for each visit (so it is like a multi-label classification problem), is it better to use sigmoid instead of softmax?
-
I ran with the mimic3 dataset, the training result seems very good (I evaluated by looking at some ICD codes' neighbors), but the training loss are very high even after 100 epochs (begin with 400 and reach 360 at 100 epochs). I think this is because of the softmax I mentioned above. Is this a problem? It seems in this way, the loss for code-level part doesn't matter very much (loss for this part will be small).
Thanks
Xianlong
from med2vec.
Gentlemen,
It's always great stumbling upon strangers on the internet discuss exactly the problem you work on... until you also realize that what you were so certain was a novel idea is already a thing.
Xianlong- Ed is absolutely right insofar as the boon you'll get out a representation strategy that's at least a nudge towards semantic "understanding."
Last thing, I'm currently running this in admittedly much uglier fashion than yall, but I do distribtution configured/implemented. Is that something for which you'd appreciate a pull request or would it really just be another you had to maintain?
from med2vec.
Related Issues (20)
- TyperError: Expected Variable, got odict values HOT 4
- Negative Visit Forward Cross-Entropy on MIMIC-III HOT 1
- Questions about experiments HOT 1
- questions about the training data format HOT 3
- How to tune parameters to avoid cost:nan? HOT 1
- Where I can find the AHFS classification table? HOT 1
- Cannot able to Interpret Output of npz model File HOT 6
- Negative Code Embeddings HOT 2
- high training cost HOT 2
- Scatter plot from learned code representations HOT 16
- Epochs and loss during training HOT 3
- Mapping embeddings to ICD codes HOT 2
- NaN gradient may be due to weight initialization HOT 4
- Interpretation of learned representations
- How to make demo.txt
- GPU training fails HOT 5
- Cost and Weights are NAN HOT 2
- output file HOT 2
- Output model/weights? HOT 3
- Questions about complexity analysis HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from med2vec.