stanfordmlgroup / chexbert Goto Github PK
View Code? Open in Web Editor NEWCombining Automatic Labelers and Expert Annotations for Accurate Radiology Report Labeling Using BERT
License: Other
Combining Automatic Labelers and Expert Annotations for Accurate Radiology Report Labeling Using BERT
License: Other
Thanks for this great work, I was following the setup instructions and came across this issue when trying to run inference on my data. It might be due to the mismatch between the dimension of input tensor and the dimensions of the weight matrix. Do you know how we can resolve this? Thank.
Begin report impression labeling. The progress bar counts the # of batches completed:
The batch size is 18
0%| | 0/1 [00:01<?, ?it/s]
Traceback (most recent call last):
File "label.py", line 147, in <module>
y_pred = label(checkpoint_path, csv_path)
File "label.py", line 96, in label
out = model(batch, attn_mask)
File "/home/ENTER/envs/chexbert/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/home/ENTER/envs/chexbert/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/ENTER/envs/chexbert/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/home/CheXbert/src/models/bert_labeler.py", line 44, in forward
final_hidden = self.bert(source_padded, attention_mask=attention_mask)[0]
File "/home/ENTER/envs/chexbert/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/home/ENTER/envs/chexbert/lib/python3.7/site-packages/transformers/modeling_bert.py", line 790, in forward
encoder_attention_mask=encoder_extended_attention_mask,
File "/home/ENTER/envs/chexbert/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/home/ENTER/envs/chexbert/lib/python3.7/site-packages/transformers/modeling_bert.py", line 407, in forward
hidden_states, attention_mask, head_mask[i], encoder_hidden_states, encoder_attention_mask
File "/home/ENTER/envs/chexbert/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/home/ENTER/envs/chexbert/lib/python3.7/site-packages/transformers/modeling_bert.py", line 368, in forward
self_attention_outputs = self.attention(hidden_states, attention_mask, head_mask)
File "/home/ENTER/envs/chexbert/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/home/ENTER/envs/chexbert/lib/python3.7/site-packages/transformers/modeling_bert.py", line 314, in forward
hidden_states, attention_mask, head_mask, encoder_hidden_states, encoder_attention_mask
File "/home/ENTER/envs/chexbert/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/home/ENTER/envs/chexbert/lib/python3.7/site-packages/transformers/modeling_bert.py", line 216, in forward
mixed_query_layer = self.query(hidden_states)
File "/home/ENTER/envs/chexbert/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/home/ENTER/envs/chexbert/lib/python3.7/site-packages/torch/nn/modules/linear.py", line 87, in forward
return F.linear(input, self.weight, self.bias)
File "/home/ENTER/envs/chexbert/lib/python3.7/site-packages/torch/nn/functional.py", line 1372, in linear
output = input.matmul(weight.t())
RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)`
In the paper, it is mentioned that the input token size is limited to 512 tokens, but it can be easily extended to work with longer reports. I was wondering how this would be done? Thanks.
Impression section has some problems that cannot be solved
In the paper you used 687 manually annotated MIMIC-CXR reports to evaluate your models, since MIMIC-CXR is already publicly available, is it possible to release the study id and manual annotation of the said 687 reports?
Thanks!
Hi,
Assuming I own a NLP labeling function and I want to compare the performance of my model to CheXbert. Is there a way to get access to this CheXpert reports dataset?
All The Best,
Hello,
I'm trying to build the environment from the environment.yml you have in the repo but I'm getting the following.
Collecting package metadata (repodata.json): done
Solving environment: failed
ResolvePackageNotFound:
- pyzmq==18.1.1=py37he6710b0_0
- libgcc-ng==9.1.0=hdf63c60_0
- glib==2.63.1=h5a9c865_0
- gst-plugins-base==1.14.0=hbbd80ab_1
...
I think this is due to the fact that if they don't already exist in the conda environment ResolvePackageNotFound will throw this error similar to this post
I'm thinking you might need to move all dependancies to the pip section but I'm not positive. I'm attempting to setup on MacOS
Thanks,
Ryan
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.