Giter VIP home page Giter VIP logo

multimodal_rrs's Introduction

Multimodal, Efficient Radiology Report Summarization

Contributors: Choonghan Kim, Seonhee Cho, Jiho Lee, Chetan Chilkunda
Collaborators: Joo Heung Yoon, M.D at the University of Pittsburgh Medical Center


This study explores robust multimodal strategies in the shared task of radiology report summarization.

Specifically, this work aims to quantify the effects of radiology image input on different model architectures in the context of increasingly corrupt text data. Baseline models include text-to-text ImpressionGPT and multimodal CheXOFA. In the setting of increasingly corrupt text input, we expect the multimodal models to perform better and we propose two new multimodal pipelines that leverage the image input to generalize against corrupt text input. The expectation is that these text-agnostic generalizations become part of the state-of-the-art pipelines for robust radiology report summarization. A secondary aspect of this work is the efficacy of leveraging prompt-based strategies and large language models over pre-training and fine-tuning approaches because of data privacy constraint in the medical domain.

This work was motivated by a course project (11-785 Introduction to Deep Learning) by Professor Bhiksha Raj.
We are currently working towards publishing this work in Spring 2024.

The directory above details each of the three currently implemented models:
NOTE: the files in these directories correspond to modified files from the main model directories on Github or Hugging Face

  • CheXOFA with standard and Low Rank Adaptation (LoRA) fine-tuning methodologies
  • Flamingo
  • Instruct-BLIP with LLaMA-7b

multimodal_rrs's People

Contributors

cchilkun avatar seonhee99 avatar jiho-030 avatar kimchoonghan avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.