Giter VIP home page Giter VIP logo

chq-summ's Introduction

Reinforcement Learning for Abstractive Question Summarization with Question-aware Semantic Rewards

The code requires Python 3 and please install the Python dependencies with the command:

pip install -r requirements.txt

The original MeQSum dataset is available here.

Running the code

  1. Please make sure to download the pre-trained question-type identification and question-focus recognition models from here and place it in the current directory.

  2. Fine tune ProphetNet model on MeQSum dataset

    Follow the instrcution from transformers repo. https://github.com/huggingface/transformers/tree/v4.1.1/examples/seq2seq

  3. Train MLE + RL Model

    python main.py --train_mode rl --trained_model_path /path/to/the/fine-tuned/prophetnet/model
    
  4. Test Model

    python main.py --model test --trained_model_path /path/to/the/saved/model
    
    

Reference

If you are using this code for your reseach work then please cite our paper:

@inproceedings{yadav-etal-2021-reinforcement,
    title = "Reinforcement Learning for Abstractive Question Summarization with Question-aware Semantic Rewards",
    author = "Yadav, Shweta  and
      Gupta, Deepak  and
      Ben Abacha, Asma  and
      Demner-Fushman, Dina",
    booktitle = "Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)",
    month = aug,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.acl-short.33",
    doi = "10.18653/v1/2021.acl-short.33",
    pages = "249--255"
    }

chq-summ's People

Contributors

shwetanlp avatar

Stargazers

 avatar 呜呼 avatar Jatin Salve avatar kstranger avatar Yuan Hao, Cheng avatar Tanay Dixit avatar Zeguan Xiao avatar Nour Eddine ZEKAOUI avatar  avatar  avatar Tong Li avatar SB Li avatar Zheng Yuan avatar Utkarsh Garg avatar Weihao Zeng avatar Muhammad Sulaiman avatar  avatar fyyfu avatar Debasish Chouhan avatar Zhiyu Chen avatar xcfeng avatar

Watchers

 avatar

chq-summ's Issues

Memory grows with iterative training

Hi, when i train the model, I found that with iterative training, the memory of GPU keeps increasing. And at last, it returns an error: Out of Memory.
I use the V100, 32G memory, it is weird.

Issue with pre-trained question-type identification and question-focus recognition models

@shwetanlp

Steps to reproduce the issue

1.Complete the step 1 & 2 given in the readme.md of the repository
2. Download the pre-trained model in the current directory
3. Run the command python main.py --train_mode rl --trained_model_path /path/to/the/fine-tuned/prophetnet/model

Expected result

It should train the model

Actual result

python main.py --train_mode rl --trained_model_path /home/sosukeaizen/Desktop/saved_model/prophetnet-meqsum-model

Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.decoder.weight', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Traceback (most recent call last):
  File "main.py", line 17, in <module>
    from reward.compute_question_type_reward import get_question_type_reward
  File "/home/sosukeaizen/Desktop/task/CHQ-Summ/reward/compute_question_type_reward.py", line 67, in <module>
    type_model.load_state_dict(torch.load('type_model/epochs-7'))
  File "/home/sosukeaizen/.local/lib/python3.8/site-packages/torch/serialization.py", line 579, in load
    with _open_file_like(f, 'rb') as opened_file:
  File "/home/sosukeaizen/.local/lib/python3.8/site-packages/torch/serialization.py", line 230, in _open_file_like
    return _open_file(name_or_buffer, mode)
  File "/home/sosukeaizen/.local/lib/python3.8/site-packages/torch/serialization.py", line 211, in __init__
    super(_open_file, self).__init__(open(name, mode))
IsADirectoryError: [Errno 21] Is a directory: 'type_model/epochs-7'

System information (as much as possible)

Ubuntu 20.04.5 LTS

Additional comments

It is not able to identify type_model/epochs-7'

About MATINF

I want to reproduce and compare your results on MATINF, can you provide your processed(translated) version of MATINF?

About the dataset ?

In the main.py. Line 288-289 and 306-308。
is train dataset equals to the val dataset ? Is that right?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.