Giter VIP home page Giter VIP logo

multiquestiongeneration's Introduction

Multi-Question Generation

This is the codebase for the paper, Educational Multi-Question Generation for Reading Comprehension.

Data

We use the SQuAD v1.1 dataset for our experiments augmented with paraphrases of each of the given questions. We release this data here.

Setup

Run the following commands to create all of the necessary folders and install the depenedencies:

mkdir eval
pip install -r requirements.txt

2QG Training

To train the 2QG model, we use the following command:

python prophetnet_twoq_finetune.py

Training takes about 2 days. Hence, we also release our trained model. It is located in the above google drive folder.

Evaluation

We perform evaluation on the SQuAD validation set. For evaluating the different models as in the paper, we provide the bash script eval_experiments.sh. The script needs to be modified with the correct path to trained 2QG model.

To conduct the analysis shown in the section Toward Multi-Question Generation, we use the following command:

python eval_two_qg_n_samples.py --model <path_to_trained_eq_model> --num_samples <number_of_samples>

This will generated .npy files with containing the PINC scores. You can then generated the boxplot distribution using the command:

python make_boxplot.py

Citation

If you extend or use this work, please cite our paper:

@inproceedings{rathod-etal-2022-educational,
    title = "Educational Multi-Question Generation for Reading Comprehension",
    author = "Rathod, Manav  and
      Tu, Tony  and
      Stasaski, Katherine",
    booktitle = "Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022)",
    month = jul,
    year = "2022",
    address = "Seattle, Washington",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.bea-1.26",
    pages = "216--223",
    abstract = "Automated question generation has made great advances with the help of large NLP generation models. However, typically only one question is generated for each intended answer. We propose a new task, Multi-Question Generation, aimed at generating multiple semantically similar but lexically diverse questions assessing the same concept. We develop an evaluation framework based on desirable qualities of the resulting questions. Results comparing multiple question generation approaches in the two-question generation condition show a trade-off between question answerability and lexical diversity between the two questions. We also report preliminary results from sampling multiple questions from our model, to explore generating more than two questions. Our task can be used to further explore the educational impact of showing multiple distinct question wordings to students.",
}

If you have any questions about this work, feel free to reach out!

multiquestiongeneration's People

Contributors

manavr123 avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

multiquestiongeneration's Issues

Questions about the results of 1qg+para data

Thank you very much for providing open-source code and proposing a novel way to generate multiple semantically similar vocabulary rich questions. However, some problems have arisen while using the relevant code. We hope you can provide answers! During the implementation process, I found that the experimental results of 1QG+param consistently did not achieve the corresponding results. Especially in the results of Q1-Q2. We also validated the single issue generation results for Microsoft/prophetnet target uncased square-qg and found that the result for BLEU4 did not reach 25.8. I hope to receive your results.The results of our experiment are shown in the figure. Looking forward to your reply.
ๅ›พ็‰‡

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.