Giter VIP home page Giter VIP logo

combo's Introduction

COMBO

Merging Generated and Retrieved Knowledge for Open-Domain QA

alt text

Instructions

Setup

The experiments in the paper were run using python=3.7.0, torch==1.11.0 and trasnformers==4.26.0.

Discriminator Training

To show you how to train the discriminators with silver labels, we will use NaturalQuestion to illustrate.

Step 1: Generate Silver Labels

cd passage_matching
python create_loo_data.py
bash infer_loo.sh
python create_evidentiality_data.py
python create_compatibility_data.py

Step 2: Train and Infer with Discriminators

sbatch train_evidentiality.sh
sbatch train_compatibility.sh
bash submit_infer_compatibility_evidentiality_scores.sh

Step 3: Create Passage Pairs

python merge_infer_scores.py
python build_psg_pairs.py --matching_method {random|compatibility_2stage_optimal}  --dataset {nq/tqa/webq}

where compatibility_2stage_optimal refers to COMBO method in paper.

Reader Training

We modify codes from atlas. Please follow their instructions to set up environments properly.

cd ../training
sbatch scripts/train_simple_merge.sh   # for Direct Merging method
sbatch scripts/train_nq_webq_pair.sh   # for Random Matching / COMBO method

combo's People

Contributors

yunx-z avatar

Stargazers

 avatar Kyumin Lee avatar  avatar yunfan avatar  avatar  avatar 爱可可-爱生活 avatar Aashiq Muhamed avatar Shuzheng Si avatar Kaikai An avatar Jinhyeong avatar JingSheng avatar  avatar  avatar init avatar Xin Liu avatar  avatar Zhuoran Jin avatar Muhammad Khalifa avatar

Watchers

 avatar

Forkers

melodymcode

combo's Issues

About Missing Code File: 'create_compatibility_data.py' - Gratitude and Inquiry

Thanks for your fantastic work. I am a reader of your paper titled "Merging Generated and Retrieved Knowledge for Open-Domain QA" and would like to express my gratitude for sharing your code on GitHub. While experimenting with the code, I noticed that there is a crucial file missing in the repository, namely "create_compatibility_data.py". It appears that this file plays a vital role in the execution and understanding of the code.

I understand that your time is valuable, but if possible, could you provide the missing file or guide me on where I might find it? I believe having access to this file would significantly assist me in comprehending and utilizing your work.

Thank you very much for your hard work and contribution again. Looking forward for your response.

Question: About perfect matching in Compatibility-guided Optimal Matching

Thank you for your interesting work. I recently read your paper and have a small question.
I wonder why we need perfect matching (1-1) between 2 sets of context, other than allowing 1-many (or many-1) maching.
For example:
(L1, L2, L3); (R1, R2, R3) can be mapped like (L1 -- R1), (L1 -- R2), (L2 -- R3) and (L3 -- R3)

Please Add Missing Code File

Please Add Missing Code File

In the code prompted by Readme.md, I found that the project is missing three files, resulting in the project can not run normally, I hope you can submit the complete code to Github.

Thank you very much for your great contribution to this paper.

Missing file:create_loo_data.py,create_compatibility_data.py train_compatibility.sh

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.