Comments (11)
Thanks for your attention to this work!
We have no plan to release the training code.
from knover.
@christineaa I have a one more question regarding the experiment in the QKConv paper.
When you train and evaluate for QReCC task, did you use the 14K conversations as the training dataset or 80K question-answer pairs as the training dataset ?
And similarly for the test dataset - did you use the conversation level or question-answer level?
Thank you!
from knover.
We used the question-answer pairs as the training/dev/test dataset, with 60.4K, 3.1K, and 16.4K samples respectively.
from knover.
Hi, @christineaa . Thanks for your nice work.
I have one more question: how should I build the BM25 index for QRecc task. I notice you post a link to the ml-qrecc repo. Whether should I download the webpages from both the Common Crawl and the Wayback Machine and build the BM25 index?
from knover.
Thanks for your attention to this work!
Yes, you should download both web pages and follow the instructions in the ml-qrecc repo.
from knover.
Thanks for your reply.
from knover.
@christineaa I have further questions regarding training and evaluation of QKConv model on QReCC dataset.
-
In your paper, the third footprint states "We remove conversations without truth responses.". What is the meaning of this ? Did you apply this for training or test dataset or both ? Please provide me a detailed code for this processing if available. Because in your QKConv inference or dataset code, I cannot find any relevant information.
-
Upon evaluation for Table 2 in QKConv paper for QReCC, did you apply the above "removed" version of qrecc-test.json ? or plain qrecc-test.json ? I mean qrecc-test.json from here.
from knover.
Also, when you report Table2, did you exlcude test examples which do not have gold knowledge ?
from knover.
@robinsongh381 Thanks for your attention to this work!
- We remove the samples when "Truth_answer"/"Answer" is an empty string for the training set (57946 samples left) and the test set (15024 samples left).
- We use the "removed" version of the test set, coding as evaluation code.
- We include samples without golden knowledge in Table 2 as the absence of golden knowledge does not affect response generation evaluation, and only exclude them in the knowledge selection evaluation.
from knover.
@christineaa Thank you for kind response.
I have a further question for Question 3.
The absence of gold knowledge indicates that the essential and required piece of information does not exist within the knowledge pool and hence factually correct and knowledg-grounded response cannot be obtained.
For this reason, I have found that previous works on QReCC evaluation, such as DPR-IHN[1], and CONQRR[2], have excluded such cases (i.e., examples without gold-knowledge annotation) in their evaluation.
What is your opinion on this ?
Thank you
[1] Saving Dense Retriever from Shortcut Dependency in Conversational Search
[2] CONQRR: Conversational Query Rewriting for Retrieval with Reinforcement Learning
from knover.
@robinsongh381
The knowledge base for QReCC contains 54M passages, and more or less, there is knowledge relevant to the questions. We demonstrate how the model utilizes incorrect retrieved knowledge in Table 5 & Table 6.
However, DPR-IHN and CONQRR excluding samples without golden knowledge are another case. They present knowledge selection Recall metrics as their main results, and Recall metrics cannot be applied without golden knowledge.
from knover.
Related Issues (20)
- Plato-KAG部署环境下如何输入topic和knowledge
- 请问Link theWorld这个论文中2.1节Service Information的service API是如何构建的
- Plato-KAG文档 HOT 1
- 加载数据时发现报错[WARN] Invalid example: context too long / no context - Example HOT 2
- WARN,读数据时显示context过长或无content HOT 1
- 使用single_gpu训练报错TypeError: __new__() got multiple values for argument data_id HOT 4
- PLATO stage-1训练之后output里没有输出 HOT 1
- InvalidArgumentError: Broadcast dimension mismatch HOT 2
- PLATO stage1训练发现内存一直在增长,训练到9w步后,出现内存溢出,这是什么原因? HOT 4
- PLATO-KAG生成的回复能否使用NSP模型的score排序 HOT 2
- 关于PLATO-KAG模型部署后的回答生成 HOT 2
- KAG训练中mean_mlm_ce指标的意义是什么 HOT 2
- KAG训练中mean_mlm_ce指标的意义是什么
- lr scheduler参数设置 HOT 4
- Changing the allowed maximum conversation length in Plato-2 HOT 1
- Using PLATO-XL for inference on 3 or more GPUs HOT 2
- Methods of PLATO-KAG pre-training for other languages
- 训练plato2.2L
- AG-DST模型开源了吗 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from knover.