xwhan / knowledge-aware-reader Goto Github PK

PyTorch implementation of the ACL 2019 paper "Improving Question Answering over Incomplete KBs with Knowledge-Aware Reader"

Python 99.25% Shell 0.75%

acl2019 question-answering knowledge-base readingcomprehension pytorch graph-neural-networks

knowledge-aware-reader's Introduction

Code for the ACL 2019 paper:

Improving Question Answering over Incomplete KBs with Knowledge-Aware Reader

Paper link: https://arxiv.org/abs/1905.07098

Model Overview:

Requirements

PyTorch 1.0.1
tensorboardX
tqdm
gluonnlp

Prepare data

mkdir datasets && cd datasets && wget https://sites.cs.ucsb.edu/~xwhan/datasets/webqsp.tar.gz && tar -xzvf webqsp.tar.gz && cd ..

Full KB setting

CUDA_VISIBLE_DEVICES=0 python train.py --model_id KAReader_full_kb --max_num_neighbors 50 --label_smooth 0.1 --data_folder datasets/webqsp/full/

Incomplete KB setting

Note: The Hits@1 should match or be slightly better than the number reported in the paper. More tuning on threshold should give you better F1 score.

30% KB

CUDA_VISIBLE_DEVICES=0 python train.py --model_id KAReader_kb_03 --max_num_neighbors 50 --use_doc --data_folder datasets/webqsp/kb_03/ --eps 0.05

10% KB

CUDA_VISIBLE_DEVICES=0 python train.py --model_id KAReader_kb_01 --max_num_neighbors 50 --use_doc --data_folder datasets/webqsp/kb_01/ --eps 0.05

50% KB

CUDA_VISIBLE_DEVICES=0 python train.py --model_id KAReader_kb_05 --num_layer 1 --max_num_neighbors 100 --use_doc --data_folder datasets/webqsp/kb_05/ --eps 0.05 --seed 3 --hidden_drop 0.05

Citation

@inproceedings{xiong-etal-2019-improving,
    title = "Improving Question Answering over Incomplete {KB}s with Knowledge-Aware Reader",
    author = "Xiong, Wenhan  and
      Yu, Mo  and
      Chang, Shiyu  and
      Guo, Xiaoxiao  and
      Wang, William Yang",
    booktitle = "Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics",
    month = jul,
    year = "2019",
    address = "Florence, Italy",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/P19-1417",
    doi = "10.18653/v1/P19-1417",
    pages = "4258--4264",
}

knowledge-aware-reader's People

Contributors

Stargazers

Watchers

knowledge-aware-reader's Issues

Mapping between kb entity id and text

Hello, thanks a lot for releasing the code for the paper. It is very helpful. I am wondering where can I find the mapping between the kb entity (e.g. "m.16jpgj") to a real text. I believe it is needed to obtain glove embeddings of KB entities, which are part of the released preprocessed data.

I have looked at several different sources and it looks like people have been using the Freebase from FastRDFStore Package. I have downloaded Freebase dump from this repo, but it looks like 17% of KB entities in your preprocessed data are missing.

I would really appreciate if you can provide a pointer to the public data with the mapping, or release that mapping for WebQSP. Thanks!

您好，我有一个小问题

我在运行的时候发现embedding处发生了错误，说indices在这里不是long类型，但是我改过之后却依然是这样的，所以想问您这个怎么解决

How to generate rel_word_idx.npy file

Hello, Professor Xiong.
I'd like to ask you a question. I want test it on wikimovie dataset. But I don't know how to generate the rel_word_idx.npy file.

can't find the module

Dear author,
I have one questions to ask you.
In the main.py ,no module named data_loader and graftnet.

How to run script.py i.e., arguments?

Hi,
Could you please tell how the arguments for these are given?
pred_kb_file = sys.argv[2]
pred_doc_file = sys.argv[3]
pred_hybrid_file = sys.argv[4]

Thanks

RuntimeError

consult

Dear Dr.Wang ,
Does the project need wikimovie data set？

您好，webqsp.tar.gz 的文件好像不能访问了。

How to find the topic entities mentioned by the question?

Hello, thanks a lot for the code first! I have an question : I find there is a sign(2) about the top entities mentioned by the question, I can find this paper "Semantic Parsing for Single-Relation Question Answering" by the sign, but I can't find the method how to determine the topic entities.
I would really appreciate if you can help me when you aren't busy, thanks a lot!

Maybe a typo in the paper

Hi guys, thank you for open-sourcing the code! However, from the provided arxiv URL, I find one thing which may be a typo. Please see the red block in the following screenshot:

I cannot see any line which mentions about 28.1 on F1, but it appears in the ablation study table. Therefore, I guess the result in table 1 should be 28.1, right? The same for Hits@1. Thanks :)

data access problem

The data file cannot be accessed now