cgraywang / deepex Goto Github PK

View Code? Open in Web Editor NEW

99.0 99.0 14.0 547 KB

Code repo for EMNLP21 paper "Zero-Shot Information Extraction as a Unified Text-to-Triple Translation"

Home Page: https://arxiv.org/pdf/2109.11171.pdf

License: Apache License 2.0

Python 97.32% Shell 2.68%

deepex's People

Contributors

Stargazers

Watchers

Forkers

deepex's Issues

Details of output from OIE_2016.sh

Hello, thanks for your contribution to Open Information Extraction research.

I'm currently working on using your repo to create triples from the raw texts in Wikipedia by using your code.

bash tasks/OIE_2016.sh

I changed the data_dir to construct triples then executed the above code.
Code works fine, and it gave me the output called search_res.json.

The problem is, I can't find the description for the format of this output.
I can't find the keys for each value of this output.

{"deduplicated:": {"Another Temporary Gallery [SEP] gallery [SEP] The Museum": [8, 0.7081716619431973, [[0, 25], [65, 75]], 27, 0] ...

For example, the thing I want to know is this list.
[8, 0.7081716619431973, [[0, 25], [65, 75]], 27, 0]
Coudl you briefly explain what each of this value mean?

论文中公式（1）中loss的代码实现

请问论文中公式（1）中loss的实现具体在哪个文件中?仅在Reranking函数中计算了sentence embedding和triple embedding的一阶范数。

Would script about model "Magolor/deepex-ranking-model" be released?

Thank you for great work in zero-shot IE. Where can I find the related script about the pre-training about Magolor/deepex-ranking-model

About the reproduce results of deepex

Hi, thank you for your good work in OIE2016, but I use your default parameters with 1 V100 GPU, but I get the following results:

Did I have some mistakes that I cannot get the 0.72 F1?

Also, I find that the constractive pretrain deep-ranking-model has not been released and the code is inference code now. It needs about 3 hours to test. Am I right?

Hope for your reply!

Unable to run bash tasks/OIE_2016.sh

PyTorch 1.7.1 with CUDA 10.1, single GPU.

After running

git clone --recursive [email protected]:cgraywang/deepex.git
cd ./deepex
conda create --name deepex python=3.7 -y
conda activate deepex
pip install -r requirements.txt
pip install -e .

bash tasks/OIE_2016.sh

I am getting

FileNotFoundError: [Errno 2] No such file or directory: 'result/OIE_2016.bert-large-cased.np.d2048.b6.sorted/P0_result.json'

I suspect that because the output file was not generated, so digging in the trace log I found:

RuntimeError
: RuntimeErrorRuntimeError
NCCL error in: /opt/conda/conda-bld/pytorch_1607370141920/work/torch/lib/c10d/ProcessGroupNCCL.cpp:784, invalid usage, NCCL version 2.7.8: RuntimeError: 
NCCL error in: /opt/conda/conda-bld/pytorch_1607370141920/work/torch/lib/c10d/ProcessGroupNCCL.cpp:784, invalid usage, NCCL version 2.7.8: NCCL error in: /opt/conda/conda-bld/pytorch_1607370141920/work/torch/lib/c10d/ProcessGroupNCCL.cpp:784, invalid usage, NCCL version 2.7.8RuntimeError

And some other NCCL errors.
I am not sure where to start debugging.

Thank you in advance.

I am trying to do a simple text-to-triple inference task (openIE) and I thought that this would be the way to start.

Reproducing results shown in paper

Could you described the steps to reproduce the results in your paper for OIE 2016 and the other systems. You used the supervised-oie repo and the benchmark included in there. There is also a related repo from the same author oie-benchmark with the same evaluation. However they are slightly different and have had updates and neither seems to work out of the box when running the eval.sh script which should produce the PR Curve plots (part of what i was interested in comparing).

Looking at them its unclear which version of the benchmark corpus is being used and whether you picked test (or dev) split or used the whole dataset. The default script seems to use all the data but when going through the tasks you provide it seems to use the test split.

Thanks

Tony

关于OIE数据集中指标比较的疑惑

论文在Open Information Extraction任务中，在OIE2016数据集上deepex 的f1 =72.6，在附录中提到这是top3的指标，请教表格中其他方法如PropS、RnnOIE得到的指标使用的是top1还是top3？

Evaluation on BenchIE

An Evaluation on BenchIE would be great. Since it is probably the best available dataset.

Link to Dataset: https://paperswithcode.com/dataset/benchie

git clone时出现问题

使用 ‘git clone --recursive [email protected]:cgraywang/deepex.git’ 进行下载时出现：

Cloning into 'deepex'...
Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

请问这个问题的原因是什么，感谢

Would script about model "Magolor/deepex-ranking-model" be released?

Hi,
did you relese a skript about "Magolor/deepex-ranking-model"?
I dont know how to use it.

Poor triple extractor performance (OpenIE)

I followed the README and successfully run the OpenIE16 benchmark, then I modified OIE_2016.json file to point to my directory with test.txt file containing just one line Julia owns two cats and one dog. The output is however really poor, the expected triples [Julia, owns, two cats] and [Julia, owns, one dog] have low scores and there are many other (ill-created) triples, sometimes with even higher score values (complete output attached below).

Is this the normal behavior of the model? Why is the performance so poor? Is there a systematic issue with how I am doing this?

['$input_txt:$ Julia owns two cats and one dog',
 {'deduplicated:': {'Julia [SEP] owns two [SEP] One Dog': [2,
    0.15572701767086983,
    [[0, 5], [24, 31]],
    8,
    0],
   'Julia [SEP] owns two cats [SEP] One Dog': [5,
    0.38142501655966043,
    [[0, 5], [24, 31]],
    21,
    0],
   'Julia [SEP] cats [SEP] One Dog': [2,
    0.09584893216378987,
    [[0, 5], [24, 31]],
    6,
    0],
   'Two Cats [SEP] cats [SEP] One Dog': [2,
    0.09491016250103712,
    [[11, 19], [24, 31]],
    6,
    0],
   'Julia [SEP] owns two cats and [SEP] One Dog': [6,
    0.4070159122347832,
    [[0, 5], [24, 31]],
    26,
    0],
   'Julia [SEP] cats and [SEP] Two Cats': [2,
    0.14055378548800945,
    [[0, 5], [11, 19]],
    9,
    0],
   'Julia [SEP] two cats [SEP] One Dog': [1,
    0.06196947582066059,
    [[0, 5], [24, 31]],
    4,
    0],
   'Julia [SEP] one [SEP] Two Cats': [1,
    0.06188515014946461,
    [[0, 5], [11, 19]],
    4,
    0],
   'Julia [SEP] owns [SEP] Two Cats': [6,
    0.2982198027893901,
    [[0, 5], [11, 19]],
    20,
    0],
   'Julia [SEP] owns [SEP] One Dog': [4,
    0.17877793312072754,
    [[0, 5], [24, 31]],
    12,
    0],
   'Julia [SEP] two [SEP] One Dog': [3,
    0.1447404371574521,
    [[0, 5], [24, 31]],
    10,
    0],
   'Julia [SEP] and one [SEP] Two Cats': [5,
    0.32297115167602897,
    [[0, 5], [11, 19]],
    23,
    0],
   'Two Cats [SEP] owns [SEP] One Dog': [8,
    0.44091942673549056,
    [[11, 19], [24, 31]],
    32,
    0],
   'Julia [SEP] and [SEP] One Dog': [1,
    0.04122000187635422,
    [[0, 5], [24, 31]],
    3,
    0],
   'Julia [SEP] cats and one [SEP] Two Cats': [2,
    0.10967723815701902,
    [[0, 5], [11, 19]],
    8,
    0],
   'Two Cats [SEP] one [SEP] One Dog': [2,
    0.08130411058664322,
    [[11, 19], [24, 31]],
    6,
    0],
   'Two Cats [SEP] owns two [SEP] One Dog': [8,
    0.4609282175078988,
    [[11, 19], [24, 31]],
    36,
    0],
   'Julia [SEP] and [SEP] Two Cats': [3,
    0.14550211280584335,
    [[0, 5], [11, 19]],
    12,
    0],
   'Two Cats [SEP] two [SEP] One Dog': [4,
    0.19086267473176122,
    [[11, 19], [24, 31]],
    16,
    0],
   'Two Cats [SEP] and [SEP] One Dog': [6,
    0.1381131475791335,
    [[11, 19], [24, 31]],
    21,
    0]}}]

Help running inference

Hi,
I am unable to figure out how to run OIE inference using the deepex models. Can you please help answering:

Where is the model
how to run OIE on the model
thanks,