sheffieldnlp / naacl2018-fever Goto Github PK
View Code? Open in Web Editor NEWFact Extraction and VERification baseline published in NAACL2018
Home Page: http://fever.ai
License: Apache License 2.0
Fact Extraction and VERification baseline published in NAACL2018
Home Page: http://fever.ai
License: Apache License 2.0
no
I notice that https://jamesthorne.co/fever/model.tar.gz is not responding. I can not download the pre-trained model. Are there any other sites that I can download it?
Thank you.
#if using a CPU, set
export CUDA_DEVICE=-1#if using a GPU, set
export CUDA_DEVICE=0 #or cuda device id
To run
Extra:
How long is evaluation supposed to take? I'm running the evidence retrieval step on 8 vCPUs , 16 GB RAM and an SSD, and for the dev set its projcecting almost 10 hours?
Is this expected?
Do not give any partial credit if only partial set of documents is returned.
I see this after cloning the latest version.
Traceback (most recent call last):
File "src/scripts/build_tfidf.py", line 34, in
count_matrix, doc_dict = get_count_matrix(
NameError: name 'get_count_matrix' is not defined
Looking at drqascripts.retriever.build_tfidf:
class TfIdfBuilder(builtins.object):
get_count_matrix(self)
But it's called as a static class in src/scripts/build_tfidf.py
Played with the interactive mode, and observed the following behaviour: documents retrieved lack are relevant only to some part of the claim,. e.g.:
I am wondering whether:
Label distribution.
Claim length
Hi,
Inside the "process_tfidf_drqa.py" file, there is a line trying to import OnlineTfidfDocRanker (i.e. "from drqascripts.retriever.build_tfidf_lines import OnlineTfidfDocRanker"), however, i cannot locate this file within the provided codebase. Can I know where i can find this file?
I got the error
Traceback (most recent call last):
File "src/scripts/retrieval/document/eval_mrr.py", line 22, in
evidence = set([t[1] for t in js["evidence"] if isinstance(t,list) and len(t)>1])
TypeError: unhashable type: 'list'
when running
python src/scripts/retrieval/document/eval_mrr.py --split dev --count 5
It seems that the t[1] can be a list, but isn't it supposed to be a string type?
Hi,
I faced an error when tried to build the Dockerfile:
command:
$ sudo docker build .
error:
Step 21/23 : RUN conda install -y pytorch=0.3.1 torchvision -c pytorch
...
CondaHTTPError: HTTP 000 CONNECTION FAILED for url <https://conda.anaconda.org/pytorch/linux-64/pytorch-0.3.1-py36_cuda8.0.61_cudnn7.1.2_3.tar.bz2>
Elapsed: -
An HTTP error occurred when trying to retrieve this URL.
HTTP errors are often intermittent, and a simple retry will get you on your way.
Cheers,
When trying to train a model I get an error:
FileNotFoundError: [Errno 2] No such file or directory: 'data/fever/train.ns.pages.p1.jsonl'
I believe in the instructions some part is missing - maybe running the script src/scripts/dataset/download_dataset.py
with appropriate parameters to get the needed files?
Thanks,
Slavko
In particular:
Use of common sense/world knowledge
3k labels from each class to create balanced dev/test
When I ran src/scripts/rte/da/eval_da.py, it shows that "Did not use initialization regex that was passed: .*token_embedder_tokens\._projection.*weight". And my allennlp version is 0.2.3. So how to solve this error?
When running the following command:
PYTHONPATH=src python src/scripts/retrieval/ir.py --db data/fever/fever.db --model data/index/fever-tfidf-ngram=2-hash=16777216-tokenizer=simple.npz --in-file data/fever-data/dev.jsonl --out-file data/fever/dev.sentences.p5.s5.jsonl --max-page 5 --max-sent 5
The new version of the code gives me a key error.
The exception comes from the line (from drqascripts/retriever/build_tfidf.py in function count)
col.extend([DOC2IDX[doc_id]] * len(counts))
I do not get this error if i set the flag --parallel to be false. Not too sure but I am guessing that the issue lies in the multiprocessing part of the code.
Thanks for your help!
Wikipedia articles are pre-tokenized and just need splitting by space. Claims need to be tokenized properly
Currently the builddb script uses facebook's logging - we should use FEVER's logging
Building wheels for collected packages: drqa, fever-scorer, drqa, fever-scorer
Running setup.py bdist_wheel for drqa ... done
Stored in directory: /tmp/pip-ephem-wheel-cache-cfhla23v/wheels/25/a8/71/6390f88d8b3ecda4c32998985670851ed7281bfa8ced27196e
Running setup.py bdist_wheel for fever-scorer ... done
Stored in directory: /tmp/pip-ephem-wheel-cache-cfhla23v/wheels/e0/c6/1a/8ff7f96802122bf337bfc8e05852f7d5618a6cffc95b5ee624
Running setup.py bdist_wheel for drqa ... error
Complete output from command /home/ubuntu/anaconda3/envs/fever/bin/python -u -c "import setuptools, tokenize;file='/tmp/pip-install-vnfl6_v5/drqa/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" bdist_wheel -d /tmp/pip-wheel-5hao0iio --python-tag cp36:
Traceback (most recent call last):
File "", line 1, in
File "/home/ubuntu/anaconda3/envs/fever/lib/python3.6/tokenize.py", line 452, in open
buffer = _builtin_open(filename, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/pip-install-vnfl6_v5/drqa/setup.py'
Failed building wheel for drqa
Running setup.py clean for drqa
Complete output from command /home/ubuntu/anaconda3/envs/fever/bin/python -u -c "import setuptools, tokenize;file='/tmp/pip-install-vnfl6_v5/drqa/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" clean --all:
Traceback (most recent call last):
File "", line 1, in
File "/home/ubuntu/anaconda3/envs/fever/lib/python3.6/tokenize.py", line 452, in open
buffer = _builtin_open(filename, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/pip-install-vnfl6_v5/drqa/setup.py'
Failed cleaning build dir for drqa
Running setup.py bdist_wheel for fever-scorer ... error
Complete output from command /home/ubuntu/anaconda3/envs/fever/bin/python -u -c "import setuptools, tokenize;file='/tmp/pip-install-vnfl6_v5/fever-scorer/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" bdist_wheel -d /tmp/pip-wheel-3rmxr6dz --python-tag cp36:
Traceback (most recent call last):
File "", line 1, in
File "/home/ubuntu/anaconda3/envs/fever/lib/python3.6/tokenize.py", line 452, in open
buffer = _builtin_open(filename, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/pip-install-vnfl6_v5/fever-scorer/setup.py'
Failed building wheel for fever-scorer
Running setup.py clean for fever-scorer
Complete output from command /home/ubuntu/anaconda3/envs/fever/bin/python -u -c "import setuptools, tokenize;file='/tmp/pip-install-vnfl6_v5/fever-scorer/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" clean --all:
Traceback (most recent call last):
File "", line 1, in
File "/home/ubuntu/anaconda3/envs/fever/lib/python3.6/tokenize.py", line 452, in open
buffer = _builtin_open(filename, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/pip-install-vnfl6_v5/fever-scorer/setup.py'
Failed cleaning build dir for fever-scorer
When I trained the Decomposable Attention model, I got the bug:
File "src/scripts/rte/da/train_da.py", line 9, in <module>
from allennlp.data import Vocabulary, Dataset, DataIterator, DatasetReader, Tokenizer, TokenIndexer
ImportError: cannot import name 'Dataset'
I installed allennlp from source and when I looked into the folder data, there's nothing called Dataset. Does anybody know how to fix this problem? Thanks!
My hunch is that the baseline uses a specifical version of drqa other than the one we can find in the drqa's repo. However, when I try to install it with the following command, as specified in the requirements:
pip3 install git+git://github.com/j6mes/drqa@fever#egg=DrQA-0.1.3
I got the error:
Building wheels for collected packages: drqa, drqa
Running setup.py bdist_wheel for drqa ... done
Stored in directory: /private/var/folders/gk/hj0gqfws7sj4s3dk8rnckplc0000gn/T/pip-ephem-wheel-cache-ok8ikj6r/wheels/2a/62/41/ddc1e0efc8a4f3becd45012e6624752e2fe5fbf733a5b61d3a
Running setup.py bdist_wheel for drqa ... error
Complete output from command /usr/local/opt/python3/bin/python3.6 -u -c "import setuptools, tokenize;__file__='/private/var/folders/gk/hj0gqfws7sj4s3dk8rnckplc0000gn/T/pip-install-6vmieha8/drqa/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /private/var/folders/gk/hj0gqfws7sj4s3dk8rnckplc0000gn/T/pip-wheel-t6ci4qrb --python-tag cp36:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6/tokenize.py", line 452, in open
buffer = _builtin_open(filename, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: '/private/var/folders/gk/hj0gqfws7sj4s3dk8rnckplc0000gn/T/pip-install-6vmieha8/drqa/setup.py'
----------------------------------------
Failed building wheel for drqa
Running setup.py clean for drqa
Complete output from command /usr/local/opt/python3/bin/python3.6 -u -c "import setuptools, tokenize;__file__='/private/var/folders/gk/hj0gqfws7sj4s3dk8rnckplc0000gn/T/pip-install-6vmieha8/drqa/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" clean --all:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6/tokenize.py", line 452, in open
buffer = _builtin_open(filename, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: '/private/var/folders/gk/hj0gqfws7sj4s3dk8rnckplc0000gn/T/pip-install-6vmieha8/drqa/setup.py'
----------------------------------------
Failed cleaning build dir for drqa
Successfully built drqa
Failed to build drqa
Can somebody help please? Thank you!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.