Comments (7)
Hi ! By default it caches the predictions and references used to compute the metric in ~/.cache/huggingface/datasets/metrics
(not ~/.datasets/
). Let me update the documentation @bhavitvyamalik .
The cache is used to store all the predictions and references passed to add_batch
for example in order to compute the metric later when compute
is called.
I think the issue might come from the cache directory that is used by default. Can you check that you have the right permissions ? Otherwise feel free to set cache_dir
to another location.
from evaluate.
Hi..
I'm facing the same issue.
I'm trying to run beir/examples/retrieval/evaluation/dense/evaluate_sbert_multi_gpu.py. Doing do I end up with the below error.
Traceback (most recent call last):
File "evaluate_sbert_multi_gpu.py", line 62, in
results = retriever.retrieve(corpus, queries)
File "/data/user/beir/beir/retrieval/evaluation.py", line 23, in retrieve
return self.retriever.search(corpus, queries, self.top_k, self.score_function, **kwargs)
File "/data/user/beir/beir/retrieval/search/dense/exact_search_multi_gpu.py", line 150, in search
cos_scores_top_k_values, cos_scores_top_k_idx, chunk_ids = metric.compute()
File "/home/user/miniconda3/envs/beir/lib/python3.7/site-packages/evaluate/module.py", line 433, in compute
self._finalize()
File "/home/user/miniconda3/envs/beir/lib/python3.7/site-packages/evaluate/module.py", line 390, in _finalize
self.data = Dataset(**reader.read_files([{"filename": f} for f in file_paths]))
File "/home/user/miniconda3/envs/beir/lib/python3.7/site-packages/datasets/arrow_reader.py", line 260, in read_files
pa_table = self._read_files(files, in_memory=in_memory)
File "/home/user/miniconda3/envs/beir/lib/python3.7/site-packages/datasets/arrow_reader.py", line 195, in _read_files
pa_table: Table = self._get_table_from_filename(f_dict, in_memory=in_memory)
File "/home/user/miniconda3/envs/beir/lib/python3.7/site-packages/datasets/arrow_reader.py", line 331, in _get_table_from_filename
table = ArrowReader.read_table(filename, in_memory=in_memory)
File "/home/user/miniconda3/envs/beir/lib/python3.7/site-packages/datasets/arrow_reader.py", line 352, in read_table
return table_cls.from_file(filename)
File "/home/user/miniconda3/envs/beir/lib/python3.7/site-packages/datasets/table.py", line 1065, in from_file
table = _memory_mapped_arrow_table_from_file(filename)
File "/home/user/miniconda3/envs/beir/lib/python3.7/site-packages/datasets/table.py", line 52, in _memory_mapped_arrow_table_from_file
pa_table = opened_stream.read_all()
File "pyarrow/ipc.pxi", line 750, in pyarrow.lib.RecordBatchReader.read_all
File "pyarrow/error.pxi", line 115, in pyarrow.lib.check_status
OSError: Expected to be able to read 80088040 bytes for message body, got 80088032
Steps to reproduce:
conda create -n beir python=3.7 -y
conda activate beir
pip install beir
pip install evaluate
Change the variable dataset from "nfcorpus" to "quora"
Run the command: CUDA_VISIBLE_DEVICES=0,1,2,3 python beir/examples/retrieval/evaluation/dense/evaluate_sbert_multi_gpu.py
@lhoestq any idea what might be causing this?
from evaluate.
also, I test the function on some little data , get the same message:
Python 3.8.5 (default, Jan 27 2021, 15:41:15)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from datasets import load_metric
>>> metric = load_metric('accuracy')
>>> metric.add_batch(predictions=[1, 1, 1, 1], references=[1, 1, 0, 0])
2021-05-15 16:39:17.240991: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
>>> metric.compute()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/yshuang/.local/lib/python3.8/site-packages/datasets/metric.py", line 391, in compute
self._finalize()
File "/home/yshuang/.local/lib/python3.8/site-packages/datasets/metric.py", line 342, in _finalize
self.writer.finalize()
File "/home/yshuang/.local/lib/python3.8/site-packages/datasets/arrow_writer.py", line 370, in finalize
self.stream.close()
File "pyarrow/io.pxi", line 132, in pyarrow.lib.NativeFile.close
File "pyarrow/error.pxi", line 112, in pyarrow.lib.check_status
OSError: error closing file
from evaluate.
Hi @hyusterr,
If you look at the example provided in metrics/accuracy.py
, it only does metric.compute()
to calculate the accuracy. Here's an example:
from datasets import load_metric
metric = load_metric('accuracy')
output = metric.compute(predictions=[1, 1, 1, 1], references=[1, 1, 0, 0])
print(output['accuracy']) # 0.5
from evaluate.
I thought I can use Metric to collect predictions and references, this follows the step from huggingface's sample colab.
BTW, I fix the problem by setting other cache_dir in load_metric, but I'm still wondering about the mechanism.
from evaluate.
I tried this code on a colab notebook and it worked fine (with gpu enabled):
from datasets import load_metric
metric = load_metric('accuracy')
output = metric.add_batch(predictions=[1, 1, 1, 1], references=[1, 1, 0, 0])
final_score = metric.compute()
print(final_score) # 0.5
Also, in load_metric
, I saw cache_dir
is optional and it defaults to ~/.datasets/
from evaluate.
Closing this for now. Will re-open it should the issue still persist.
from evaluate.
Related Issues (20)
- Evaluate fails to acquire lock interrupting single-node multi-GPU training HOT 2
- Is there a plan to translate the documentation for this repository?
- several breakages due to recent `datasets` HOT 9
- Documentation claims that `label_mapping` is an accepted input to `TokenClassificationEvaluator.compute()`; it is not
- Token classification bootstrap crashing with custom dataset
- `DatasetColumn` and `DatasetColumnPair` should have `__repr__()` methods defined
- Try to import evaluate but get "Library not loaded: @rpath/libwebp.7.dylib" error HOT 2
- MSE not accepting (n_samples, n_ouptuts) despite docs stating so
- Add geometric mean of per-token-Perplexities
- need evaluator code reference for object detector & zero shot object dector task
- Evaluation not working with `microsoft/deberta-v3-large`
- Add Precision@k and Recall@k metrics
- After fine-tuning Gemma and want to evaluate performance: AttributeError: module 'keras._tf_keras.keras' has no attribute '__internal__'
- the difference of your bleu and sacrebleu HOT 1
- evaluate consuming Memory and slow down the process
- Is perplexity correctly computed? HOT 4
- It seems like evaluate.load doesnt use
- ImportError: To be able to use evaluate-metric/rouge, you need to install the following dependencies['nltk'] using 'pip install # Here to have a nice missing dependency error message early on' for instance' HOT 2
- ValueError: Predictions and/or references don't match the expected format. HOT 1
- Does Rouge score support the multilingual language? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from evaluate.