Comments (6)
Thank you for the kind response. When I removed --fp16
option, I obtained a similar result (but, task-specific performance is still different from the pretrained model). Although I have known RTX 2000 series support mixed-precision, I guess that there is some issue.
+-------+-------+-------+-------+-------+--------------+-----------------+-------+
| STS12 | STS13 | STS14 | STS15 | STS16 | STSBenchmark | SICKRelatedness | Avg. |
+-------+-------+-------+-------+-------+--------------+-----------------+-------+
| 69.15 | 82.25 | 74.72 | 81.63 | 78.63 | 78.39 | 69.97 | 76.39 |
+-------+-------+-------+-------+-------+--------------+-----------------+-------+
from simcse.
Hi,
Thanks for reporting this. I think it is mainly caused by differences between GPU versions and CUDA versions. The last reproduced result looks good to me though (fp16 does make a lot of variance).
from simcse.
Hi, have you tried python simcse_to_huggingface.py --path result/my-unsup-simcse-bert-base-uncased/
to convert the model's state dict and config before evaluation?
from simcse.
Thank you for the quick response. I just tried to run the script before evaluation, but I obtained the same results..
$ python simcse_to_huggingface.py --path result/my-unsup-simcse-bert-base-uncased/
SimCSE checkpoint -> Huggingface checkpoint for result/my-unsup-simcse-bert-base-uncased/
$ python evaluation.py --model_name_or_path result/my-unsup-simcse-bert-base-uncased/ --pooler cls_before_pooler --task_set sts --mode test
(some log ...)
+-------+-------+-------+-------+-------+--------------+-----------------+-------+
| STS12 | STS13 | STS14 | STS15 | STS16 | STSBenchmark | SICKRelatedness | Avg. |
+-------+-------+-------+-------+-------+--------------+-----------------+-------+
| 65.14 | 79.35 | 70.48 | 80.72 | 76.45 | 74.21 | 70.97 | 73.90 |
+-------+-------+-------+-------+-------+--------------+-----------------+-------+
from simcse.
In that case, I'm not quite sure how to interpret your results by now. I tried the scripts on google colab (pytorch=1.8.1 cuda=10.1 gpu=Tesla K80), and got an average performance of 75.20, similar to that reproduced in #25. Hopefully this will help.
Also, it seems to me that the intrinsic difference between GPU devices may affect the performance by up to 1 point, and the optimal hyperparamters are likely to alter cross different devices. So I'll suggest to try some simple parameter tuning on batch size, learning rate and pooling method on your own device, and see whether the results get better.
from simcse.
I think so. Experiments without fp16 would be better for reproducing the reported results and testing other variants. I'm now closing this issue. Thanks again :)
from simcse.
Related Issues (20)
- why mlp_only_train=True during unsupervised training? HOT 2
- [question] Pretrined sentence embeddings model fine tuning HOT 2
- 关于simcse build_index 的速度问题 HOT 6
- Difference in models between train and evaluation scripts. HOT 4
- TypeError: object of type 'IndexFlatIP' has no len() HOT 4
- model = SimCSE("princeton-nlp/sup-simcse-bert-base-uncased")调用 HOT 3
- Can I load saved index to GPU? HOT 2
- Can I replace the base model with longformer? Is it a simple replacement or does the code also need to be changed? Thank you for your answer. HOT 4
- Why divide by temp when calculating cosine similarity HOT 1
- ValueError: Mixed precision training with AMP or APEX (`--fp16`) can only be used on CUDA devices.
- 关于两次前向传播 HOT 2
- AttributeError: 'OurTrainingArguments' object has no attribute 'distributed_state' HOT 2
- setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (750,) + inhomogeneous part. HOT 2
- 关于 Supervised SimCSE 的 GPU Memory Usage HOT 2
- An error when max_seq_length is set too long HOT 1
- drpout HOT 2
- couldn't install cimcse HOT 7
- Question about the comparison of data augmentations's reproduce HOT 2
- The function ‘search’ only returns one result HOT 2
- when I run evaluation.py, the spearman is unusually low HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from simcse.