Describe Model I am using (UniLM, MiniLM, LayoutLM ...): E5 <

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

The influence of t (temperature) in the E5 Model paper about unilm HOT 4 OPEN

daegonYu commented on July 26, 2024

The influence of t (temperature) in the E5 Model paper

from unilm.

Comments (4)

intfloat commented on July 26, 2024

Hi @daegonYu ,

This is a hyperparameter for tuning. Empirically we observe that lower temperature will lead to better performance but might cause training instability under float16 precision for large models. A lower temperature allows the logits to vary in a wider range and thus has more flexibility.

from unilm.

daegonYu commented on July 26, 2024

“A lower temperature allows the logits to vary in a wider range and thus has more flexibility.” This can be interpreted as saying that embeddings make it easier to learn more diverse expressions. But in "https://huggingface.co/intfloat/multilingual-e5-base"

3. Why does the cosine similarity scores distribute around 0.7 to 1.0?

This is a known and expected behavior as we use a low temperature 0.01 for InfoNCE contrastive loss.

For text embedding tasks like text retrieval or semantic similarity, what matters is the relative order of the scores instead of the absolute values, so this should not be an issue.

If embeddings can be expressed in wider range, I think cosine similarity should be distributed over a wide range. Cosine similarity is distributed between 0.7 and 1.0. It's difficult to understand because it seems like something contradictory. Simply put, I wonder why lowering the temperature allows learning a wider range of logits.

from unilm.

intfloat commented on July 26, 2024

The logits are calculated with cosine_similarity / t. Therefore, the logits will fall in [-100, 100] with t = 0.01 and [-50, 50] with t=0.02, etc.

However, this does not mean the learned cosine similarity will be in a wider range. On the contrary, the cosine similarity tends to concentrate as the temperature becomes lower.

from unilm.

Recommend Projects

The influence of t (temperature) in the E5 Model paper about unilm HOT 4 OPEN

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent