I use this code to compute a WEAT: <div class="snippet-clipboard-content notransla

Yeah, the weat return as follows: <div class="snippet-clipboard-content notranslat

WEAT returns nothing about wefe HOT 4 CLOSED

dccuchile commented on May 19, 2024

WEAT returns nothing

from wefe.

Comments (4)

raffaem commented on May 19, 2024

Yeah, the weat return as follows:

{'query_name':  [MY QUERY NAME], 'result': nan, 'weat': nan, 'effect_size': nan}

from wefe.

raffaem commented on May 19, 2024

Is it possible to know why it is not returning a result?

from wefe.

pbadillatorrealba commented on May 19, 2024

Hello

Based on what you are describing (that the query returns values in some models and not in others) I could infer that the problem lies in that when transforming the query word sets to embeddings sets there is (at least) one word set that is losing 20% of its words. In this case, WEFE by default invalidates the query making it return None.
This could be because the model you are using does not have words in capital letters, does not have words with accents or the words do not exist in its vocabulary.

The behavior of queries invalidated by missing many words is detailed in the warning of this subsection:
https://wefe.readthedocs.io/en/latest/user_guide.html#word-preprocessors

You can use the parameter warn_not_found_words=True to see which words are being lost when converting the query to embeddings.

wefemodel = WordEmbeddingModel(wv, model_name)
query = Query(target_sets, attribute_sets, target_sets_names, attribute_sets_names)
result_weat = weat.run_query(
    query, wefemodel, calculate_p_value=True, warn_not_found_words=True,
)

A possible solution would be to use a word preprocessor (specified in the run_query parameter preprocessor_args or secondary_preprocessor_args).

wefemodel = WordEmbeddingModel(wv, model_name)
query = Query(target_sets, attribute_sets, target_sets_names, attribute_sets_names)
result_weat = weat.run_query(
    query,
    wefemodel,
    calculate_p_value=True,
    secondary_preprocessor_args={"lowercase": True, "strip_accents": True},
    warn_not_found_words=True,
)

In practical terms, with this parameter you specify to run_query that for each word o each set, first look for its original version in the model vocabulary and in case it does not find them, preprocess the word (lowercase and without accents) and try again this search.

Pablo.

from wefe.

raffaem commented on May 19, 2024

Hello,

Thank you for your support and your prompt and detailed answer.

I'm making sure that all the words of the word sets are present in the embedding before running the query. So I don't think that's the problem.

Anyway I think WEFE should throw an exception by default instead of returning nothing.

I will try again next week.

Thank you again

from wefe.

WEAT returns nothing about wefe HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent