Hello! Thank you for your job. TF Similarity is great! In <a hre

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Thank you, <a class="user-mention notranslate" data-hovercard-type="user" data-hoverca

What values of metrics can be interpreted as successful during training Similarity Models about similarity HOT 2 CLOSED

sergeichukd commented on September 25, 2024

What values of metrics can be interpreted as successful during training Similarity Models

from similarity.

Comments (2)

owenvallis commented on September 25, 2024 1

Hi @sergeichukd,

We provide a number of classification and information retrieval metrics that can be used during training via the EvalCallback object in tensorflow_similarity.callbacks. I general we tend to use the binary_accuracy metric as this can be thought of as the within threshold precision * the recall or total number of elements in the database. There are more details in the doc string under the classification metrics dir.

Regarding the comment. The issue here is that we train the model using a subset of the classes but include all classes in the validation dataset. This gives us a sense of how well the model will generalize to new unseen classes, but also means the validation metrics will lag behind the train metrics.

Ket me know if you have any questions about the EvalCallback object or any of the metrics. Here is a more detailed example of running some of the evaluations after the model is finished training.

# Assumes you have trained a model and that index data != query data...

# Add all examples to the index
brute_force_search = NMSLibSearch(
    distance="cosine",
    dim=model.output.shape[1],
    method='brute_force',
)
# Create or clear the index
try:
  model.reset_index()  # clear the index
except AttributeError:
  model.create_index(brute_force_search)  # or create it.

model.index(index_x, y=index_y, data=index_human_readable_data)

calibrate_metrics = model.calibrate(
    query_x,
    y=query_y,
    thresholds_targets = {"0.99": 0.99, "0.95": 0.95, "0.90": 0.90, "0.85": 0.85, "0.80": 0.80},
    calibration_metric="binary_accuracy",
)

eval_cal = model.evaluate_classification(
    query_x,
    y=query_y,
    extra_metrics=['precision', 'binary_accuracy', 'recall', 'npv', 'fpr'],
)

def make_recall_at_k(k: int) -> RecallAtK:
  return RecallAtK(k=k, average="macro")

def make_precision_at_k(k: int) -> PrecisionAtK:
  return PrecisionAtK(k=k, average="macro")

def make_map_at_r(targets_y: np.ndarray, max_class_count: int) -> MapAtK:
  class_counts = Counter(targets_y)
  max_class_count = min(max(class_counts.values()), max_class_count)
  return MapAtK(
      r=class_counts,
      clip_at_r=True,
      k=max_class_count,
      name="map@R",
  )

def make_r_precision(
    targets_y: np.ndarray, max_class_count: int
) -> PrecisionAtK:
  class_counts = Counter(targets_y)
  max_class_count = min(max(class_counts.values()), max_class_count)
  return PrecisionAtK(
      r=class_counts,
      clip_at_r=True,
      k=max_class_count,
      name="R_Precision",
  )

recall_at_k = [make_recall_at_k(k) for k in [1, 2, 4, 8, 16, 32]]
precision_at_k = [make_precision_at_k(k) for k in [1, 2, 4, 8, 16, 32]]

metrics = [
    make_map_at_r(df["label"].cat.codes.values, 300),
    make_r_precision(df["label"].cat.codes.values, 300),
]
metrics.extend(recall_at_k + precision_at_k)

eval_cal = model.evaluate_retrieval(
    query_x,
    y=query_y,
    retrieval_metrics=metrics,
)

from similarity.

sergeichukd commented on September 25, 2024

Thank you, @owenvallis!
This callback is extremely helpful for me

from similarity.

What values of metrics can be interpreted as successful during training Similarity Models about similarity HOT 2 CLOSED

Comments (2)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent