I'm interested in getting the resulting embeddings of entities and relations. How

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Thanks <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

How do I get the embedding matrix for all entities and relations? about pykeen HOT 10 CLOSED

pykeen commented on May 24, 2024

How do I get the embedding matrix for all entities and relations?

from pykeen.

Comments (10)

mberr commented on May 24, 2024 2

@mali-git to access the weight data, I have to call

from pykeen.pipeline import pipeline
result = pipeline(
    model='TransE',
    dataset='umls',
)
model = result.model
entity_embeddings = model.entity_embeddings._embeddings.weight.data
relation_embeddings = model.relation_embeddings._embeddings.weight.data

Has this been updated recently or did I overlook something? smile

Hi, this way should still be possible.

For a more future-proof way you can use

entity_embeddings = model.entity_embeddings()

or more explicitly

entity_embeddings = model.entity_embeddings(indices=None)

which does not depend on the underlying Embedding implementation details, but uses the interface RepresentationModule.

Also, we are extending our framework to have a unified way to store more than one representation for entities/relations, cf. here:

pykeen/src/pykeen/models/nbase.py

Line 310 in 68af86e

class ERModel(

Briefly, the change is to always store a list. In that case, for TransE, you would need to use model.entity_representations[0](). We'll update the docs once we actually did migrate the models.

from pykeen.

mali-git commented on May 24, 2024 1

Hi @xin-zhao0,

pipeline_result contains the trained model which contains the embeddings:

model = pipeline_result.model
entity_embeddings = model.entity_embeddings.weight.data
relation_embeddings = model.relation_embeddings.weight.data

Each row in entity_embeddings/relation_embeddings contains the embedding of a specific entity/relation. For instance, entity_embeddings[0] returns the embedding of the entity with id 0.

To get the ids of the entites/relations, you can use the internal mappings:

entity_to_id = model.triples_factory.entity_to_id
relation_to_id = model.triples_factory.relation_to_id

Putting everything together you can get the entity embeddings for the entities 'brazil' and 'china' contained in the Nations dataset as follows:

import torch
import numpy as np

model = pipeline_result.model
entity_embeddings = model.entity_embeddings.weight.data
relation_embeddings  = model.relation_embeddings.weight.data
entity_to_id = model.triples_factory.entity_to_id

entities = ['brazil', 'china']
entity_ids = torch.tensor([entity_to_id[entity] for entity in entities], dtype=torch.long)
embedded_entities = entity_embeddings[entity_ids]

from pykeen.

mali-git commented on May 24, 2024 1

Thank you. for models like rescal and some others, there is a relation-specific matrix M. Is that returned in the results as part of the model?

The relations matrices are also saved in model.relation_embeddings.weight.data, but they need to be reshaped. For RESCAL, you can get the relation embedding dimension as follows (please consider that some models have a dedicated relation_dim which needs to be used when reshaping the relation-embeddings) :

import torch
import numpy as np

# Get the relation-matrix of the relation with the ID 0
relation_embeddings = rescal.relation_embeddings.weight.data
relation = relation_embeddings[0].view(rescal.embedding_dim, rescal.embedding_dim)

# Get the embeddings for the relations 'officialvisits' and 'reldiplomacy' contaiend in the Nations dataset
relation_to_id = model.triples_factory.relation_to_id
relations = ['officialvisits', 'reldiplomacy']
relation_ids = torch.tensor([relation_to_id[relation] for relation in relations], dtype=torch.long)
relation_embeddings = relation_embeddings[relation_ids].view(-1, rescal.embedding_dim, rescal.embedding_dim)

from pykeen.

cthoyt commented on May 24, 2024 1

New docs at https://pykeen.readthedocs.io/en/latest/tutorial/first_steps.html#using-learned-embeddings

from pykeen.

shimst3r commented on May 24, 2024 1

Thanks @cthoyt, very much appreciated. 👍

from pykeen.

cthoyt commented on May 24, 2024

@mali-git before we close this, let's add some documentation in the "getting started" tutorial

from pykeen.

mali-git commented on May 24, 2024

Sounds good!

from pykeen.

xin-zhao0 commented on May 24, 2024

Thank you. for models like rescal and some others, there is a relation-specific matrix M. Is that returned in the results as part of the model?

from pykeen.

shimst3r commented on May 24, 2024

@mali-git to access the weight data, I have to call

from pykeen.pipeline import pipeline
result = pipeline(
    model='TransE',
    dataset='umls',
)
model = result.model
entity_embeddings = model.entity_embeddings._embeddings.weight.data
relation_embeddings = model.relation_embeddings._embeddings.weight.data

Has this been updated recently or did I overlook something? 😄

from pykeen.

cthoyt commented on May 24, 2024

@shimst3r @xin-zhao0 I've started a PR where we translated some of this discussion into the documentation. Thanks for keeping us honest ;)

from pykeen.

How do I get the embedding matrix for all entities and relations? about pykeen HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent