Giter VIP home page Giter VIP logo

Comments (7)

mali-git avatar mali-git commented on June 4, 2024 1

Hi @PhaelIshall,

Are there instructions on running this on multiple GPUs so that I don't get this error?

Currently, we don't support multi-GPU support. However, it is planned that we will integrate support in the next major release.

I tried to go through the relevant parts and there is no mention of the hardware used for the experiments (sorry if I missed it!), there is a mention of the overall hours for all experiments and that each one took a maximum of 24 hours so I guess something is wrong if it is exceeding that for me (mostly for RotatE and TuckER which are relatively large)? I would appreciate it if you can provide the hardware information (again sorry if it is indeed in the paper!)F

For the WN18RR and FB15K-237 experiments, we used Tesla V100s, and for Kinships and YAGO3-10 experiments, we used RTX2080s.

Here you can find an overview of the training time: https://github.com/pykeen/benchmarking/blob/master/ablation/summary/paper/trellis_scatter_training_time.pdf
In addition, in each result-summary, we saved the training and evaluation time, too (e.g., https://github.com/pykeen/benchmarking_fixed_epochs/blob/master/ablation/results/rotate/wn18rr/random/adam/2020-04-25-19-04_217bcf38-2101-461b-9593-b133b4201e6a/0000_wn18rr_rotate/best_pipeline/2020-08-27-01-59-50/replicates/replicate-00000/results.json).
Some model configurations took more than 24 hours because the model training has started before the 24 hours deadline was reached. Therefore, the model was trained until the end, and the HPO has terminated afterward (which in rare cases took 1-2 days of training).

from pykeen.

mberr avatar mberr commented on June 4, 2024 1

@PhaelIshall

Out of memory error screenshot here, triggered by running this code:
import torch map_location=torch.device('cpu') model = torch.load('trained_model.pkl') #distmult wn18 predictions_df = model.score_all_triples()
And also by running any predict all_triples experiment on the server I mentioned.

Please notice the comments and warning in the documentation of score_all_triples.

score_all_triples is always an expensive operation since it will compute scores for all possible triples, i.e. num_entities**2 * num_relations many; for medium to large knowledge graphs this number will be prohibitively large, e.g. for FB15k-237 it is 49,863,620,925, i.e. approx 50 billion. When passing k=None (the default), the memory consumption will also be huge since the scores for all these triples need to be stored, e.g. for FB15k-237, the scores alone require 199.5 GiB. You should have seen a warning emitted from your call to score_all_triples, most likely right before you encounter the OOM issue.

I do not know your exact use-case, but often one does not need scores for all possible triples. pykeen includes various methods to efficiently (memory and performance) compute subsets of these scores which are likely of interest:

  • score_all_triples(k=NUM) computes scores for all triples, but keeps only the triples with the NUM largest scores (i.e. highest plausability according to the KGE model). This is still computationally expensive since all scores are computed, but requires less memory, since we can discard low scores in-between.
  • predict_scores_all_heads / predict_scores_all_relations / predict_scores_all_tails compute scores for all triples (*, r, t) / (h, *, t) / (h, r, *), i.e. all triples where either relation & tail / head & tail / head & relation are fixed, and only one "dimension" is varied. This is for instance used in the default link prediction evaluation protocol.
  • predict_scores predicts the scores for a batch of triples. This is the most flexible option, but less computationally efficient in case you want to score against all entities/relations.

EDIT: In case you want to do some on-the-fly processing while computing all scores, e.g. to keep the triples with minimum score, compute descriptive statistics over scores, etc., you can check the following code part of score_all_triples to write your own method

for r, e in itt.product(
range(self.num_relations),
range(0, self.num_entities, batch_size),
):
# calculate batch scores
hs = torch.arange(e, min(e + batch_size, self.num_entities), device=self.device)
hr_batch = torch.stack([
hs,
hs.new_empty(1).fill_(value=r).repeat(hs.shape[0]),
], dim=-1)
scores[r, e:e + batch_size, :] = self.predict_scores_all_tails(hr_batch=hr_batch).to(scores.device)

from pykeen.

mberr avatar mberr commented on June 4, 2024 1

@mberr Thank you for your explanation. I narrowed the task down to picking the top 1 tail prediction for every head/relation. I want to be able not only to score the triples but to see the top prediction for each one and manually verify them. This is why I am loading each head/relation from the test file and calling predict_tails on them (please let me know if there is another way to do that on your available datasets).

Right now I am not exactly sure whether you want to compute the top tail entities for every possible combination of head and relation, or only those combinations which occur in a list of evaluation triples / pairs.

Assuming you only want to do this for the existing combinations from a set of evaluation triples, and in case you already have the ID-based triples, you can also use them to do batched predictions with predict_scores_all_tails. You can access the ID-based triples as the attribute mapped_triples of the TriplesFactory.

Your code could extend the following template

k = 1
# get all the unique (head, relation) ID combinations from the mapped_triples
hr = factory.mapped_triples[:, :2].unique(dim=0)
# batched computation
for hr_batch in hr.split(batch_size, dim=0):
    # compute scores for a batch of (head-relation) pairs, and all tail entities
    # shape: (batch_size, num_entities)
    scores = model.predict_scores_all_tails(hr_batch=hr_batch.to(model.device))
    # get top-k entities
    top_scores, top_ids = scores.topk(k=k, dim=-1, largest=True)
    # ... do something with these scores ...

from pykeen.

cthoyt avatar cthoyt commented on June 4, 2024

There isn't any support in PyKEEN for multiple gpus now - in our last two papers we referenced a couple other libraries that have that kind of functionality. We also did a massive benchmarking paper where we reported the training times (and hardware) for many models/dataset combinations. I'm on my phone right now and can't link, but there should be some links to the last two papers on pykeen.github.io

from pykeen.

PhaelIshall avatar PhaelIshall commented on June 4, 2024

I tried to go through the relevant parts and there is no mention of the hardware used for the experiments (sorry if I missed it!), there is a mention of the overall hours for all experiments and that each one took a maximum of 24 hours so I guess something is wrong if it is exceeding that for me (mostly for RotatE and TuckER which are relatively large)? I would appreciate it if you can provide the hardware information (again sorry if it is indeed in the paper!)

from pykeen.

PhaelIshall avatar PhaelIshall commented on June 4, 2024

@mberr Thank you for your explanation. I narrowed the task down to picking the top 1 tail prediction for every head/relation. I want to be able not only to score the triples but to see the top prediction for each one and manually verify them. This is why I am loading each head/relation from the test file and calling predict_tails on them (please let me know if there is another way to do that on your available datasets).

from pykeen.

cthoyt avatar cthoyt commented on June 4, 2024

Multi gpu support is now possible with the pytorch lightning plugin - you can find it in https://pykeen.readthedocs.io/en/stable/contrib/lightning.html

from pykeen.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.