Comments (7)
Hi @PhaelIshall,
Are there instructions on running this on multiple GPUs so that I don't get this error?
Currently, we don't support multi-GPU support. However, it is planned that we will integrate support in the next major release.
I tried to go through the relevant parts and there is no mention of the hardware used for the experiments (sorry if I missed it!), there is a mention of the overall hours for all experiments and that each one took a maximum of 24 hours so I guess something is wrong if it is exceeding that for me (mostly for RotatE and TuckER which are relatively large)? I would appreciate it if you can provide the hardware information (again sorry if it is indeed in the paper!)F
For the WN18RR and FB15K-237 experiments, we used Tesla V100s, and for Kinships and YAGO3-10 experiments, we used RTX2080s.
Here you can find an overview of the training time: https://github.com/pykeen/benchmarking/blob/master/ablation/summary/paper/trellis_scatter_training_time.pdf
In addition, in each result-summary, we saved the training and evaluation time, too (e.g., https://github.com/pykeen/benchmarking_fixed_epochs/blob/master/ablation/results/rotate/wn18rr/random/adam/2020-04-25-19-04_217bcf38-2101-461b-9593-b133b4201e6a/0000_wn18rr_rotate/best_pipeline/2020-08-27-01-59-50/replicates/replicate-00000/results.json).
Some model configurations took more than 24 hours because the model training has started before the 24 hours deadline was reached. Therefore, the model was trained until the end, and the HPO has terminated afterward (which in rare cases took 1-2 days of training).
from pykeen.
Out of memory error screenshot here, triggered by running this code:
import torch map_location=torch.device('cpu') model = torch.load('trained_model.pkl') #distmult wn18 predictions_df = model.score_all_triples()
And also by running any predict all_triples experiment on the server I mentioned.
Please notice the comments and warning in the documentation of score_all_triples
.
score_all_triples
is always an expensive operation since it will compute scores for all possible triples, i.e. num_entities**2 * num_relations
many; for medium to large knowledge graphs this number will be prohibitively large, e.g. for FB15k-237 it is 49,863,620,925, i.e. approx 50 billion. When passing k=None
(the default), the memory consumption will also be huge since the scores for all these triples need to be stored, e.g. for FB15k-237, the scores alone require 199.5 GiB. You should have seen a warning emitted from your call to score_all_triples
, most likely right before you encounter the OOM issue.
I do not know your exact use-case, but often one does not need scores for all possible triples. pykeen
includes various methods to efficiently (memory and performance) compute subsets of these scores which are likely of interest:
score_all_triples(k=NUM)
computes scores for all triples, but keeps only the triples with theNUM
largest scores (i.e. highest plausability according to the KGE model). This is still computationally expensive since all scores are computed, but requires less memory, since we can discard low scores in-between.predict_scores_all_heads
/predict_scores_all_relations
/predict_scores_all_tails
compute scores for all triples(*, r, t)
/(h, *, t)
/(h, r, *)
, i.e. all triples where either relation & tail / head & tail / head & relation are fixed, and only one "dimension" is varied. This is for instance used in the default link prediction evaluation protocol.- predict_scores predicts the scores for a batch of triples. This is the most flexible option, but less computationally efficient in case you want to score against all entities/relations.
EDIT: In case you want to do some on-the-fly processing while computing all scores, e.g. to keep the triples with minimum score, compute descriptive statistics over scores, etc., you can check the following code part of score_all_triples
to write your own method
pykeen/src/pykeen/models/base.py
Lines 631 to 641 in fb2e19d
from pykeen.
@mberr Thank you for your explanation. I narrowed the task down to picking the top 1 tail prediction for every head/relation. I want to be able not only to score the triples but to see the top prediction for each one and manually verify them. This is why I am loading each head/relation from the test file and calling
predict_tails
on them (please let me know if there is another way to do that on your available datasets).
Right now I am not exactly sure whether you want to compute the top tail entities for every possible combination of head and relation, or only those combinations which occur in a list of evaluation triples / pairs.
Assuming you only want to do this for the existing combinations from a set of evaluation triples, and in case you already have the ID-based triples, you can also use them to do batched predictions with predict_scores_all_tails. You can access the ID-based triples as the attribute mapped_triples
of the TriplesFactory.
Your code could extend the following template
k = 1
# get all the unique (head, relation) ID combinations from the mapped_triples
hr = factory.mapped_triples[:, :2].unique(dim=0)
# batched computation
for hr_batch in hr.split(batch_size, dim=0):
# compute scores for a batch of (head-relation) pairs, and all tail entities
# shape: (batch_size, num_entities)
scores = model.predict_scores_all_tails(hr_batch=hr_batch.to(model.device))
# get top-k entities
top_scores, top_ids = scores.topk(k=k, dim=-1, largest=True)
# ... do something with these scores ...
from pykeen.
There isn't any support in PyKEEN for multiple gpus now - in our last two papers we referenced a couple other libraries that have that kind of functionality. We also did a massive benchmarking paper where we reported the training times (and hardware) for many models/dataset combinations. I'm on my phone right now and can't link, but there should be some links to the last two papers on pykeen.github.io
from pykeen.
- The benchmarking paper "Bringing Light Into the Dark: A Large-scale Evaluation of Knowledge Graph Embedding Models Under a Unified Framework" is summarized here: https://pykeen.github.io/2020/08/07/benchmarking.html (pdf)
- The software paper "PyKEEN 1.0: A Python Library for Training and Evaluating Knowledge Graph Embeddings" that mentions other libraries with multiple GPU support is here: https://arxiv.org/abs/2007.14175 (pdf)
I tried to go through the relevant parts and there is no mention of the hardware used for the experiments (sorry if I missed it!), there is a mention of the overall hours for all experiments and that each one took a maximum of 24 hours so I guess something is wrong if it is exceeding that for me (mostly for RotatE and TuckER which are relatively large)? I would appreciate it if you can provide the hardware information (again sorry if it is indeed in the paper!)
from pykeen.
@mberr Thank you for your explanation. I narrowed the task down to picking the top 1 tail prediction for every head/relation. I want to be able not only to score the triples but to see the top prediction for each one and manually verify them. This is why I am loading each head/relation from the test file and calling predict_tails
on them (please let me know if there is another way to do that on your available datasets).
from pykeen.
Multi gpu support is now possible with the pytorch lightning plugin - you can find it in https://pykeen.readthedocs.io/en/stable/contrib/lightning.html
from pykeen.
Related Issues (20)
- Possible issue with model evaluation when using datasets with inverse triples HOT 1
- RGCN RuntimeError: trying to backward through graph a second time. (has parameters but no reset_parameters) HOT 2
- QuatE: GPU memory is not released per epoch HOT 3
- Training loop does not update relation representations when continuing training HOT 2
- from pykeen.pipeline import pipeline, pipeline issue HOT 3
- Evaluating metrics on many subsets with multiple models HOT 2
- Shape Mismatch upon initializing pretrained ComplEx embeddings HOT 2
- TransE - CUDA out of memory HOT 3
- Importing model_resolver HOT 2
- Getting Embeddings of the Entity and Relations HOT 13
- RGCN Hyper parameter optimization error HOT 1
- MatKG HOT 1
- HPO_Pipeline fails on AutoSF models HOT 1
- Unable to reproduce TransE experiment
- EarlyStopper: show progress bar
- Cosine Annealing with Warm Restart LR Scheduler recieving an unexpected kwarg `T_i` HOT 1
- OOM Crash on MPS/Apple silicon HOT 2
- Reason for omitting validation inference triples from filtering when doing test evaluation in inductive lp example HOT 2
- tqdm progressbar is still shown although setting `use_tqdm=False`
- create_inverse_triples=True fails for the ILPC datasets
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pykeen.