Comments (7)
+1. I have the same issue when I used the forward() function and loss() function seperately in the trainer.
from pykeen.
Hi @ferzcam (and @renzhonglu11 ),
this likely comes DistMult
using a regularizer by default, cf.
pykeen/src/pykeen/models/unimodal/distmult.py
Lines 85 to 86 in d1222b7
collect_regularization_term
which not only collects the regularization terms across different places, but also releases their references. When this is not called, the term accumulates indefinetely, and holds references to buffers -> over time you will run out of memory.
To fix it, either
- configure the model without regularizer, e.g., for
DistMult
passregularizer=None
, or - if you want to make use of the regularizer, make sure to call
collect_regularization_term
on the models and include this into your loss terms
from pykeen.
@mberr your solution works. Thanks a lot!😁
from pykeen.
You can see with
from pykeen.datasets import get_dataset
from pykeen.models import DistMult
dataset = get_dataset(dataset="nations")
model = DistMult(triples_factory=dataset.training)
print(model)
how the resulting structure looks like:
DistMult(
(loss): MarginRankingLoss(
(margin_activation): ReLU()
)
(interaction): DistMultInteraction()
(entity_representations): ModuleList(
(0): Embedding(
(_embeddings): Embedding(14, 50)
)
)
(relation_representations): ModuleList(
(0): Embedding(
(regularizer): LpRegularizer()
(_embeddings): Embedding(55, 50)
)
)
(weight_regularizers): ModuleList()
)
Notice how the relation_representations
as a regularizer
. compare this to
from pykeen.datasets import get_dataset
from pykeen.models import DistMult
dataset = get_dataset(dataset="nations")
model = DistMult(triples_factory=dataset.training, regularizer=None)
print(model)
resulting in
DistMult(
...
(relation_representations): ModuleList(
(0): Embedding(
(_embeddings): Embedding(55, 50)
)
)
...
)
The regularization term of the relation embedding is updated here
pykeen/src/pykeen/nn/representation.py
Lines 189 to 191 in d1222b7
i.e., in the
Embedding
's forward call.from pykeen.
Nice. Now it makes more sense. And I also found out where Pykeen calls the collect_regularization_term()
. Thanks for your explanation.
from pykeen.
Thanks a lot. But I am still quite not sure if it is really the reason. I took a look at Pykeen's source code. It seems Pykeen calculates the loss just with a function in the trainer. (
pykeen/src/pykeen/training/training_loop.py
Line 643 in d1222b7
What I did is that I first call forward() of a model(like Distmult) to calculate the prediction value and then call loss to calculate the loss value. (similar to what @ferzcam did) Then the memory will increase per epoch during training despite using GPU or CPU.
However, when I put forward and loss calculation together in one function just like Pykeen, and call this function. The memory does not increase anymore. So confused about it. 🧐
In my KGE Model
class PykeenKGE:
def training_step(self, batch):
x_batch, y_batch = batch
yhat_batch = self.forward(x_batch)
loss_batch = self.loss(yhat_batch, y_batch)
return loss_batch + self.model.collect_regularization_term()
In Trainer:
batch_loss = self.model.training_step(batch)
from pykeen.
Hi @mberr. Thanks for the explanation. I was able to make it work now!
from pykeen.
Related Issues (20)
- TypeError: DoubleMarginLoss.__init__() got an unexpected keyword argument 'margin_positive' HOT 1
- Protobuf requirement contriant HOT 5
- TypeError: MarginPairwiseLoss.__init__() missing 2 required positional arguments: 'margin' and 'margin_activation' HOT 2
- AttributeError: 'Module' object has no attribute 'get' HOT 2
- Question about the use of `create_inverse_triples` HOT 2
- Want to train a model without any evaluate or test dataset HOT 1
- Bug in wandb result tracker HOT 1
- Possible issue with model evaluation when using datasets with inverse triples HOT 1
- RGCN RuntimeError: trying to backward through graph a second time. (has parameters but no reset_parameters) HOT 2
- QuatE: GPU memory is not released per epoch HOT 3
- Training loop does not update relation representations when continuing training HOT 2
- from pykeen.pipeline import pipeline, pipeline issue HOT 3
- Evaluating metrics on many subsets with multiple models HOT 2
- Shape Mismatch upon initializing pretrained ComplEx embeddings HOT 2
- TransE - CUDA out of memory HOT 3
- Importing model_resolver HOT 2
- Getting Embeddings of the Entity and Relations HOT 13
- RGCN Hyper parameter optimization error HOT 1
- MatKG HOT 1
- HPO_Pipeline fails on AutoSF models HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pykeen.