Comments (4)
The differences between the two configs that were posted above are:
- dim: 100 -> 400
- loss_fn: ranking -> softmax
- max_norm: 1 -> none
- num_batch_negs: 100 -> 50
- num_uniform_negs: 0 -> 1000
@ledw ran some ablation experiments, starting from the latest config and resetting one option at a time to be equal to the earlier config, to see which changes have the biggest impact on the MRR:
-
dim: 400 -> 100
pos_rank: 43.8787 , mrr: 0.64014 , r1: 0.517911 , r10: 0.836764 , r50: 0.920003 , auc: 0.987667
-
loss: softmax -> ranking
pos_rank: 98.5577 , mrr: 0.309776 , r1: 0.179623 , r10: 0.562789 , r50: 0.796812 , auc: 0.964517
-
loss: softmax -> ranking and use cosine distance
pos_rank: 388.779 , mrr: 0.195723 , r1: 0.133602 , r10: 0.313572 , r50: 0.478204 , auc: 0.976249
-
max_norm: none -> 1
pos_rank: 42.7176 , mrr: 0.661254 , r1: 0.541069 , r10: 0.849622 , r50: 0.925336 , auc: 0.982936
-
negatives: 50 -> 100 batch, 1000 -> 0 uniform
pos_rank: 44.1177 , mrr: 0.624536 , r1: 0.496927 , r10: 0.831161 , r50: 0.918937 , auc: 0.984603
from pytorch-biggraph.
Sorry for the late reply.
I don't remember who ran that experiment and with what configuration. I have a config for FB15k with TransE lying around which achieves the following stats:
pos_rank: 72.3282 , mrr: 0.475054 , r1: 0.318422 , r10: 0.750859 , r50: 0.885883 , auc: 0.973625 , count: 59071
which are better than the ones you were getting but still worse than the ones in the paper. I'm posting this config here below, but I'll keep looking for the config we ran those experiments with.
entity_base = "data/FB15k"
def get_torchbiggraph_config():
config = dict(
entity_path=entity_base,
num_epochs=200,
entities={
'all': {'num_partitions': 1},
},
relations=[{
'name': 'all_edges',
'lhs': 'all',
'rhs': 'all',
'operator': 'translation',
}],
dynamic_relations=True,
edge_paths=[],
checkpoint_path='model/fb15k',
dimension=100,
global_emb=False,
max_norm=1,
comparator='dot',
loss_fn='ranking',
margin=0.2,
lr=0.1,
num_uniform_negs=0,
num_batch_negs=100,
eval_fraction=0, # to reproduce results, we need to use all training data
)
return config
from pytorch-biggraph.
We got numbers that's better than those we reported in the paper for TransE:
Stats: pos_rank: 58.1476 , mrr: 0.668699 , r1: 0.559327 , r10: 0.843527 , r50: 0.916956 , auc: 0.974048 , count: 59071
Here's config:
def get_torchbiggraph_config():
config = dict(
entity_path=entity_base,
num_epochs=50,
entities={
'all': {'num_partitions': 1},
},
relations=[{
'name': 'all_edges',
'lhs': 'all',
'rhs': 'all',
'operator': 'translation',
}],
dynamic_relations=True,
edge_paths=[],
checkpoint_path='model/fb15k',
dimension=400,
global_emb=False,
comparator='dot',
loss_fn='softmax',
lr=0.1,
num_uniform_negs=1000,
eval_fraction=0, # to reproduce results, we need to use all training data
)
return config
from pytorch-biggraph.
I guess the main takeaway here is that it's the loss function that really brings big gains. Second to that, but quite far behind, there's the negative sampling. The two might correlate with each other, of course. I'm closing this as I think we've answered the initial request and did some additional research. Feel free to reopen if there's more to talk about.
from pytorch-biggraph.
Related Issues (20)
- [Question] Are featurized entities supported on GPUs? HOT 2
- raise EOFError and RuntimeError: CUDA error: device-side assert triggered HOT 2
- Regarding non dynamic relations HOT 2
- Regarding negative batch sampling in dyamic relations HOT 2
- Entity attributes, attribute triplets and custom loss function HOT 1
- Parallelizing convert_input_data HOT 1
- How to calculate similarity score between two embeddings if affine operator was used? HOT 2
- Could Pytorch BigGraph find similarities two edges away? HOT 1
- Could you please generate a new release ? HOT 3
- [Question] Python API
- [Question] Incremental training on existing entity embeddings and relation embeddings
- [Question] Stability of embeddings on consecutive runs? HOT 3
- [Question] Dynamic relations with multiple entity types
- Behavior of same batch negative sampling ? HOT 1
- Require C++ Installation after using PBG_INSTALL_CPP=1 pip install .
- [Question] Is it possible to "freeze" embeddings of a certain entity type?
- ModuleNotFoundError: No module named 'torchbiggraph.converters.importers'
- Choice of edge weight
- update version available through pip?
- Does VERSION.txt change as code changes?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pytorch-biggraph.