Comments (11)
In training, graph2vec just use center not the nei_list, so the set of words will be same. However, I wonder whether the order of extracted_features, which is different in two versions, will hurt the results.
from graph2vec.
Hi ZiruiYan,
Not sure if this is related, but when I execute on a provided sample, results are not good.
To validate the result, I have added the following code:
`def test(model, document_collections):
ranks = []
second_ranks = []
for doc_id in range(len(document_collections)):
inferred_vector = model.infer_vector(document_collections[doc_id].words)
sims = model.docvecs.most_similar([inferred_vector], topn=len(model.docvecs))
rank = [docid for docid, sim in sims].index("g_" + str(doc_id))
ranks.append(rank)
second_ranks.append(sims[1])
print(sorted(collections.Counter(ranks).items())`
Basically inferred vectors should be most similar to themselves, (0,50) would be ideal. What I am getting is absolutely random:
[(0, 1), (1, 2), (2, 3), (4, 1), (5, 2), (8, 2), (10, 2), (11, 2), (14, 1), (16, 3), (19, 1), (21, 1), (22, 2), (23, 1), (24, 1), (25, 2), (27, 1), (28, 2), (29, 1), (32, 2), (34, 1), (35, 3), (36, 1), (37, 1), (39, 1), (40, 3), (41, 1), (43, 1), (44, 1), (45, 2), (50, 3)]
Can you provide a fix for the code, so it will work same as original?
from graph2vec.
The provided samples are synthetic data.For inferring you have to use a large learning rate.
from graph2vec.
Can you please point to the data set and learning rate to be used so the inferring will show reasonable results?
from graph2vec.
Synthetic data means ER graphs. A learning rate above 0.05 helps.
from graph2vec.
So if I try on nci1 set from the original paper and learning rate 0.05, inferring should be OK right?
Do you have any not synthetic data sets in the json format as the software expects?
from graph2vec.
from graph2vec.
Thanks!!! Really appreciate this. Also want to see the inference working. Do not know if authors of the original paper tried to check the inference.
from graph2vec.
from graph2vec.
Hi Benedek, good news, found a bug in my test procedure.
Now I am getting almost perfect inferring results on the synthetic set: [(0, 50), (1, 1)]
using parameters --learning-rate 0.05 --down-sampling 0.001 --epochs 500
The correct test code:
`def test(model, document_collections):
ranks = []
second_ranks = []
for doc_id in range(len(document_collections)):
inferred_vector = model.infer_vector(document_collections[doc_id].words)
sims = model.docvecs.most_similar(positive=[inferred_vector], topn=len(model.docvecs))
rank = [docid for docid, sim in sims].index(document_collections[doc_id].tags[0])
ranks.append(rank)
second_ranks.append(sims[1])
print(sorted(collections.Counter(ranks).items()))`
from graph2vec.
Something unrelated, suppose my node data is multidimensional i.e. has more than one label.
Any idea of how to use graph2vec in such case? Of course I can run separately on each label and merge results into a single TaggedDocument before calling Doc2Vec. Any other options?
from graph2vec.
Related Issues (20)
- node and edge attributes HOT 2
- ValueError while using the default Dataset HOT 2
- Graph2Vec datasets HOT 1
- Visualisation of graph2vec embeddings in a network HOT 5
- Question about embeddings HOT 5
- Input JSON HOT 1
- Getting This Error When Running on a graph with 1304 nodes HOT 1
- [Question] Add PyPi package HOT 2
- worse results with latest version
- Error on executing graph2vec.py HOT 1
- how to get the graph dataset? HOT 1
- Using one example HOT 1
- RuntimeError: you must first build vocabulary before training the model HOT 1
- Graph2vec for graph similarity learning HOT 3
- graph encoding HOT 6
- model save and load HOT 1
- Can I use multiple features of a particular node? HOT 1
- how to generate embeddings of graphml or graphson files as input using your library?
- Graph2vec infer HOT 1
- What does the output file contain HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from graph2vec.