Comments (10)
HetGNN应该是下游任务和训练分开的,训练收敛后获得节点的Embedding再去做evaluation,所以eva部分不会使用全图再去跑一遍。所以目前不是很理解您的问题?
from openhgnn.
明白你的意思,不过图较大的情况下 batch 一次次的获取全图的emb耗时比较高,需要走随机游走。而全图训练则不需要走游走的过程(但是需要极大的内存)。不知道这里有什么思路可以改进优化下,加速下全图的emb获取,在较大图下
from openhgnn.
比如我获取全部节点的emb, 用于后续任务。 但如果我mini-batch训练的话 ,要实时构建游走路径。然后走模型训练,需要很长的时间。而在全图训练过程中 并未走游走,但全图需要很大的内存。 因此想请教下是否有办法在构建全图emb中,无需做游走可以直接走模型的方法(batch训练)。
from openhgnn.
具体来说 就是这一行了
from openhgnn.
我理解的意思是:目前train过程中可以使用mini-batch的方式,在eval过程中只有full-batch的方式。然而full batch不适用大图,对吗?
1、HetGNN的聚合方式决定了必须要Random Walk with Restart, het_graph
就是预先游走后形成的图。所以eval的时候,输入model的是游走后的图。
2、解决大图的eval的问题:可能还是需要像train一样,做mini-batch,分别获得节点的emb,最后拼起来。
from openhgnn.
感谢,现在是这么做的,不过效率堪忧。只要切图就必须游走看来是?
from openhgnn.
训练中是每个epoch 采一部分step做train。 而eva要恐怖的全部的step
from openhgnn.
现在是8卡训练 + 单卡eva. 能想到的加速办法是把mini-batch游走的部分,在eva中,提前snapshot下来。
from openhgnn.
感谢,现在是这么做的,不过效率堪忧。只要切图就必须游走看来是?
不切图也需要游走,这是HetGNN模型决定的。
The key idea of most graph neural networks (GNNs) is to aggregate feature information from a node’s direct (first-order) neighbors, such as GraphSAGE [7] or GAT [31]. However, directly applying these approaches to heterogeneous graphs may raise several issues:
HetGNN认为直接聚合的方式有局限性,所以提出RWR的方式来确定聚合哪些邻居。我个人认为这部分和其他部分是分离的,所以这个地方不用游走,直接使用原图也是可以的。如果能接受提高效率,降低性能,这是一种可以尝试的方法。
from openhgnn.
好的,thx,暂时用8卡并行构建emb试试
from openhgnn.
Related Issues (20)
- run HGSL model error HOT 1
- Help needed: Wanted behavior of Experiment.specific_trainerflow.get method and task/trainerflow registration HOT 2
- a running error in the link prediction task based on TransE model HOT 1
- [Doc] ReadTheDocs parameters
- RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [14328, 334]] is at version 1; expected version 0 instead HOT 7
- This demo is no longer valid HOT 1
- function load() in openhgnn/dataset/adapter.py/ HOT 1
- Run Example HOT 1
- 数据集的重复处理/预处理数据集文件缓存无效 HOT 1
- Why is the embedding of meta paths different for different nodes in acm4GTN? HOT 1
- New request for heterogeneous models
- How to support inductive predict HOT 2
- Failed to import embedding flows. HOT 1
- what's the version of numpy? An error happens. HOT 1
- how to run other tdataset with the model HOT 1
- 关于DBLP数据集精度差的问题 HOT 2
- 个别模型minibatch训练问题 HOT 1
- 无法复现HGT在HGBn-ACM数据集上的结果 HOT 2
- Segmentation fault on GTN with 3 layers HOT 1
- Obtaining metapaths and attention scores HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from openhgnn.