Giter VIP home page Giter VIP logo

Comments (19)

hellomuse avatar hellomuse commented on July 18, 2024

在生成embedding的时候用的是yelp_triple数据集,那么在embedding中就用到了所有边的信息,这样会对链路预测的结果产生影响吧?

from hegan.

librahu avatar librahu commented on July 18, 2024

embedding的生成是会删除原图的20%的边的。

from hegan.

hellomuse avatar hellomuse commented on July 18, 2024

感谢您的回复!

生成embedding时删除边的地方我似乎没有找到,是不是需要自己改动呢?

如果您方便的话,可以上传您切分出的yelp_ub.test_0.8_new 和 yelp_ub.train_0.8_lr 这两个文件吗?

抱歉我的问题有点多,期待您的回复。

from hegan.

librahu avatar librahu commented on July 18, 2024

这个就按照随机切分就可以。为每条边设置一个0到1随机数,如果随机数小于0.8就保留,不然就删除放入测试集。

from hegan.

 avatar commented on July 18, 2024

Does the HeGAN is trained on the original HIN (including both train/test edges) while the downstream task (e.g. link prediction) being conducted on the split dataset (test set)? How did you construct the file "*_triple.dat"?

from hegan.

librahu avatar librahu commented on July 18, 2024

@isHaoyiFan
(1) Following the previous works, for link prediction and recommendation, we will train the HeGAN on training set (the HINs without 20% edges). As for the node clustering and node classification, the HeGAN is trained on the original HINs.
(2) To construct the "_triple.dat", we consider the HINs as multiple triples. For an example, for a edge "a1 - r1 - p1" in original HINs, we will add two triple in "_triple.dat", i.e., (a1, p1, r1) and (p1, a1, r1') (r1' is the inverse of r1)

from hegan.

 avatar commented on July 18, 2024

@librahu Yea, I see! Thanks.

from hegan.

wangyouze avatar wangyouze commented on July 18, 2024

您好,当我运行he_gan.py的时候,代码报错,显示缺失文件../data/dblp_lp/dblp_ap.train_0.8_lr。但是data文件夹中并没有dblp以及相应的文件。请问这类文件如何获取?打扰了。

from hegan.

librahu avatar librahu commented on July 18, 2024

https://github.com/librahu/HIN-Datasets-for-Recommendation-and-Network-Embedding 原始文件都可以在这里获取 @wangyouze

from hegan.

960924 avatar 960924 commented on July 18, 2024

您好,我把yelp_triple.dat按8:2划分成了训练集yelp_ub.train_0.8_lr和测试集yelp_ub.test_0.8_new,然后进行链路预测的时候,f1和acc都是1。 是我划分的不对,还是生成embedding时需要进行修改。
期待您的回复,谢谢

from hegan.

librahu avatar librahu commented on July 18, 2024

@960924 链路预测的步骤如下:
(1) 8:2划分yelp_triple.dat, 得到train和test。
(2) 在train数据集的网络上训练embedding。
(3) 对train上的每条边采样一个负样本,训练一个逻辑回归模型。
(4-1) 基于逻辑回归的评估:对test上的每条边采样一个负样本,用(3)中训练好的逻辑回归模型进行预测,评估。
(4-2) 基于内积的评估:对test上的每条边采样一个负样本,对test上的每条边做内积进行评估。

我不知道你是否是按照如上的步骤进行实验的。

from hegan.

960924 avatar 960924 commented on July 18, 2024

@librahu 我是用您的代码做的实验,但发现test_y和pred_label完全一样,所以导致acc为1,我不知道问题出在了哪。

from hegan.

960924 avatar 960924 commented on July 18, 2024

@librahu
for u, b, label in self.train_link_label:
train_x.append(embedding_list[u] + embedding_list[b])
train_y.append(float(label))
对于(3),只看到对train上的每条边进行如上操作,不知道您说的采样一个负样本是不是这个操作。除了这一步以外,其余操作与您回复的步骤相同。但逻辑回归得出的结果pred_label 和test_y始终相同,导致f1和acc始终为1。期待您的回复,谢谢!

from hegan.

Mengjie-Guo avatar Mengjie-Guo commented on July 18, 2024

in the link prediction task, is the pretrained embedding trained on the HINs without 20% edges or the whole HINs?

from hegan.

librahu avatar librahu commented on July 18, 2024

@Mengjie-Guo without 20% edges

from hegan.

Mengjie-Guo avatar Mengjie-Guo commented on July 18, 2024

请问link-rediction用的pretrain向量是使用metapath2vec得到的,在得到pretrain向量时是用了测试performance最好的,还是metapath2vec训练的没那么好欠拟合时得到最后需要的pretrain向量,防止用HeGAN再次训练过拟合?

from hegan.

librahu avatar librahu commented on July 18, 2024

@Mengjie-Guo
直接用默认的参数得到pretrain的向量就好了,因为在训练HeGAN的时候这个也会训练这个向量。

from hegan.

960924 avatar 960924 commented on July 18, 2024

@Mengjie-Guo 你好 你在做link-prediction嘛 有问题能跟你讨论一下嘛 谢谢

from hegan.

xinchen1412 avatar xinchen1412 commented on July 18, 2024

In classification and clustering experiments, are the pretrained embeddings trained on the whole HINs?That is, are the embeddings used for classification and clustering the same? looking forward to your reply

from hegan.

Related Issues (9)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.