Giter VIP home page Giter VIP logo

openhgnn's Introduction

OpenHGNN

GitHub release (latest by date) Documentation Status GitHub visitors Total lines

启智社区(中文版) | OpenHGNN [CIKM2022] | Space4HGNN [SIGIR2022] | Benchmark&Leaderboard | Slack Channel

This is an open-source toolkit for Heterogeneous Graph Neural Network based on DGL [Deep Graph Library] and PyTorch. We integrate SOTA models of heterogeneous graph.

News

2023-07-17 release v0.5

We release the latest version v0.5.0

  • New models and datasets.
  • 4 New tasks: pretrain, recommendation, graph attacks and defenses, abnorm_event detection.
  • TensorBoard visualization.
  • Maintenance and test module.
2023-02-24 OpenI Excellent Incubation Award

OpenHGNN won the Excellent Incubation Program Award of OpenI Community! For more details:https://mp.weixin.qq.com/s/PpbwEdP0-8wG9dsvRvRDaA

2023-02-21 First Prize of CIE

The algorithm library supports the project of "Intelligent Analysis Technology and Scale Application of Large Scale Complex Heterogeneous Graph Data" led by BUPT and participated by ANT GROUP, China Mobile, Haizhi Technology, etc. This project won the first prize of the 2022 Chinese Intitute of Electronics "Science and Technology Progress Award".

2023-01-13 release v0.4

We release the latest version v0.4.

  • New models
  • Provide pipelines for applications
  • More models supporting mini-batch training
  • Benchmark for million-scale graphs
2022-08-02 paper accepted
Our paper [ OpenHGNN: An Open Source Toolkit for Heterogeneous Graph Neural Network ](https://dl.acm.org/doi/abs/10.1145/3511808.3557664) is accpeted at CIKM 2022 short paper track.
2022-06-27 release v0.3

We release the latest version v0.3.

  • New models
  • API Usage
  • Simply customization of user-defined datasets and models
  • Visualization tools of heterogeneous graphs
2022-02-28 release v0.2

We release the latest version v0.2.

2022-01-07 加入启智社区
启智社区用户可以享受到如下功能:
  • 全新的中文文档
  • 免费的计算资源—— 云脑使用教程
  • OpenHGNN最新功能
    • 新增模型:【KDD2017】Metapath2vec、【TKDE2018】HERec、【KDD2021】HeCo、【KDD2021】SimpleHGN、【TKDE2021】HPN、【ICDM2021】HDE、fastGTN
    • 新增日志功能
    • 新增美团外卖数据集

Key Features

  • Easy-to-Use: OpenHGNN provides easy-to-use interfaces for running experiments with the given models and dataset. Besides, we also integrate optuna to get hyperparameter optimization.
  • Extensibility: User can define customized task/model/dataset to apply new models to new scenarios.
  • Efficiency: The backend dgl provides efficient APIs.

Get Started

Requirements and Installation

  • Python >= 3.6

  • PyTorch >= 1.9.0

  • DGL >= 0.8.0

  • CPU or NVIDIA GPU, Linux, Python3

1. Python environment (Optional): We recommend using Conda package manager

conda create -n openhgnn python=3.6
source activate openhgnn

2. Install Pytorch: Follow their tutorial to run the proper command according to your OS and CUDA version. For example:

pip install torch torchvision torchaudio

3. Install DGL: Follow their tutorial to run the proper command according to your OS and CUDA version. For example:

pip install dgl -f https://data.dgl.ai/wheels/repo.html

4. Install openhgnn:

  • install from pypi
pip install openhgnn
  • install from source
git clone https://github.com/BUPT-GAMMA/OpenHGNN
# If you encounter a network error, try git clone from openi as following.
# git clone https://git.openi.org.cn/GAMMALab/OpenHGNN.git
cd OpenHGNN
pip install .

5. Install gdbi(Optional):

  • install gdbi from git
pip install git+https://github.com/xy-Ji/gdbi.git
  • install graph database from pypi
pip install neo4j==5.16.0
pip install nebula3-python==3.4.0

Running an existing baseline model on an existing benchmark dataset

python main.py -m model_name -d dataset_name -t task_name -g 0 --use_best_config --load_from_pretrained

usage: main.py [-h] [--model MODEL] [--task TASK] [--dataset DATASET] [--gpu GPU] [--use_best_config][--use_database]

optional arguments:

-h, --help show this help message and exit

--model -m name of models

--task -t name of task

--dataset -d name of datasets

--gpu -g controls which gpu you will use. If you do not have gpu, set -g -1.

--use_best_config use_best_config means you can use the best config in the dataset with the model. If you want to set the different hyper-parameter, modify the openhgnn.config.ini manually. The best_config will override the parameter in config.ini.

--load_from_pretrained will load the model from a default checkpoint.

--use_database get dataset from database

e.g.:

python main.py -m GTN -d imdb4GTN -t node_classification -g 0 --use_best_config

Note: If you are interested in some model, you can refer to the below models list.

Refer to the docs to get more basic and depth usage.

Use TensorBoard to visualize your train result

tensorboard --logdir=./openhgnn/output/{model_name}/

e.g.:

tensorboard --logdir=./openhgnn/output/RGCN/

Note: To visualize results, you need to train the model first.

Use gdbi to get grpah dataset

take neo4j and imdb dataset for example

  • construct csv file for dataset(node-level:A.csv,edge-level:A_P.csv)
  • import csv file to database
LOAD CSV WITH HEADERS FROM "file:///data.csv" AS row
CREATE (:graphname_labelname {ID: row.ID, ... });
  • add user information to access database in config.py file
self.graph_address = [graph_address]
self.user_name = [user_name]
self.password = [password]
  • e.g.:
python main.py -m MAGNN -d imdb4MAGNN -t node_classification -g 0 --use_best_config --use_database

Supported Models with specific task

The link will give some basic usage.

Model Node classification Link prediction Recommendation
TransE[NIPS 2013] ✔️
TransH[AAAI 2014] ✔️
TransR[AAAI 2015] ✔️
TransD[ACL 2015] ✔️
Metapath2vec[KDD 2017] ✔️
RGCN[ESWC 2018] ✔️ ✔️
HERec[TKDE 2018] ✔️
HAN[WWW 2019] ✔️ ✔️
KGCN[WWW 2019] ✔️
HetGNN[KDD 2019] ✔️ ✔️
HeGAN[KDD 2019] ✔️
HGAT[EMNLP 2019]
GTN[NeurIPS 2019] & fastGTN ✔️
RSHN[ICDM 2019] ✔️ ✔️
GATNE-T[KDD 2019] ✔️
DMGI[AAAI 2020] ✔️
MAGNN[WWW 2020] ✔️
HGT[WWW 2020] ✔️
CompGCN[ICLR 2020] ✔️ ✔️
NSHE[IJCAI 2020] ✔️
NARS[arxiv] ✔️
MHNF[arxiv] ✔️
HGSL[AAAI 2021] ✔️
HGNN-AC[WWW 2021] ✔️
HeCo[KDD 2021] ✔️
SimpleHGN[KDD 2021] ✔️
HPN[TKDE 2021] ✔️ ✔️
RHGNN[arxiv] ✔️
HDE[ICDM 2021] ✔️
HetSANN[AAAI 2020] ✔️
ieHGCN[TKDE 2021] ✔️

Candidate models

Contributors

OpenHGNN Team[GAMMA LAB], DGL Team and Peng Cheng Laboratory.

See more in CONTRIBUTING.

Cite OpenHGNN

If you use OpenHGNN in a scientific publication, we would appreciate citations to the following paper:

@inproceedings{han2022openhgnn,
  title={OpenHGNN: An Open Source Toolkit for Heterogeneous Graph Neural Network},
  author={Hui Han, Tianyu Zhao, Cheng Yang, Hongyi Zhang, Yaoqi Liu, Xiao Wang, Chuan Shi},
  booktitle={CIKM},
  year={2022}
}

openhgnn's People

Contributors

aliciaaawang avatar astral-requiem avatar bunnyqiqi avatar buptlfq avatar carlosdjy avatar clearhanhui avatar clingingsai avatar coincidenceboy avatar dddg617 avatar deskearth avatar frankshal avatar guyuisland avatar j0kerfeng avatar lazishu2000 avatar liushiliushi avatar lspongebobjh avatar moonlight-sherry avatar siyongxu avatar theheavens avatar voidharuhi avatar whatisup555 avatar wsy000718 avatar xinstein3033 avatar xkyang00 avatar xyx0711 avatar yijianliu avatar ying-1106 avatar zhaijojo avatar zhanghyi avatar zsy0828 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

openhgnn's Issues

cannot import name 'Experiment' from 'openhgnn'

🐛 Bug

To Reproduce

I try to use the Experiment module:

from openhgnn import Experiment

error:

cannot import name 'Experiment' from 'openhgnn'

Environment

  • OpenHGNN Version: 0.2.1
  • Backend Library & Version: PyTorch 1.11.0, DGL 0.6.1
  • OS: Win11
  • Python version: 3.8.13

error in HetGNN_sampler.py

line 168, in assign_features_to_blocks
assign_simple_node_features(blocks[0].srcdata, g, ntypes)
AttributeError: 'dict' object has no attribute 'srcdata'

Error to run without Cuda

File "C:\Users\XyZ\OpenHGNN\openhgnn\models\GTN_sparse.py", line 220, in forward
sum_g = dgl.adj_sum_graph(A, 'w_sum')
AttributeError: module 'dgl' has no attribute 'adj_sum_graph'

This Issue came-up while I ran the command= python main.py -m GTN -d imdb4GTN -t node_classification -g -1 --use_best_config

Can someone say me where I went wrong?

Run Example

❓ Questions and Help

from openhgnn.dataset import generate_random_hg
from dgl import transforms as T
from dgl import DGLHeteroGraph
from dgl.data import DGLDataset
import torch as th
import numpy as np
import scipy.sparse as sp

category = 'p'
meta_paths_dict = {'pap': [('p', 'to', 'a'), ('a', 'to', 'p')]}


class MyNCDataset(DGLDataset):
    def __init__(self):
        super().__init__(name='my-nc-dataset')

    def process(self):
        # Generate a random heterogeneous graph with labels on target node type.
        self._g = load_acm()
        transform = T.Compose([T.ToSimple(), T.AddReverse()])
        # self._g = transform(self._g)

    # Some models require meta paths, you can set meta path dict for this dataset.
    @property
    def meta_paths_dict(self):
        return meta_paths_dict

    def __getitem__(self, idx):
        return self._g

    def __len__(self):
        return 1

def load_acm():
    path = "../data/acm/"
    ratio = [1, 5, 10, 20]
    label = np.load(path + "labels.npy").astype('int32')
    nei_a = np.load(path + "nei_a.npy", allow_pickle=True)
    nei_s = np.load(path + "nei_s.npy", allow_pickle=True)
    feat_p = sp.load_npz(path + "p_feat.npz").astype("float32")
    feat_a = sp.load_npz(path + "a_feat.npz").astype("float32")
    feat_s = make_sparse_eye(60)
    pap = sp.load_npz(path + "pap.npz")
    psp = sp.load_npz(path + "psp.npz")
    pos = sp.load_npz(path + "pos.npz")

    label = torch.LongTensor(label)
    nei_a = nei_to_edge_index([torch.LongTensor(i) for i in nei_a])
    nei_s = nei_to_edge_index([torch.LongTensor(i) for i in nei_s])
    feat_p = preprocess_sp_features(feat_p)
    feat_a = preprocess_sp_features(feat_a)
    feat_s = preprocess_th_features(feat_s)
    pap = sp_adj_to_tensor(pap)
    psp = sp_adj_to_tensor(psp)
    pos = sp_adj_to_tensor(pos)
    # nei_a size: (2, 13407)
    edge = {
        ('a', 'to', 'p'): (nei_a.flip([0])[0], nei_a.flip([0])[1]),
        ('s', 'to', 'p'): (nei_s.flip([0])[0], nei_s.flip([0])[1])
    }
    g = dgl.heterograph(edge)
    g.nodes['p'].data['x'] = feat_p
    g.nodes['s'].data['x'] = feat_s
    g.nodes['a'].data['x'] = feat_a
    g.nodes['p'].data['label'] = label

    # data[('p', 'a', 'p')].edge_index = pap
    # data[('p', 's', 'p')].edge_index = psp
    # data[('p', 'pos', 'p')].edge_index = pos

    for r in ratio:
        mask = train_test_split(
            g.nodes['p'].data['label'].detach().cpu().numpy(), seed=np.random.randint(0, 35456, size=1),
            train_examples_per_class=r,
            val_size=1000, test_size=None)
        train_mask_l = f"{r}_train_mask"
        train_mask = mask['train'].astype(bool)
        val_mask_l = f"{r}_val_mask"
        val_mask = mask['val'].astype(bool)

        test_mask_l = f"{r}_test_mask"
        test_mask = mask['test'].astype(bool)

        g.nodes['p'].data[train_mask_l] = torch.from_numpy(train_mask)
        g.nodes['p'].data[val_mask_l] = torch.from_numpy(val_mask)
        g.nodes['p'].data[test_mask_l] = torch.from_numpy(test_mask)

    return g

if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('--model', '-m', default='RGCN', type=str, help='name of models')
    parser.add_argument('--dataset', '-d', default='acm', type=str, help='acm or cora')
    parser.add_argument('--gpu', '-g', default='-1', type=int, help='-1 means cpu')
    parser.add_argument('--mini-batch-flag', action='store_true')

    args = parser.parse_args()

    ds = MyNCDataset()
    new_ds = AsNodeClassificationDataset(ds, target_ntype='author', labeled_nodes_split_ratio=[0.8, 0.1, 0.1],
                                         prediction_ratio=1, label_mask_feat_name='label_mask')

    experiment = Experiment(conf_path='./my_config.ini', max_epoch=1, model=args.model, dataset=new_ds,
                            task='node_classification', mini_batch_flag=args.mini_batch_flag, gpu=args.gpu,
                            test_flag=False, prediction_flag=False, batch_size=100, use_uva=False)
    experiment.run()
WARNING:root:The OGB package is out of date. Your version is 1.3.5, while the latest version is 1.3.6.
------------------------------------------------------------------------------
 Basic setup of this experiment: 
     model: RGCN    
     dataset: my-nc-dataset-as-nodepred   
     task: node_classification. 
 This experiment has following parameters. You can use set_params to edit them.
 Use print(experiment) to print this information again.
------------------------------------------------------------------------------
batch_size: 100
dataset_name: my-nc-dataset-as-nodepred
device: cpu
dropout: 0.2
fanout: 4
gpu: -1
hidden_dim: 64
hpo_search_space: None
hpo_trials: 100
in_dim: 64
load_from_pretrained: True
lr: 0.01
max_epoch: 0
mini_batch_flag: False
model_name: RGCN
n_bases: 40
num_layers: 3
optimizer: Adam
output_dir: ./openhgnn/output/RGCN
patience: 50
prediction_flag: True
seed: 0
test_flag: False
use_best_config: False
use_self_loop: False
use_uva: False
validation: True
weight_decay: 0.0001

08 May 15:02    INFO  [Config Info]	Model: RGCN,	Task: node_classification,	Dataset: Dataset("my-nc-dataset-as-nodepred", num_graphs=1, save_path=/home/yhkj/.dgl/my-nc-dataset-as-nodepred)
08 May 15:02    INFO  [NC Specific] Modify the out_dim with num_classes
Traceback (most recent call last):
  File "/hgnn/inference.py", line 21, in <module>
    prediction_res = experiment.run()
  File "/home/yhkj/anaconda3/envs/ssl/lib/python3.8/site-packages/openhgnn/experiment.py", line 105, in run
    flow = build_flow(self.config, trainerflow)
  File "/home/yhkj/anaconda3/envs/ssl/lib/python3.8/site-packages/openhgnn/trainerflow/__init__.py", line 46, in build_flow
    return FLOW_REGISTRY[flow_name](args)
  File "/home/yhkj/anaconda3/envs/ssl/lib/python3.8/site-packages/openhgnn/trainerflow/node_classification.py", line 41, in __init__
    self.model = build_model(self.model).build_model_from_args(self.args, self.hg).to(self.device)
  File "/home/yhkj/anaconda3/envs/ssl/lib/python3.8/site-packages/openhgnn/models/RGCN.py", line 41, in build_model_from_args
    return cls(args.hidden_dim,
  File "/home/yhkj/anaconda3/envs/ssl/lib/python3.8/site-packages/openhgnn/models/RGCN.py", line 73, in __init__
    self.layers.append(RelGraphConvLayer(
  File "/home/yhkj/anaconda3/envs/ssl/lib/python3.8/site-packages/openhgnn/models/RGCN.py", line 167, in __init__
    self.conv = dglnn.HeteroGraphConv({
  File "/home/yhkj/anaconda3/envs/ssl/lib/python3.8/site-packages/dgl/nn/pytorch/hetero.py", line 132, in __init__
    self.mods = nn.ModuleDict(mods)
  File "/home/yhkj/anaconda3/envs/ssl/lib/python3.8/site-packages/torch/nn/modules/container.py", line 322, in __init__
    self.update(modules)
  File "/home/yhkj/anaconda3/envs/ssl/lib/python3.8/site-packages/torch/nn/modules/container.py", line 398, in update
    self[key] = module
  File "/home/yhkj/anaconda3/envs/ssl/lib/python3.8/site-packages/torch/nn/modules/container.py", line 329, in __setitem__
    self.add_module(key, module)
  File "/home/yhkj/anaconda3/envs/ssl/lib/python3.8/site-packages/torch/nn/modules/module.py", line 388, in add_module
    raise KeyError("attribute '{}' already exists".format(name))
KeyError: "attribute 'to' already exists"

Process finished with exit code 1

Hello, dear author!
I followed the official example strictly, but returned an error on my own dataset!
I would like to know this is the reason.
Thank you for your help.

Why is the embedding of meta paths different for different nodes in acm4GTN?

❓ Questions and Help

Hi, I found that there are different meta-path embeddings for nodes of different types when I run HGSL model for node classification task in acm4GTN. Normally, it should have one type meta-path embedding for one type meta-path. So could you tell me "Why is the embedding of meta paths different for different nodes in acm4GTN?"

关于DBLP数据集精度差的问题

❓ Questions and Help

我想知道为什么我使用平台上面的DBLP数据集,用节点分类任务测试各个算法,精度却只有30多,这和其他论文DBLP精度差的很大。想请教这是什么原因

数据集的重复处理/预处理数据集文件缓存无效

在dataset模块中有设定可以将数据集中的raw数据处理成框架可以接收的格式后缓存为bin文件,以备下次快速读入无需重复处理,但是有些方法因为未对has_cache方法实现,导致每次都要重新读入原始文件并处理。例如openhgnn/dataset/gtn_dataset.py,可以加一个实现 def has_cache(self): return os.path.isfile(os.path.join(self.save_path, 'graph.bin'))

New user questions

Hi all, thanks for making this library available. I am trying to use it for my benchmarks, but I am having a bit of trouble.

I want to evaluate my own dataset for recommendation. In the website, there is an example only for node classification. I started to dig in the git repository and found an example for link_prediction under examples/customization.

I decided to settle for link_prediction, because I don't know what would be the equivalent to AsLinkPredictionDataset for recommendation.

I want to compute hits@k, but it is not clear where to change the metric, since I couldn't find the It as an input of AsLinkPredictionDataset, config.ini or OpenHGNN, so I have no idea how to change it.

In OGB benchmarks, they do the hits@k by providing a neg_df and a positive_df and comparing scores_pos > scores_neg. Maybe this could be part of the link_prediction pipeline to support hits@k?

I could also calculate the metric on my own if I could save the predictions, but it is not clear how to do inference or access the model after it is trained. I couldn't find it in the tutorials or examples.

In summary:
It would be nice to have:

  1. Tutorial for recommendation system;
  2. How to change metrics;
  3. Can I calculate hits@K in OGB style, where I compare hits@K on a given neg set?
  4. How to save/load the trained model and do inference?

thanks very much!!
Felipe

my code

import torch as th
from openhgnn.dataset import AsLinkPredictionDataset, generate_random_hg
from dgl import transforms as T
from dgl import DGLHeteroGraph
from dgl.data import DGLDataset
from dgl.dataloading.negative_sampler import GlobalUniform
import os
import numpy as np
meta_paths_dict ={}#{'APA': [('author', 'author-paper', 'paper'), ('paper', 'rev_author-paper', 'author')]}
target_link = [('DRUG', 'DRUG_DIS', 'DIS')]

class MySplitLPDatasetWithNegEdges(DGLDataset):
    def __init__(self):
        super().__init__(name='my-split-lp-dataset-with-neg-edges',
                         force_reload=True)

    def process(self):
        hg, neg_edges = np.load('pathtomydataset.npy'), allow_pickle=True)
        self._neg_val_edges, self._neg_test_edges = neg_edges['valid'], neg_edges['test']
        self._g = hg

    @property
    def neg_val_edges(self):
        return self._neg_val_edges

    @property
    def neg_test_edges(self):
        return self._neg_test_edges

    @property
    def meta_paths_dict(self):
        return meta_paths_dict

    def __getitem__(self, idx):
        return self._g

    def __len__(self):
        return 1


def train_with_custom_lp_dataset(dataset):
    from openhgnn.config import Config
    from openhgnn.start import OpenHGNN
    config_file = ["../../openhgnn/config.ini"]
    config = Config(file_path=config_file, model='RGCN', dataset=dataset, task='link_prediction', gpu=-1)
    OpenHGNN(args=config)

if __name__ == '__main__':
    mySplitLPDatasetWithNegEdges = AsLinkPredictionDataset(MySplitLPDatasetWithNegEdges(), target_link=target_link,
                                                           target_link_r=None,
                                                           force_reload=True)
    train_with_custom_lp_dataset(mySplitLPDatasetWithNegEdges)

space4hgnn运行rank.py文件报错

你好,我按照space4hgnn中README文件执行“Run a single experiment”和“Run a batch of experiments”后,在“3.2 Analyze with figures”时,运行rank.py文件时报错,错误信息如下,同时distribution.py文件也会报错。
/home/L/OpenHGNN/space4hgnn/figure/rank.py
Traceback (most recent call last):
File "/home/L/anaconda3/envs/openhgnn2/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 3361, in get_loc
return self._engine.get_loc(casted_key)
File "pandas/_libs/index.pyx", line 76, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 5198, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 5206, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'subgraph'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/L/OpenHGNN/space4hgnn/figure/rank.py", line 21, in
df=df[df['subgraph'] != 'mixed']
File "/home/L/anaconda3/envs/openhgnn2/lib/python3.7/site-packages/pandas/core/frame.py", line 3458, in getitem
indexer = self.columns.get_loc(key)
File "/home/L/anaconda3/envs/openhgnn2/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 3363, in get_loc
raise KeyError(key) from err
KeyError: 'subgraph'

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [14328, 334]] is at version 1; expected version 0 instead

When I run the model HGSL, the acm4GTN dataset can work normally. When I switch to the dblp4GTN dataset or the imdb4GTN dataset, this error occurs:(the following is dblp4GTN)

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [14328, 334]] is at version 1; expected version 0 instead.

And I set the undirected_relations = author-paper,paper-conference for dataset dblp4GTN in config.ini file. I just can find the tensor shape [14328,334] is node type “paper” in variable h_dict, but I don’t know why it doesn’t work? Or could u add the datasets “dblp4GTN” and “imdb4GTN” for HGSL model? And what’s the differences between “acm4GTN” and “[others]4GTN”?

Help needed: Wanted behavior of Experiment.specific_trainerflow.get method and task/trainerflow registration

Hi,
I am trying to create a new trainer flow, as well as a new task. I am struggling a bit and have a few questions:
When I register them with @register_flow(str_flow) and @register_task(str_task), must str_task and str_flow be identical?
Because as my flow is not specific to a model, it is not in the specific_trainerflow dictionnary defined in the Experiment class. So the line 92 in experiment.py( trainerflow = self.specific_trainerflow.get(self.config.model, self.config.task) ) returns the key of the task as the trainerflow_key. Is this the wanted behavior?

Thanks!

Something weird with the order of setting arguments when using best config

self.hidden_dim = self.h_dim * self.num_heads

Hi,

I found that Config would firstly read config arguments from config.ini then utilize best_config.py to replace them when setting --use_best_config. However, stuff like self.hidden_dim in config.py was set when Config reads arg from config.ini and would not change even when best_config.py modified self.h_dim and self.num_heads because best_config.py works in OpenHGNN(). Probably it's a bug I think? Because change of self.h_dim and self.num_heads might change self.hidden_dim.

个别模型minibatch训练问题

❓ Questions and Help

在使用自己的数据训练iehgcn时,forward函数里需要blocks[0].dsttype,但blocks[0].dsttype返回的是['_N']

    with hg.local_scope():
        hg.ndata['h'] = h_dict
        # formulas (2)-1
        dst_inputs = self.W_self(dst_inputs)
        query = {}
        key = {}
        attn = {}
        attention = {}
        
        # formulas (3)-1 and (3)-2
        for ntype in hg.dsttypes:
            query[ntype] = self.linear_q[ntype](dst_inputs[ntype])
            key[ntype] = self.linear_k[ntype](dst_inputs[ntype])

关于HetGNN的evaluation

你好,我的异构图比较大,直接跑全图的eva过程会爆内存,完全和训练一样batch跑还要过随机游走过程。请问还有什么好的办法吗?

run HGSL model error

🐛 Bug

when i run the suggest command :

python main.py -m HGSL -d acm4GTN -t node_classification -g 0 --use_best_config

this raise an error like:

Traceback (most recent call last):
File "main.py", line 21, in
experiment.run()
File "/workspace/OpenHGNN/openhgnn/experiment.py", line 97, in run
flow = build_flow(self.config, trainerflow)
File "/workspace/OpenHGNN/openhgnn/trainerflow/init.py", line 46, in build_flow
return FLOW_REGISTRYflow_name
File "/workspace/OpenHGNN/openhgnn/trainerflow/node_classification.py", line 42, in init
self.model = build_model(self.model).build_model_from_args(self.args, self.hg).to(self.device)
File "/workspace/OpenHGNN/openhgnn/models/HGSL.py", line 106, in build_model_from_args
mp_emb_dim = hg.nodes["paper"].data["pap_m2v_emb"].shape[1]
File "/opt/conda/lib/python3.7/site-packages/dgl/view.py", line 73, in getitem
return self._graph._get_n_repr(self._ntid, self._nodes)[key]
File "/opt/conda/lib/python3.7/site-packages/dgl/frame.py", line 622, in getitem
return self._columns[name].data
KeyError: 'pap_m2v_emb'

it seems like there is no pap_m2v_emb key in paper nodes data, so how to fix it?


more error update:
when I just make mp_emb_dim=0 to jump this line, more errors raise, such as no hidden_dimmini_batch_flag ... defined in config, besides, when I successfully run this model, another exception was raised:

image

Do you have an updated version of the model?

Sincere thanks.

To Reproduce

Steps to reproduce the behavior:

1.cd OpenHGNN
2.python main.py -m HGSL -d acm4GTN -t node_classification -g 0 --use_best_config

Expected behavior

Environment

  • OpenHGNN Version (e.g., 1.0):
  • PyTorch latest, DGL latest
  • Linux
  • python main.py -m HGSL -d acm4GTN -t node_classification -g 0 --use_best_config
  • best_config for recommend

How to train model using own dataset?

❓ Questions and Help

I want to train my own HNN data, could you tell me how to edit this code? the data in ./openhgnn/dataset are download from https://s3.cn-north-1.amazonaws.com.cn/dgl-data/ and is .bin file. So how could I change this dataset?
救救孩子

无法使用ACM4GTN数据集运行GTN

运行
python main.py -m GTN -t node_classification -d acm4GTN -g 0 --use_best_config

报错信息
Using backend: pytorch
Use the best config.
Done saving data into cached files.
Modify the out_dim with num_classes
0%| | 0/50 [00:00<?, ?it/s]
Traceback (most recent call last):
File "main.py", line 24, in
OpenHGNN(args=config)
File "/home/special/user/lihaoran/OpenHGNN_clone_from_github/openhgnn/start.py", line 17, in OpenHGNN
result = flow.train()
File "/home/special/user/lihaoran/OpenHGNN_clone_from_github/openhgnn/trainerflow/node_classification.py", line 77, in train
loss = self._full_train_step()
File "/home/special/user/lihaoran/OpenHGNN_clone_from_github/openhgnn/trainerflow/node_classification.py", line 109, in _full_train_step
loss.backward()
File "/opt/miniconda3/lib/python3.7/site-packages/torch/_tensor.py", line 255, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/opt/miniconda3/lib/python3.7/site-packages/torch/autograd/init.py", line 149, in backward
allow_unreachable=True, accumulate_grad=True) # allow_unreachable flag
File "/opt/miniconda3/lib/python3.7/site-packages/torch/autograd/function.py", line 87, in apply
return self._forward_cls.backward(self, *args) # type: ignore[attr-defined]
File "/opt/miniconda3/lib/python3.7/site-packages/dgl/backend/pytorch/sparse.py", line 544, in backward
gidxA.reverse(), A_weights, gidxC, dC_weights, gidxB.number_of_ntypes())
File "/opt/miniconda3/lib/python3.7/site-packages/dgl/backend/pytorch/sparse.py", line 638, in csrmm
CSRMM.apply(gidxA, A_weights, gidxB, B_weights, num_vtypes)
File "/opt/miniconda3/lib/python3.7/site-packages/dgl/backend/pytorch/sparse.py", line 528, in forward
gidxC, C_weights = _csrmm(gidxA, A_weights, gidxB, B_weights, num_vtypes)
File "/opt/miniconda3/lib/python3.7/site-packages/dgl/sparse.py", line 548, in _csrmm
A, F.to_dgl_nd(A_weights), B, F.to_dgl_nd(B_weights), num_vtypes)
File "dgl/_ffi/_cython/./function.pxi", line 287, in dgl._ffi._cy3.core.FunctionBase.call
File "dgl/_ffi/_cython/./function.pxi", line 232, in dgl._ffi._cy3.core.FuncCall
File "dgl/_ffi/_cython/./base.pxi", line 155, in dgl._ffi._cy3.core.CALL
dgl._ffi.base.DGLError: [17:18:53] /opt/dgl/src/array/cuda/csr_mm.cu:87: Check failed: e == CUSPARSE_STATUS_SUCCESS: CUSPARSE ERROR: 11
Stack trace:
[bt] (0) /opt/miniconda3/lib/python3.7/site-packages/dgl/libdgl.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x4f) [0x7fd13c2565df]
[bt] (1) /opt/miniconda3/lib/python3.7/site-packages/dgl/libdgl.so(std::pair<dgl::aten::CSRMatrix, dgl::runtime::NDArray> dgl::aten::cusparse::CusparseSpgemm<float, int>(dgl::aten::CSRMatrix const&, dgl::runtime::NDArray, dgl::aten::CSRMatrix const&, dgl::runtime::NDArray)+0x625) [0x7fd13c6accd5]
[bt] (2) /opt/miniconda3/lib/python3.7/site-packages/dgl/libdgl.so(std::pair<dgl::aten::CSRMatrix, dgl::runtime::NDArray> dgl::aten::CSRMM<2, long, float>(dgl::aten::CSRMatrix const&, dgl::runtime::NDArray, dgl::aten::CSRMatrix const&, dgl::runtime::NDArray)+0x59e) [0x7fd13c6af81e]
[bt] (3) /opt/miniconda3/lib/python3.7/site-packages/dgl/libdgl.so(dgl::aten::CSRMM(dgl::aten::CSRMatrix, dgl::runtime::NDArray, dgl::aten::CSRMatrix, dgl::runtime::NDArray)+0x10d6) [0x7fd13c493466]
[bt] (4) /opt/miniconda3/lib/python3.7/site-packages/dgl/libdgl.so(+0x48cfa8) [0x7fd13c493fa8]
[bt] (5) /opt/miniconda3/lib/python3.7/site-packages/dgl/libdgl.so(+0x48d724) [0x7fd13c494724]
[bt] (6) /opt/miniconda3/lib/python3.7/site-packages/dgl/libdgl.so(DGLFuncCall+0x48) [0x7fd13c4d5c78]
[bt] (7) /opt/miniconda3/lib/python3.7/site-packages/dgl/_ffi/_cy3/core.cpython-37m-x86_64-linux-gnu.so(+0x163ea) [0x7fd1136f03ea]
[bt] (8) /opt/miniconda3/lib/python3.7/site-packages/dgl/_ffi/_cy3/core.cpython-37m-x86_64-linux-gnu.so(+0x1695b) [0x7fd1136f095b]

GPU:A100-PCIE
DGL版本:dgl-cu111-0.8a211008
看起来可以获得logits,但是无法进行反向传播

很奇怪的是运行IMDB4GTN数据集时没有任何问题
使用MHNF运行ACM4GTN也报了同样的错误

我看到GTN有两个,一个是叫GTN_spare.py一个GTN.py,默认是用的GTN_spare。用GTN.py可以运行ACM4GTN,但是准确率只有60%上下

Error when running GTN&fastGTN

Thank you very much for being able to provide this tool. I get an error when I run fastGTN using:

python main.py -m fastGTN -t node_classification -d acm4GTN -g 0 --use_best_config

The error is as follows:

Traceback (most recent call last):
File "D:/github/OpenHGNN/main.py", line 30, in
OpenHGNN(args=config)
File "D:\github\OpenHGNN\openhgnn\start.py", line 19, in OpenHGNN
result = flow.train()
File "D:\github\OpenHGNN\openhgnn\trainerflow\node_classification.py", line 112, in train
train_loss = self._full_train_step()
File "D:\github\OpenHGNN\openhgnn\trainerflow\node_classification.py", line 152, in _full_train_step
logits = self.model(self.hg, h_dict)[self.category]
File "D:\Program Files (x86)\anaconda\envs\OpenHGNN\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "D:\github\OpenHGNN\openhgnn\models\fastGTN.py", line 119, in forward
hat_A = self.layersi
File "D:\Program Files (x86)\anaconda\envs\OpenHGNN\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "D:\github\OpenHGNN\openhgnn\models\fastGTN.py", line 180, in forward
sum_g = dgl.adj_sum_graph(A, 'w_sum')
File "D:\Program Files (x86)\anaconda\envs\OpenHGNN\lib\site-packages\dgl\transforms\functional.py", line 2766, in adj_sum_graph
C_gidx, C_weights = F.csrsum(gidxs, weights)
File "D:\Program Files (x86)\anaconda\envs\OpenHGNN\lib\site-packages\dgl\backend\pytorch\sparse.py", line 817, in csrsum
nrows, ncols, C_indptr, C_indices, C_eids, C_weights = CSRSum.apply(gidxs, *weights)
File "D:\Program Files (x86)\anaconda\envs\OpenHGNN\lib\site-packages\dgl\backend\pytorch\sparse.py", line 668, in forward
gidxC, C_weights = _csrsum(gidxs, weights)
File "D:\Program Files (x86)\anaconda\envs\OpenHGNN\lib\site-packages\dgl\sparse.py", line 776, in _csrsum
C, C_weights = _CAPI_DGLCSRSum(As, [F.to_dgl_nd(w) for w in A_weights])
File "D:\Program Files (x86)\anaconda\envs\OpenHGNN\lib\site-packages\dgl_ffi_ctypes\function.py", line 188, in call
check_call(_LIB.DGLFuncCall(
File "D:\Program Files (x86)\anaconda\envs\OpenHGNN\lib\site-packages\dgl_ffi\base.py", line 65, in check_call
raise DGLError(py_str(_LIB.DGLGetLastError()))
dgl._ffi.base.DGLError: [15:31:21] C:\Users\Administrator\dgl-0.5\src\array\kernel.cc:471: Check failed: A[i].indptr->dtype == idtype (int64 vs. int32) : The ID types of all graphs must be equal.

I use the following software versions:

python = 3.8
cudatoolkit = 11.3.1
torch = 1.11.0+cu113
dgl-cu113 = 0.8.1 & 0.8.0

Then I ran the same version of the software on my ubuntu server with no errors.

what's the version of numpy? An error happens.

❓ Questions and Help

The error is as follow.
np.object was a deprecated alias for the builtin object. To avoid this error in existing code, use object by itself. Doing this will not modify any behavior and is safe.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:

Attribute error

I am training HetGNN model for node classification. when i try to run the script for training. I get the following error. Please help me
AttributeError: 'dict' object has no attribute 'srcdata'

This demo is no longer valid

❓ Questions and Help

How to build a new dataset
Overview

We use dgl.heterograph as our graph data structure.

The API dgl.save_graphs and dgl.load_graphs can be used in storing graph into the local.
The Flow

Process your dataset as [dgl.heterograph](https://docs.dgl.ai/en/latest/guide/graph-heterogeneous.html#guide-graph-heterogeneous).
Store as graph.bin. Compress as dataset_name4model_name.zip
Upload the zip file to s3.
If the dataset is Heterogeneous Information Network, you can modify the [AcademicDataset](https://github.com/BUPT-GAMMA/OpenHGNN/blob/main/openhgnn/dataset/academic_graph.py) directly. Or you can refer to it building a new Class Dataset.

We give a demo to build a new dataset.

demo
This demo is no longer valid

a running error in the link prediction task based on TransE model

🐛 Bug

I found an error when I try to test the TransE model. The relative code is follow:

if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('--model', '-m', default='TransE', type=str, help='name of models')
    parser.add_argument('--gpu', '-g', default='-1', type=int, help='-1 means cpu')
    parser.add_argument('--mini-batch-flag', action='store_true')

    args = parser.parse_args()

    ds = MyLPDatasetWithPredEdges()
    new_ds = AsLinkPredictionDataset(ds, target_link=target_link, target_link_r=target_link_r,
                                     split_ratio=[0.8, 0.1, 0.1], force_reload=True)

    experiment = Experiment(conf_path='../config.ini', max_epoch=3, model=args.model, dataset=new_ds,
                            task='link_prediction', mini_batch_flag=args.mini_batch_flag, gpu=args.gpu,
                            test_flag=True, prediction_flag=True, batch_size=100)
    experiment.run()

When the training process is over and the test process is about to begin, the following error occurs:

------------------------------------------------------------------------------
 Basic setup of this experiment: 
     model: TransE    
     dataset: my-lp-dataset-as-linkpred   
     task: link_prediction. 
 This experiment has following parameters. You can use set_params to edit them.
 Use print(experiment) to print this information again.
------------------------------------------------------------------------------
batch_size: 100
dataset_name: my-lp-dataset-as-linkpred
device: cpu
dis_norm: 1
filtered: filtered
gpu: -1
hidden_dim: 400
hpo_search_space: None
hpo_trials: 100
load_from_pretrained: False
lr: 1.0
margin: 4.0
max_epoch: 3
mini_batch_flag: False
model_name: TransE
neg_size: 13
optimizer: SGD
output_dir: ./openhgnn/output\TransE
patience: 3
prediction_flag: False
score_fn: transe
seed: 0
test_flag: False
test_percent: 0.1
use_best_config: False
valid_percent: 0.01
weight_decay: 0.0001

17 Feb 16:50    INFO  [Init Task] The task: link prediction, the dataset: Dataset("my-lp-dataset-as-linkpred", num_graphs=1, save_path=\.dgl\my-lp-dataset-as-linkpred), the evaluation metric is roc_auc, the score function: transe 
17 Feb 16:50    INFO  [Train Info] epoch 000
100%|███████████████████████████████████████████████████████████| 1011/1011 [01:00<00:00, 16.65it/s]
17 Feb 16:51    INFO  [Train Info] epoch 000 loss: 1749.946214914322
Traceback (most recent call last):
  File "./OpenHGNN/examples/applications/link_prediction/train.py", line 21, in <module>
    experiment.run()
  File ".\OpenHGNN\openhgnn\experiment.py", line 107, in run
    result = flow.train()
  File ".\OpenHGNN\openhgnn\trainerflow\TransX_trainer.py", line 53, in train
    epoch = self._train()
  File ".\OpenHGNN\openhgnn\trainerflow\TransX_trainer.py", line 82, in _train
    val_metric = self._test_step('valid')
  File ".\OpenHGNN\openhgnn\trainerflow\TransX_trainer.py", line 127, in _test_step
    return {mode: self.task.evaluate(n_emb, r_emb, mode)}
  File ".\OpenHGNN\openhgnn\tasks\link_prediction.py", line 129, in evaluate
    p_score = th.sigmoid(self.ScorePredictor(eval_hg, n_embedding, r_embedding))
  File ".\OpenHGNN\openhgnn\models\TransE.py", line 36, in forward
    h_emb = self.n_emb(h.to(self.device))
  File "\Anaconda3\envs\threatrace\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "\Anaconda3\envs\threatrace\lib\site-packages\torch\nn\modules\sparse.py", line 162, in forward
    self.norm_type, self.scale_grad_by_freq, self.sparse)
  File "\Anaconda3\envs\threatrace\lib\site-packages\torch\nn\functional.py", line 2210, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
TypeError: embedding(): argument 'indices' (position 2) must be Tensor, not DGLGraph

exit code 1

Environment

  • OpenHGNN Version =1.0:
  • PyTorch 1.13.1+cu116, DGL 1.0.0:
  • OS Windows:
  • Python version: 3.7.16

Model reference consultation

Hi, do you have any references for homo_GNN model? The reference source of homo_GNN model was not found in the project

"dblp4HAN" dataset bug

🐛 Bug

When I ran "python -u /home/wj/dgl/OpenHGNN-main/main.py -m HAN -d dblp4HAN -t node_classification -g 6 --use_best_config --load_from_pretrained" with openhgnn, I got an error as "UnboundLocalError: local variable '_dataset' referenced before assignment".

To Reproduce

Steps to reproduce the behavior:

1.Just run as the command I shown above.

Expected behavior

  1. I traced the code and I found the source code was implemented with many "elif" to extinct the dataset name but without an "else" so when we input an invalid dataset name it will report an error about the variable but not the dataset name we inputted. So we can just add an "else" to improve the code error reports.
  2. And I found the "dblp4HAN" was introduced in the README.md file in openhgnn/dataset , but actually there is not such a dataset, so we can just modify this file?

Environment

  • OpenHGNN Version (e.g., 1.0):
  • Backend Library & Version (e.g., PyTorch 0.4.1, DGL 0.7.0):
  • OS (e.g., Linux):
  • Running command you used (e.g., python main.py -m GTN -d imdb4GTN -t node_classification -g 0 --use_best_config):
  • Model configuration you used (e.g., details of the model configuration you used in config.ini):
  • Python version: 3.8
  • CUDA/cuDNN version (if applicable):
  • GPU models and configuration (e.g. V100):
  • Any other relevant information:

Additional context

None.

关于HetGNN-emb有完全相同的情况

通过HetGNN跑提供的academic4HetGNN.zip 数据集。emb结果有完全相同的情况发生,原因未知。请问是否是符合预期的?

如下测试:

`import numpy as np

emb = np.load('emb50.npy')
list = emb[:,0]

for i in np.unique(list):
idx = np.argwhere(list == i)
r = idx.reshape(1, -1).squeeze(0)
if len(r) > 1:
print('index for {}:\n'.format(i), r)
for j in r:
print(emb[j])
`

[Doc] ReadTheDocs parameters

🐛 Bug

The parameters written in the docstring of __init__ should correspond to the input parameters. For example, we lost the ntypes in the HGT. Remember to check all the other models.

bugs in minibatch trainning

🐛 Bug

To Reproduce

error occurred in _mini_train_step function in trainerflow/node_classification.py when use mini_batch_flag in node_classification task and SimpleHGN model

import argparse
from openhgnn.experiment import Experiment

if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('--model', '-m', default='SimpleHGN', type=str, help='name of models')
    parser.add_argument('--task', '-t', default='node_classification', type=str, help='name of task')
    # link_prediction / node_classification
    parser.add_argument('--dataset', '-d', default='imdb4MAGNN', type=str, help='name of datasets')
    parser.add_argument('--gpu', '-g', default='0', type=int, help='-1 means cpu')
    parser.add_argument('--use_best_config', action='store_true', help='will load utils.best_config')
    parser.add_argument('--load_from_pretrained', action='store_true', help='load model from the checkpoint')
    args = parser.parse_args()

    experiment = Experiment(model=args.model, dataset=args.dataset, task=args.task, gpu=args.gpu,
                            use_best_config=args.use_best_config, load_from_pretrained=args.load_from_pretrained, mini_batch_flag = True, batch_size=64)
    experiment.run()

Expected behavior

Minibatch training on a large heterograph

Environment

  • torch==1.12.1
  • dgl-cu113==0.9.0 # for CUDA support
  • openhgnn==0.3.0
  • Linux
  • Python 3.8.13

Additional context

  • the default minibatch sampler is MultiLayerFullNeighborSampler
  • the blocks is a list (line 164) and the expected input in the forward function of the model (e.g. SimpleHGN) is a hg(line 159)
for i, (input_nodes, seeds, blocks) in enumerate(loader_tqdm):
    blocks = [blk.to(self.device) for blk in blocks]
    ...
    logits = self.model(blocks, emb)[self.category]
def forward(self, hg, h_dict):
    with hg.local_scope():
        hg.ndata['h'] = h_dict

Obtaining metapaths and attention scores

❓ Questions and Help

Hi I noticed in the FastGTN paper that they point out the top metapaths. My question is, how do you extract these specific metapaths? I have data where I want to find the best metapaths but I don't know the metapaths before hand (ex. I don't know author-paper-author is a metapath), can I specify a dataset without metapaths and have the model produce metapaths. I don't see a function for this. Also, I am assuming that the attention scores in the model is the A_hat. If not, how do we get the attention scores?

Thanks in advance

No trainflow for ie-HGCN?

❓ Questions and Help

It seems there is no specific trainflow for the new-added trainflow ie-HGCN in openhgnn/experiment.py, then how can I run this model?
Thanks for the reply.

Segmentation fault on GTN with 3 layers

🐛 Bug

To Reproduce

Steps to reproduce the behavior:

  1. Change num_layers in GTN to 3
  2. run python ./OpenHGNN/main.py -m GTN -d acm4GTN -t node_classification -g 0

Namespace(dataset='acm4GTN', gpu=0, load_from_pretrained=False, model='GTN', task='node_classification', use_best_config=False)

Basic setup of this experiment:
model: GTN
dataset: acm4GTN
task: node_classification.
This experiment has following parameters. You can use set_params to edit them.
Use print(experiment) to print this information again.

adaptive_lr_flag: True
dataset_name: acm4GTN
device: cuda:0
gpu: 0
hidden_dim: 128
hpo_search_space: None
hpo_trials: 100
identity: True
load_from_pretrained: False
lr: 0.005
max_epoch: 50
mini_batch_flag: False
model_name: GTN
norm_emd_flag: True
num_channels: 2
num_layers: 3
optimizer: Adam
out_dim: 16
output_dir: ./openhgnn/output/GTN
patience: 10
seed: 0
use_best_config: False
weight_decay: 0.001

10 Jan 00:43 INFO [Config Info] Model: GTN, Task: node_classification, Dataset: acm4GTN
Done saving data into cached files.
10 Jan 00:43 INFO [NC Specific] Modify the out_dim with num_classes
10 Jan 00:43 INFO [Feature Transformation] Feat is 0, nothing to do!
0%| | 0/50 [00:00<?, ?it/s]Segmentation fault (core dumped)

Expected behavior

No segmentation fault.

Environment

  • OpenHGNN Version (e.g., 1.0): Install from most recent source with pip -e
  • Backend Library & Version (e.g., PyTorch 0.4.1, DGL 0.7.0): pytorch=2.0.0=py3.8_cuda11.8_cudnn8.7.0_0, dgl=1.1.2.cu118=py38_0
  • OS (e.g., Linux): Linux
  • Running command you used (e.g., python main.py -m GTN -d imdb4GTN -t node_classification -g 0 --use_best_config): python ./OpenHGNN/main.py -m GTN -d acm4GTN -t node_classification -g 0
  • Model configuration you used (e.g., details of the model configuration you used in config.ini):
    [GTN]
    learning_rate = 0.005
    weight_decay = 0.001

hidden_dim = 128
out_dim = 16
num_channels = 2
num_layers = 3

seed = 0
max_epoch = 50
patience = 10

identity = True
norm_emd_flag = True
adaptive_lr_flag = True
mini_batch_flag = False

  • Python version: 3.8.18
  • CUDA/cuDNN version (if applicable):
  • GPU models and configuration (e.g. V100): Tesla T4
  • Any other relevant information: None

Can not run the cmd in tests/scripts/run_experiments.py on multiple datasets

for example ‘python main.py -m NSHE -t node_classification -d acm4NSHE -g 0 --use_best_config’ get the following message

Traceback (most recent call last):
File "main.py", line 28, in
OpenHGNN(args=config)
File "/root/Downloads/lyx/heter/hgt/base_methods/OpenHGNN-main/openhgnn/start.py", line 19, in OpenHGNN
result = flow.train()
File "/root/Downloads/lyx/heter/hgt/base_methods/OpenHGNN-main/openhgnn/trainerflow/nshe_trainer.py", line 56, in train
self._test_step()
File "/root/Downloads/lyx/heter/hgt/base_methods/OpenHGNN-main/openhgnn/trainerflow/nshe_trainer.py", line 136, in _test_step
metric = self.task.evaluate(logits, 'f1_lr')
File "/root/Downloads/lyx/heter/hgt/base_methods/OpenHGNN-main/openhgnn/tasks/node_classification.py", line 78, in evaluate
pred = logits[mask].argmax(dim=1).to('cpu')
UnboundLocalError: local variable 'mask' referenced before assignment

and for 'python main.py -m KGCN -d LastFM4KGCN -t recommendation -g 0 --use_best_config', we have

Traceback (most recent call last):
File "main.py", line 28, in
OpenHGNN(args=config)
File "/root/Downloads/lyx/heter/hgt/base_methods/OpenHGNN-main/openhgnn/start.py", line 18, in OpenHGNN
flow = build_flow(args, trainerflow)
File "/root/Downloads/lyx/heter/hgt/base_methods/OpenHGNN-main/openhgnn/trainerflow/init.py", line 46, in build_flow
return FLOW_REGISTRYflow_name
File "/root/Downloads/lyx/heter/hgt/base_methods/OpenHGNN-main/openhgnn/trainerflow/kgcn_trainer.py", line 19, in init
super(KGCNTrainer, self).init(args)
File "/root/Downloads/lyx/heter/hgt/base_methods/OpenHGNN-main/openhgnn/trainerflow/base_flow.py", line 48, in init
self.task = build_task(args)
File "/root/Downloads/lyx/heter/hgt/base_methods/OpenHGNN-main/openhgnn/tasks/init.py", line 37, in build_task
return TASK_REGISTRYargs.task
File "/root/Downloads/lyx/heter/hgt/base_methods/OpenHGNN-main/openhgnn/tasks/recommendation.py", line 13, in init
self.dataset = build_dataset(args.dataset, 'recommendation')
File "/root/Downloads/lyx/heter/hgt/base_methods/OpenHGNN-main/openhgnn/dataset/init.py", line 75, in build_dataset
return DATASET_REGISTRY[_dataset](dataset, logger=kwargs['logger'])
KeyError: 'logger'

无法使用gpu训练

python main.py -m KGCN -d LastFM4KGCN -t recommendation -g 0 --use_best_config

RuntimeError: Tensor for argument #2 'mat1' is on CPU, but expected it to be on GPU (while checking arguments for addmm)

关于HetGNN的emb顺序困惑

请教下 在 x = self.model(blocks[0], input_features) 中返回的x 是dict 。 他里面每种node_type 的emb 和blocks[0] 的入参的点的顺序如何对应?

我核对了以后 发现并不是 blocks[0].srcnodes[node_type].data[dgl.NID] 所代表的节点顺序。

Failed to import embedding flows.

when run the task of 'embedding' with the model 'MAGNN', it has the error 'Failed to import embedding flows.'
from openhgnn import Experiment
experiment = Experiment(model='MAGNN', dataset='acm4GTN', task='embedding', gpu=-1, lr=0.05, hidden_dim=64,
max_epoch=30, num_layers=3)
experiment.run()

SLiCE model

❓ Questions and Help

Hello! Thank you for the wonderful library.

I needed some help in running the code for the SLiCE model on any dataset for link prediction that you might have tested on. Can you please help me with the steps of doing the same?

Error when running fastGTN

I get an error when I run fastGTN using:
python main.py -m fastGTN -t node_classification -d acm4GTN -g 0 --use_best_config
The error is as follows:
AttributeError: 'Config' object has no attribute 'identity'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.