microsoft / ptgnn Goto Github PK

View Code? Open in Web Editor NEW

372.0 12.0 40.0 307 KB

A PyTorch Graph Neural Network Library

License: MIT License

Dockerfile 0.25% Python 99.75%

graph-neural-networks deep-learning pytorch geometric-deep-learning gnn

ptgnn's People

Contributors

Stargazers

Watchers

ptgnn's Issues

Docker is not building

The provided dockerfile is not building

Cannot get high acc when use create_ggnn_mp_layers.

Hello,

Thanks for this great repo and I use it to play with varmisuse, but I have some questions about it. In the implementation, it seems that two different layers provided, create_mlp_mp_layers (see

ptgnn/ptgnn/implementations/varmisuse/train.py

Line 42 in ef13a9f

def create_mlp_mp_layers(num_edges: int):

)
and create_ggnn_mp_layers (see

ptgnn/ptgnn/implementations/varmisuse/train.py

Line 76 in ef13a9f

def create_ggnn_mp_layers(num_edges: int):

)
and in GGNN model, it invokes create_mlp_mp_layers (see

ptgnn/ptgnn/implementations/varmisuse/train.py

Line 114 in ef13a9f

message_passing_layer_creator=create_mlp_mp_layers,

)
to build the model, however according to my understanding, mlp layers are just fully-connected layers without message passing for graph learning, so I replace it with create_ggnn_mp_layers function for learning. But the results are not promising, with only 72.50 test accuracy on the same split from #1. Furthermore, mlp layers provide much higher accuracy, 81.13 test accuracy. It seems there is something wrong, but I cannot figure it out.

Best wishes.

Training the model for varmisuse task

Hey! I tried to run training of the varmisuse model in order to explore how it works on data from unseen projects. I have a few questions regarding it:

Seems like the dataset format has changed compared to the published version of data. I've found the following issue in another repository. Unfortunately, I had already reorganized data before finding the issue: converted json files into jsonlines and changed structure from project/{train|test|valid}/files to {train|test|valid}/files. It would be nice to either duplicate the reorganizing script to this repo, or add a link to the issue in README.
After reorganizing the data, I tried to run training with default settings (minibatch size = 300) on an instance with 94 GB RAM and 48 CPUs. The instance doesn't have GPU because I wanted to measure the memory usage so that I can allocate a proper GPU instance afterward. Unfortunately, training fails with OOM error, because it quickly utilizes 94 GB and asks for more. Moreover, I've tried to create a smaller version of the dataset by picking only 1 project from train/validation/test, and it didn't really help: with a minibatch size of 100 and a single project in train part I still got OOM. Is it expected behavior?
Which instance do you recommend for training the model? In particular, how much RAM do I need and how long does the training take on, let's say, V100?
Do you have a pre-trained model that you can share? Maybe I can avoid the training at all and just run the already trained model on different data.

Thanks a lot in advance and thanks for great projects and papers!

Add conda recipe

Describe the new feature:

It would be great to have a conda recipe so that it can be included with projects that have more complicated build processes (for example, using libraries that need C/C++ compilers).

What is the current outcome?

Have a recipe for ptgnn on the conda-forge channel.

Is it backward-compatible?

Yes, and it would be forward-compatible as well, because the conda bots can automatically fetch new sdists pushed to pypi.

Cannot run on the varmisuse task

Hi,

Thanks for this wonderful work, it is really helpful for others. But when I test for the varmisuse task, it cannot run correctly. Even in the first step. May I ask for help?

Thanks

How to obtain the raw function

Hi Miltos,

Thanks for this great project. When I play with VarMisuse task on the released data at https://www.microsoft.com/en-us/download/details.aspx?id=56844, I face one problem.
I want to get the original function based on the graphs, however it seems that the functions are not released. So I tried to parse the built graph to restore it based on the NextToken edge but it still failed. It seems that the entrance node for the graph or the index 1 node is not the beginning of the function and the next token can not string up a completed function, see the following

The filename of this sample is 'test\Nancy.Tests\Unit\Bootstrapper\Base\BootstrapperBaseFixtureBase.cs'
So may I ask for some advice about how to get the original functions?

Thanks

microsoft / ptgnn Goto Github PK

ptgnn's People

Contributors

Stargazers

Watchers

Forkers

ptgnn's Issues

Docker is not building

Cannot get high acc when use create_ggnn_mp_layers.

Training the model for varmisuse task

Add conda recipe

Describe the new feature:

What is the current outcome?

Is it backward-compatible?

Cannot run on the varmisuse task

How to obtain the raw function

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent