Comments (5)
Hi Cameron,
Thank you for your interest in our work!
As you rightfully noted, some of our final chosen hyperparameters did not propagate to the public GitHub's run file, and this caused a bit of a discrepancy. Sorry for the inconvenience! We are already preparing a new commit to fix that.
In the meantime, I think the key hyperparameter to change from your setting is hint_mode
, which should be encoded_decoded_nodiff
. You already figured out the other important hyperparameter to change (hint_teacher_forcing_noise
).
Further, you can think of both pgn
and pgn_mask
as PGNs ("mask" is a hyperparameter for the PGN, masking out possible predictions for the edge targets to follow the graph's edges. Sometimes this is a perfect inductive bias, sometimes it is very wrong.). What we did in the paper is, per-task, report the better result out of those two in the "PGN" column.
The mean reduction patch only affects processors that use the mean aggregation, which we never use in our official experiments, as the max aggregator was always superior.
I hope this is helpful. If you have any other issues, please don't hesitate to contact us.
Thanks,
Petar
from clrs.
To follow up on this, PR #94 integrates these hyperparameters into the main codebase.
from clrs.
Thank you for the quick response! I was able to replicate the paper results much more closely with the new specifications.
from clrs.
Hello, I just wanted to confirm that the paper settings for GAT was number of heads = 1, head size = 128?
from clrs.
Hi Cameron, I am not completely sure at this time, but what we report as "GAT" is actually the maximum performance out of gat
, gat_full
, gatv2
, and gatv2_full
, and I think also we swept number of heads between [1, 4, 8].
Basically, the best performance we were able to get out of all of these GAT variants, we reported as "GAT", due to a reduced amount of horizontal space.
from clrs.
Related Issues (15)
- Is the paper still available? HOT 3
- Hint `A_t` in SCC HOT 2
- Why no directed graph for FloydWarshall, Dijkstra, BFS and BellmanFord HOT 2
- Issue with distribution of undirected graphs HOT 2
- What does DFS output result mean ? HOT 1
- Repetition of indexes in pred
- Update of PyPI version
- Why the outputs of bfs and dfs algorithms are the same HOT 2
- Tarjan's strongly connected components algorithm
- Sampling bug on undirected weighted graphs HOT 2
- More input signals for evaluation HOT 9
- Bug in KMP implementation HOT 3
- tensorflow-macos and tensorflow-metal
- Problems with jax HOT 9
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from clrs.