Comments (4)
Hello,
all transformer layers take as input X, E and y. Even if the output dimension of y is eventually 0, y is still useful. The only thing that is not trained is mlp_out_y
, that you can disable if you want.
For the regressor model in the conditional generation experiments on the contrary, the output dimensions of X
and E
are 0, but the output dimension of y
is 1 or 2.
Clement
from digress.
y
is indeed not used for computing the loss. The input y
to the transformer is the graph-level feature of the noisy_data
, computed by compute_extra_data
. The output y
from the transformer is not used as input to the next denoising step.
from digress.
@haoming-codes yes and this leads to the network layers using y to not be updated during training.
from digress.
The part of the network transforming y in the last transformer layer (y_y, e_y, x_y) is also not training. But I get what you mean by 'y' is still useful, since it's at least incorporating time to the other variables in the network. Thanks for clarifying!
Best,
from digress.
Related Issues (20)
- Your work in this paper is of great significance and it is my honor to present it in our group meeting. For my better understanding and presentation about your work, would you mind sharing me your presentation slides and viedo about this paper? I would very much appreciate it if you shared your slides to me, thanks!
- Something wrong with utils.EMA HOT 2
- I would very much appreciate it if you shared
- I will appreciate it if you could send me a mail HOT 2
- Reporting KL divergence loss for training step
- Question about the architecture (graphTransformer) HOT 2
- Generate graphs by text HOT 2
- bug found for Error: datamodule has no len()
- bug found for guidance branch. RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)
- problems with loading the checkpoints for planar.ckpt?
- Get the prev sample like in diffusers API
- Could the project use a graph structure to describe interatomic distance information?
- How much memory is required for data preprocessing? HOT 1
- Need some help of conditional generation
- Normalization for congress
- Validation step returns nan loss because of division by zero HOT 1
- Checkpoints not found at provided SwitchDrive links HOT 1
- Evaluating Trained Models HOT 2
- Download Links for checkpoints not working HOT 1
- Conditioning the regressor and target
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from digress.