Giter VIP home page Giter VIP logo

diffusion-extensions's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

diffusion-extensions's Issues

About the use of so3_lerp and so3_scale

Thanks for sharing your marvelous project!
I have some questions about the use of so3_lerp and so3_scale.
According to the paper, it seems that in the following code, so3_scale should be used instead of so3_lerp towards identity rotation.

mean = so3_lerp(self.identity, x_start, extract(self.sqrt_alphas_cumprod, t, x_start.shape))

Is there anything I missed or misunderstood? Or does that mean so3_lerp could approximate so3_scale? Your response would help a lot.

About the conversion from axis-angle representation to skew-symmetric matrix

Hi, I have a question about the conversion from the axis-angle representation vector to a skew-symmetric matrix, as implemented here:

def vec2skew(vec: torch.Tensor) -> torch.Tensor:

In the above implementation, the skew-symmetric matrix $S(v)$ for $v = \left[ x, y, z \right]$ is defined as:

$$S(v) = \begin{bmatrix} 0 & -z & y \\\ z & 0 & -x \\\ -y & x & 0 \end{bmatrix}$$

which is different from the definition in your ICLR paper (Denoising Diffusion Probabilistic Models on SO(3) for Rotational Alignment, https://openreview.net/forum?id=BY88eBbkpe5, page 3, six lines above Eq. 6).

image

Did I misunderstand something? Which definition should be used here for such conversion?

About sampling axis-angles from isotropic Gaussian distributions

Hello, thanks for providing such a wonderful method and code that helped me improve our project.

However, while testing the code you open sourced, I discovered that there may be a problem with the code that samples rotations in the form of axis angles from an isotropic Gaussian distribution:

trap_start = torch.gather(self.trap, 0, idx_0[..., None])[..., 0]
trap_end = torch.gather(self.trap, 0, idx_1[..., None])[..., 0]

In the above code, trap_start and trap_end are the indexes before and after the CDF is equal to unif, but they all obtain the index in the CDF corresponding to the first variance, rather than the corresponding variance.

I trained and tested the network's performance on the task of generating $\pm90$-degree rotations along the z-axis and got the following result plot:

Figure_1.

The predicted rotation distribution differs greatly from the true value.

I optimized this part of the code so that trap_start and trap_end are the indexes before and after the CDF corresponding to their respective variances is equal to unif. The optimized core code is as follows:

        B = self.eps.shape[0]
        bs = torch.arange(0, B)
        trap_start = self.trap[idx_0, bs]
        trap_end = self.trap[idx_1, bs]

In addition to this, I made several other changes to make the code runnable:

(1)
Original code:

R = process.p_sample(R, torch.full((1,), i, device=device, dtype=torch.long))

modified code:

R = process.p_sample(R, torch.full((BATCH,), i, device=device, dtype=torch.long))

(2)
Original code:

sample = IsotropicGaussianSO3(model_stdev[0]).sample([b])

modified code:

sample = IsotropicGaussianSO3(model_stdev).sample()

The result after modifying the code is shown below:

Figure_2

The predicted rotation distribution is almost the same as the true value.

If I understand it wrong, please point out my problem so that I can better understand your method, thank you~

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.