Giter VIP home page Giter VIP logo

prolific_dreamer2d's People

Contributors

yuanzhi-zhu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

prolific_dreamer2d's Issues

Intuition behind `nerf_init`?

Hi there,

Thank you for providing a good reference code of prolific dreamers for the community. May I ask what the idea behind args.nerf_init is for sds latent-nerf?

    if args.nerf_init and args.rgb_as_latents and not args.use_mlp_particle:
        # current only support sds and experimental for only rgb_as_latents==True
        assert args.generation_mode == 'sds'
        with torch.no_grad():
            noise_pred = predict_noise0_diffuser(unet, particles, text_embeddings_vsd, t=999, guidance_scale=7.5, scheduler=scheduler)
        particles = scheduler.step(noise_pred, 999, particles).pred_original_sample

DeepFloyd IF

Can you provide a version of DeepFloyd IF? Thank you very much!

Question: About the unet and unet_phi

In this implementation of VSD, the unet is the same as unet_phi. However, in threestudio's implementation, the unets are different.

According to the paper, shouldn't the unets be different? The phi model is initialized the same as unet, but after optimization, they should be quite different.

Question: About the calculation of x0

When save_x0=True, the following code is used to compute x0.

pred_latents = scheduler.step(noise_pred-noise_pred_phi+noise, t, noisy_latents).pred_original_sample.to(dtype).clone().detach()

May I ask the rationale behind this formula? To my understanding, to compute $\hat{x_0}$ via pretrained model, one should use:

pred_latents = scheduler.step(noise_pred, t, noisy_latents).pred_original_sample.to(dtype).clone().detach()

The code which I'm referring to is

pred_latents = scheduler.step(noise_pred-noise_pred_phi+noise, t, noisy_latents).pred_original_sample.to(dtype).clone().detach()

What's the mathematical meaning of calculating noise_pred-noise_pred_phi+noise?

Thanks!

Strange artifacts after setting rgb_as_latents to false

Hi @yuanzhi-zhu,

thanks a lot for this awesome implementation of VSD!

I noticed that the training objective (i.e. “particle”) is a Gaussian white noise in latent space (see: https://github.com/yuanzhi-zhu/prolific_dreamer2d/blob/main/prolific_dreamer2d.py#L274), and the final image was decoded by VAE of the diffusion model. Comments suggest that pure Gaussian in image space will result in weird artifacts. Similar behavior was observed after I set rgb_as_latents to false (see the attached figure).

final_image_a_photograph_of_an_astronaut_riding_a_horse

I'm wondering what could be the reason for this? Is this some trick from the original paper?

particle-based variational inference

Hello,

Thank you so much for your great work, it helps a lot in understanding the papers and concepts.

However, I am very much confused about the particle-based variational inference in Prolific Dreamer and hope to get some insights from your code, but I could not find any reference. Not sure if I understand it correctly, but it seems theta is represented by the "latent" at line227 ? . Does that mean particle-based variational inference is not included in the code piece?

Thanks again for the great work and I look forward to your reply :)

What is the version of diffusers?

The version of diffusers in my environment is 0.14.0, and there is not the package diffusers.models.attention_processor. I also look up the newest version of diffusers 0.17.0, there is not this package too.

Classifer-free guidance

Hi,

Thanks for your awesome implementation. I have one question regarding the CFG guidance. Dose this line should be

noise_pred = noise_pred_text + guidance_scale * (noise_pred_text - noise_pred_uncond)

replacement of sds loss

I want to try dreamfusion's simple replacement of vsd loss, can I use the code in this project directly, because it is a 2d prolificdreamer, do I need to make any modifications? please give me some guidance

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.