yuanzhi-zhu / prolific_dreamer2d Goto Github PK
View Code? Open in Web Editor NEWUnofficial implementation of 2D ProlificDreamer
Unofficial implementation of 2D ProlificDreamer
Hi there,
Thank you for providing a good reference code of prolific dreamers for the community. May I ask what the idea behind args.nerf_init
is for sds latent-nerf?
if args.nerf_init and args.rgb_as_latents and not args.use_mlp_particle:
# current only support sds and experimental for only rgb_as_latents==True
assert args.generation_mode == 'sds'
with torch.no_grad():
noise_pred = predict_noise0_diffuser(unet, particles, text_embeddings_vsd, t=999, guidance_scale=7.5, scheduler=scheduler)
particles = scheduler.step(noise_pred, 999, particles).pred_original_sample
Can you provide a version of DeepFloyd IF? Thank you very much!
In this implementation of VSD, the unet is the same as unet_phi. However, in threestudio's implementation, the unets are different.
According to the paper, shouldn't the unets be different? The phi model is initialized the same as unet, but after optimization, they should be quite different.
When save_x0=True
, the following code is used to compute x0.
pred_latents = scheduler.step(noise_pred-noise_pred_phi+noise, t, noisy_latents).pred_original_sample.to(dtype).clone().detach()
May I ask the rationale behind this formula? To my understanding, to compute
pred_latents = scheduler.step(noise_pred, t, noisy_latents).pred_original_sample.to(dtype).clone().detach()
The code which I'm referring to is
prolific_dreamer2d/prolific_dreamer2d.py
Line 458 in e2ffc93
What's the mathematical meaning of calculating noise_pred-noise_pred_phi+noise
?
Thanks!
Hi @yuanzhi-zhu,
thanks a lot for this awesome implementation of VSD!
I noticed that the training objective (i.e. “particle”) is a Gaussian white noise in latent space (see: https://github.com/yuanzhi-zhu/prolific_dreamer2d/blob/main/prolific_dreamer2d.py#L274), and the final image was decoded by VAE of the diffusion model. Comments suggest that pure Gaussian in image space will result in weird artifacts. Similar behavior was observed after I set rgb_as_latents to false (see the attached figure).
I'm wondering what could be the reason for this? Is this some trick from the original paper?
Hello,
Thank you so much for your great work, it helps a lot in understanding the papers and concepts.
However, I am very much confused about the particle-based variational inference in Prolific Dreamer and hope to get some insights from your code, but I could not find any reference. Not sure if I understand it correctly, but it seems theta is represented by the "latent" at line227 ? . Does that mean particle-based variational inference is not included in the code piece?
Thanks again for the great work and I look forward to your reply :)
The version of diffusers in my environment is 0.14.0, and there is not the package diffusers.models.attention_processor. I also look up the newest version of diffusers 0.17.0, there is not this package too.
First, thank you for the great implementation!
https://github.com/yuanzhi-zhu/prolific_dreamer2d/blob/main/model_utils.py#L242
Could you describe why the lora scale "0" is depends on lora_v?
Hi, Thanks for uploading your work. I really enjoy this!
I have a question about your code.
In prolific_dreamer2d.py(https://github.com/yuanzhi-zhu/prolific_dreamer2d/blob/main/prolific_dreamer2d.py#L163), you use DDIM scheduler for scheduling.
Can I ask why you use DDIM scheduler, not DDPM or others?
Thanks!
thanks for your greate implementation!
Hi,
Thanks for your awesome implementation. I have one question regarding the CFG guidance. Dose this line should be
noise_pred = noise_pred_text + guidance_scale * (noise_pred_text - noise_pred_uncond)
thanks for your nice work, but I want to ask what does 2d mean?
I want to try dreamfusion's simple replacement of vsd loss, can I use the code in this project directly, because it is a 2d prolificdreamer, do I need to make any modifications? please give me some guidance
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.