mrtornado24 / ide-3d Goto Github PK

View Code? Open in Web Editor NEW

472.0 19.0 44.0 85.17 MB

[SIGGRAPH Asia 2022] IDE-3D: Interactive Disentangled Editing For High-Resolution 3D-aware Portrait Synthesis

Home Page: https://mrtornado24.github.io/IDE-3D/

Python 15.99% Cuda 1.74% C++ 0.40% Jupyter Notebook 81.85% Shell 0.03%

3d-aware-image-synthesis face-editing generative-adversarial-network inversion nerf

ide-3d's Introduction

IDE-3D: Interactive Disentangled Editing for High-Resolution 3D-aware Portrait Synthesis

ACM Transactions on Graphics (SIGGRAPH Asia 2022)

IDE-3D: Interactive Disentangled Editing for High-Resolution 3D-aware Portrait Synthesis
Jingxiang Sun, Xuan Wang, Yichun Shi, Lizhen Wang, Jue Wang, Yebin Liu

https://mrtornado24.github.io/IDE-3D/

Abstract: Existing 3D-aware facial generation methods face a dilemma in quality versus editability: they either generate editable results in low resolution, or high quality ones with no editing flexibility. In this work, we propose a new approach that brings the best of both worlds together. Our system consists of three major components: (1) a 3D-semantics-aware generative model that produces view-consistent, disentangled face images and semantic masks; (2) a hybrid GAN inversion approach that initialize the latent codes from the semantic and texture encoder, and further optimized them for faithful reconstruction; and (3) a canonical editor that enables efficient manipulation of semantic masks in canonical view and producs high quality editing results. Our approach is competent for many applications, e.g. free-view face drawing, editing and style control. Both quantitative and qualitative results show that our method reaches the state-of-the-art in terms of photorealism, faithfulness and efficiency.

Installation

git clone --recursive https://github.com/MrTornado24/IDE-3D.git
cd IDE-3D
conda env create -f environment.yml

Getting started

Please download our pre-trained checkpoints from link and put them under pretrained_models/. The link mainly contains the pretrained generator ide3d-ffhq-64-512.pkl and the style encoder encoder-base-hybrid.pkl. More pretrianed models will be released soon.

Semantic-aware image synthesis

# Generate videos using pre-trained model

python gen_videos.py --outdir=out --trunc=0.7 --seeds=0-3 --grid=2x2 \
    --network=pretrained_models/ide3d-ffhq-64-512.pkl --interpolate 1 --image_mode image_seg

# Generate the same 4 seeds in an interpolation sequence

python gen_videos.py --outdir=out --trunc=0.7 --seeds=0-3 --grid=1x1 \
    --network=pretrained_models/ide3d-ffhq-64-512.pkl --interpolate 1 --image_mode image_seg

# Generate images using pre-trained model

python gen_images.py --outdir=out --trunc=0.7 --seeds=0-3 \
    --network=pretrained_models/ide3d-ffhq-64-512.pkl

# Extract shapes (saved as .mrc and .npy) using pre-trained model

python extract_shapes.py --outdir out --trunc 0.7 --seeds 0-3 \
    --network networks/network_snapshot.pkl --cube_size 1 
    
# Render meshes to video

python render_mesh.py --fname out/0.npy --outdir out

We visualize our .mrc shape files with UCSF Chimerax. Please refer to EG3D for detailed instruction of Chimerax.

Interactive editing

We provide an interactive tool that can be used for 3D-aware face drawing and editng in real-time. Before using it, please install the enviroment with pip install -r ./Painter/requirements.txt.

python Painter/run_ui.py
    --g_ckpt pretrained_models/ide3d-ffhq-64-512.pkl 
    --e_ckpt pretrained_models/encoder-base-hybrid.pkl

Preparing datasets

FFHQ: Download and process the Flickr-Faces-HQ dataset following EG3D. Then, parse semantic masks for all processed images using a pretrained parsing model. You can download dataset.json for FFHQ here. The processed data would be placed as:

    ├── /path/to/dataset
    │   ├── masks512x512
    │   ├── maskscolor512x512
    │   ├── images512x512
    │   │   ├── 00000
                ├──img00000000.png
    │   │   ├── ...
    │   │   ├── dataset.json

Custom dataset: You can process your own dataset using the following commands. It would be useful for real portrait image editing.

cd dataset_preprocessing/ffhq
python preprocess_in_the_wild.py --indir=INPUT_IMAGE_FOLDER

Real portrait image editing

IDE-3D supports 3D-aware real protrait image editing using our interactive tool. Please run the following commands:

# infer latent code as initialization

python apps/infer_hybrid_encoder.py 
    --target_img /path/to/img_0.png
    --g_ckpt pretrained_models/ide3d-ffhq-64-512.pkl 
    --e_ckpt pretrained_models/encoder-base-hybrid.pkl
    --outdir out

The above command would return rec_ws.pt under out/img_0.

# run pti

python inversion/scripts/run_pti.py 
    --run_name ide3d_plus_initial_code 
    --projector_type ide3d_plus 
    --pivotal_tuning
    --viz_image 
    --viz_mesh 
    --viz_video 
    --label_path /path/to/dataset.json 
    --image_name img_0
    --initial_w out/img_0/rec_ws.pt

We adopt PTI for 3D inverison. Before running, please place the images into examples/. You can pass Flag ide3d_plus or ide3d to choose different inversion types ('w' and 'w+'). Flag initial_w specifies the latent code obtained from the last step. It benefits more reasonable shape especially for images with steep viewing angles. The command would return pose label label.pt, reconstructed latent code latent.pt, finetuned generator and some visualizations.

# (optional) finetune encoder

python apps/finetune_hybrid_encoder.py
    --target_img /path/to/img_0.png
    --target_code /path/to/latent.pt
    --target_label /path/to/label.pt 
    --g_ckpt /path/to/finetuned_generator.pt 
    --e_ckpt pretrained_models/encoder-base-hybrid.pkl 
    --outdir out 
    --max-steps 1000

This step is to align the shapes reconstructed by encoders and PTI. The finetuned encoder would be saved as finetuned_encoder.pkl. Besides, a semantic mask mask.png would be saved under the same folder.

# run UI

python Painter/run_ui.py
    --g_ckpt /path/to/finetuned_generator.pt
    --e_ckpt /path/to/finetuned_encoder.pkl
    --target_code /path/to/latent.pt
    --target_label /path/to/label.pt
    --inversion

Note you should click Open Image and load mask.png that is returned in the last step.

3D-aware CLIP-guided domain adaptation

Please obtain the adapted generators following IDE3D-NADA. You can perform interactive editing in other domains by simply replacing the original generator by the adapted one:

python Painter/run_ui.py
    --g_ckpt /path/to/adapted_generator.pt
    --e_ckpt pretrained_models/encoder-base-hybrid.pkl

Semantic-guided style animation

IDE-3D supports animating stylized virtual faces through semantic masks. Please process a video clip and prepare a dataset.json. Then run:

python apps/infer_face_animation.py 
    --drive_root /path/to/images
    --network pretrained_models/ide3d-ffhq-64-512.pkl 
    --encoder pretrained_models/encoder-base-hybrid.pkl
    --grid 4x1 
    --seeds 52,197,229
    --outdir out

Training

Training scipts will be released soon.

Acknowledgments

Part of the codes are borrowed from StyleGAN3, PTI, EG3D and StyleGAN-nada.

Citation

If you use this code for your research, please cite the following works:


@article{sun2022ide,
 title = {IDE-3D: Interactive Disentangled Editing for High-Resolution 3D-aware Portrait Synthesis},
 author = {Sun, Jingxiang and Wang, Xuan and Shi, Yichun and Wang, Lizhen and Wang, Jue and Liu, Yebin},
 journal = {ACM Transactions on Graphics (TOG)},
 volume = {41},
 number = {6},
 articleno = {270},
 pages = {1--10},
 year = {2022},
 publisher = {ACM New York, NY, USA},
 doi={10.1145/3550454.3555506},
}

@inproceedings{sun2022fenerf,
  title={Fenerf: Face editing in neural radiance fields},
  author={Sun, Jingxiang and Wang, Xuan and Zhang, Yong and Li, Xiaoyu and Zhang, Qi and Liu, Yebin and Wang, Jue},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={7672--7682},
  year={2022}
}

ide-3d's People

Contributors

Stargazers

Watchers

ide-3d's Issues

Question about dataset.json for inversion

Hi there,

In apps/infer_hybrid_encoder.py, Line 141: fname = 'D:/projects/eg3d/data/FFHQ/images512x512/dataset.json' # label list Path. . Should we prepare dataset.json by ourself? Is it possible you could share this file? Thank you so much in advance!

Painter/run_UI.py

D:/projects/IDE-3D/data/ffhq/images512x512/dataset.json
Can you please share this dataset .json file? When I try to run run_UI.py script, it cause error.

Pretrained checkpoints.

I see ”More pretrianed models will be released soon." Could you please tell me when will the pretrained checkpoints be released.

Request for network implementation details.

this work is quite interesting and amazing.
i want to learn about more details to better understand how it works and i'm looking forward to the script of networks.
thanks a lot

Export 3d object

The project looks great.
Is it possible to export a mesh, like a obj or ply?

thanks a lot

Pose for the real image inversion

Hi there, sorry for too many questions these days!

It looks like in apps/infer_hybrid_encoder.py, it just reads the camera pose from dataset.json. If we want to invert a real image, should we get the pose by using some off-the-shelf pose estimator? Thanks!

the pretrained generator ide3d-ffhq-64-512.pkl and the style encoder encoder-base-hybrid.pkl.

Hi. I want to make my own generator and style encoder for my own dataset, but I am not sure how I can build these pickles.
Can you please give me a solution or referral link about this?

Error While - pyglet.gl.ContextException: Could not create GL context while running the command "python render_mesh.py --fname out/0.npy --outdir out'

(pytorch3d) student@CL502-07://mnt/c/Users/sit/Documents/IDE-3D$
0%| | 0/240 [00:00<?, ?it/s]libGL error: MESA-LOADER: failed to open swrast: /home/student/anaconda3/envs/pytorch3d/bin/../lib/libstdc++.so.6: version `GLIBCXX_3.4.30' not found (required by /lib/x86_64-linux-gnu/libLLVM-15.so.1) (search paths /usr/lib/x86_64-linux-gnu/dri:$${ORIGIN}/dri:/usr/lib/dri, suffix _dri)
libGL error: failed to load driver: swrast
0%| | 0/240 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/mnt/c/Users/sit/Documents/IDE-3D/render_mesh.py", line 85, in
render()
File "/home/student/anaconda3/envs/pytorch3d/lib/python3.9/site-packages/click/core.py", line 1157, in call
return self.main(*args, **kwargs)
File "/home/student/anaconda3/envs/pytorch3d/lib/python3.9/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/home/student/anaconda3/envs/pytorch3d/lib/python3.9/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/student/anaconda3/envs/pytorch3d/lib/python3.9/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/mnt/c/Users/sit/Documents/IDE-3D/render_mesh.py", line 60, in render
r = pyrender.OffscreenRenderer(512, 512)
File "/home/student/anaconda3/envs/pytorch3d/lib/python3.9/site-packages/pyrender/offscreen.py", line 31, in init
self._create()
File "/home/student/anaconda3/envs/pytorch3d/lib/python3.9/site-packages/pyrender/offscreen.py", line 149, in _create
self._platform.init_context()
File "/home/student/anaconda3/envs/pytorch3d/lib/python3.9/site-packages/pyrender/platforms/pyglet_platform.py", line 50, in init_context
self._window = pyglet.window.Window(config=conf, visible=False,
File "/home/student/anaconda3/envs/pytorch3d/lib/python3.9/site-packages/pyglet/window/xlib/init.py", line 133, in init
super(XlibWindow, self).init(*args, **kwargs)
File "/home/student/anaconda3/envs/pytorch3d/lib/python3.9/site-packages/pyglet/window/init.py", line 538, in init
context = config.create_context(gl.current_context)
File "/home/student/anaconda3/envs/pytorch3d/lib/python3.9/site-packages/pyglet/gl/xlib.py", line 105, in create_context
return XlibContext(self, share)
File "/home/student/anaconda3/envs/pytorch3d/lib/python3.9/site-packages/pyglet/gl/xlib.py", line 127, in init
raise gl.ContextException('Could not create GL context')
pyglet.gl.ContextException: Could not create GL context

buggy codebase

The paper and result seems promising. But the code is too buggy to run. The messy import is a nightmare.

Here are some tips for those who want to run this code:

install pytorch3D from official instead of using the install script here.
fix the hard code sys path
fix all kinds of import error

how to edit on my own images

I notice the advices from the section "Real portrait image editing", where first the following code should be executed:

python apps/infer_hybrid_encoder.py 
    --target_img /path/to/img_0.png
    --g_ckpt pretrained_models/ide3d-ffhq-64-512.pkl 
    --e_ckpt pretrained_models/encoder-base-hybrid.pkl
    --outdir out

But I found for my image, it does not have labels like those from dataset.json, so error occured:

File "/home/dianxin/hys/IDE-3D/apps/infer_hybrid_encoder.py", line 147, in
c = [label_list[opts.target_img[-21:]]]
KeyError: 'test.png'

How can I generate the labels for my own images? Thank you!

Is there pre-trained model trained on CelebA-Mask dataset?

Thank for your great work, could you please provide model trained on CelebA-Mask dataset?

Error during 'conda create -f environment.yaml'

command

git clone --recursive https://github.com/MrTornado24/IDE-3D.git
cd IDE-3D
conda env create -f environment.yml

result

(base) C:\t\IDE-3D>git clone --recursive https://github.com/MrTornado24/IDE-3D.git
Cloning into 'IDE-3D'...
remote: Enumerating objects: 320, done.
remote: Counting objects: 100% (3/3), done.
remote: Compressing objects: 100% (3/3), done.
remote: Total 320 (delta 0), reused 0 (delta 0), pack-reused 317
Receiving objects: 100% (320/320), 85.16 MiB | 5.92 MiB/s, done.
Resolving deltas: 100% (38/38), done.
Submodule 'dataset_preprocessing/ffhq/Deep3DFaceRecon_pytorch' ([email protected]:sicxu/Deep3DFaceRecon_pytorch.git) registered for path 'dataset_preprocessing/ffhq/Deep3DFaceRecon_pytorch'
Submodule 'ide3d-nada' ([email protected]:MrTornado24/ide3d-nada.git) registered for path 'ide3d-nada'
Cloning into 'C:/t/IDE-3D/IDE-3D/dataset_preprocessing/ffhq/Deep3DFaceRecon_pytorch'...
Host key verification failed.
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
fatal: clone of '[email protected]:sicxu/Deep3DFaceRecon_pytorch.git' into submodule path 'C:/t/IDE-3D/IDE-3D/dataset_preprocessing/ffhq/Deep3DFaceRecon_pytorch' failed
Failed to clone 'dataset_preprocessing/ffhq/Deep3DFaceRecon_pytorch'. Retry scheduled
Cloning into 'C:/t/IDE-3D/IDE-3D/ide3d-nada'...
Host key verification failed.
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
fatal: clone of '[email protected]:MrTornado24/ide3d-nada.git' into submodule path 'C:/t/IDE-3D/IDE-3D/ide3d-nada' failed
Failed to clone 'ide3d-nada'. Retry scheduled
Cloning into 'C:/t/IDE-3D/IDE-3D/dataset_preprocessing/ffhq/Deep3DFaceRecon_pytorch'...
Host key verification failed.
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
fatal: clone of '[email protected]:sicxu/Deep3DFaceRecon_pytorch.git' into submodule path 'C:/t/IDE-3D/IDE-3D/dataset_preprocessing/ffhq/Deep3DFaceRecon_pytorch' failed
Failed to clone 'dataset_preprocessing/ffhq/Deep3DFaceRecon_pytorch' a second time, aborting

(base) C:\t\IDE-3D>cd IDE-3D

(base) C:\t\IDE-3D\IDE-3D>conda env create -f environment.yml
Collecting package metadata (repodata.json): done
Solving environment: failed

ResolvePackageNotFound:
  - imageio=2.13.5
  - pillow=9.0.0
  - ninja=1.10.2.3

About style transfer

Hi, thank you for sharing code.
I have a question about the cartoon demo in github.io. cartoon demo

How to get this style transfer, using a cartoon dataset to transfer learning, then layer blend?
Hoping for your reply!

Configs for encoder training and canonical encoder

Hi,

Thank you for sharing your awesome work. I have some questions about how you train your encoder part. In your released code, script apps/train_hybrid_encoder.py shows how you do it, but there are two options (train_gen and train_real) and the default weight parameter for each loss is 0, also your paper and supp do not includes loss description of this part. So I want to ask can you release more details about it. I tried to train_gen and train_real (celebA_mask dataset) on your pretrained generator, but it fails to get good reconstruction encoder.

Also, you mentioned you add canonical encoder for this paper, but on apps/train_hybrid_encoder.py, I can not found canonical encoder part, the gen images or real images are not limited to canonical views, and there is only one encoder. Also I can not found it on your pretrained encoder, it seems that you directly use the pretrained encoder in Painter/run_UI.py.

When running painter theres error with dnnlib

import dnnlib

ModuleNotFoundError: No module named 'dnnlib

This is what i get , all dependencies installed fine but on windows some had to be lower versions .

name: ide3d
channels:

pytorch
nvidia
dependencies:
python=3.9
pip
numpy>=1.20
click>=8.0
pillow=8.3.1
scipy=1.7.1
pytorch=1.10.0
cudatoolkit=11.3
requests=2.27.1
tqdm=4.62.3
ninja=1.10.2
matplotlib=3.5.1
imageio=2.9.0
pip:
- imgui==1.4.1
- glfw==2.5.0
- pyopengl==3.1.5
- imageio-ffmpeg==0.4.5
- pyspng
- psutil
- mrcfile
- tensorboard
- einops
- pymcubes
- pytorch3d

network_snapshot.pkl was not found

Hi, The model file was not found when I ran the following command, Could you please share me the model file(network_snapshot.pkl)?

python extract_shapes.py --outdir out --trunc 0.7 --seeds 0-3
--network networks/network_snapshot.pkl --cube_size 1

Network architecture details, import issues, and novel view generations

Hi,
Thanks for the great work; we appreciate it. I have several questions/suggestions about it, though:

Do you plan to publish the explicit network architecture code soon? It would be beneficial for further research.
There are many absolute paths and importing issues throughout the codebase. Even some paths are hardcoded, which makes it more cumbersome to work with than needed. I suggest fixing those for easy usability.
Do you have an individual .py file to generate novel views from real-life images? As far as I understood, the inversion pipeline is as follows: first, apps/infer_hybrid_encoder.py generates a w, then inversion/scripts/run_pti.py fine-tunes the said w. Can I use that fine-tuned w and your hybrid encoder, along with different angles fed to the generator & renderer for the novel views? Have you applied further adjustments for generating new views (to get the images in the last row of Fig. 7. on the SIGGRAPH paper)?

Best,
Batuhan

Questions about the generator implementation

Hi, thanks for releasing the code!

Querying the source code from the pickle file, I found that some modules looked different from the paper Fig.2.

I could not find the three parallel branches sharing the 64x64x64 feature map you mentioned in the Appendix B. And the stylegan backbone looks simmilar to the EG3D's stylegan backbone, adding the toseg layer.
The texture decoder outputs the sigma; on the other hand, the shape decoder outputs the sigma in the paper.

Am I thinking something wrong, or is this exactly your design?

Please refer to the files below.
source_code.txt
generator_summary.txt

Is there network code?

Hello, thank for your great work at first.

Now, I am trying to understand your paper and model.

However, the code for the model is not shown to me.
I think the code will be placed in "./training/*" but it isn't.

You guys only plan to open the pretrained model as .pkl?
Or it would be updated as soon?

Remove the eg3d-nada submodule

It doesn't exist publicly on github, so it should probably be removed from your project.
Also, it would be nice to switch the submodules to use https paths so us laymen don't have to deal with ssh keys.

Possible bug in render_mesh.py

Hi there, thanks for providing the code! I found some possible bugs when I used render_mesh.py.

Lack of openGL environment. To solve this, I added os.environ['PYOPENGL_PLATFORM'] = 'egl' to the top of the code.
Delete the render. I met this error because the render is not deleted at each iteration. To solve this, I added r.delete() to the end of each iteration in the for loop.
Unsorted image list for video generation. In Line 70, the input image list is not sorted. To fix it, change Line 70 to img2video(sorted(glob.glob(f"tmp/{id}/*.png")), f"{outdir}/render.mp4").

Any plans for training code release?

Dear @MrTornado24

Congratulations on the amazing work and thank you for choosing to release the implementation.

I was wondering if there are any plans to release the training code in the near future?

Hoping for a positive reply.

Thank you.