Giter VIP home page Giter VIP logo

modnet's People

Contributors

manzke avatar nahidalam avatar yarkable avatar yzhou0919 avatar zhkkke avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

modnet's Issues

run webcam demo failed

Hi,

I'm trying to run the video matting demo based on WebCam, after follow your recommend cmd.
It failed at pytorch issue,but I google and can't find other error issue like this.
My environment is ubuntu 16.04 with gpu rtx2070 python3.6 cuda10

1186486

Could you help me solve this error.
Thanks

有百度云的地址吗?

预训练模型的GD地址由于现在较难科学上网,无法打开,希望可以更新到度盘或者直接放在github上

使用预训练模型效果不理想

[您好,我用了您的预训练模型直接进行预测matte,然后将matte作为mask与原图做“与"操作,发现效果很差,我是不是哪里操作不对啊?结果图
这是我的代码,其中matting是利用预训练模型modnet_photographic_portrait_matting.ckpt预测得到的matte
img = cv2.imread(image)
matting = cv2.imread(matting,cv2.IMREAD_GRAYSCALE)
masked = cv2.bitwise_and(img,img,mask=matting)

Question about the label of detail prediction

I want to know about how you get the label of detail branch.
Due to figure.2 in your paper, my code is as follow:
image = cv2.imread("/mnt/SegmentData/labels/00000.jpg")
kernel = np.ones((80, 80), np.uint8)
dil = cv2.dilate(image, kernel, iterations=1)
ero = cv2.erode(image, kernel, iterations=1))
dil_ero = dil - ero
out = dil_ero * image
But cannot get the similar result as you show as details_dp in Figure2.
It would be helpful if you could correct my codes. Thanks

关于soc部分训练的问题

作者您好,对于soc训练这部分有些疑问,想请教一下。

  1. 对于下面这个损失函数,以前面的l2损失为例,此时两个输入都是网络的输出结果,那么在做反向传播的时候,梯度是从两个输出同时传进去的吗,还是说会固定第二项。
    image
  2. 将M复制给M‘后,M’在后面权重是一直不变的吗?
  3. 我对于soc的理解是,Lcons损失由于是自监督训练,会导致ap向sp和dp靠近,但由于sp是模糊的,那么会导致ap也变的模糊。此时加个Ldd来限制ap朝模糊的方向发展。但是如果前面训练好的M网络对于新域的视频数据预测的很差,那么结果会不会朝不好的方向发展呢?

关于compositional loss

您好,有个问题想请教一下
关于 compositional loss,您论文中说到:It measures the absolute difference between the input image I and the composited image obtained from αp, the ground truth foreground, and the ground truth background.
这里的ground truth foreground, ground truth background指的是什么呢?您是在求解loss的时候用到了trimap么?
刚入门matting的小白,可能问题有点多

Modnet Demo App not working

The Gradio demo app you created does not appear to be working. It says "Launching Your Gradio Interface". "You will receive an email notification when the process is complete. You may close this page."

I don't know what this means. It also wasn't saying this yesterday.

https://gradio.app/g/modnet

No module named ‘src.models’

when I tred to complie image_matting, here is the error that I got. Does anyone know how to fix it ? I am kinda new for programming. Thx.

error occured when run on windows cpu

hi, because there is no GPU on my windows pc, so i change the GPU to cpu by :

企业微信截图_16076615348073
and chang the code in image_matting/inference.py line 98:
_, _, matte = modnet(im.cpu(), inference=False)

but error occured as:
企业微信截图_16076616854716

my torch-cpu=1.6.0
thanks.

mobilenetv2_human_seg.ckpt,这个文件是该项目中的吗?

作者你好,
我刚才试了试开源的模型,发现头发抠的非常精细,但是其它地方比较容易多抠出一些东西,特别是背景略复杂的时候。请问加上人体segmentation会不会好些?正好我在你分享的google drive链接中看到这个:mobilenetv2_human_seg.ckpt。请问是做这件事的吗?请问mobilenetv2_human_seg.ckpt怎么使用呢?

谢谢!

Change License to a compatible one that would allow easier use in open source projects.

MODNET is an impressive work in solving image and video matting problems.
This work has a non-commercial license. Could you please provide a more convenient license like MIT or Apache license that allows for commercial use. This would make it easier for this work to be used in open source projects. Non-commercial License work is not compatible for use in open source projects. You can read more about the incompatibility of non-commercial work from here. I am unable to make use of this work in my open source project because of the non-commercial license.

License clarification

The README file says that this project is released under the CC-BY-NC-SA 4.0 license.

Does this affect the images produced with the network? Usually, the license of a tool does not affect the license of the output of the tool, except if specified in the license. For example, the license for Adobe's Deep Image Matting dataset explicitly states that models trained on their images can only be used for non-commercial purposes, so I just wanted to ask to make sure this is not the case here.

difference between the video Inference and single Image Inference in the code

Hi , Thanks for your great work !
I am confused . what the difference between the video inference and the single image inference? except that where the picture came from. and the difference between the two ckpt file: modnet_webcam_portrait_matting.ckpt, modener_photographic_portrait_matting.ckpt. Thank you!

Provide user video

The colab notebook uses user's webcam video. How can we provide a pre-recorded video file for inference?

hd分支loss值一直不下降

抱歉又来打扰
在训练过程中hd_loss的值一直在 0.06 附近,按照你在另一个问题中回答改了一下损失函数

num_mask = torch.sum(mask)
hd_loss = torch.sum((torch.abs(pred_detail - matte) + torch.abs(pred_detail.detach() - matte)) * mask) / num_mask

在后续训练中发现还是不能收敛,有什么其他解决办法么

Any way to run the model in Tensorflow / Tensorflow.js?

Hi!
I'm not very experienced with ML yet and was wondering if you could point me towards ability to run the trained model inference using Tensorflow / TF.js. What would I need to do?

Thanks so much for the paper and the model!

Training data

In paper you've said

For a fair comparison, we train all models on the same
dataset, which contains nearly 3000 annotated foregrounds.

What is this dataset?

Noisy output when no human present

Hi, Thank you for open sourcing such a great work.
I tried the model, it works well in detecting person in almost every case, But there's always some noise in real input images. I tried it on an image with no person, but still the model masks a significant segment.

Is it possible to increase confidence score, or some post or preporcessing method to improve the results.
Thank you

Some question about training!

Hello, your work is excellent!
But, I want to know how your pre-trained model is trained,and I want to know how your loss function is represented in your code.
Can you publish more your code details about the training process?
Can you contact me through email?
[email protected]

train on rgb or bgr?

Hello , for image matting the train was done using bgr or rgb format?
i got better result when testing images with Bgr format (opencv format)
But when i checked your code of inference it's seems you're using RGB (Pil format)
so i'm confused maybe in training you sed RGB format.
thks

Demo result is far from good

Thank you for this excellent work.
However,there is a big gap between test results and description. There is obvious jitter at the edge and wrong classifications.
Do you have any plan to improve ?
image

Can we export pretrained model to TorchScript format?

this is a great work, and have some advantages comparing to other models.

does there have TorchScript format model as BackgroundMatting project do? this will do much more check and performance benchmark work on other devices such as mobile, i think this project must has a wide perspective and huge potential.

@ZHKKKe thanks your work, waiting for future progress.

i try to use jit.save but it is not working:

modnet = MODNet(backbone_pretrained=False)
modnet = nn.DataParallel(modnet)
modnet.load_state_dict(torch.load(args.ckpt_path, map_location=torch.device('cpu')))
modnet.eval()
scriptmod = torch.jit.script(modnet)
torch.jit.save(scriptmod, "modnet.pt")

Transfer learning

Hi,

Do you plan to release transfer learning options to specify which layers should be trained?

Also, have you tried this network on different objects besides portraits? If yes, did you managed to get comparable results?

PPM100

Thanks your great work. I'm very interested in the data sets you've built. Can you tell me how to create this dataset? Could you please open the dataset for us? thanks a lot.

关于SOC的训练问题

您好,关于soc的训练还有几个问题想请问一下。
在soc的时候,需要训练多少个epoch呢?
我自己在复现的时候,是按照论文中的Adam优化器 1e-4 的学习率去微调,我设置了30个epoch。但是发现在微调过程中loss在第一轮是先上升的很快,然后逐渐下降,但下降的并不多。并且在验证集上的iou还在不断下降。
我目前不知道是哪个环节出了问题,还望您指点一下。
非常感谢 !~

Embedded lowcost GPU

Hi,

cool project. Would like to test if we can port it to an ARM64 with a 32GB embedded low power GPU system (NVIDIA AGX Xavier 32GB). This can be used for training and inferencing. A first test with the pretrained in Dec. 2020 would give us a first impression of how the performance is comparable with desktop GPU.

SOC problem

Thanks for your sharing.Nice work~
here is a question about the SOC in your paper.
The self-supervied stage is used in the new domain datasets, so the new or the target datasets are which we will test later?

And another question is when i try to train MODNet, the prediction of dp if just boundary which is not same as your paper .
35
6

Exporting model to onnx format

How to export model to onnx format?
I tried the following with colab demo code , but it showed an error:
TypeError: forward() missing 1 required positional argument: 'inference'

from torch.autograd import Variable

model = MODNet(backbone_pretrained=False)
model = nn.DataParallel(model).cuda()

state_dict = torch.load(pretrained_ckpt)
model.load_state_dict(state_dict)
model.eval()
dummy_input = Variable(torch.randn(1, 3, 512, 512))
torch.onnx.export(model.module, dummy_input, '/content/MODNet/modnet.onnx', export_params = True)

BTW the test results looks amazing !!!

Support for Python 3.8?

I think the version of torch your using (1.0.0) requires Python 3.6. However, since most people are on 3.8 now, would it be possible to update the script to use a later version of torch?

Composition of two videos

Hi,

Thanks for this great project!
Are you planning to make a new release with composition of two videos? It would be good

Speed Related Issue

Hi,
Thank you for this great work. I really appreciate the accuracy. I wanted to segment this object at 30 FPS but relatively the detection speed is very slow. is there any way to use this as a real-time application. ? This repo will rock in the future...!

Dataset and Trimap

Hello! I used the model you trained to test a large number of pictures, and found that there is a significant problem. When there is an intermediate background such as a seat and other people’s avatars at both ends of the head and shoulders, the matting effect is poor, I can understand Is the semantic segmentation of portraits in the network inaccurate? Or there is no such case in your improved portrait dataset, which leads to poor segmentation. If this is the problem, can you add some such images to it?
You mentioned that the modnet can use trimap input, how do you do it? If the trimap is not very accurate, can the network adapt to the trimap to some extent?
There is also a portrait semantic segmentation model in the model you provided. How was it trained? It was in the first stage of the network.
11
11-1
22
22-1
33
33-1
9
11
3
3

train

when will you share the train code, i am too eager to try it !

question on your code

in_x = self.inorm(x[:, self.inorm_channels:, ...].contiguous())

the original code is "in_x = self.inorm(x[:, self.inorm_channels:, ...].contiguous())"

Should this line use "self.bnorm_channels" instead of inorm_channels?

How to improve image matting accuracy

Hi,

Thanks for this great project. I tried your colab for image matting, it looks like the boundary is not clear enough for some inputs(also the one in Github readme).

Is there anyway to improve image matting accuracy?

关于 adobe 数据训练的问题

@ZHKKKe
您好
我认为这是一个非常牛逼的项目!
当然,在有时候,一些特殊的图抠的不太理想,但是在正常的图抠的非常精细。
很自然的,我想到了使用更好的数据训练,比如 Adobe Deep Image Matting Dataset 数据集
能否分享一下详细的训练过程,非常感谢!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.