Giter VIP home page Giter VIP logo

piwise's Introduction

PiWiSe

Pixel-wise segmentation on the VOC2012 dataset using pytorch.

For a more complete implementation of segmentation networks checkout semseg.

Note:

  • FCN differs from original implementation see this issue
  • SegNet does not match original paper performance see here
  • PSPNet misses "atrous convolution" (conv layers of ResNet101 should be amended to preserve image size)

Keeping this in mind feel free to PR. Thank you!

Setup

See dataset examples here.

Download

Download image archive and extract and do:

mkdir data
mv VOCdevkit/VOC2012/JPEGImages data/images
mv VOCdevkit/VOC2012/SegmentationClass data/classes
rm -rf VOCdevkit

Install

We recommend using pyenv:

pyenv virtualenv 3.6.0 piwise
pyenv activate piwise

then install requirements with pip install -r requirements.txt.

Usage

For latest documentation use:

python main.py --help

Supported model parameters are fcn8, fcn16, fcn32, unet, segnet1, segnet2, pspnet.

Training

If you want to have visualization open an extra tab with:

python -m visdom.server -port 5000

Train the SegNet model 30 epochs with cuda support, visualization and checkpoints every 100 steps:

python main.py --cuda --model segnet2 train --datadir data \
    --num-epochs 30 --num-workers 4 --batch-size 4 \
    --steps-plot 50 --steps-save 100

Evaluation

Then we want to do semantic segmentation on foo.jpg:

python main.py --model segnet2 --state segnet2-30-0 eval foo.jpg foo.png

The segmented class image can now be found at foo.png.

Results

These are some results based on segnet after 40 epoches. Set

loss_weights[0] = 1 / 1

to deal gracefully with the unbalanced problem.

Input Output Ground Truth

piwise's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

piwise's Issues

No folder named "Labels", Tuple Index out of range, size mismatch.

First I think there's a small error in the README tutorial: Should the correct directory should be data/labels instead of data/classes ?

Anyways, I get the following traceback:

/home/nic/.conda/envs/piwise/lib/python3.6/site-packages/torch/nn/functional.py:1423: UserWarning: nn.functional.upsample_bilinear is deprecated. Use nn.functional.upsample instead.
  warnings.warn("nn.functional.upsample_bilinear is deprecated. Use nn.functional.upsample instead.")
/home/nic/machineLearning/Automatter/machine-learning/piwise/piwise/criterion.py:13: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.
  return self.loss(F.log_softmax(outputs), targets)
Traceback (most recent call last):
  File "main.py", line 163, in <module>
    main(parser.parse_args())
  File "main.py", line 138, in main
    train(args, model)
  File "main.py", line 85, in train
    board.image(color_transform(outputs[0].cpu().max(0)[1].data),
  File "/home/nic/machineLearning/Automatter/machine-learning/piwise/piwise/transform.py", line 48, in __call__
    color_image = torch.ByteTensor(3, size[1], size[2]).fill_(0)
IndexError: tuple index out of range

My first attempt at an obvious solution was to change it to

color_image = torch.ByteTensor(3, size[0], size[1]).fill_(0)

But I still have a size mismatch error. Tensor is 256 x 256 and Mask is 256.

SyntaxError: invalid syntax in main.py

New to pytorch, after data download, I occured this issue:

root@2d5d0934e049:/home/yuanshuai/code/piwise# python main.py 
  File "main.py", line 84
    f'input (epoch: {epoch}, step: {step})')
                                          ^
SyntaxError: invalid syntax

Pretrained models

Is it possible to make the weights of the trained models available?

Blank output maps!

Hi there,

I spent a week working on your code, and contributing to it! But, all I got is the blank prediction maps after convergences! I just found that you've had the same problem with your implementation! Thus, I'm wondering if there is any update on this issue!
On the hands, I've got another Theano implementation of UNet which produces the claimed results very well. So, I tried to make the training procedure the same for both implementations (e.g. fixed data, fixed hyper-parameters and so on), but again blank maps with PyTorch!!

BTW, there appeared to be some mistakes with your implementation of UNet, specifically in concatenation. Even after correcting them, the problem exists.

improve network performance

These are the results of the current segnet implementation:
horse
horse-segnetfix
horse-segnet

As they are far from the paper results there must be a systematic error in our implementation. Here I want to collect ideas what to do better.

  • ignore background class in loss
  • ignore border class in loss
  • apply CRF to final result
  • finetune learning rate (extra lr for prertrained layers)
  • zero initialize model layers (in the first 4 epochs they converge to zero so why not init them that way)
  • use batch normalized vgg16 for segnet as purposed in the paper

About an inexplicable bug

I ran this program on my own computer very well, but when I ran it on another computer, such a error happened:
RuntimeError: cuda runtime error(59): device-side assert triggered when running transfer_learning_tutorial
This error seems to be caused by a problem with the labels. Can you tell me how to solve this bug?

Is there some question in transform.py?

def colormap(n):
cmap=np.zeros([n, 3]).astype(np.uint8)

for i in np.arange(n):
    r, g, b = np.zeros(3)

    for j in np.arange(8):
        r = r + (1<<(7-j))*((i&(1<<(3*j))) >> (3*j))
        g = g + (1<<(7-j))*((i&(1<<(3*j+1))) >> (3*j+1))
        b = b + (1<<(7-j))*((i&(1<<(3*j+2))) >> (3*j+2))

    cmap[i,:] = np.array([r, g, b])

return cmap

I do not understand the function,please help me...
I use your main code to run a new network, but I can not decode a image from output

Evaluation error

In the assessment there was a mistake, I was not familiar with image segmentation, the problem is here
RuntimeError: inconsistent tensor size, expected tensor [256 x 256] and mask [256] to have the same number of elements, but got 65536 and 256 elements respectively at d:\projects\pytorch\torch\lib\th\generic/THTensorMath.c:138
I feel that there is a problem here
for label in range(1, len(self.cmap)):
mask = gray_image[0] == label
color_image[0][mask] = self.cmap[label][0]
color_image[1][mask] = self.cmap[label][1]
color_image[2][mask] = self.cmap[label][2]

It's different with the standard SegNet

I think the most contribution in segnet is the idx-maxpooling. you can use the F.max_unpool2d(idx, x) to replace the nn.upsample().
The original upsample can't fit the loss of location information.

No model named segnet2?

Options for args.model just contains a "segnet" in main.py, but the README.md#Usage mentions command all with "segnet2"?

A question about the size of output imgae

The code need input size (256 ,256) ,and output size is (256,256).This limit the ability of the practice.So I want to extend the net ability to arbitrary input size .If I do ,the code need adjustment? I can not understand the code in depth. Please help
def colormap(n):
cmap=np.zeros([n, 3]).astype(np.uint8)

for i in np.arange(n):
    r, g, b = np.zeros(3)

    for j in np.arange(8):
        r = r + (1<<(7-j))*((i&(1<<(3*j))) >> (3*j))
        g = g + (1<<(7-j))*((i&(1<<(3*j+1))) >> (3*j+1))
        b = b + (1<<(7-j))*((i&(1<<(3*j+2))) >> (3*j+2))

    cmap[i,:] = np.array([r, g, b])

return cmap

Some differences from original FCN

Hi, I'm trying to use FCN in pytorch, and your work interests me very much. It really helps me a lot, but I found the segmentation results are not so satisfying (after 30 epoch of training on pascal2012), and there are some differences from original FCN implementation.

  • in the original FCN8s, the outputs of pool3, pool4, and pool5 are sent into subsequent convolution layers to produce segmentation results, however, in your version, it's the outputs of conv3, conv4, and conv5 that are sent into the subsequent convolution layers!
  • in the original FCN8s implementation, the output of pool4 (and it's the same with pool3) is multiplied with a small constant, before being sent into the subsequent convolution layers to produce segmentation maps.
  • in the original FCN8s implementation, the VGG convolution layers are also updated during training. But in your version, VGG convolution layers won't be updated.
  • the original implementation uses deconvolution to upsample.

Some questions about dataset

Under the folder of data, there are only two folders of images and classes. Where can I get the labels of labels?

Out of Memory Issue

Do any one meet the error that: RuntimeError: cuda runtime error (2) : out of memory at /opt/conda/conda-bld/pytorch_1511320797808/work/torch/lib/THC/generic/THCStorage.cu:66 I just simply apply this model on titanX and also on my own 1080ti, same issue appeared.

Does the relization of the Segnet in network.py match the paper of segnet?

Hello
I read the the relization of the Segnet in network.py and I have a question.
In the paper of Segnet,the last layer boxed in the picture below is (one Upsampling + one Conv +one Conv)
default

but in the relization of the Segnet in network.py (https://github.com/bodokaiser/piwise/blob/50f8be658693dd27bfa463cb00d1acec7e869fc4/piwise/network.py#L275),the layer metioned above is (one Upsampling + one Conv) ,it lose one Conv layer.
default

Thank you

Probability maps instead of Binary maps

Hello,
In the evaluation phase, your code outpus binary segmentation maps (for a binary segmentation task), while there cases in which a the real-value probability map is desired. How can I change your code to get the probability map?

Thanks
Saeed

Object detection

Hi there and thank you for piwise!

My question is: how could piwise or any segnet implementation be used for object detection?

add semantic segmentation metrics

As in the FCN paper which means:

1. Pixel Accuracy

Sum of all pixels correctly classified divided through total number of pixels.

2. Mean Accuracy

Mean per class correctly classified pixels.

3. Mean Interscection over Union

Softmax in the evaluation stage

Hi,

I've found that you are explicitly applying log_softmax in your criterion.py file. Isn't is necessary to explicitly apply the softmax function on the output of the network in the evaluation stage? that is, before getting the maximum index for each class?

Thanks

Syntax Error in main.py

Throws a Syntax Error when running the main.py file

File "main.py", line 84
    f'input (epoch: {epoch}, step: {step})')
                                          ^
SyntaxError: invalid syntax

Semantic segmentation for binary masks (2 class: background/foreground object)

Hi @bodokaiser
Do you know how to adapt this code for binary masks? (background = black, foreground = white)
I changed the number of classes to 2:
#NUM_CLASSES = 22
NUM_CLASSES = 2
But some erros still occurs:

Traceback (most recent call last):
  File "main2.py", line 164, in <module>
    main(parser.parse_args())
  File "main2.py", line 139, in main
    train(args, model)
  File "main2.py", line 74, in train
    loss = criterion(outputs, targets[:, 0])
  File "/home/ubuntu/src/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/pytorch/piwise/piwise/criterion.py", line 13, in forward
    return self.loss(F.log_softmax(outputs), targets)
  File "/home/ubuntu/src/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/src/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/loss.py", line 132, in forward
    self.ignore_index)
  File "/home/ubuntu/src/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/functional.py", line 674, in nll_loss
    return _functions.thnn.NLLLoss2d.apply(input, target, weight, size_average, ignore_index)
  File "/home/ubuntu/src/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/_functions/thnn/auto.py", line 47, in forward
    output, *ctx.additional_args)
RuntimeError: weight tensor should be defined either for all or no classes at /pytorch/torch/lib/THCUNN/generic/SpatialClassNLLCriterion.cu:28
(pytorch) ubuntu@ip-172-31-85-122:~/pytorch/piwise$

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.