Giter VIP home page Giter VIP logo

inverseform's Introduction

InverseForm

This repository contains a version of the InverseForm module.

Shubhankar Borse, Ying Wang, Yizhe Zhang, Fatih Porikli, "InverseForm: A Loss Function for Structured Boundary-Aware Segmentation ", CVPR 2021.[arxiv]

Qualcomm AI Research (Qualcomm AI Research is an initiative of Qualcomm Technologies, Inc)

Reference

If you find our work useful for your research, please cite:

@inproceedings{borse2021inverseform,
  title={InverseForm: A Loss Function for Structured Boundary-Aware Segmentation},
  author={Borse, Shubhankar and Wang, Ying and Zhang, Yizhe and Porikli, Fatih
},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2021}
}

Method

InverseForm is a novel boundary-aware loss term for semantic segmentation, which efficiently learns the degree of parametric transformations between estimated and target boundaries.

! an image

This plug-in loss term complements the cross-entropy loss in capturing boundary transformations and allows consistent and significant performance improvement on segmentation backbone models without increasing their size and computational complexity.

Here is an example demo from our state-of-the-art model trained on the Cityscapes benchmark.

This repository contains the implementation of InverseForm module presented in the paper. It can also run inference on Cityscapes validation set with models trained using the InverseForm framework. The same models can be validated by removing the InverseForm framework such that no additional compute is added during inference. Here are some of the models over which you can run inference with and without the InverseForm block (right-most column of the table below):

Model mIoU (trained w/o InverseForm) mIoU (trained w/ InverseForm) Checkpoint
HRNet-18 77.0% 77.6% hrnet18_IF_checkpoint.pth
HRNet-16-Slim 76.1% 77.8% hr16s_4k_slim.pth
OCRNet-48 86.0% 86.3% hrnet48_OCR_IF_checkpoint.pth
OCRNet-48-HMS 86.7% 87.0% hrnet48_OCR_HMS_IF_checkpoint.pth

Setup environment

Code has been tested with pytorch 1.3 and NVIDIA Apex. The Dockerfile is available under docker/ folder.

Cityscapes path

utils/config.py has the dataset/directory information. Please update CITYSCAPES_DIR as the preferred Cityscapes directory. You can download this dataset from https://www.cityscapes-dataset.com/.

Inference on cityscapes

To run inference, this directory path needs to be added to your pythonpath. Here is the command for this:

export PYTHONPATH="${PYTHONPATH}:/path/to/this/dir"

Here are code snippets to run inference on the models shown above. These examples show usage with 8 GPUs. You could run the inference command with 1/2/4 GPUs by updating the nproc_per_node argument.

Our pretrained InverseForm module can be downloaded from here and should be placed inside the directory checkpoints/. See usage below.

distance_measures_regressor.pth

  • HRNet-18-IF
python -m torch.distributed.launch --nproc_per_node=8 experiment/validation.py --output_dir "/path/to/output/dir" --model_path "checkpoints/hrnet18_IF_checkpoint.pth" --has_edge True
  • HRNet-16-Slim-IF
python -m torch.distributed.launch --nproc_per_node=8 experiment/validation.py --output_dir "/path/to/output/dir" --model_path "checkpoints/hr16s_4k_slim.pth" --hrnet_base "16" --arch "lighthrnet.HRNet16" --has_edge True
  • OCRNet-48-IF
python -m torch.distributed.launch --nproc_per_node=8 experiment/validation.py --output_dir "/path/to/output/dir" --model_path checkpoints/hrnet48_OCR_IF_checkpoint.pth --arch "ocrnet.HRNet" --hrnet_base "48" --has_edge True
  • HMS-OCRNet-48-IF
python -m torch.distributed.launch --nproc_per_node=8 experiment/validation.py --output_dir "/path/to/output/dir" --model_path checkpoints/hrnet48_OCR_HMS_IF_checkpoint.pth --arch "ocrnet.HRNet_Mscale" --hrnet_base "48" --has_edge True

To remove the InverseForm operation during inference, simply run without the has_edge flag. You will notice no drop in performance as compared to running with the operation.

Acknowledgements:

This repository shares code with the following repositories:

We would like to acknowledge the researchers who made these repositories open-source.

inverseform's People

Contributors

mhofmann-qc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

inverseform's Issues

I have trained stn with resnet using the ImageNet , then how to train inverseForm net

Hi! I have read your awesome paper and the appendices, and I have a few questions:

  1. Which dataset is used to train the stn ? ImageNet or mnist. if I use ImageNet to train stn, how many times should I down sample to get the feature before flatten the feature, and how about the learning rate should I choose in stn, is it the same with classification net?

2.if I have trained stn with resnet using the ImageNet , the how to train inverseForm net, In my opinion it looks like this:
a) first get the rgb input image from ImageNet, then get the gray edges using sobel filter.
b) Freeze the stn, Using rgb input image to get the theta and the affine matrix, and use gray edges to get the affine edges by the affine matrix.
c) Get the theta hat from the inverseForm Net by using the gray edges and the affine gray edges. Then get the loss from theta and theta hat
Is that correct?
3. Does it matter if I use different net(resnet or mobilenet) to train the stn net?
4. Whats's the loss function between the theta from stn and the theta hat from inverseForm, L2? or L1
Looking forward to your reply. Thank you!

Design question

In your paper you describe InverseForm is used to compare b_pred with b_gt. The latter is obtained from running a Sobel filter on the ground truth segmentation . Is there any particular reason we don't do the same to obtainb_pred? Meaning, we could have just taken the predicted segmentation map, ran a sobel operation on that, and fed that as b_pred.

Put formally in your notation:

b_gt = sobel(y_gt)
So why not also
b_pred = sobel(y_pred)

how to visualize the result?

thank you for your great work
but could you please tell me how to visualize the result and save them?
thank you.

Question about the code

Hi, I have a question about the loss code. In the function of ImageBasedCrossEntropyLoss2d, the result of nll_loss.weight is no difference no matter the batch_weights is setted for True or False. Is this the original purpose that the Ith target(target[i]) should be as the input to the weights calculate when the batch_weights is setted for False?

Two confusions about the code

Hi,I have two confusions about the code:

  1. The InverseNet should predict 8 or 6 values according to Section 3.3 in your paper, but the InverseNet in your code only ouputs 4 values. Why? What do these 4 elements represent?
  2. Is the calculation of InverseForm loss in your code based on Euclidean distance or Geodesic distance?

A question about InverseTransform2D

Hi,
In your code, the class InverseTransform2D calculates the loss of boundary distance. You use "inputs = torch.ge(inputs, 0.5).float()" to get boundary map. But it seems that the backward is interrupted. Please make sure that this item of loss can be backward. Cause I think the required_grad attribute of mean_square_inverse_loss is False.

Usage in TensorFlow

How can I use this loss function in TensorFlow? Which portions need to be recoded?

Confuse about predict scales and shifts and the loss

Hi:
In another issue, you said the inverseForm outputs 4 values which stand for scales and shifts, so if I want to minimize the inverseForm loss, the scales should be close to 1 and shifts should be chose to 0.But in your code, the loss is (((distance_coeffs*distance_coeffs).sum(dim=1))**0.5).mean(), which will push all of 4 values to 0, why ?
Looking forward to your reply. Thank you!

Training the Inverseform Net on custom dataset

Hi! Based on your paper, I understand that you have retrained the IFLoss separately for each dataset (as specified in your supplementary materials). I am presently using a custom aerial imagery dataset for my experiments and would like to follow the same protocol of retraining IFLoss on this dataset. I hope you can offer some insights on how I could go about doing this. Even better, I would appreciate it if you could share the training script, which would save a great deal of time for me.

Also, I'm also curious as to why you have retrained the IFLoss for each dataset separately. Since the Inverseform network only used the GT seg masks to train, are the structure and shape of the GT seg masks so drastically different between the various datasets that IFLoss requires retraining on each dataset separately?

Looking forward to your reply. Thank you!

How can I use this work for cityscapes segmantation on Kitti dataset?

Thanks for your great work! And since you provide excellent performance on cityscapes benchmark, I just wonder whether I can use your pretrained checkpoints for semantic segmantation on Kitti dataset. If it supports, I'd greatly appreciate it for your generous advice for how to adjust your dataset config and so on.

Not able to Run given validation/test code

While using the given command in README
"python -m torch.distributed.launch --nproc_per_node=8 experiment/validation.py --output_dir "/path/to/output/dir" --model_path checkpoints/hrnet48_OCR_IF_checkpoint.pth --arch "ocrnet.HRNet" --hrnet_base "48" --has_edge True"

getting below error
usage: validation.py [-h] [--tag TAG] [--no_run] [--interactive] [--no_cooldir] [--farm FARM] exp_yml

** All cmd line arguments are given as "unrecognized arguments"

Edge Head Weights

Hello,

I wanted to ask if you would be willing to release the weights of the edge-head that is added to the default semantic segmentation architecture.

Where can I find the appendix materials?

Dear authors,

Thanks for your great work! I am interested in how you conducted your tiling operations, and select your hyperparameters for training the net. You mentioned discussions about those are in the appendix. Could you kindly remind me where can I find the appendix materials for this paper? Thanks!

question about visualization

Hi, from your code I guess the visualization code is :

        prediction = assets['predictions']
        prediction_cpu = np.array(prediction)
        label_out = np.zeros_like(prediction)
        submit_fn = '{}.png'.format(img_name)
        for label_id, train_id in   cfg.DATASET.id_to_trainid.items():
            label_out[np.where(prediction_cpu == train_id)] = label_id
        cv2.imwrite(os.path.join(self.save_dir, submit_fn), label_out)

but when I execute it, there is no picture saved, and the saving path is right, could you help me? Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.