coincheung / pytorch-loss Goto Github PK

label-smooth, amsoftmax, partial-fc, focal-loss, triplet-loss, lovasz-softmax. Maybe useful

License: MIT License

Python 57.46% C++ 1.49% Cuda 41.05%

pytorch dice-loss triplet-loss amsoftmax cuda label-smoothing focal-loss mish ema lovasz-softmax

pytorch-loss's Introduction

pytorch-loss

My implementation of label-smooth, amsoftmax, partial-fc, focal-loss, dual-focal-loss, triplet-loss, giou/diou/ciou-loss/func, affinity-loss, pc_softmax_cross_entropy, ohem-loss(softmax based on line hard mining loss), large-margin-softmax(bmvc2019), lovasz-softmax-loss, and dice-loss(both generalized soft dice loss and batch soft dice loss). Maybe this is useful in my future work.

Also tried to implement swish, hard-swish(hswish) and mish activation functions.

Additionally, cuda based one-hot function is added (support label smooth).

Newly add an "Exponential Moving Average(EMA)" operator.

Add convolution ops, such as coord-conv2d, and dynamic-conv2d(dy-conv2d).

Some operators are implemented with pytorch cuda extension, so you need to compile it first:

    $ python setup.py install

After installing, now you can pick up what you need and use the losses or ops like one of thes:

from pytorch_loss import SwishV1, SwishV2, SwishV3
from pytorch_loss import HSwishV1, HSwishV2, HSwishV3
from pytorch_loss import MishV1, MishV2, MishV3
from pytorch_loss import convert_to_one_hot, convert_to_one_hot_cu, OnehotEncoder
from pytorch_loss import EMA

from pytorch_loss import TripletLoss
from pytorch_loss import SoftDiceLossV1, SoftDiceLossV2, SoftDiceLossV3
from pytorch_loss import PCSoftmaxCrossEntropyV1, PCSoftmaxCrossEntropyV2
from pytorch_loss import LargeMarginSoftmaxV1, LargeMarginSoftmaxV2, LargeMarginSoftmaxV3
from pytorch_loss import LabelSmoothSoftmaxCEV1, LabelSmoothSoftmaxCEV2, LabelSmoothSoftmaxCEV3
from pytorch_loss import GIOULoss, DIOULoss, CIOULoss
from pytorch_loss import iou_func, giou_func, diou_func, ciou_func
from pytorch_loss import FocalLossV1, FocalLossV2, FocalLossV3
from pytorch_loss import Dual_Focal_loss
from pytorch_loss import GeneralizedSoftDiceLoss, BatchSoftDiceLoss
from pytorch_loss import AMSoftmax
from pytorch_loss import AffinityFieldLoss, AffinityLoss
from pytorch_loss import OhemCELoss, OhemLargeMarginLoss
from pytorch_loss import LovaszSoftmaxV1, LovaszSoftmaxV3
from pytorch_loss import TaylorCrossEntropyLossV1, TaylorCrossEntropyLossV3
from pytorch_loss import InfoNceDist
from pytorch_loss import PartialFCAMSoftmax

from pytorch_loss import TaylorSoftmaxV1, TaylorSoftmaxV3
from pytorch_loss import LogTaylorSoftmaxV1, LogTaylorSoftmaxV3

from pytorch_loss import CoordConv2d, DY_Conv2d

Note that some losses or ops have 3 versions, like LabelSmoothSoftmaxCEV1, LabelSmoothSoftmaxCEV2, LabelSmoothSoftmaxCEV3, here V1 means the implementation with pure pytorch ops and use torch.autograd for backward computation, V2 means implementation with pure pytorch ops but use self-derived formula for backward computation, and V3 means implementation with cuda extension. Generally speaking, the V3 ops are faster and more memory efficient, since I have tried to squeeze everything in one cuda kernel function, which in most cases brings less overhead than a combination of pytorch ops.

For those who happen to find this repo, if you see errors in my code, feel free to open an issue to correct me.

pytorch-loss's People

Contributors

Stargazers

Watchers

Forkers

tomjerrygithub jarygrace mengkunzhao lighttoyang gaohuan2015 liugingko zihaodong tonyfy ms-krajesh eternityzy queenie88 we0091234 alice-ren jingcx zhaowwenzhong kunyao2015 davis-love-ai suyanzhou626 junxyu canghaiyunfan fjchange jian1299 dl-alva zhaowujie xingliujia hackerxiaobai patrickket love2019-11 abnerxzhe pgogo scauapc youngfrank william-zhan useric peternara zawecha1 lilujunai gongshichina shiyongde yoooo233 zymale swhan9873 hunterkai xy524679663 azuredsky idleuncle kiminh fdsjk chizhu xixirupan liaoheping collin-burns taoshss lfliu tommylitlle fanrupin zhouxiaoxu annopackage entn-at zp1018 damehou sailfish009 xuanhanyu derrickwang005 doublefish20170305 tdf1995 pgsrv leegendlee fil82 exitxingling shibinmei wwwwan freegliboracle zerojuzi gaokun556 xjzhao18 super-ljg xiaohulugo null-op yueyedeai zhangxinaaaa dohnlee mldl luan1412167 chunyu222 genhao3 zhangliyuan97 collector-m xueyagaga zergtant summonswar groundwalker7 waterbearbear zxf864823150 sharonlee19 shenyuanyuan andreybuynov wenbinlee zibuyu2018 swansealeo

pytorch-loss's Issues

about smooth label

if i choose smooth label ,what loss function do i choose?
in ptorch, celoss only support traget.long()

fatal error: math.h: No such file or directory

Hi,

I am trying to run Taylor Softmax.

(0)

I run the python3 setup.py install and get:

root@7c09a3f30c39:/home/keras/notebook/nvme_raid/aveysov/pytorch-loss# python3 setup.py install
running install
running bdist_egg
running egg_info
creating pytorch_loss.egg-info
writing pytorch_loss.egg-info/PKG-INFO
writing dependency_links to pytorch_loss.egg-info/dependency_links.txt
writing top-level names to pytorch_loss.egg-info/top_level.txt
writing manifest file 'pytorch_loss.egg-info/SOURCES.txt'
reading manifest file 'pytorch_loss.egg-info/SOURCES.txt'
writing manifest file 'pytorch_loss.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
creating build
creating build/lib.linux-x86_64-3.7
creating build/lib.linux-x86_64-3.7/pytorch_loss
copying pytorch_loss/swish.py -> build/lib.linux-x86_64-3.7/pytorch_loss
copying pytorch_loss/frelu.py -> build/lib.linux-x86_64-3.7/pytorch_loss
copying pytorch_loss/generalized_iou_loss.py -> build/lib.linux-x86_64-3.7/pytorch_loss
copying pytorch_loss/pc_softmax.py -> build/lib.linux-x86_64-3.7/pytorch_loss
copying pytorch_loss/focal_loss_old.py -> build/lib.linux-x86_64-3.7/pytorch_loss
copying pytorch_loss/focal_loss.py -> build/lib.linux-x86_64-3.7/pytorch_loss
copying pytorch_loss/one_hot.py -> build/lib.linux-x86_64-3.7/pytorch_loss
copying pytorch_loss/soft_dice_loss.py -> build/lib.linux-x86_64-3.7/pytorch_loss
copying pytorch_loss/amsoftmax.py -> build/lib.linux-x86_64-3.7/pytorch_loss
copying pytorch_loss/taylor_softmax.py -> build/lib.linux-x86_64-3.7/pytorch_loss
copying pytorch_loss/triplet_loss.py -> build/lib.linux-x86_64-3.7/pytorch_loss
copying pytorch_loss/__init__.py -> build/lib.linux-x86_64-3.7/pytorch_loss
copying pytorch_loss/label_smooth.py -> build/lib.linux-x86_64-3.7/pytorch_loss
copying pytorch_loss/hswish.py -> build/lib.linux-x86_64-3.7/pytorch_loss
copying pytorch_loss/ema.py -> build/lib.linux-x86_64-3.7/pytorch_loss
copying pytorch_loss/test.py -> build/lib.linux-x86_64-3.7/pytorch_loss
copying pytorch_loss/dice_loss.py -> build/lib.linux-x86_64-3.7/pytorch_loss
copying pytorch_loss/large_margin_softmax.py -> build/lib.linux-x86_64-3.7/pytorch_loss
copying pytorch_loss/lovasz_softmax.py -> build/lib.linux-x86_64-3.7/pytorch_loss
copying pytorch_loss/mish.py -> build/lib.linux-x86_64-3.7/pytorch_loss
copying pytorch_loss/conv_ops.py -> build/lib.linux-x86_64-3.7/pytorch_loss
copying pytorch_loss/ohem_loss.py -> build/lib.linux-x86_64-3.7/pytorch_loss
copying pytorch_loss/affinity_loss.py -> build/lib.linux-x86_64-3.7/pytorch_loss
copying pytorch_loss/dual_focal_loss.py -> build/lib.linux-x86_64-3.7/pytorch_loss
running build_ext
building 'focal_cpp' extension
creating /home/keras/notebook/nvme_raid/aveysov/pytorch-loss/build/temp.linux-x86_64-3.7
creating /home/keras/notebook/nvme_raid/aveysov/pytorch-loss/build/temp.linux-x86_64-3.7/csrc
Emitting ninja build file /home/keras/notebook/nvme_raid/aveysov/pytorch-loss/build/temp.linux-x86_64-3.7/build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/1] /usr/local/cuda/bin/nvcc  -I/opt/conda/lib/python3.7/site-packages/torch/include -I/opt/conda/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.7/site-packages/torch/include/TH -I/opt/conda/lib/python3.7/sit
e-packages/torch/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.7m -c -c /home/keras/notebook/nvme_raid/aveysov/pytorch-loss/csrc/focal_kernel.cu -o /home/keras/notebook/nvme_raid/aveysov/pytorch-loss/build/temp.linux-x86_64-3.7/csrc
/focal_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPIL
ER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=focal_cpp -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -std=c++14
FAILED: /home/keras/notebook/nvme_raid/aveysov/pytorch-loss/build/temp.linux-x86_64-3.7/csrc/focal_kernel.o
/usr/local/cuda/bin/nvcc  -I/opt/conda/lib/python3.7/site-packages/torch/include -I/opt/conda/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.7/site-packages/torch/include/TH -I/opt/conda/lib/python3.7/site-pack
ages/torch/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.7m -c -c /home/keras/notebook/nvme_raid/aveysov/pytorch-loss/csrc/focal_kernel.cu -o /home/keras/notebook/nvme_raid/aveysov/pytorch-loss/build/temp.linux-x86_64-3.7/csrc/focal
_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYP
E="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=focal_cpp -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -std=c++14
In file included from /usr/local/cuda/include/crt/math_functions.h:8958:0,
                 from /usr/local/cuda/include/crt/common_functions.h:295,
                 from /usr/local/cuda/include/cuda_runtime.h:115,
                 from <command-line>:0:
/usr/include/c++/7/cmath:45:15: fatal error: math.h: No such file or directory

compilation terminated.
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1672, in _run_ninja_build
    env=env)
  File "/opt/conda/lib/python3.7/subprocess.py", line 512, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

I run the python3 setup.py install command in my dockerized research environment, which is derived from the official PyTorch GPU images:

ARG BASE_IMAGE=pytorch/pytorch:1.9.0-cuda11.1-cudnn8-devel
FROM $BASE_IMAGE

I remember when I faced similar problems in the past, I did something like this for compilation of some CUDA kernels, but then I removed these lines (it was a while ago!):

RUN apt-get install gcc-5 g++-5 g++-5-multilib gfortran-5 -y && \
    update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-5 60 --slave /usr/bin/g++ g++ /usr/bin/g++-5 --slave /usr/bin/gfortran gfortran /usr/bin/gfortran-5 && \
    update-alternatives --query gcc
RUN gcc --version

Could you maybe elaborate a bit here, since I am not very familiar with how the C++ ecosystem works.

(1)
As far as I see there is a standard autograd implementation and a custom CUDA implementation.
Since I am not very proficient with C++ and CUDA, may I ask what was the reasoning behind adding a custom CUDA kernel, was the autograd version too slow, or memory intensive?

Many thanks for you advice and code!

“THC/THC.h”: No such file or directory

您好，如果我想在我的uc文件里修改这个函数，需要修改哪里呀

Amazing work！

your focal loss is wrong? it seems little different with others, can you explain your code?

class FocalLoss(nn.Module):
def init(self, alpha=1, gamma=2, logits=False, reduce=True):
super(FocalLoss, self).init()
self.alpha = alpha
self.gamma = gamma
self.logits = logits
self.reduce = reduce

def forward(self, inputs, targets):
    if self.logits:
        BCE_loss = F.binary_cross_entropy_with_logits(inputs, targets, reduce=False)
    else:
        BCE_loss = F.binary_cross_entropy(inputs, targets, reduce=False)
    pt = torch.exp(-BCE_loss)
    F_loss = self.alpha * (1-pt)**self.gamma * BCE_loss

    if self.reduce:
        return torch.mean(F_loss)
    else:
        return F_loss

AM-softmax implement details

when use am-softmax loss, do we need to add another fully connect layer before using AM-softmax loss?

such as Avg_pooling-->fc-->amsoftmax?

where is the mish_cpp?

when i found mish.py ,i see MishV3,but we need "import mish_cpp",can i ask where is the mish_cpp?

hard triplet conergence

I use triple loss between data of two modalities to reduce the distance between different modalities of the same class and increase the distance between different modalities of different class. But when I use batch_all loss, the valid set loss has not changed; now using hard_loss, the valid set loss still has not changed. What is the reason? I found some answers that triplet is difficult to converge. What do you do to deal with triplet loss convergence?

fatal error: THC/THC.h: No such file or directory

Latest pytorch has removed THC.h

TPU support

Are these functions especially version v2 and v3 are supported on GPU only? or they are supported on tpu also?

use on multi-label 3D medical image segmentation task?

Hi, thanks for your outstanding work!

Can your losses, for example focal loss, be used on multi-label 3D medical image segmentation task?

thank you so much! Look forward of your reply~

feature request: dice loss support multi-class classification ?

What does ignore_lb = 255 do?

What does ignore_lb = 255 do? Please help me to explain. Thank you very much for your help.

partial_fc_amsoftmax memory leak

partial_fc_amsoftmax memory leak @CoinCheung

installation failed

/usr/local/cuda/bin/nvcc -I/home/m/.local/lib/python3.7/site-packages/torch/include -I/home/m/.local/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/home/m/.local/lib/python3.7/site-packages/torch/include/TH -I/home/m/.local/lib/python3.7/site-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.7m -c csrc/focal_kernel.cu -o build/temp.linux-x86_64-3.7/csrc/focal_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options '-fPIC' -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -DTORCH_EXTENSION_NAME=focal_cpp -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_70,code=compute_70 -gencode=arch=compute_70,code=sm_70 -std=c++14
/home/m/.local/lib/python3.7/site-packages/torch/include/c10/core/SymInt.h(84): warning: integer conversion resulted in a change of sign

/home/m/.local/lib/python3.7/site-packages/torch/include/c10/core/SymInt.h(84): warning: integer conversion resulted in a change of sign

csrc/common.hpp(20): error: identifier "hexp" is undefined

csrc/common.hpp(35): error: identifier "hlog" is undefined

2 errors detected in the compilation of "csrc/focal_kernel.cu".
error: command '/usr/local/cuda/bin/nvcc' failed with exit status 1

not sure what other information to include... Driver Version: 465.19.01 CUDA Version: 11.3
let me know if you need any other information

use of TO_REMOVE in GIOU loss

is there a reason to use "TO_REMOVE" while calculating the intersection between bounding boxes?

pytorch-loss/pytorch_loss/generalized_iou_loss.py

Line 20 in 1c8487e

wh = (rb - lt + TO_REMOVE).clamp(min=0)

Because when I print the gt_area, pr_area, intersection, union they result as 4,4,4,4 for given example in the loss file. But from the example, it should be 4,4, 1, 7 respectively or am I missing something?

Calculation about focal loss

Thanks for your great works!

As far as I know, the focal loss is $ FL = -y(1-p)^\gamma \log{p} - (1-y)p^\gamma \log{1-p} $, but the code in the focal_loss.py:

        coeff = torch.abs(label - probs).pow(self.gamma).neg()
        log_probs = torch.where(logits >= 0,
                F.softplus(logits, -1, 50),
                logits - F.softplus(logits, 1, 50))
        log_1_probs = torch.where(logits >= 0,
                -logits + F.softplus(logits, -1, 50),
                -F.softplus(logits, 1, 50))
        loss = label * self.alpha * log_probs + (1. - label) * (1. - self.alpha) * log_1_probs
        loss = loss * coeff

Seems like the $ \log{p} $ in the focal loss is implemented by the function F.softplus()? But the F.softplus() comes out with $ y=\log{1+e^x} $, which is not $ \log{x} $ .

How should I understand this difference? Or did I miss something necessary about focal loss?

LovaszSoftmax Loss test case not working

When I try to run your test case to check it out. It throws error.

terminate called after throwing an instance of 'thrust::system::system_error'
  what():  CUDA free failed: cudaErrorIllegalAddress: an illegal memory access was encountered

I'm using Python 3.7.10 and PyTorch 1.8.1. GPU is Nvidia RTX 3070.

哪一个损失函数最好用，损失函数类中里面又不带参数？

你好，我看到am-softmax这个损失函数里面带着参数需要更新，这个参数实际上就是分类器的最后一层是吧。我来请教一下哪种损失函数比较容易上手呢？同时效果还比较好？

Question about Multi-class focal loss

I notice that it seems that focal_loss.py only implements focal loss on the binary classification setting. Do there exist any implementations about focal loss for a multi-class classification setting?

Regarding Generalized IOU Loss

Can G-IOU be used as a loss function? Actually I am training a bounding box prediction model and using GIOU as a loss function but after some few iterations of the first epoch of training, I am getting constant values.

new feature request: dice loss

Dice Loss for Data-imbalanced NLP Tasks
https://arxiv.org/pdf/1911.02855.pdf
https://github.com/ShannonAI/mrc-for-flat-nested-ner/blob/master/loss/dice_loss.py

do buffered params in EMA need to be updated?

Hi, seems like the buffered_pamameters are not affected by the optimizers, namely, they remain unchanged. So I am wondering do these params need to calculated by EMA? Thanks!

Any plan to give a tutorial?

Hi, it's a great work! However, I often found myself stuck on how to write a custom cuda kernel / C++ extension, do you have any plan to write a tutorial or generously give some helpful links? I found most tutorials on internet including PyTorch's official document on custom cuda kernel is not that welcome to new beginners.

there is some problems in the file of focal_loss.py

in the file of focal_loss.py, how to import focal_cpp? I do not find the file of focal_cpp.
And if my torch version is less than 1.6.0,how do I apply?

LabelSmoothSoftmaxCEV1, LabelSmoothSoftmaxCEV2, LabelSmoothSoftmaxCEV3

Sorry. Excuse me. Performance or efficiency of the V1， V2， v3 progressive relationship? Can you explain the advantages of each of the three in more detail? Which one is recommended for use in your application?

Regularizing Neural Networks by Penalizing Confident Output Distribution

Hello,

First of all, thank you for contributing such a nice repo, integrating such useful loss function in PyTorch.

According to this repo there is an implementation in your repo for Regularizing Neural Networks by Penalizing Confident Output Distribution. However, I can't find the citation of this paper in repo. Is this loss function implemented in the repo?

Thank you in advance!

BCEWithLogitsLoss has combined a Sigmoid layer and the BCELoss in one single class, But why to use torch.sigmoid again

version 1: use torch.autograd

class FocalLossV1(nn.Module):

def __init__(self,
             alpha=0.25,
             gamma=2,
             reduction='mean',):
    super(FocalLossV1, self).__init__()
    self.alpha = alpha
    self.gamma = gamma
    self.reduction = reduction
    self.crit = nn.BCEWithLogitsLoss(reduction='none')

def forward(self, logits, label):
    '''
    args:
        logits: tensor of shape (N, ...)
        label: tensor of shape(N, ...)
    '''

    # compute loss
    logits = logits.float() # use fp32 if logits is fp16
    with torch.no_grad():
        alpha = torch.empty_like(logits).fill_(1 - self.alpha)
        alpha[label == 1] = self.alpha

    **_probs = torch.sigmoid(logits)_**
    pt = torch.where(label == 1, probs, 1 - probs)
    ce_loss = self.crit(logits, label.double())
    loss = (alpha * torch.pow(1 - pt, self.gamma) * ce_loss)
    if self.reduction == 'mean':
        loss = loss.mean()
    if self.reduction == 'sum':
        loss = loss.sum()
    return loss

CLASStorch.nn.BCEWithLogitsLoss(weight: Optional[torch.Tensor] = None, size_average=None, reduce=None, reduction: str = 'mean', pos_weight: Optional[torch.Tensor] = None)[SOURCE]
This loss combines a Sigmoid layer and the BCELoss in one single class. This version is more numerically stable than using a plain Sigmoid followed by a BCELoss as, by combining the operations into one layer, we take advantage of the log-sum-exp trick for numerical stability.

BCEWithLogitsLoss has combined a Sigmoid layer and the BCELoss in one single class, But why to use torch.sigmoid again,
Anything Wrong ? thanks