Giter VIP home page Giter VIP logo

Comments (6)

HKervadec avatar HKervadec commented on September 18, 2024 4

Hey,

The function transforming the one-hot encoding to distance maps is there https://github.com/LIVIAETS/surface-loss/blob/108bd9892adca476e6cdf424124bc6268707498e/utils.py#L198 It will compute one distance map per class.

To address your questions directly:

otherwise distance map value may be 0 to 10 or 100. It is too high

Well that is the point of this paper ; to have a much higher loss for far away pixels.

If I do it for 3 classes segmentation. So C =3, then I must compute distance map of each class: background (class 0), class1, and class2 then store it in BxCxHxW. Is it right?

Yes. Note that we are still experimenting with the supervision of the background class ; treat it as a another class or simply ignore it as we do in the binary case. This is why my losses have the self.idc parameter, to select all or a few classes to supervise. To proper choice will depend on the dataset you are using ; I think it won't matter for balanced problems, but for highly unbalanced problems supervising the background class might be harmful.

This coding style does not like pytorch.

I use einsum extensively, because it self-documents the code and the way the data is represented. This way, I do not have to track manually what shape my tensors have ; which is great when you come back to your code after a few months. It is also convenient when you need to multiply and sum at once, as for intersection: Tensor = w * einsum("bcwh,bcwh->bc", pc, tc).

Type hints are also great when you keep switching between Tensors, NumPy arrays and floats, as they will all behave differently.

Why we need to onehot the distance map.

I think you slightly misunderstood. I make sure the distance map is not a valid one_hot encoding (one_hot being only 0 and 1s and sum to 1 across the C axis) ; just a way to check I did not mix up the ground truth and the distance map.

from boundary-loss.

John1231983 avatar John1231983 commented on September 18, 2024 2

Thank you for your reply. I spend one day to convert your code to three class problem. I have a segmented image size of BxHxW, and the value is from 0 to 2. First, I convert it to onehot vector using the code


class One_Hot(nn.Module):
    def __init__(self, depth):
        super(One_Hot, self).__init__()
        self.depth = depth
        self.ones = torch.sparse.torch.eye(depth).cuda()

    def forward(self, X_in):
        n_dim = X_in.dim()
        output_size = X_in.size() + torch.Size([self.depth])
        num_element = X_in.numel()
        X_in = X_in.data.long().view(num_element)
        out = Variable(self.ones.index_select(0, X_in)).view(output_size)
        return out.permute(0, -1, *range(1, n_dim)).squeeze(dim=2).float()

    def __repr__(self):
        return self.__class__.__name__ + "({})".format(self.depth)

Then, I can use your function one_hot2dist directly. This is modied the boundary loss with three class problem

class BoundaryLoss(nn.Module):
    def __init__(self, n_classes):
        super(BoundaryLoss, self).__init__()
        self.n_classes = n_classes
        self.one_hot_encoder = One_Hot(n_classes).forward

    def forward(self, input, target):
        batch_size = input.size(0)
        input = F.softmax(input, dim=1).view(batch_size, self.n_classes, -1)
        target = self.one_hot_encoder(target).contiguous().view(batch_size, self.n_classes, -1)
        dist = one_hot2dist(target.data.cpu().numpy())
        dist= torch.from_numpy(dist)
        dist = dist.cuda()
        score = torch.mean(input * dist)
        score =  score / (float(batch_size) * float(self.n_classes))

        return score

Could you check my implementation is correct or not? Thanks

from boundary-loss.

HKervadec avatar HKervadec commented on September 18, 2024

Are you on track, but several things:

You can simply chain the class2one_hot and one_hot2dist functions from utils.py to get all the distances maps for each classes.
This part shows how I go from the loaded image of shape wh to C distance maps with shape cwh

I usually put that part in the dataloader as it takes some time and the data needs to come back to CPU ; and we can pre-compute it. Then during the main training loop, you have your distance maps directly available and can feed it to my implementation; it already handles multi-class.

For you, you would need to do something like this:

surface_loss = SurfaceLoss(idc=[0, 1, 2])  
# Or [1, 2] if you do not care about supervising the background

for input_image, dist_maps in dataloader:
    # input_image: bwh
    # dist_maps: bcwh
    optimizer.zero_grad()

    output_logits = net(input_image)  # bcwh
    output_softmaxes = F.softmax(output_logits, dim=1)  # bcwh

    loss = surface_loss(outputs_logits, dist_maps, None)  
    # Extra parameter is left-over for compatibility reasons, some of my other works have an extra-parameter
    loss.backward()
    optimizer.step()

Concerning your implementation, you do not need to divide by b * c after you perform the mean of the products. Apart from that I think it is correct, but as stated before, much of the work can be moved in the dataloader to speed things up.

Let me know if you have other questions or if I was not clear, I will be happy to help.

from boundary-loss.

John1231983 avatar John1231983 commented on September 18, 2024

@HKervadec : It worked now. However, it hurts my performance. I think I did wrong thing in your purpose. Your purpose use distance map to guide the segmentation loss. The distance map is computed based on the boundary of the object. Hence, I think the output of your network is boundary instead of whole subject region. For example, a pixel belongs to the foreground region but it is in the center, so the distance map of the pixel to the boundary is far then it has high-value weight (high distance). When we use the surface loss, the pixel will tend to high surface loss and will become background. Only pixels are near boundary will has low surface loss. Hence, I guess the output of your network must be boundary instead of whole regions likes UNet. Am I right?

from boundary-loss.

HKervadec avatar HKervadec commented on September 18, 2024

Hence, I guess the output of your network must be boundary instead of whole regions likes UNet. Am I right?

No. If you go back to the paper, Equation (5), you will see that we use the softmax probabilities (s_\theta) from the network and the level-set \phi_G. I invite you to re-read section 2 thoroughly, as it explains quite in details how we ended up with this formulation. Some papers use two branches, one for region segmentation, while the other has a regression layer to predict the distance to the boundary. Then, both losses are combined. But with our formulation, we do not need that as the distance information is already included.

However, it hurts my performance.

There are several possible explanations for that:

  • We evaluate on highly unbalanced datasets (very few pixels of the foreground per image), where the cross-entropy, GDL or Dice loss do not get good results. The boundary loss is there to overcome their limitations, not to replace them. When you apply to balanced datasets, they will have different limitations (or just work) so the effect from the boundary loss will be different.
  • Also, the optimal way to balance the two losses (GDL + boundary loss) might be different for balanced datasets.
  • Or there might be a way to make the boundary loss work by itself.
  • As I stated before, we have not addressed how to handle the background class in multi-class settings. I suspect that if you treat it as just another class, it can create trivial local minima during the optimization process so it might be better to ignore it (i.e. use self.idc = [1, 2] instead of self.idc=[0, 1, 2]).

All those points are still active research problems ; the paper introduces the formulation and shows that it has a positive effect on highly unbalanced segmentation problems, but there is still a lot of unexplored territory.

Since this is not a code problem anymore, I think we could close this issue, and continue by e-mail the collaboration to apply this boundary loss to other settings.

from boundary-loss.

John1231983 avatar John1231983 commented on September 18, 2024

Some papers use two branches, one for region segmentation, while the other has a regression layer to predict the distance to the boundary. Then, both losses are combined.

Thanks. Could you provide some papers link or title? Thanks. I guess my problem is that the object is not so high imbalance

from boundary-loss.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.