Hi, In your paper, it said during training the network used a weighted cross-entr

I would try with something like this <a href="https://github.com/Lyken17/pytorch-OpCou

I would try with something like this <a href="https://github.com/Lyken17/

Hi, Batching always helps the fps. <a href="https://developer.nvidia

Loss function problem about lidar-bonnetal HOT 10 CLOSED

prbonn commented on June 19, 2024

Loss function problem

from lidar-bonnetal.

Comments (10)

tano297 commented on June 19, 2024

Hi.

nn.CrossEntropyLoss acts on the activation logits, doing the logarithm and the softmax for you. In my organization, I do the softmax inside the "segmentator" class, and the logarithm right before calling the loss function, so the behavior is the same using nn.NLLLoss.

Let me know if this is clear or I'll send you links to the lines when I'm in my computer

from lidar-bonnetal.

BenjaminYoung29 commented on June 19, 2024

Yes it's very clear to me! Thanks!

from lidar-bonnetal.

BenjaminYoung29 commented on June 19, 2024

Hi.

nn.CrossEntropyLoss acts on the activation logits, doing the logarithm and the softmax for you. In my organization, I do the softmax inside the "segmentator" class, and the logarithm right before calling the loss function, so the behavior is the same using nn.NLLLoss.

Let me know if this is clear or I'll send you links to the lines when I'm in my computer

Hi.

I need to count the FLOPs of my network. Do you have any good idea on doing this?

Thanks.

from lidar-bonnetal.

tano297 commented on June 19, 2024

I would try with something like this https://github.com/Lyken17/pytorch-OpCounter

from lidar-bonnetal.

BenjaminYoung29 commented on June 19, 2024

I would try with something like this https://github.com/Lyken17/pytorch-OpCounter

I have tried it out. Ant it works fine. Here is the thing to be noticed.

First pip install thop.

Then in segmentator.py

from thop import profile


input=torch.randn(1, 5, 64, 512)  #change 512 into the width of the input in yaml files
device = torch.device("cuda")
input = input.to(device)
self.decoder.cuda()
self.head.cuda()
flops, params = profile(self, inputs=(input, ))
flops, params = clever_format([flops, params], "%.3f")
print("FLOPS: ", flops)
print("Total params: ", params)

And I found a question while doing this, which is in your code you changed self.backbone to cuda. However you didn't do the same for self.decoder or self.head. Why is that?

from lidar-bonnetal.

tano297 commented on June 19, 2024

Hi,

The backbone is changed to cuda there because it is being profiled with a fake input to get the shape of the skip connections (needed by the decoder to define its internal structure). Since backbone, decoder, and head are nn.Modules members of the 'Segmentator' class, they all go to cuda when I do self.model.cuda here. This is standard pytorch behavior, when you call functions such as .train(), .eval(), .cuda(), .cpu(), etc, on a module, it calls it for all the children

from lidar-bonnetal.

BenjaminYoung29 commented on June 19, 2024

Hi,

Do you know what will affect the inference time of the network. I find it puzzling that when i set the batch size to 48( trained on 2 1080Ti), the inference fps of my own network was 96 and when I set it to 8, the fps was 60. And I've run your DarkNet53-512px, the inference fps was 50fps. The number of parameter of DarkNet53 is 50M, while mine is 4M. And our inference time differs little which is not emprical.

from lidar-bonnetal.

tano297 commented on June 19, 2024

Hi,

Batching always helps the fps. here is a good post from nvidia with an fps vs batch size analysis. This is because each kernel launches once for the same layer in each image in the batch, rather than multiple times. This helps use the GPU closer to 100% of its utilization. (click on inference)

In terms of the number of parameters, less is not always better. The darknet backbone is specially designed in the Yolo paper to maximize GPU utilization, by using simple operators that are implemented very fast, and not so many layers sequentially, which adds dead times which affect gpu utilization. Therefore the relationship between flops vs time, or parameters vs time is NEVER linear. This is not just for this framework, but for every GPU based application (especially deep CNNs)

from lidar-bonnetal.

BenjaminYoung29 commented on June 19, 2024

Hi,

Thanks a lot for all your enlightening reply. They helps me a lot. I realized that the validation was run with a batch size that is bigger than 1, which makes the result reasonable. But in real life LiDAR collects point cloud at about 10Hz and the embedding device on a car processes the input one by one. So maybe it would be more reasonable to compare the fps at batchsize 1.

Thanks agian for your help these days.

from lidar-bonnetal.

tano297 commented on June 19, 2024

Yes, batch 1 comparison makes the most sense. On the paper, all the experiments are run with batch 1, using the inference script, and adding traces to calculate means and stds of runtimes. This code is not on the repo since it's only necessary to generate results for the paper

from lidar-bonnetal.

Loss function problem about lidar-bonnetal HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent