Giter VIP home page Giter VIP logo

Comments (8)

frgfm avatar frgfm commented on August 25, 2024

Hi @PengtaoJiang 👋

Thanks for reaching out! Congrats on your results in your paper benchmarks, I'll take a look, and try to come up with a good implementation 👌

from torch-cam.

frgfm avatar frgfm commented on August 25, 2024

Hey again @PengtaoJiang !

For a clean integration, I'll proceed in two PRs:

  • one to integrate LayerCAM without stage fusion (#77)
  • another to add stage fusion

The first one is already merged, I'll think about how to properly integrate stage fusion 👍

from torch-cam.

PengtaoJiang avatar PengtaoJiang commented on August 25, 2024

Hi, @frgfm ,
I just use a simple element-wise maximum fusion to combine cam from multiple layers. Additionally, could you please note that for those layers that are followed by a layer (max pooling in vgg or conv with stride > 1 in resnet), the cam visualizations usually have grid effect. This issue comes from the gradient backward.
I usually choose the nearby layers for visualization. For example, pool4 instead of conv3_3 in vgg16, or model.layer3[-2] instead of model.layer3[-1]. Another choice is to upsample the following layer's gradient, for example, Up(pool4's gradient) * conv3_3's activation. VGG16 usually has more clear visualization than ResNet. ResNet tends to downsample two much at the first few layers. Besides, we also found that larger input will obtain more fine-grained cam visualization for lower layers.

from torch-cam.

frgfm avatar frgfm commented on August 25, 2024

Hey again @PengtaoJiang 👋

Regarding the stage fusion, how to actually perform it isn't an issue. My current design only allows a single target_layer for the CAMs, with the corresponding hooking scheme. I'll just need to find an elegant and robust way to specify several target layers and that the hooks do report to correct information.

One question though since I cannot find the answer in the paper: when fusing the CAM from multiple stages, do you perform RELU + normalization before fusing? or after? I intend to do it after so that with the normalization, the importance of each layer is equal but I wanted to have your take on this :)

Regarding the layer selection, with this repo, the user is free to select any layer she/he wants so that won't be a problem!

from torch-cam.

PengtaoJiang avatar PengtaoJiang commented on August 25, 2024

Hi, @frgfm ,
I perform RELU + normalization before fusion.

from torch-cam.

frgfm avatar frgfm commented on August 25, 2024

Hi @PengtaoJiang 👋

Small update, I finally found time to handle this in a clean way:

  • adding support to compute CAMs for multiple layers at the same time (#89)
  • adding the custom fusion method to LayerCAM (#93)

This should be dealt with very shortly 👌

from torch-cam.

frgfm avatar frgfm commented on August 25, 2024

FYI @PengtaoJiang, I added speed benchmark and a Colab tutorial including CAM fusion with LayerCAM if you're interested: https://github.com/frgfm/notebooks/blob/main/torch-cam/quicktour.ipynb

from torch-cam.

frgfm avatar frgfm commented on August 25, 2024

Another update @PengtaoJiang : I updated the demo in #105 so that you can retrieve CAMs from multiple layers at once and fuse them automatically. You only have to use the plus sign "+" between each layer name (e.g. "layer2+layer3+layer4")

You can find the deployed version over here 👉 https://huggingface.co/spaces/frgfm/torch-cam
(select LayerCAM as CAM method, enter the layers' names and enjoy the fused CAM)

from torch-cam.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.