Giter VIP home page Giter VIP logo

ame's Introduction

Active Visual Exploration Based on Attention-Map Entropy (IJCAI 2023)

Official PyTorch implementation of the paper: "Active Visual Exploration Based on Attention-Map Entropy"

Active Visual Exploration Based on Attention-Map Entropy
Adam Pardyl, Grzegorz Rypeść, Grzegorz Kurzejamski, Bartosz Zieliński, Tomasz Trzciński
https://doi.org/10.24963/ijcai.2023/145

Abstract: Active visual exploration addresses the issue of limited sensor capabilities in real-world scenarios, where successive observations are actively chosen based on the environment. To tackle this problem, we introduce a new technique called Attention-Map Entropy (AME). It leverages the internal uncertainty of the transformer-based model to determine the most informative observations. In contrast to existing solutions, it does not require additional loss components, which simplifies the training. Through experiments, which also mimic retina-like sensors, we show that such simplified training significantly improves the performance of reconstruction, segmentation and classification on publicly available datasets.


Setup

git clone https://github.com/apardyl/AME.git && cd AME
conda env create -f environment.yml -n wtln # we recommend using mamba instead of conda (better performance)
conda activate wtln

Train

  • download and extract the requested dataset
  • run training with:
python train.py <dataset> <model> [params]

where dataset is one of:

  • Reconstruction task:
    • ADE20KReconstruction
    • Coco2014Reconstruction
    • Sun360Reconstruction
    • TestImageDirReconstruction
  • Segmentation task:
    • ADE20KSegmentation
  • Classification task:
    • Sun360Classification (for train all configuration)
    • EmbedClassification (for head-only configuration, prepare embeddings with predict.py first)

and the model is:

  • {Attention/Random/Checkerboard}Mae for reconstruction
  • {Attention/Random/Checkerboard}SegMae for segmentation
  • {Attention/Random/Checkerboard}ClsMae for train-all classification
  • EmbedClassifier for head-only classification

Example: Run AttentionMAE on MS COCO 14 with reconstruction task

python train.py Coco2014Reconstruction AttentionMae  --data-dir DATASET_DIR

Run python train.py <dataset> <model> --help for available training params.

Visualizations form the paper can be generated using predict.py (use --visualization-path param).

Embeddings for head-only classification are generated with predict.py (--dump-path).

Average glimpse selection maps and evaluation of a trained model can be obtained with predict.py with --avg-glimpse-path and --test params accordingly.

Trained models

Models trained by us are avaialble at: https://ideasncbr-my.sharepoint.com/:f:/g/personal/adam_pardyl_ideas-ncbr_pl/EljEH2vCtdFGj-LcYleHIhYB1ANXWmZhoH0DnjejY-IjiA?e=HVsQjp

Citation

If you find this useful for your research, please use the following.

@inproceedings{pardyl2023active,
  title     = {Active Visual Exploration Based on Attention-Map Entropy},
  author    = {Pardyl, Adam and Rypeść, Grzegorz and Kurzejamski, Grzegorz and Zieliński, Bartosz and Trzciński, Tomasz},
  booktitle = {Proceedings of the Thirty-Second International Joint Conference on
               Artificial Intelligence, {IJCAI-23}},
  publisher = {International Joint Conferences on Artificial Intelligence Organization},
  editor    = {Edith Elkind},
  pages     = {1303--1311},
  year      = {2023},
  month     = {8},
  note      = {Main Track},
  doi       = {10.24963/ijcai.2023/145},
  url       = {https://doi.org/10.24963/ijcai.2023/145}
}

ame's People

Contributors

apardyl avatar grypesc avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.