Giter VIP home page Giter VIP logo

aws-cv-unique-information's Introduction

Information content of samples

The author implementation of smooth unique information proposed in the paper "Estimating informativeness of samples with Smooth Unique Information" by Hrayr Harutyunyan, Alessandro Achille, Giovanni Paolini, Orchid Majumder, Avinash Ravichandran, Rahul Bhotika, and Stefano Soatto. This work defines and estimates smooth unique information of samples with respect to classifier weights and predictions. We compute these quantities for linearized neural networks.

To cite the paper please use the following BibTeX:

@inproceedings{harutyunyan2021estimating,
  title={Estimating informativeness of samples with Smooth Unique Information},
  author={Hrayr Harutyunyan and Alessandro Achille and Giovanni Paolini and Orchid Majumder and Avinash Ravichandran and Rahul Bhotika and Stefano Soatto},
  booktitle={International Conference on Learning Representations},
  year={2021},
  url={https://openreview.net/forum?id=kEnBH98BGs5}
}

About the repository

  • nnlib contains useful general tools for training and working with neural networks. Source.
  • sample_info is the main directory to look at.
  • sample_info/methods/ implements standard and linearized classifiers. The latter ones are mainly used for testing and debugging purposes. They are never used in the main experiments.
  • sample_info/configs/ lists neural network architectures.
  • sample_info/modules contains is the most important subdirectory
    • data_utils.py contains simple datasets and tools for creating datasets.
    • influence_functions.py implements influence functions.
    • misc.py contains a few tools.
    • nn_utils.py extents the corresponding file from nnlib and is used to parse neural networks from architecture configs.
    • ntk.py is one of the most important files and implements needed functions for working with linearized neural networks, such as computing Jacobians, predicting weights, training and test predictions at custom times.
    • sgd.py implements computation of the SGD noise covariance matrix and its diagonal.
    • stability.py is implements the proposed methods -- computing information with weights or activations.
  • sample_info/notebooks contains some Jupyter notebooks. The content of these can be ignored, as most of them are not up do date and were used only for initial experiments.
  • sample_info/scripts contains scripts of experiments and codes for generating commands.
  • tests implements some unit tests, which mainly tests NTK tools and stability measures.

Requirements

  • Standard libraries like numpy, scipy, tqdm, scikit-leearn, matplotlib
  • Pytorch 1.4
  • Tensorboad

To run the tests:

nosetests test

Additionally, you need to have $DATA_DIR in environment, pointing to the directory where all data is stored.

Experiments

Most experiment commands can be generated from the script sample_info/scripts/generate_commands.py. Here is an example how to run the experiment of computing correlations of informativeness scores with ground truth in the case of MNIST 4 vs 9 classification with MLP.

MNIST 4 vs 9, full batch, mlp 1024

  • Generate and run the commands for getting the ground truth effects, informativeness scores using our method, and influence functions:
python -um sample_info.scripts.generate_commands --exp_names mnist4vs9_fullbatch_noreg_small_cnn_ground_truth \
    mnist4vs9_fullbatch_noreg_small_cnn_informativeness \
    mnist4vs9_fullbatch_noreg_small_cnn_influence_functions
  • Aggregate the results
python -um sample_info.scripts.aggregate_ground_truth_results --exp_name mnist4vs9_fullbatch_noreg_small_cnn -n 1000

License

This project is licensed under the Apache-2.0 License.

aws-cv-unique-information's People

Contributors

alexachille avatar amazon-auto avatar hrayrhar avatar orchidmajumder avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.