Giter VIP home page Giter VIP logo

grae's Introduction

Geometry Regularized Autoencoders (GRAE)

DOI:10.1109/TPAMI.2022.3222104

Teapot

Source code for the Geometry Regularized Autoencoder paper. Based on the paper here. The traditional autoencoder objective is augmented to regularize the latent space towards a manifold learning embedding, e.g., PHATE. A more detailed explanation of the method can be found in GRAE_poster.pdf.

Reference

If you find this work useful, please cite:

@article{duque2022geometry,
  title={Geometry Regularized Autoencoders},
  author={Duque, Andres F and Morin, Sacha and Wolf, Guy and Moon, Kevin R},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  year={2022},
  publisher={IEEE}
}

Install

You can install this repo directly with pip, preferably in a virtual environment :

pip install --upgrade git+https://github.com/KevinMoonLab/GRAE.git

Character Tracking

Usage

The code largely follows the scikit-learn API to implement different autoencoders and dimensionality reduction tools. You can change basic autoencoder hyperparameters and manifold learning hyperparameters through the model interface. For example, to reproduce some Rotated Digits results :

from grae.models import GRAE
from grae.data import RotatedDigits

# Various autoencoder parameters can be changed
# t and knn are PHATE parameters, which are used to compute a target embedding
m = GRAE(epochs=100, n_components=2, lr=.0001, batch_size=128, t=50, knn=10)

# Input data should be an instance of grae.data.BaseDataset
# We already have subclasses for datasets in the paper
data = RotatedDigits(data_path='data', split='train')

# Fit model
m.fit(data)

# Get 2D latent coordinates
z = m.transform(data)

# Compute some image reconstructions
imgs = m.inverse_transform(z)

Some utility functions are available for visualization :

# Fit, transform and plot data
m.fit_plot(data)

# Transform and plot data
m.plot(data)

# Transform, inverse transform and visualize reconstructions
m.view_img_rec(data)

Most of our benchmarks are implemented with similar estimators. Implemented models include

  • GRAE: Autoencoder with a PHATE latent target;
  • GRAE (UMAP): Autoencoder with a UMAP latent target;
  • AE: Vanilla Autoencoder;
  • DAE: Denoising Autoencoder;
  • CAE: Contractive Autoencoder;
  • VAE: β-VAE;
  • TAE: Topological Autoencoder;
  • DiffusionNet: Diffusion Nets.

And many more!

Adding a new model or a new dataset

New models should subclass grae.models.BaseModel or grae.models.AE if autoencoder-based. New datasets should follow the grae.data.BaseDataset interface.

grae's People

Contributors

andresd45 avatar kmoon3 avatar sachamorin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

grae's Issues

Empty show metrics

show metrics function should check if df is empty before processing (this will happen when working on a single data set with no supervised metrics)

UMAP train test split ordering

Running UMAP score method on train split before running on test split leads to obviously wrong test embeddings.

(This was hot fixed for the paper by scoring test before train and the results are valid, but the bug should still be investigated).

License for code re-use and modification?

Thank you for your work and for this beautifully constructed software!

I'm looking forward to adapting some of the source code for another project. Naturally, I would like to pay my respects to the team who developed the code, and intended to do so by attaching the project license alongside a reference to this repository and to your publication at the top of the script. However, the project does not specify any license.

How should someone interested in adapting the code available in this repository proceed?

Cheers,
Davi

Roll class bug

Roll class is bugged when used alone. SwissRoll (which is the children class we use in the paper) is fine.

Changes in Rotated Digits embedding

Embedding plots for Rotated Digits are now a bit different from the paper version. Metrics are still good. Maybe something is not seeded properly.

Section metrics don't work on EB data

Partitioning the manifold leads to locally constant time labels because of the lack of time granularity. This raises errors when computing correlations and MI.

comet-ml==3.1.14 is not installed

I tried to install the git repository via clone, and I got an error while running it. Then I tried to find the error and I found that the error occurred due to the installation of comet.ml ==3.1.14. Error note indicated that it's an error of the repo.

Does anyone have the same issue as me? How should I fix it? Can you please guide me on this?

Duplicates in IPSC data

A lot of pairs have 0 distance, but aren't exactly equal. Need to remove duplicates given a small tolerance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.