Giter VIP home page Giter VIP logo

nsf-paper's Introduction

Nonnegative spatial factorization for multivariate count data

This repository contains supporting code to facilitate reproducible analysis. For details see the preprint. If you find bugs please create a github issue.

Authors

Will Townes and Barbara Engelhardt

Abstract

Gaussian processes are widely used for the analysis of spatial data due to their nonparametric flexibility and ability to quantify uncertainty, and recently developed scalable approximations have facilitated application to massive datasets. For multivariate outcomes, linear models of coregionalization combine dimension reduction with spatial correlation. However, their real-valued latent factors and loadings are difficult to interpret because, unlike nonnegative models, they do not recover a parts-based representation. We present nonnegative spatial factorization (NSF), a spatially-aware probabilistic dimension reduction model that naturally encourages sparsity. We compare NSF to real-valued spatial factorizations such as MEFISTO and nonspatial dimension reduction methods using simulations and high-dimensional spatial transcriptomics data. NSF identifies generalizable spatial patterns of gene expression. Since not all patterns of gene expression are spatial, we also propose a hybrid extension of NSF that combines spatial and nonspatial components, enabling quantification of spatial importance for both observations and features.

Description of Repository Contents

models

TensorFlow implementations of probabilistic factor models

  • cf.py - nonspatial models (factor analysis and probabilistic nonnegative matrix factorization).
  • mefisto.py - wrapper around the MEFISTO implementation in the mofapy2 python package.
  • pf.py - nonnegative and real-valued spatial process factorization (NSF and RSF).
  • pfh.py - NSF hybrid model, includes both spatial and nonspatial components.

scrna

Analysis of spatial transcriptomics data

  • sshippo - Slide-seqV2 mouse hippocampus
  • visium_brain_sagittal - Visium mouse brain (anterior sagittal section)
  • xyzeq_liver - XYZeq mouse liver/tumor

simulations

Data generation and model fitting for the ggblocks and quilt simulations

utils

Python modules containing functions and classes needed by scripts and model implementation classes.

  • benchmark.py - functions used in fitting models to datasets and pickling the objects for later evaluation. Can be called as a command line script to facilitate automation.
  • benchmark_gof.py - script with basic command line interface for computing goodness-of-fit, sparsity, and timing statistics on large numbers of fitted model objects
  • misc.py - miscellaneous convenience functions useful in preprocessing (normalization and reversing normalization), postprocessing, computing benchmarking statistics, parameter manipulation, and reading and writing pickle and CSV files.
  • nnfu.py - nonnegative factor model utility functions for rescaling and regularization. Useful in initialization and postprocessing.
  • postprocess.py - postprocessing functions to facilitate interpretation of nonnegative factor models.
  • preprocess.py - data loading and preprocessing functions. Normalization of count data, rescaling spatial coordinates for numerical stability, deviance functions for feature selection (analogous to scry), conversions between AnnData and TensorFlow objects.
  • training.py - classes for fitting TensorFlow models to data, including caching with checkpoints, automatic handling of numeric instabilities, and ConvergenceChecker, which uses a cubic spline to detect convergence of a stochastic optimizer trace.
  • visualize.py - plotting functions for making heatmaps to visualize spatial and nonspatial factors, as well as some goodness-of-fit metrics.

nsf-paper's People

Contributors

willtownes avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.