Giter VIP home page Giter VIP logo

hw07-group14's Introduction

Binder

Single Cell Sequencing Analysis Project

Jennefer, Claudea; Kim, Wendy; Tsai, Gordon; Villouta, Catalina

Project Scope

In this project we work directly with public available single cell RNA-seq data with the aim of classifying Mus musculus (house mouse) cells to the appropiate organ they came from. Given limited computational resources we decided to work with cells from kidney and liver only, however the work presented here is generalizable to as many organs as needed.

Project Goals

  • Implement an autoencoder to find a low-dimensional latent representation of the cells.
  • Show that the latent representation is more useful than PCA.
  • Implement a model built on top of the encoder for classifying cells into kidney or liver.
  • Obtain a high performance for the classifier out-of-sample.

Dependencies

All dependencies are listed in environment.yml and book-requirements.txt

Dataset

This project's relevant datasets span cell information from kidney and liver organs. We obtained them from the Tabula Muri's project, a compendium of single-cell transcriptome data from the Mus musculus organism (specific links are shown in the references). The data used in this project can be found in the data folder:

  • Kidney-counts.csv
  • Liver-counts.csv

Setup

  • Create Virtual Environment:

    • using environment.yml in terminal
      • mamba env create -f environment.yml -p ~/envs/genes
      • python -m ipykernel install --user --name genes --display-name "IPython - genes"
    • using Makefile in terminal: make env
  • Activate Virtual Environment:

    • conda activate genes
  • Install genetools package from source, via:

    • pip install .
  • Run analysis using Makefile in terminal:

    • make all

Run tests

  • Use the command pytest genetools on the terminal to run all the tests.

License

This project is released under the terms of the BSD 3-clause License.

Reference

We obtained the data from the Tabula Muris project released in 2017 by The Chan Zuckerberg Biohub. All matrices of gene-cell counts and metadata are available as CSVs on Figshare. We specifically used the data for kidney and liver cells from the FACS-based full-length transcript analysis released in 2018.

  • Consortium, Tabula Muris; Webber, James; Batson, Joshua; Pisco, Angela (2018): Single-cell RNA-seq data from Smart-seq2 sequencing of FACS sorted cells (v2). figshare. Dataset. DOI

hw07-group14's People

Contributors

claudea24 avatar github-classroom[bot] avatar gordon-tsai avatar mcvillouta avatar wendykimm avatar

Forkers

mcvillouta

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.