Giter VIP home page Giter VIP logo

wild_mouse_genetic_survey's Introduction

Overview

This repository contains code pertaining to analysis of population structure and inbreeding in wild house mice, as described in this manuscript:

Morgan AP, Hughes JJ et al. Population structure and inbreeding in wild house mice (Mus musculus) at different geographic scales. bioRxiv. doi:10.1101/2022.02.17.478179

Large data files (eg. genotype matrix in VCF format) and large intermediate files are not included in this repository. As such the analysis scripts will not be ready to run "out of the box"; file paths will need to be adjusted according to the interested user's local directory structure. But all information needed to verify the conduct of key analyses in the manuscript is available in the code as written.

Dependencies

For alignment and SNV calling from WGS data:

  • bwa
  • picard
  • samblaster
  • samtools
  • bcftools
  • GATK v.4.1.0.0
  • python >= 3.7
  • snakemake (any recent version) See this repository for detailed walkthrough of alignment pipeline, and this repository for SNV calling.

For post-processing and analyses:

  • python >= 3.7
  • R v.4.1.1, with these packages plus dependencies therein:
    • tidyverse
    • ggplot2
    • cowplot
    • viridis
    • ggbeeswarm
    • ape
    • BEDMatrix
    • raster
    • rgeos
    • maps
    • mouser (non-CRAN)
    • popcorn (non-CRAN)
  • bcftools v.1.9
  • plink v.200a3
  • akt v.3beb346 (NB: later versions introduced breaking changes to pedigree inference, so version matters)
  • TreeMix v.1.12
  • ADMIXTURE v.1.3.0

Files

  • Batch calculations, in bash_scripts
    • calc_kinship.sh: estimate kinship coefficients within taxa, generate lists of putatively unrelated individuals
    • calc_kinship_subsets.sh: same as above, but randomize input so that different sets of unrelateds are retained (for robustness checks)
    • calc_freqs.sh: calculate allele frequencies within taxa, using unrelateds only
    • calc_roh.sh: identify long runs of homozygosity (ROH) using taxon-specific allele frequencies as input
  • Python utilities
    • pyadmix: wrapper script for ADMIXTURE
  • Data management
    • make_sample_lists.R: using master sample table and list of putatively unrelated individuals, make text files with lists of sample IDs by taxon, etc. Includes code for creating random subsets of representative individuals to use for robustness checks
    • make_sample_manifest.R: make Table S1
  • Figures
    • draw_maps.R: Figure 1
    • draw_PCA.R: Figures 2, 3
    • draw_admix_plots.R: Figure 4
    • analyze_inbreeding.R: Figure 6
    • analyze_ROH.R: Figure 7
    • analyze_demes.R: Figure 8

wild_mouse_genetic_survey's People

Contributors

andrewparkermorgan avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.