Giter VIP home page Giter VIP logo

cgps-genotyphi's Introduction

cgps-genotyphi

CGPS implementation of Genotyphi by Kat Holt et al for assembled genomes. Genotyphi is the implementation of the genotyping framework for Salmonella Typhi by Wong et al, which uses a curated set of mutations to assign strains to a particular labelled clade or subclade.

For a full description of Genotyphi and the schema please visit the above links.

Getting Started

With Docker

Without Docker

Output Format

Getting Started

CGPS-Genotyphi can be run as a JAVA programme (Linux/MacOS) or using Docker (all platforms).

The simplest way to install and run CGPS-Genotyphi is via Docker with the following command:

docker run --rm -v $PWD:/data cgps/genotyphi -i [my_typhi_assembly.fasta] -o

If the latest version is not installed, Docker will pull it down from the central DockerHub repository before running it. If you want to use a specific version of genotyphi add the 'version tag' to the command as cgps/genotyphi:v1.0.1.

Otherwise to install the programme follow either the Docker-based or Maven-based build instructions below.

Docker-based Build (recommended)

Requires:

  • Docker (Optional: Git for building from master with version tags)
  • Runs on any OS supported by Docker.
  1. Download the code as a zip bundle, e.g. for the latest code use the example below. Alternatively, pick a specific release from ("Releases")[/releases]. Alternatively, you can git clone https://github.com/ImperialCollegeLondon/cgps-genotyphi.git.
wget https://github.com/ImperialCollegeLondon/cgps-genotyphi/archive/master.zip
unzip code-genotyphi-master.zip
  1. Installation
cd genotyphi
docker build -t genotyphi-builder -f Dockerfile .
# The next command actually builds genotyphi as a JAR and as a container
docker run -it --rm --name genotyphi -v /var/run/docker.sock:/var/run/docker.sock -v "$(pwd)":/usr/src/mymaven -v ~/.docker:/root/.docker -w /usr/src/mymaven genotyphi-builder mvn package

Or, for faster future builds, create a docker volume (2nd command) and use it for future builds (third command):

docker build -t genotyphi-builder -f Dockerfile .
docker volume create --name maven-repo
# Use this command for faster future builds.
docker run -it --rm --name genotyphi -v /var/run/docker.sock:/var/run/docker.sock -v "$(pwd)":/usr/src/mymaven -v maven-repo:/root/.m2 -v ~/.docker:/root/.docker -w /usr/src/mymaven genotyphi-builder mvn package

At this point you can use Docker or run it directly from the terminal (requires JAVA 8 & blastn to be installed as well).

Maven Build

Requires:

  • git, maven, java 8, makeblastdb (on $PATH)

Optional:

  • blastn on $PATH (for running the unit tests)
git clone https://github.com/ImperialCollegeLondon/cgps-genotyphi.git
cd cgps-genotyphi
mvn -Dmaven.test.skip=true install
# (or leave out -Dmaven.test.skip=true if blastn is available)

This will configure the BLAST databases and resources that Genotyphi needs.

At this point you can use Docker or run it directly from the terminal.

To create the Genotyphi container, run:

  1. cd build
  2. docker build -t genotyphi -f DockerFile .

Running with Docker

Usage

To run genotyphi on a single Salmonella Typhi FASTA file in the local directory using the container. An output file {assembly}_genotyphi.jsn is created.

NB If you used the recommended docker build process, substitute genotyphi for registry.gitlab.com/cgps/cgps-genotyphi.

docker run --rm -v $PWD:/data genotyphi -i assembly.fa

To run genotyphi on all FASTA files in the local directory, with an output file for each one:

docker run --rm -v $PWD:/data genotyphi -i .

If the FASTA files are in a different directory use

docker run --rm -v /full/path/to/FASTAS/:/data registry.gitlab.com/cgps/cgps-genotyphi -i .

NB "/data" is a protected folder for genotyphi, and is normally used to mount the local drive.

To get the results to STDOUT rather than file:

docker run --rm -v $PWD:/data genotyphi -i assembly.fa -o

NB not pretty printed, one record per line

Running Directly

  • The JAR file is build/genotyphi.jar and can be moved anywhere. It assumes the database directory is in the same directory, but this can be specified with the -d command line option.
  • Get options and help: java -jar genotyphi.jar
  • e.g. a single assembly java -jar genotyphi.jar -i salty_assembly.fa

Output Format

The output format can be selected using the -f/-format option. It defaults to Text.

  1. Text
  2. CSV
  3. JSON
  4. Pretty JSON
  5. Simple JSON

Text Format

The text format contains three lines:

  1. The assembly ID
  2. The genotype
  3. The determining mutations: {geneName}{location}{variant}({associated genotype})
Name: 007898
Genotype: 4.3.1
Mutations: STY2513_1047T_(4.3.1), STY2867_515C_(2), STY3196_989A_(3)

CSV Format

The CSV format contains the same fields as the text format, but in columns instead. In default mode one file per assembly is written. If you want a single CSV file for all assemblies use the -o option and write the STDOUT to file, e.g:

docker run --rm -v $PWD:/data registry.gitlab.com/cgps/cgps-genotyphi -i . -o -f csv > genotyphi.csv

10071_8#7.contigs_velvet,3.5.4,"STY0176_969T_(3.5.4); STY2867_515C_(2); STY3196_989A_(3); STY4063_411T_(3.5)"
13566_1#53.contigs_velvet,3.1.1,"STY3203_9C_(3.1); STY2863_154T_(3.1.1); STY2867_515C_(2); STY3196_989A_(3)"
9870_8#7.contigs_velvet,4.3.1,"STY2513_1047T_(4.3.1); STY2867_515C_(2); STY3196_989A_(3)"
ERR1079262_paired.contigs_spades,3.2.2,"STY4741_444T_(3.2.2); STY3196_989A_(3)"

JSON Format

A complete example of the JSON format can be found in here. The example below is "pretty" formatted. By default it is printed on a single line with no spaces.

{
  "assemblyId" : "my_assembly",
  "genotype" : "4.3.1",
  "foundLoci" : 68.0,
  "aggregatedAssignments" : {
    "primaryGroups" : [ {
      "depth" : "PRIMARY",
      "code" : [ "3" ]
    } ],
    "cladeGroups" : [ ],
    "subcladeGroups" : [ {
      "depth" : "SUBCLADE",
      "code" : [ "4", "3", "1" ]
    } ]
  },
  "genotyphiMutations" : {
    "STY2513" : [ {
      "variant" : "T",
      "genotyphiGroup" : {
        "depth" : "SUBCLADE",
        "code" : [ "4", "3", "1" ]
      },
      "location" : 1047
    } ],
    "STY2867" : [ {
      "variant" : "C",
      "genotyphiGroup" : {
        "depth" : "PRIMARY",
        "code" : [ "2" ]
      },
      "location" : 515
    } ],
    "STY3196" : [ {
      "variant" : "A",
      "genotyphiGroup" : {
        "depth" : "PRIMARY",
        "code" : [ "3" ]
      },
      "location" : 989
    } ]
  },
  "blastResults" : [ {
    "blastSearchStatistics" : {
      "librarySequenceId" : "STY3940",
      "librarySequenceStart" : 1,
      "querySequenceId" : ".12045_3_90.22",
      "querySequenceStart" : 55709,
      "percentIdentity" : 100.0,
      "evalue" : 0.0,
      "reversed" : false,
      "librarySequenceStop" : 1401,
      "querySequenceStop" : 57109,
      "librarySequenceLength" : 1401
    },
    "mutations" : [ ],
    "queryMatchSequence" : "GTGTCA...",
    "referenceMatchSequence" : "GTGTCA..."
  },
  ...
  ]
}

Pretty JSON Format

This formats the JSON nicely as in the example given above.

Simple JSON Format

The same as the above JSON format, but without the BLAST results or aggregation result details.

Naming Docker Builds

Container tags are automatically generated during the build phase by Maven using jgitver.

To create a "release tag" (i.e. not appended with "-SNAPSHOT") and push the resulting container to a remote Docker repository:

git tag -a -m "My message" v1.0.0-rc4
docker run -it --rm --name genotyphi -v /var/run/docker.sock:/var/run/docker.sock -v "$(pwd)":/usr/src/mymaven -v maven-repo:/root/.m2 -v ~/.docker:/root/.docker -w /usr/src/mymaven genotyphi-builder mvn install

The Docker repository can be changed from the CGPS default by editing the <genotyphi.docker-repository> property in the top level pom.xml.

Acknowledgments

This software was written developed by the Centre for Genomic Pathogen Surveillance (CGPS) and funded by the Wellcome Trust.

cgps-genotyphi's People

Contributors

coriny avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

cgps-genotyphi's Issues

New security advisory on com.fasterxml.jackson.core:jackson-databind

We found a vulnerable dependency in repositories you have security alert access to.

Security advisory GHSA-h4rc-386g-6m85 [https://github.com/advisories/GHSA-h4rc-386g-6m85] (moderate severity) affects 1 repository:

com.fasterxml.jackson.core:jackson-databind(Maven) used in 1 repository

ImperialCollegeLondon/cgps-genotyphi
View alert [https://github.com/ImperialCollegeLondon/cgps-genotyphi/network/alert/pom.xml/com.fasterxml.jackson.core:jackson-databind/open]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.