Giter VIP home page Giter VIP logo

wangpanqiao / speciesprimer Goto Github PK

View Code? Open in Web Editor NEW

This project forked from biologger/speciesprimer

0.0 1.0 0.0 3.89 MB

The SpeciesPrimer pipeline is intended to help researchers finding specific primer pairs for the detection and quantification of bacterial species in complex ecosystems.

License: GNU General Public License v3.0

Dockerfile 0.13% GAP 0.04% Python 69.41% Perl 4.02% Shell 0.03% CSS 4.73% HTML 21.63%

speciesprimer's Introduction

SpeciesPrimer

Contents

Docs

Minimum system requirements

  • Quad core processor
  • 16 GB RAM
  • SSD / fast hard disk (recommended)
  • 60 GB free space for nt database
  • 4.5 GB for the docker image
  • 5 - 20 GB for each analysis

quick start (Ubuntu 16.04)

  • Download and install docker

      $ sudo docker pull biologger/speciesprimer
      $ mkdir $HOME/primerdesign
      $ mkdir $HOME/blastdb
      $ sudo docker run \
      -v $HOME/blastdb:/home/blastdb \
      -v $HOME/primerdesign:/home/primerdesign \
      -p 5000:5000 -p 9001:9001 \
      --name speciesprimer biologger/speciesprimer
    
  • Open the address [http://localhost:5000] or [http://127.0.0.1:5000] in your favorite webbrowser

  • Enter your E-mail address (required for the biopython NCBI Entrez module)

  • Navigate to SpeciesPrimer settings [http://localhost:5000/pipelineconfig]

  • Download the nt BLAST DB

  • Customize the species list and other parameters if required

  • Navigate to Primer design [http://localhost:5000/primerdesign] and start primer design for new targets

Use the pipeline with the command line

  • After the docker run command open a new terminal

      # open an interactive terminal in the docker container
      $ sudo docker exec -it speciesprimer bash
    
  • Download the nt BLAST DB:

      $ getblastdb.py -dbpath /home/blastdb --delete
    
  • or alternatively

      $ cd /home/blastdb
      update_blastdb.pl --passive --decompress --blastdb_version 5 nt_v5
      $ cd /home/primerdesign
    
  • Customize the species list and other parameters if required (see docs/pipelinesetup.md for more info):

      $ nano /home/pipeline/dictionaries/species_list.txt
      $ nano /home/pipeline/p3parameters
      $ nano /home/pipeline/NO_Blast/NO_BLAST.gi
    
  • Start primer design

      $ speciesprimer.py
    

Introduction

The SpeciesPrimer pipeline is intended to help researchers finding specific primer pairs for the detection and quantification of bacterial species in complex ecosystems. The pipeline uses genome assemblies of the target species to identify core genes (genes which are present in all assemblies) and checks the specificity for the target species using BLAST. Primer design is performed by primer3, followed by a stringent primer quality control. To make the evaluation of primer specificity faster and simpler, not all sequences of all bacterial species in the BLAST database are considered, the user has to provide a list of organisms which are expected to be present in the investigated ecosystem and should not be detected by the primer pair. The output of the pipeline is a comma separated file with possible primer pairs for the target species, which can be further tested and evaluated by the user.

Pipeline workflow and tools

Pipeline workflow Tools Reference
Input genome assemblies
- download NCBI Entrez (Biopython) Cock et al. 2009; Sayers 2009
- annotation Prokka Seemann 2014
- quality control BLAST+ Altschul et al. 1990
Core gene sequences
- identification Roary Page et al. 2015
- phylogeny FastTree 2 Price et al. 2010
- selection of conserved sequences SQlite3, Prank, consambig (EMBOSS),GNU parallel, DBGenerator.py Löytynoja 2014; Rice et al. 2000; Tange 2011; microgenomcis
- evaluation of specificity BLAST+ Altschul et al. 1990
Primer
- design Primer3 Untergasser et al. 2012
- quality control BLAST+, Mfold, MFEPrimer 2.0, MPprimer Altschul et al. 1990; Zuker et al. 1999; Qu et al. 2012; Shen et al. 2010

The DBGenerator.py script from Microbial Genomics Lab at CBIB is used to create an SQL database from the Roary output.

Python modules and software used for the GUI:

flask

flask-wtf

gunicorn

MyDaemon

Run settings

Section Command line option [Input] Description Default
General target [str] Name of the target species None (required)
exception [str] Name of a non-target bacterial species for which primer binding is tolerated None
path [str] Absolute path of the working directory Current working directory
offline Work offline with local genome assemblies False
skip_download Skips download of genome assemblies from NCBI RefSeq FTP server False
assemblylevel [all, complete, chromosome, scaffold, contig] Only genome assemblies with the selected assembly status will be downloaded from the NCBI RefSeq FTP server ['all']
remote Use the BLAST+ remote flag for BLAST searches False
blastseqs [100, 500, 1000, 2000, 5000] Set the number of sequences per BLAST search. Decreasing the number of sequences requires less memory 1000
blastdbv5 Uses the nt_v5 database and limits all BLAST searches to taxid:2 (bacteria). Increases speed. False
Quality control qc_gene [rRNA, recA, dnaK, pheS, tuf] Selection of housekeeping genes for BLAST search to determine the species of input genome assemblies ['rRNA']
ignore_qc Keep genome assemblies, which fail to meet the criteria of the quality control step False
Pan-genome analysis skip_tree Skips core gene alignment (Roary) and core gene phylogeny (FastTree) False
Primer design minsize [int] Minimal accepted amplicon size of PCR primer pairs 70
maxsize [int] Maximal accepted amplicon size of PCR primer pairs 200
Primer quality control mfold [float] Set the deltaG threshold (max. deltaG) for the secondary structures at 60 °C in the PCR product, calculated by Mfold -3.5
mpprimer [float] Set the deltaG threshold (max. deltaG) for the primer-primer 3’-end binding, calculated by MPprimer -3.0
mfethreshold [int] Threshold for MFEprimer primer pair coverage (PPC) score. Higher values: select for better coverage for target and lower coverage for for non-target sequences (recommended range 80 - 100). 90

Tutorial (Ubuntu 16.04):

Docker setup

Download and install docker

Download from https://www.docker.com/get-docker and see the docs for installation instructions https://docs.docker.com/

Download images from docker hub

  1. Open a terminal:

    • HOST:

        $ sudo docker pull biologger/speciesprimer
      
  2. Now you have the image, you can display the image with

    • HOST:

        $ sudo docker images
      
  3. If there is more than one image from the repository biologger/speciesprimer, you can remove the image with the <none> Tag

    • HOST:

        $ sudo docker rmi {image_id}
      

Choose directories

  1. Decide which directories (on the host) should be used by the container
  • If the pre-formatted nucleotide (nt) database from NCBI is already downloaded and unpacked on your computer, just add the path to the directory in the docker run command (-v path_to_host_blastdb_dir:/home/blastdb)

  • Create a directory for primerdesign and one for the BLAST database

  • Example:

    • Create two new directories in the home directory

    • HOST:

        # one for the primer design files
        $ mkdir /home/biologger/primerdesign
        # one for the nucleotide blast database
        $ mkdir /home/biologger/blastdb
      

Run a container instance

Create the container instance using the host directories as volumes for the docker container. In the container these directories are then located in /home/blastdb and /home/primerdesign. The name of the container can be changed (--name). The -p option defines the ports which are open for the container so you can access the container app http://127.0.0.1:{hostport1/2} / http://localhost:{hostport1/2}}. On the left side the host port is given and on the right side the container port. The container port is fixed and cannot be changed, if the host port is already used another port can be selected. The link on the page where you can control the runs is however fixed to port 9001, but you can open the log file stream by opening http://localhost:{hostport2} in your browser.

docker run

  1. HOST:

     $ sudo docker run \
     -v path_to_host_blastdb_dir:/home/blastdb \
     -v path_to_host_primerdesign_dir:/home/primerdesign \
     -p {hostport1}:5000 -p {hostport2}:9001 \
     --name speciesprimer_pipeline -it biologger/speciesprimer
    

Example:

  • HOST:

      $ sudo docker run \
      -v /home/biologger/blastdb:/home/blastdb \
      -v /home/biologger/primerdesign:/home/primerdesign \
      -p 5000:5000 -p 9001:9001 \
      --name speciesprimer_pipeline -it biologger/speciesprimer
    

In the terminal you see that the server in the container was started. Afterwards you can open the address [http://localhost:5000] or what port you have choosen for {hostport1} in your webbrowser.

docker stop

You can shutdown the container by opening a terminal and the command

    $ sudo docker stop {containername/id}

Example:

  • HOST:

      $ sudo docker stop speciesprimer_pipeline
    

docker start

The next time you do not have to repeat the docker run command (this would create a new container, without your modified settings) Instead you simply start the container with the command

    $ sudo docker start {containername/id}

Example:

  • HOST:

      $ sudo docker start speciesprimer_pipeline
    

Afterwards you can open the address [http://localhost:5000] or what port you have choosen for {hostport1} in your webbrowser.

docker attach

If you want to see the status of the webserver in the container in your terminal (like after the docker run command)

    $ sudo docker attach {containername/id}

Example:

  • HOST:

      $ sudo docker attach speciesprimer_pipeline
    

docker exec

If you w to access the container with the terminal you can use (the -it option is for the interactive terminal)

    $ sudo docker exec -it {containername/id} bash

Example:

  • HOST:

      $ sudo docker exec -it speciesprimer_pipeline bash
    

Leave the container terminal

You can leave the docker container by typing exit

Example:

  • CONTAINER:

      $ exit
    

Test the container

If not already started

$ sudo docker start {containername/id}

$ sudo docker exec -it {containername/id} bash

If you see root@{containerID}:/home/primerdesign# in the terminal, you have now access to the terminal of the container.

Test if you have mounted the volumes correctly

  • CONTAINER:

      $ echo test > test.txt
    
    • Check if you find test.txt

        $ ls -l	
      
  • HOST:

    • Check if you find test.txt on the host

        $ ls -l /home/{linux_username}/primerdesign
      

If you want to delete this test.txt file there are two options

  1. Do it in the container

    • CONTAINER:

        $ rm test.txt
      
  2. Do it on the host

    • Change the owner of the files in the primerdesign directory on the host (recursively).

    • HOST:

        $ sudo chown -R {linux_username} {path_to_primerdesign_dir}
      
    • Now you can move and delete the files and directories.

    Example:

    • HOST:

        $ sudo chown -R biologger /home/biologger/primerdesign
      

Troubleshooting Docker

Conflict with the container name

"docker: Error response from daemon: Conflict. The container name "/speciesprimer_pipeline" is already in use by container 	"e9d0de003ce8eff06b34f8f46e4934797052e16dcdbd7e60214d05ea3828a70", You have to remove (or rename) that container to be able to reuse that name"
  1. Display your containers

    • HOST:

        $ sudo docker ps -a
      
  2. Stop the container with the container ID or the container name *

     	$ sudo docker stop {ContainerID/containername}
    
  3. Delete the container *

     	$ sudo docker rm {ContainerID/containername}
    

Now you can try again to create a container with the sudo docker run command

  • HOST:

      $ sudo docker run \
      -v /home/{linux_username}/blastdb:/home/blastdb \
      -v /home/{linux_username}/primerdesign:/home/primerdesign \
      -p {hostport1}:5000 -p {hostport2}:9001 \
      --name speciesprimer_pipeline -it biologger/speciesprimer
    

Example:

  • HOST:

      $ sudo docker run \
      -v /home/biologger/blastdb:/home/blastdb \
      -v /home/biologger/primerdesign:/home/primerdesign \
      -p 5000:5000 -p 9001:9001 \
      --name speciesprimer_pipeline -it biologger/speciesprimer__
    

speciesprimer's People

Contributors

biologger avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.