Giter VIP home page Giter VIP logo

findsv's Introduction

FindSV

FindSV is a pipeline designed to find structural variations(SVs). It is designed to run on the UPPMAX system but should work well on other servers using slurm to queue the jobs.

FindSV can simultaneously run FindTranslocations,CNVnator and fermikit on a large number of samples. The pipeline also creates a database of the found events and counts the occurance of each event.

FindSV accepts the path to a folder containing multiple project folders. Each project folder should store the bam files that the program uses as input for the various tools and scipts used to analyse structural variants.

Install

Download and build the pipeline using the following commands:

git clone https://github.com/J35P312/FindSV.git
cd FindSV/programFiles
bash InstallScript

Running

First load python 2.7:

module load python/2.7

Before a project can be analysed, a project file must be created. An example file is found within the projects folder within the FindSV folder. the extension of the project files must be txt. if no project is specified by the --project option, all projects that have a project file within the projects folder till be analysed. By copying the example.txt file and changing the various fields, the user can customise the analysis of different projects.

The pipeline is started using the following command:

python FindSV --run

or run it every nth hour using this command

python FindSV --run --cycle n

to analyse only one of the projects, run the pipeline using the following arguments:

python FindSV --run --project runthisproject

where runthisproject is the path to a project file.

The available tools of the pipeline may be listed using the --list option:

python FindSV --list

At the moment the available variant callers are:

  • Fermikit
  • FindTranslocations
  • CNVnator
  • delly

The config file

The config file (config.txt) is located in the FindSV folder. It contains the following fields.

-working_dir= sets the output diretory of the pipeline. The output will be stored in subfolders for project, tool and sample: working_dir/project/tool/sample.

If working_dir is set to default, a folder named output will be created in the FindSV directory

Set bamfile_location=path/to/bam and working_dir=path/to/project/folders.

-available_tools=

the tools available for analysing the bam files. Each have a function within the script.py file found inside the module folder.

-excluded_tools=

tools that are not to be used. To exclude CNVnator:

excluded_tools=CNVnator

If more than one tool is to be excluded, separate the tools with ;

excluded_tools=CNVnator;FindTranslocations

-account=

the slurm project account, example:

account=b2011162

Projects may be excluded by adding them to the excluded projects list

excluded_projects=

Examples:

excluded_projects=project1
excluded_projects=project1;project2

-uppmax= Used to enable the uppmax module system(default)

-recursive= If recursive is set to yes, FindSV will perform a recursive searh for bam files within the project folders(default) The reference file

The reference file contains paths leading to the references used by fermikit and CNVnator, the file(references.txt) is found in the FindSV directory and contains the following:

chromosomes=/proj/b2014152/nobackup/jesperei/references/chromosomes
bwa-indexed-ref=/sw/data/uppnex/reference/Homo_sapiens/hg19/program_files/bwa/concat.fa

chromosomes is used by cnvnator and should contain a refererence fasta file for each chromosome. Files should be named in the following manner:

chr1.fa
chr2.fa
...
chrN.fa

bwa-indexed-ref is used by fermikit, it should contain the path to a bwa inexed reference file. The path file

The path file contains the path to cnvnator, FindTranslocations and vep. If no path is added, FindSV will try the default uppmax configuration. Example:

FindTranslocations=/home/jesperei/FindTranslocations/bin/FindTranslocations
CNVnator=
cnvnator2VCF=
vep=/home/jesperei/FindSV/programFiles/ensembl-tools-release-81/scripts/variant_effect_predictor/variant_effect_predictor.pl

findsv's People

Contributors

dnil avatar j35p312 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.