Giter VIP home page Giter VIP logo

quickblast's Introduction

QuickBLAST v1.2.2

Current BUILD is being tested and is not guaranteed to work. Binaries of older version available here

Requires

  • OpenMP support (-fopenmp)
  • R > 4.4.1
  • Rtools >= 4.4 (Windows)
??QuickBLAST

Written in C++ and interfaced with R using Rcpp, the package is wrapped around ncbi-c++ toolkit's CBl2Seq Class (same with arrow) and exposing the functions to R with C linkage. I use getlogin() to store username in output metadata, this might raise red flags (in ArrowWrapper.cpp). QuickBLAST provides better interoperability with R for NCBI-BLAST. After much poking around, dependent libraries (Apache Arrow and NCBI-C++ Toolkit) are now compiled from scratch (and without Windows APIs on Windows - using MSYS2 and MinGW provided with RTools4.4).

The main difference between this PKG and the rest would be that

  • Quick blast is multi-threaded with { file reading (as chunks), BLASTing, wrapping hits into Arrow data structures }, and { writing of Arrow::RecordBatches to the output file in batches } is done in seperate threads. Hits are also converted into Rcpp::List if you want values to be returned to R.
  • QuickBLAST does not use Sys.Calls to invoke BLAST exes. You don't need BLAST programs in you system
  • BLAST DBs are not explicitly created

Cons :

  • Limited score attributes

Let me know if you want more information and please address bugs to me on github.

Installation (under construction)

devtools::install_github("https://github.com/vizkidd/QuickBLAST", force=T)

Usage

remotes::install_local("QuickBLAST_1.0_R_x86_64-pc-linux-gnu.tar.gz", build=F)
tblastx_ptr <- QuickBLAST::CreateNewBLASTInstance(seq_info = list(0,0,F), program = "tblastx", options = list("evalue"=1e-05, "pident"=0.75, "qcovhsp_perc"=0.75))
blastn_ptr <- QuickBLAST::CreateNewBLASTInstance(seq_info = list(0,0,F), program = "blastn", options = "")
 QuickBLAST::BLAST2Files(ptr=tblastx_ptr, query="ungrouped.cds", subject="ungrouped.cds", out_file="out.tmp", seq_limit=1000, show_progress=T,return_values=F, num_threads=5)
QuickBLAST::BLAST2Seqs(ptr=tblastx_ptr, query="AAAAAAAAAAAAAAAAAAAAAAAAAAATTTTTTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCGGGGGGGGGGGGGGGGGGGGGG", subject="TTTTTTTTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC")
QuickBLAST::BLAST2Seqs(ptr=blastn_ptr, query=">i11166\nTGGCACGTCTGGTAGCAGTTTGCAGGGAAGGGGAAGAGGAATACCCGTTTCTCGCCAGACAGATCC", subject=">i11167\nATGGCACGTCTGGTAGCAGTTTGCAGGGAAGGGGAAGAGGAATACCCGTTTCTCGCCAGACAGATCCCCCTCTTCATCGATGACACTCTCACGATGGTGATGGAGTTTTCCGATAGCGTCATGG")
QuickBLAST::BLAST1Folder(ptr = tblastx_ptr, input_folder="test", extension= ".cds", out_folder="test_out", num_threads=7, reciprocal_hits=F)
QuickBLAST::BLAST2Folders(ptr=blastn_ptr, query="query", subject="subject", extension = ".cds", out_folder="test2_out", num_threads=8, reciprocal_hits=F)

Same as BLAST but DB & OUTPUT Format are not available. List of available options can be checked with QuickBLAST::GetAvailableBLASTOptions() (Empty elements from the list are removed and BLAST defaults are set on the c++ side). Inputs and Outputs are provided as parameters and sequence specification(strand, sequence type) can be provided during QuickBLAST object creation with QuickBLAST::GetQuickBLASTInstance() (or use the QuickBLAST::BLAST*() functions in R). Enums used by QuickBLAST in C++ are not exposed in R and only integers are used, check QuickBLAST::GetQuickBLASTEnums().

BLAST Scores :

Currently supported scores

Future : (Looking for suggestions)

  • Implement more scores and filtering options
  • Include function for reading the arrow output files
  • Convert from arrow to GRanges (maybe with the use of arrow::Visit() functions)

Disclaimers for disclaimers, legal stuff for legal stuff and respect for respect, wherever it should go.

LinkedIN

quickblast's People

Contributors

vizkidd avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.