Giter VIP home page Giter VIP logo

primerminer's People

Contributors

vascoelbrecht avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

primerminer's Issues

automatcally cluster mitogenomes as well, to provide a backbone to map agains

currently mitogenome sequences are provided, to make an alignment, and use that degenerated consensus as a backbone for mapping all reads against. It would be better to cluster the mito sequences first to prevent overrepresentation of certain taxa (this will be rarely the case and does not matter much, but does not hurt to implement this at some point).

If no sequences are obtained, a ReadLines error occurs

"When no COI sequences are available for a given order (and I assume, family), the error "Error in readLines(file) : 'con' is not a connection" is given and the download terminates. Manual removal of the taxon in question is then required, although this isn't obvious to a less-seasoned R user. A potential consideration would be somehow bypassing this error and instead noting that 0 sequences were available for the taxon. The download could then continue without need for regular surveillance. "

Thanks to Jordan Cuff for submitting this

Any option to get the species in each cluster?

Is there any option to obtain the species in each cluster?

I know this is related with vsearch, they mention in the documentation an example including the taxonomy and the species name, following the "s" identifier: ">X80725_S000004313;tax=d:Bacteria,p:Proteobacteria,c:Gammaproteobacteria,o:Enterobacteriales,f:Enterobacteriaceae,g:Escherichia/Shigella,s:Escherichia_coli"

Maybe I'm not reading the output correctly. For example one of my groups Bathyraja, has 8 species

Bathyraja,
,"Bathyraja brachyurops"
,"Bathyraja cousseauae"
,"Bathyraja scaphiops"
,"Bathyraja griseocauda"
,"Bathyraja magellanica"
,"Bathyraja albomaculata"
,"Bathyraja macloviana"
,"Bathyraja multispinis"

Results reports they were grouped in 5 clusters, but I couldn't find which species were grouped in each cluster.

cluster_file:

Reading file Bathyraja/Vsearch/Bathyraja_all_drep+1.fasta 100%
35182 nt in 45 seqs, min 591, max 1757, avg 782
Masking 100%
Sorting by length 100%
Counting unique k-mers 100%
Clustering 100%
Sorting clusters 100%
Writing clusters 100%
Clusters: 5 Size min 1, max 37, avg 9.0
Singletons: 2, 4.4% of seqs, 40.0% of clusters
Multiple alignments 100%
vsearch v1.10.2_osx_x86_64, 8.0GB RAM, 4 cores
https://github.com/torognes/vsearch

The log.txt output is:

2020-08-18 00:13:05 - Downloading BOLD sequence data

#Bold_data: Folder Bathyraja/Bathyraja
Taxon	Sequences	downl_time
Bathyraja brachyurops	6	0.31 secs
Bathyraja cousseauae	3	0.31 secs
Bathyraja scaphiops	8	0.36 secs
Bathyraja griseocauda	12	0.34 secs
Bathyraja magellanica	4	0.3 secs
Bathyraja albomaculata	8	0.86 secs
Bathyraja macloviana	4	0.95 secs
Bathyraja multispinis	10	0.32 secs
#Bold_data_end

2020-08-18 00:13:09 - Downloading GenBank sequence data

Search query: REPLACE_WITH_TAXA[Organism] AND (COi OR CO1 OR COXi OR COX1) AND 1:2000[Sequence Length]

#GB_data: Folder Bathyraja/Bathyraja
Taxon	Sequences	downl_time
Bathyraja brachyurops	5	5.8 secs
Bathyraja cousseauae	3	5.3 secs
Bathyraja scaphiops	7	5.4 secs
Bathyraja griseocauda	10	5.4 secs
Bathyraja magellanica	4	5.1 secs
Bathyraja albomaculata	6	5.5 secs
Bathyraja macloviana	4	5.3 secs
Bathyraja multispinis	8	5.2 secs
#GB_data_end

2020-08-18 00:13:52  - Downloading Miochondrial Genomes from GenBank

Search query: REPLACE_WITH_TAXA[Organism] AND mitochondrion[filter] AND genome AND 2001:80000[Sequence Length]

#mito_data: Folder Bathyraja/Bathyraja
Taxon	Sequences	downl_time
Bathyraja brachyurops	0	0.83 secs
Bathyraja cousseauae	0	0.92 secs
Bathyraja scaphiops	0	0.84 secs
Bathyraja griseocauda	2	4.8 secs
Bathyraja magellanica	0	1.4 secs
Bathyraja albomaculata	2	4.5 secs
Bathyraja macloviana	0	1.4 secs
Bathyraja multispinis	2	4.7 secs
#mito_data_end

2020-08-18 00:14:11  - Converting Mito Genbank to fasta

#mito_gb2fasta
GBfile	noMito	unique
Bathyraja/Bathyraja/Bathyraja albomaculata_mito.gb	2	1
Bathyraja/Bathyraja/Bathyraja griseocauda_mito.gb	2	1
Bathyraja/Bathyraja/Bathyraja multispinis_mito.gb	2	1
#mito_gb2fasta_end

2020-08-18 00:14:11 - Merging fasta files

Reading in files matching BOLD\.fasta$:
Folders: Bathyraja/Bathyraja
Files: 
Clipping: Left 0 bp, Right 0 bp

Matching files which were written into  Bathyraja/Bathyraja_Bold.fasta : 
 Bathyraja/Bathyraja/Bathyraja albomaculata_BOLD.fasta, Bathyraja/Bathyraja/Bathyraja brachyurops_BOLD.fasta, Bathyraja/Bathyraja/Bathyraja cousseauae_BOLD.fasta, Bathyraja/Bathyraja/Bathyraja griseocauda_BOLD.fasta, Bathyraja/Bathyraja/Bathyraja macloviana_BOLD.fasta, Bathyraja/Bathyraja/Bathyraja magellanica_BOLD.fasta, Bathyraja/Bathyraja/Bathyraja multispinis_BOLD.fasta, Bathyraja/Bathyraja/Bathyraja scaphiops_BOLD.fasta 

2020-08-18 00:14:11 - Merging fasta files

Reading in files matching GB\.fasta$:
Folders: Bathyraja/Bathyraja
Files: 
Clipping: Left 0 bp, Right 0 bp

Matching files which were written into  Bathyraja/Bathyraja_GB.fasta : 
 Bathyraja/Bathyraja/Bathyraja albomaculata_GB.fasta, Bathyraja/Bathyraja/Bathyraja brachyurops_GB.fasta, Bathyraja/Bathyraja/Bathyraja cousseauae_GB.fasta, Bathyraja/Bathyraja/Bathyraja griseocauda_GB.fasta, Bathyraja/Bathyraja/Bathyraja macloviana_GB.fasta, Bathyraja/Bathyraja/Bathyraja magellanica_GB.fasta, Bathyraja/Bathyraja/Bathyraja multispinis_GB.fasta, Bathyraja/Bathyraja/Bathyraja scaphiops_GB.fasta 

2020-08-18 00:14:11 - Merging fasta files

Reading in files matching [mito]\.fasta$:
Folders: Bathyraja/Bathyraja
Files: 
Clipping: Left 0 bp, Right 0 bp

Matching files which were written into  Bathyraja/Bathyraja_mito.fasta : 
 Bathyraja/Bathyraja/Bathyraja albomaculata_mito.fasta, Bathyraja/Bathyraja/Bathyraja griseocauda_mito.fasta, Bathyraja/Bathyraja/Bathyraja multispinis_mito.fasta 

2020-08-18 00:14:11 - Merging fasta files

Reading in files matching \.fasta$:
Folders: 
Files: Bathyraja/Bathyraja_Bold.fasta, Bathyraja/Bathyraja_GB.fasta, Bathyraja/Bathyraja_mito.fasta
Clipping: Left 0 bp, Right 0 bp

Matching files which were written into  Bathyraja/Bathyraja_all.fasta : 
 Bathyraja/Bathyraja_Bold.fasta, Bathyraja/Bathyraja_GB.fasta, Bathyraja/Bathyraja_mito.fasta 

2020-08-18 00:14:11 - Clustering reads with Vsearch
vsearch v1.10.2_osx_x86_64, 8.0GB RAM, 4 cores

Used fasta file: Bathyraja_all.fasta
Number of imput sequences: 105
Dereplicated: 45
Cluster: 5

VSEARCH comands:

/usr/local/lib/R/site-library/PrimerMiner/vsearch-1.10.2_osx_x86_64 -derep_fulllength Bathyraja/Bathyraja_all.fasta -output Bathyraja/Vsearch/Bathyraja_all_drep.fasta >Bathyraja/Vsearch/temp.txt
/usr/local/lib/R/site-library/PrimerMiner/vsearch-1.10.2_osx_x86_64 -derep_fulllength Bathyraja/Vsearch/Bathyraja_all_drep.fasta -output Bathyraja/Vsearch/Bathyraja_all_drep+1.fasta -sizeout
/usr/local/lib/R/site-library/PrimerMiner/vsearch-1.10.2_osx_x86_64 -cluster_fast Bathyraja/Vsearch/Bathyraja_all_drep+1.fasta -strand both -id 0.97 -msaout Bathyraja/Vsearch/cluster_file  >Bathyraja/Vsearch/temp.txt

Install issue

Hi Vasco,

I'm keen to give PrimerMiner a try but have run into an issue. When installing, I get the following Warnings and Error:

> install.packages("~/Downloads/PrimerMiner-0.12.tar.gz", repos = NULL, type = "source") Warning in untar2(tarfile, files, list, exdir, restore_times) : skipping pax global extended headers ERROR: cannot extract package from ‘/Users/sebastian/Downloads/PrimerMiner-0.12.tar.gz’ Warning in install.packages : installation of package ‘/Users/sebastian/Downloads/PrimerMiner-0.12.tar.gz’ had non-zero exit status

Any idea wha I might be doing wrong?

Many thanks,
Sebastian

Design Primers

Maybe I got it wrong but does PrimerMiner design primers as well? I couldn't find any tutorial in the wiki...

Check version in config file against installed PrimerMiner file

Stopp execution with error if there are mismatches

Reason: The config file variables might change over time (spelling, additional features, features get obsolete). Backwards compatibility cant be guaranteed as PrimerMiner is in very active development and a new config file is quickly made!

include product annotation in mitogenome extractions!

"that for many of the mitogenomes the 12S or 16S rRNA gene is not annotated with a gene ID in genbank. Instead the 12S and 16S regions are often annotated as product=”12S ribosomal RNA” and product=”16S ribosomal RNA”. I have tried specifying these product annotations in the config file but primerminer does not retrieve these sequences." - suggested from Jonas Bylemans

primer_evaluation

Error when only one sequence is in the fasta file...

workaround: Copy sequence to have 2 ; )

Error installing JAMP locally

Hi Vasco
We follow the package_tutorial.R and when we run the following line:

install_github("VascoElbrecht/JAMP", subdir="JAMP")

We obtain the following error:

Downloading GitHub repo VascoElbrecht/JAMP@master
tar: Failed to set default locale
tar: Failed to set default locale
Skipping 1 packages not available: PrimerMiner
During startup - Warning messages:
1: Setting LC_CTYPE failed, using "C"
2: Setting LC_TIME failed, using "C"
3: Setting LC_MESSAGES failed, using "C"
4: Setting LC_MONETARY failed, using "C"
v checking for file '/private/var/folders/l0/636zb7xd0fg015xw9v4mgt380000gn/T/Rtmpymi7u5/remotes21c51e830429/VascoElbrecht-JAMP-54b0c90/JAMP/DESCRIPTION' ...

  • preparing 'JAMP':
    v checking DESCRIPTION meta-information ...
  • checking for LF line-endings in source and make files and shell scripts
  • checking for empty or unneeded directories
  • building 'JAMP_0.67.tar.gz'

Error: (converted from warning) Setting LC_CTYPE failed, using "C"
Execution halted
Error in i.p(...) :
(converted from warning) installation of package '/var/folders/l0/636zb7xd0fg015xw9v4mgt380000gn/T//Rtmpymi7u5/file21c554371e63/JAMP_0.67.tar.gz' had non-zero exit status

And when install
install_github("VascoElbrecht/PrimerMiner", subdir="PrimerMiner")
We obtain:
Downloading GitHub repo VascoElbrecht/PrimerMiner@master
tar: Failed to set default locale
tar: Failed to set default locale
These packages have more recent versions available.
Which would you like to update?

1: XML (3.98-1.17 -> 3.98-1.18) [CRAN]

Enter one or more numbers separated by spaces, or an empty line to cancel
1: 1
XML (3.98-1.17 -> 3.98-1.18) [CRAN]
Installing 1 packages: XML

There is a binary version available but the source version is later:
binary source needs_compilation
XML 3.98-1.17 3.98-1.18 TRUE

Do you want to install from sources the package which needs compilation? (Yes/no/cancel) Yes
installing the source package 'XML'

trying URL 'https://cran.rstudio.com/src/contrib/XML_3.98-1.18.tar.gz'
Content type 'application/x-gzip' length 1601173 bytes (1.5 MB)

downloaded 1.5 MB

Error: (converted from warning) Setting LC_CTYPE failed, using "C"
Execution halted
Error in i.p(...) :
(converted from warning) installation of package 'XML' had non-zero exit status

Downloading GitHub repo VascoElbrecht/PrimerMiner@master
tar: Failed to set default locale
tar: Failed to set default locale
These packages have more recent versions available.
Which would you like to update?

1: XML (3.98-1.17 -> 3.98-1.18) [CRAN]

Enter one or more numbers separated by spaces, or an empty line to cancel
1: 1
XML (3.98-1.17 -> 3.98-1.18) [CRAN]
Installing 1 packages: XML

There is a binary version available but the source version is later:
binary source needs_compilation
XML 3.98-1.17 3.98-1.18 TRUE

Do you want to install from sources the package which needs compilation? (Yes/no/cancel) Yes
installing the source package 'XML'

trying URL 'https://cran.rstudio.com/src/contrib/XML_3.98-1.18.tar.gz'
Content type 'application/x-gzip' length 1601173 bytes (1.5 MB)

downloaded 1.5 MB

Error: (converted from warning) Setting LC_CTYPE failed, using "C"
Execution halted
Error in i.p(...) :
(converted from warning) installation of package 'XML' had non-zero exit status

And the package is not installed.

Any suggestion? Thanks

María

not available for this version of R

Hello,

I'm trying to install PrimerMiner, but came to me the following error:

Warning in install.packages :
package ‘PrimerMiner-0.21.tar.gz’ is not available for this version of R

What's the problem?

plot_alignment step restrictions

Hi @VascoElbrecht

I hope everything's going well!

I have been using PrimerMiner to develop some specific and metabarcoding primers, so I've been using some Geneious fasta files that I already have (but complying with your recommendations in the YT tutorial) instead of following the batch download steps.

The problem arises when I try to plot more than 4 individual fasta files with plot_alignment. Using 5 or more files results in an incomplete plot (without the consensus and nucleotide letter part) with the console returning this mistake:

> plot_alignments(alignments, Order_names=gsub(".*./._(.*)_.*", "\\1", alignments)) Error in list[[k]][start:end, 2:5] : subscript out of bounds

I have been trying to see if changing the height and width could solve it without any improvements. Any recommendations?

PrimerMiner

Hi Dears

I got message when trying to download PrimerMiner through the code

install.packages("PrimerMiner", repos = NULL, type="source", dependencies=T):

Warning: invalid package 'PrimerMiner'
Error: ERROR: no packages specified
Warning in install.packages :
installation of package ‘PrimerMiner’ had non-zero exit status

Any clue to solve this issue?

Some errors when trying to batch download with 12S and 16S markers

Hello,

I am finding two errors when batch downloading using PrimerMiner...

I have set the marker in the config file to: Marker = c("16s", "16S", "16S ribosomal RNA", "16s Ribosomal RNA") # specify target gene and I have turned off the downloads from BOLD as specified in the Wiki
But then, when running the batch download script I find this problem (For Coleoptera families there is no problem at all, only when going to other orders and families of insects):

Error in download.file(paste("https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nuccore&id=", :
'curl' call had nonzero exit status
curl: (16) Error in the HTTP2 framing layer

Also, I am noticing that even when I check the Coleoptera file, there is no Groups_mito.fasta file to work within the Alignment process in Geneious

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.