Giter VIP home page Giter VIP logo

sift4g's People

Contributors

pauline-ng avatar rvaser avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

sift4g's Issues

Use of uninitialized value in string ne at check_genes.pl-populating databases

Dear rvaser;
I just tried to run the test_file[homo_sapiens_small] for building SIFT_db. Everthing was right until the populating databases state. There were warnings as:
Use of uninitialized value $siftscore in numeric ge (>=) at check_genes.pl line 53, <DB_IN> chunk 601612175.
Use of uninitialized value $siftscore in string ne at check_genes.pl line 53, <DB_IN> chunk 601612175.
Use of uninitialized value in string ne at check_genes.pl line 56, <DB_IN> chunk 601612175.
Use of uninitialized value in string ne at check_genes.pl line 56, <DB_IN> chunk 601612175
And there were .SIFTpredictions/.aligned.fasta in SIFT_predictions, 21.gz/MT.gz in GRCh38.83.
Do these warnings influence the results?(The warning message is full of screen and "all done" does not appear.)
When I stopped this command, it showed
File "check_SIFTDB.py", line 301, in
process_chr_file (infile, DNA_base_change_syn_nonsyn_counts, DNA_base_change_aa_pred, all_nonsyn_changes, dbSNP_predictions, novel_predictions, ref_predictions, aa_change_pred)
File "check_SIFTDB.py", line 202, in process_chr_file
for line in siftdb_fp.readlines():
File "/usr/lib/python3.8/codecs.py", line 319, in decode
def decode(self, input, final=False):
KeyboardInterrupt

I installed sift4g and SIFT4G_Create_Genomic_DB on ubuntu 20.04 with following packages:
Perl -- 5.30.0
DBI -- 1.643
BioPerl -- 1.7.8
LWP -- 6.67
Switch -- 2.17
Python 3.8.10

Best regards.

regex_error

Hi,
I'm not sure why I run the sift4g with this error.
"terminate called after throwing an instance of 'std::regex_error'
what(): regex_error"
Could you please tell me?
Best wishes,
Liu

Alignment bug

Hi Robert,

I am attempting to create a SIFT database by using Pauline's SIFT4G_Create_Genomic_DB method. I am working with a genome assembly and annotation from RefSeq (fasta: GCF_008728515.1_Panubis1.0_genomic.fna.gz, annotation: GCF_008728515.1_Panubis1.0_genomic.gtf.gz, both available at https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/008/728/515/GCF_008728515.1_Panubis1.0). A failure is occurring during the SIFT4G alignment step:

start siftsharp, getting the alignments
~/programs/sift/sift4g/bin/sift4g -d ~/programs/sift/SIFT_databases/uniref90.fasta -q ~/programs/sift/SIFT_databases/GCF_008728515.1_Panubis1.0/all_prot.fasta --subst ~/programs/sift/SIFT_databases/GCF_008728515.1_Panubis1.0/subst --out ~/programs/sift/SIFT_databases/GCF_008728515.1_Panubis1.0/SIFT_predictions --sub-results
** Checking query data and substitutions files **

  • processing queries: 100.00/100.00% *

** Searching database for candidate sequences **

  • processing database part 202 (size ~0.25 GB): 100.00/100.00% *

** Aligning queries with candidate sequences **
Alignment score and position are not consensus.72.50/100.00% **

The error looks to be the same as pauline-ng/SIFT4G_Create_Genomic_DB#15, and is perhaps the same bug that you raised an issue for here: mkorpar/swsharp#1.

Have you found a work-around for this problem? Is it possible, for instance, to alter the GTF file to remove the sequence(s) triggering this error?

Thanks,
Jacqueline

P.S. I have successfully run the test examples for both sift4g and SIFT4G_Create_Genomic_DB, verifying that everything is installed correctly and apparently functional.

About the number of DELETERIOUS from SIFT annotation

Hello Robert,
I used the genome fasta file , protein fasta files and genome annotation file to build the databas.Despite some warning, the database was set up successfully.SIFT numeric scores columns 10-12 have many rows say "NA".CHECK_GENES.LOG file columns 2-4 value very low.Through SIFT predicted ,I fonded that 85675 missense mutations in 42554 genes ,only 15777 sites was identified as NONSYNONYMOUS , 4141 sites was identified as DELETERIOUS and 1160 sites was identified as TOLERATED . I don't know the reason why many missense mutation sites can't be dentified.

And here is my file and command lines:
gene-annotation-src:
D.gtf.gz 6.27MB
D.pep.all.fa.gz 10.0MB
less -S D.gtf.gz

pseudo-chr0 EVM transcript 61149 61566 . - . transcript_id "D15542.t1"; gene_id "D15542"
pseudo-chr0 EVM exon 61149 61417 . - . transcript_id "D15542.t1"; gene_id "D15542";
pseudo-chr0 EVM exon 61485 61566 . - . transcript_id "D15542.t1"; gene_id "D15542";
pseudo-chr0 EVM CDS 61149 61417 . - 2 transcript_id "D15542.t1"; gene_id "D15542";
pseudo-chr0 EVM CDS 61485 61566 . - 0 transcript_id "D15542.t1"; gene_id "D15542";

less -S D.pep.all.fa.gz

protein|D19569.t1 ID=D19569.t1|Parent=D19569|Name=
MSGPGYMDHVFKANANSNSNPPCESKKMDPATKIGATSTTNTCPLPSPKIFANSELLTTSTEAAVTIPVNTKTSDAKSCQSVTNYGRVTGNSTRSDSLESSSAPLKPHTGGDVRWDAINSVCSKDSPLGLSHFRLLKRLGYGDIGSVYLVELRGTNTYFAMKVMDRGS
protein|D19568.t1 ID=D19568.t1|Parent=D19568|Name=
MLLQNWSFSGGLLSHSLLLHSRNPSNIISSALFPRRKAKAATTFLCLRLGIDEIAEIAHNKVLIAAVVSAAIGQLSKPFTSAILYGNKNNFDFRAAFQAGGFPSTHSSAVVATATSLGLERGFSDTIFGLAVVYAGLIMYDAQGVRREVGTHAKALNSVLLKNQLNSI
protein|D19566.t1 ID=D19566.t1|Parent=D19566|Name=
MVSPLLLQIPINTTSNGVTSAKANTNYPLPSPPVTSQSRELEEAEVEADDEVVGNGVLNQNVLSEDEYIRIFPRGIGPKPTEFKCEASRETPVVIMNHINLVEILMDLVEILMDVGSK*

chr-src:
reference.fa.gz 305MB
less -S reference.fa.gz

pseudo-chr0
TGGACTGACTGGACTGGGAAAATTTGTATTGTATTGGAGTATTGGATTGTTCTGGGAGTATTGACTTGTATTGTACATGTAAAGTATGAAAAGCTTGTAAAAGGACAAAACCTGGTCCCACATGGGGTGAAGTTTAAGTCGTCCTTTGAAAGATAGGATAGTCCTTAT.....
......
.....
.....
pseudo-chr9
TTATATATTAATATATAATAATAAATAATGAGAAATAAAATACCAAAAATACTATAATTACCTAGAACACCTAAAAATACCTAAACTTAATCAAAACTTATTTTTATGGATTTTTAAGGATTTTCTAAAGAGTAAAAACTAATTAACACTAAAAATACCAAAAAAATA....
....
scaffold808_size847463_ERROPOS700000
ATGATCTTCCAAAACTTGGAAAACCATCTTCAAGTGTTTGTGGACCATGTCAGCTAGGAAAGCAAATTAAGAGTCCTCACAAGAAGTCAAAATTCATTCACACCTCTCGTGTCTTAGAGTTGATTAATATGGATCTCATGGGTCCAATGAGAACAGAAAGCGTAGGGG.....
.....

Configuration File :
D.txt

GENETIC_CODE_TABLE=1
GENETIC_CODE_TABLENAME=Standard
MITO_GENETIC_CODE_TABLE=11
MITO_GENETIC_CODE_TABLENAME=Bacterial, Archaeal and Plant Plastid

PARENT_DIR=/loc/d01/user009/D/third_build_sift
ORG=D
ORG_VERSION=D_v1
DBSNP_VCF_FILE=

#Running SIFT 4G
SIFT4G_PATH=/loc/d01/user009/software/sift4g/bin/sift4g
PROTEIN_DB=/loc/d01/user009/D/uniref90.fasta
COMPUTER=GIS-KATNISS

#Sub-directories, don't need to change
GENE_DOWNLOAD_DEST=gene-annotation-src
CHR_DOWNLOAD_DEST=chr-src
LOGFILE=Log.txt
ZLOGFILE=Log2.txt
FASTA_DIR=fasta
SUBST_DIR=subst
ALIGN_DIR=SIFT_alignments
SIFT_SCORE_DIR=SIFT_predictions
SINGLE_REC_BY_CHR_DIR=singleRecords
SINGLE_REC_WITH_SIFTSCORE_DIR=singleRecords_with_scores
VDBSNP_DIR=dbSNP

#Doesn't need to change
FASTA_LOG=fasta.log
INVALID_LOG=invalid.log
PEPTIDE_LOG=peptide.log
ENS_PATTERN=ENS
SINGLE_RECORD_PATTERN=:change:_aa1valid_dbsnp.singleRecord

perl make-SIFT-db-all.pl -config D.txt

sift4g -t 48 -d ./uniref90.fasta -q /loc/d01/user009/D/third_build_sift/all_prot.fasta --subst /loc/d01/user009/D/third_build_sift/su
bst --out/loc/d01/user009/D/third_build_sift/SIFT_predictions --sub-results

Use of uninitialized value $fasta_subseq in concatenation (.) or string at generate-fasta-subst-files-BIOPERL.pl line 446, <IN_TX> line 20913.
Argument "" isn't numeric in numeric eq (==) at generate-fasta-subst-files-BIOPERL.pl line 465, <IN_TX> line 21729.
Use of uninitialized value $mutated_aa in string eq at generate-fasta-subst-files-BIOPERL.pl line 894.
Use of uninitialized value $aa2 in string eq at generate-fasta-subst-files-BIOPERL.pl line 621.
Argument "" isn't numeric in addition (+) at generate-fasta-subst-files-BIOPERL.pl line 866.
Use of uninitialized value $coord in concatenation (.) or string at generate-fasta-subst-files-BIOPERL.pl line 918.
Use of uninitialized value $exon_num in concatenation (.) or string at generate-fasta-subst-files-BIOPERL.pl line 918.
....

  • skipping protein [ D29247.t1 ]: substitution list has a position out of bounds (line: D252E, query length = 251) *
  • ....
  • skipping protein [ D44925.t1 ]: substitution list has a position out of bounds (line: I525I, query length = 524) *
  • skipping protein [ D44931.t1 ]: substitution list has a position out of bounds (line: Q67Q, query length = 66) *
  • processing queries: 100.00/100.00% *
    ** Aligning queries with candidate sequences **
  • processing database part 64 (size ~1.00 GB): 100.00/100.00% *

** Selecting alignments with median threshold: 2.75 **

  • processing queries: 100.00/100.00% *

** Generating SIFT predictions with sequence identity: 100.00% **

  • processing queries: 100.00/100.00% *
    .....
    All done!

CHECK_GENES.LOG file:
Chr Genes with SIFT Scores Pos with SIFT scores Pos with Confident Scores

pseudo-chr0 10 (227/2287) 14 (637725/4718761) 2(11664/637725)
pseudo-chr1 26 (1338/5214) 35 (4911882/13886195) 12(595766/4911882)
pseudo-chr10 24 (664/2724) 34 (2309132/6785718) 17(384198/2309132)
pseudo-chr11 24 (576/2441) 32 (1872042/5869154) 14(266614/1872042)
.....

My directory:

4.0K ./SIFT_alignments
3.8G ./SIFT_predictions
1.2G ./subst
83G ./singleRecords
220K ./singleRecords_with_scores
306M ./chr-src
4.0K ./dbSNP
37M ./gene-annotation-src
220M ./fasta
2.3G ./D_v1
4.0K ./tmp
91G .

I would be thankful for all the help.
Best regards,
Enter

could not make on linux ( Ubuntu 18.04.2 LTS)

Hi,
I met a problem when I try to make, and I could not know how to fix it.
The error is "recipe for target 'obj/database_search.o' failed"
image
Could you pleas help me?
Best wishes,
Liu

Use of uninitialized value in creating SIFT Database

Hello,
I got the following error when building a library using sift4g and sift4g can bring predictions. But still many errors in log file

`Use of uninitialized value $aa in string eq at make-single-records-BIOPERL.pl line 274.
Use of uninitialized value $aa in concatenation (.) or string at make-single-records-BIOPERL.pl line 280.
Use of uninitialized value $aa in string eq at make-single-records-BIOPERL.pl line 274.
Use of uninitialized value $aa in concatenation (.) or string at make-single-records-BIOPERL.pl line 280.
Use of uninitialized value $orig_aa in string eq at make-single-records-BIOPERL.pl line 551.
Use of uninitialized value $mutated_aa in string eq at make-single-records-BIOPERL.pl line 551.
Use of uninitialized value $orig_aa in string eq at make-single-records-BIOPERL.pl line 551.
Use of uninitialized value $mutated_aa in string eq at make-single-records-BIOPERL.pl line 551.
Use of uninitialized value $orig_aa in string eq at make-single-records-BIOPERL.pl line 551.
Use of uninitialized value $mutated_aa in string eq at make-single-records-BIOPERL.pl line 551.
Use of uninitialized value $orig_aa in string eq at make-single-records-BIOPERL.pl line 551.
Use of uninitialized value $mutated_aa in string eq at make-single-records-BIOPERL.pl line 551.
Use of uninitialized value in concatenation (.) or string at make-single-records-BIOPERL.pl line 305.
Use of uninitialized value in concatenation (.) or string at make-single-records-BIOPERL.pl line 305.
Use of uninitialized value in concatenation (.) or string at make-single-records-BIOPERL.pl line 305.
Use of uninitialized value in concatenation (.) or string at make-single-records-BIOPERL.pl line 305.
Use of uninitialized value in concatenation (.) or string at make-single-records-BIOPERL.pl line 305.
Use of uninitialized value in concatenation (.) or string at make-single-records-BIOPERL.pl line 305.
Use of uninitialized value in concatenation (.) or string at make-single-records-BIOPERL.pl line 305.
Use of uninitialized value in concatenation (.) or string at make-single-records-BIOPERL.pl line 305.
done making single records template
making noncoding records file
done making noncoding records
make the fasta sequences
Use of uninitialized value $mutated_aa in string eq at generate-fasta-subst-files-BIOPERL.pl line 894.
Use of uninitialized value $mutated_aa in string eq at generate-fasta-subst-files-BIOPERL.pl line 894.
Use of uninitialized value $mutated_aa in string eq at generate-fasta-subst-files-BIOPERL.pl line 894.
Use of uninitialized value $aa2 in string eq at generate-fasta-subst-files-BIOPERL.pl line 621.
Use of uninitialized value $aa2 in string eq at generate-fasta-subst-files-BIOPERL.pl line 621.
Use of uninitialized value $aa2 in string ne at generate-fasta-subst-files-BIOPERL.pl line 626.
Use of uninitialized value $aa2 in string eq at generate-fasta-subst-files-BIOPERL.pl line 621.
Use of uninitialized value $aa2 in string eq at generate-fasta-subst-files-BIOPERL.pl line 621.
Use of uninitialized value $aa2 in string ne at generate-fasta-subst-files-BIOPERL.pl line 626.
Use of uninitialized value $aa2 in string eq at generate-fasta-subst-files-BIOPERL.pl line 621.
Use of uninitialized value $aa2 in string eq at generate-fasta-subst-files-BIOPERL.pl line 621.
Use of uninitialized value $aa2 in string ne at generate-fasta-subst-files-BIOPERL.pl line 626.
done making the fasta sequences
start siftsharp, getting the alignments
/home/huangzr/sift4g/bin/sift4g -d /disk1/huangzr/nr/nr -q /vloume01/huangzr/database/sift/nr/hongyan/all_prot.fasta --subst /vloume01/huangzr/database/sift/nr/hongyan/subst --out /vloume01/huangzr/database/sift/nr/hongyan/SIFT_predictions --sub-results
** Checking query data and substitutions files **

  • processing queries: 0.01/100.00% *
  • processing queries: 0.01/100.00% *
    `

Any suggestions?
Thanks,
rey

the problem annovatation a vcf

SIFT
hello, I run the sift4g after creating the datebase, but the results show just the chr1 with the ouput. and the the output of chr 1 with all NA
1647733181(1)
How could i do with this error?
Best wishes

[ERROR:src/chain.c:70]: invalid chain data

We are attempting to built a SIFT database (https://sift.bii.a-star.edu.sg/sift4g/SIFT4G_codes.html) using the sift4g algorithm on our Linux supercomputing cluster. However, we have hit the following difficult-to-parse error once it reaches what appears to be the sift4g portion of database building. Could you please advise us on the best way forward? Let us know what additional information you need.

done making the fasta sequences
start siftsharp, getting the alignments
/home/staff/chiroTester/tools/SIFT/sift4g/bin/sift4g -d /home/staff/chiroTester/ref_files/uniref90.fasta.gz -q ./test_files/homo_sapiens_small/all_prot.fasta --subst ./test_files/homo_sapiens_small/subst --out ./test_files/homo_sapiens_small/SIFT_predictions --sub-results
** Checking query data and substitutions files **

  • processing queries: 100.00/100.00% *

** Searching database for candidate sequences **
[ERROR:src/chain.c:70]: invalid chain data

Questions on test_data

Hi,
I have a question on test_data. When i compile the sift4g on my ubuntu and test on test_data.
the command "./bin/sift4g -q ./test_files/query.fasta --subst ./test_files/ -d ./test_files/sample_protein_database.fa" always showed

** Checking query data and substitutions files **
terminate called after throwing an instance of 'std::regex_error'
  what():  regex_error
Aborted (core dumped)

But the command "./bin/sift4g -q ./test_files/query.fasta -d ./test_files/sample_protein_database.fa" seems work.
Could you help me?
I talked with @pauline-ng ,because I tried to build a database using SIFT4G_Create_Genomic_DB. The same error seems happend again.

Bests,
Nan

No result, no error

Hello,

I am trying to run sift4G with my own database. I have checked the chromosome names of the vcf matched the database. But no result at all. No error, either. Could you please give me any suggestions to fix it?
Below is an example of my VCF file.
1 103 . C T 45.17 PASS .
1 112 . C T 40.13 PASS .
1 405 . C T 209 PASS .
1 670 . G A 77 PASS .

This is what I got when I ran sift4G

Started Running .......
Running in Multitranscripts mode

Chromosome WithSIFT4GAnnotations WithoutSIFT4GAnnotations Progress
1 0 0 Completed : 1/1

Merging temp files....
SIFT4G Annotation completed !
Output directory:.

Sift4g usage question

Hi, many thanks for developing such useful software.

I have a question about how to use sift4g properly, as usual, we can make a sift4G database and annotate my vcf file to know the deleterious mutation within genes. However, we think this method will introduce reference bias since it is based on reference calling vcf file.
So, in our case, we assembly 4 haplotype genomes for autotetraploid and make sift4G database for each haplotype genome, and then we mark deleterious mutation based on the file chr1_gz in the directory of name_sift_database with the command zcat chr/$i/${i}_sift_database/chr${chr}_${hap}.gz |grep 'ref' |awk '$11<0.05 {print}' . Is it working for detecting deleterious mutation?

For another question, we have compared the deleterious mutation of synteny genes within 4 genomes using up method, we have 23086 synteny genes for 4 genomes.

4 genes with no deleterious mutation 9167
4 genes with only 1 gene has deleterious mutation 1172
4 genes with only 2 genes has deleterious mutation 1079
4 genes with only 3 genes has deleterious mutation 1565
4 genes with 4 genes have deleterious mutation 10103

You can see that the 4 genes with 4 genes have the deleterious mutation have a dominant proportion, to our knowledge, for autotetraploid, 4 genes with 1/2/3 genes have deleterious may have dominant proportion. Do you have any suggestions for us to compare the deleterious mutations within autotetraploid?

many thanks for your help

GPU was not used in the analysis

Hi, I'm working for identifying deleterious mutations using sift4g in model and non-model amphibian species.
To minimize computational time, I'd like to use GPU because of huge data size. However, GPU was not used even when we compile sift4g with 'make gpu'.
How can I utilize my GPU in sift4g?

I'm using nvidia GTX 1650 with cuda 11.5 and sift4g was compiled with 'make gpu' after deleting https://github.com/mkorpar/swsharp/blob/master/swsharp/Makefile#L28-L46 from vendor/swsharp.

Problem creating genomic database for new organism

Dear SIFT 4G team

I followed the instructions from "https://github.com/pauline-ng/SIFT4G_Create_Genomic_DB" to construct the database for a new organism. In my case, I am constructing the database for Cicer arietinum. I am getting the following error while running the script "make-SIFT-db-all.pl" using the following command:

Command:
perl make-SIFT-db-all.pl -config test_files/cicer_arietinum_config.txt

Log:
perl make-SIFT-db-all.pl -config test_files/cicer_arietinum_config.txt
converting gene format to use-able input
done converting gene format
making single records file
done making single records template
making noncoding records file
done making noncoding records
make the fasta sequences
done making the fasta sequences
start siftsharp, getting the alignments
cat: ./test_files/cicer_arietinum_genome/fasta/*.fasta: No such file or directory
/data/ngs/Programs_latest/SIFT4G_v2.0.0/bin/sift4g -d ./test_files/protein_db/uniref90.fasta -q ./test_files/cicer_arietinum_genome/all_prot.fasta --subst ./test_files/cicer_arietinum_genome/subst --out ./test_files/cicer_arietinum_genome/SIFT_predictions --sub-results
** Checking query data and substitutions files **

** EXITING! No valid queries to process. **

I also tried running the same script for the test human dataset provided with the package, but I am observing a different error:

Command:
perl make-SIFT-db-all.pl -config test_files/homo_sapiens-test.txt

Log:
converting gene format to use-able input
done converting gene format
making single records file
done making single records template
making noncoding records file
done making noncoding records
make the fasta sequences
done making the fasta sequences
start siftsharp, getting the alignments
/data/ngs/Programs_latest/SIFT4G_v2.0.0/bin/sift4g -d ./test_files/protein_db/uniref90.fasta -q ./test_files/homo_sapiens_small/all_prot.fasta --subst ./test_files/homo_sapiens_small/subst --out ./test_files/homo_sapiens_small/SIFT_predictions --sub-results
** Checking query data and substitutions files **
terminate called after throwing an instance of 'std::regex_error'
what(): regex_error

My config file for the Cicer arietinum looks like:
GENETIC_CODE_TABLE=1
GENETIC_CODE_TABLENAME=Standard
MITO_GENETIC_CODE_TABLE=2
MITO_GENETIC_CODE_TABLENAME=Vertebrate Mitochondrial

PARENT_DIR=./test_files/cicer_arietinum_genome
ORG=cicer_arietinum
ORG_VERSION=v1.0
DBSNP_VCF_FILE=

#Running SIFT 4G
SIFT4G_PATH=/data/ngs/Programs_latest/SIFT4G_v2.0.0/bin/sift4g
PROTEIN_DB=./test_files/protein_db/uniref90.fasta
COMPUTER=mrna

GENE_DOWNLOAD_DEST=gene-annotation-src
CHR_DOWNLOAD_DEST=chr-src
LOGFILE=Log.txt
ZLOGFILE=Log2.txt
FASTA_DIR=fasta
SUBST_DIR=subst
ALIGN_DIR=SIFT_alignments
SIFT_SCORE_DIR=SIFT_predictions
SINGLE_REC_BY_CHR_DIR=singleRecords
SINGLE_REC_WITH_SIFTSCORE_DIR=singleRecords_with_scores
DBSNP_DIR=dbSNP

FASTA_LOG=fasta.log
INVALID_LOG=invalid.log
PEPTIDE_LOG=peptide.log
ENS_PATTERN=ENS
SINGLE_RECORD_PATTERN=:change:_aa1valid_dbsnp.singleRecord

I also pulled the latest sift-4g code and recompiled and run the above mentioned perl script to observe the same set of errors.

I need help to resolve this error and construction of the database for my organism.

It would be really appreciable if I can be guided to resolve the error mentioned.

I would be thankful for all the help.

Best regards
Aamir

Not able to generate SIFT databse

all_prot.zip
subst.tar.gz

Dear Robert,
I am unable to create SIFT4G database. Below is my SIFT4G command and screen output,

sift123@sift123-Precision-T5600:/mnt1/scripts_to_build_SIFT_db$ /mnt1/SIFT4G_2.0.0/bin/sift4g -d /mnt1/protein_db/uniprot90_Jul2016/uniref90.fasta -q /mnt1/SIFT_databases//goat/all_prot.fasta --subst /mnt1/SIFT_databases//goat/subst --out /mnt1/SIFT_databases//goat/SIFT_predictions --sub-results
** Checking query data and substitutions files **

  • processing queries: 100.00/100.00% *

** Searching database for candidate sequences **
Killedessing database part 1 (size ~0.25 GB): 97.50/100.00% *

The all_prot.fasta and subst.tar.gz files are attached for debugging purpose. Please let me know if you need any additional information.

Thank you,
-Nilesh

Can SIFT predict synonymous variants?

Hi,
I wonder if SIFT can predict synonymous deleterious? I found some novol mutation didn't affect the aa, but with SIFT score <0.05. I wonder how this happen? SIFT is an algorithm that predicts whether an amino acid substitution is deleterious to protein function.
Thanks very mich!

Missing swsharp dependency?

Hello Robert,

I've tried to compile the sift4g code, and it seems to fail because swsharp headers are missing. Probably got lost somewhere along the way? ;)

[CP] src/sift_prediction.cpp
In file included from src/sift_prediction.cpp:16:
src/sift_scores.hpp:17:10: fatal error: swsharp/swsharp.h: No such file or directory
#include "swsharp/swsharp.h"
^~~~~~~~~~~~~~~~~~~
compilation terminated.
make: *** [Makefile:41: obj/sift_prediction.o] Error 1

Make ERROR:src/swimd/Swimd.cpp:1:18: fatal error: cstdio: No such file or directory

>>> vendor/swsharp <<<
[CP] src/swimd/Swimd.cpp
src/swimd/Swimd.cpp:1:18: fatal error: cstdio: No such file or directory
 #include <cstdio>
                  ^
compilation terminated.
make[1]: *** [obj/swimd/Swimd.o] Error 1
make: *** [vendor/swsharp] Error 2

when i make sift4g , have this error.

I use git clone --recursive https://github.com/rvaser/sift4g.git sift4g to download the sift4g.

my gcc and g++ version is V4.9.3 info like this:

Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/share/softwares/sift4g/gcc/gcc/libexec/gcc/x86_64-unknown-linux-gnu/4.9.3/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ./configure --prefix=/share/softwares/sift4g/gcc/gcc-4.9.3 --exec-prefix=/share/softwares/sift4g/gcc/gcc --with-mpfr=/share/softwares/sift4g/gcc/mpfr --with-gmp=/share/softwares/sift4g/gcc/gmp --with-mpc=/share/softwares/sift4g/gcc/mpc --disable-multilib
Thread model: posix
gcc version 4.9.3 (GCC)
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/share/softwares/sift4g/gcc/gcc/libexec/gcc/x86_64-unknown-linux-gnu/4.9.3/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ./configure --prefix=/share/softwares/sift4g/gcc/gcc-4.9.3 --exec-prefix=/share/softwares/sift4g/gcc/gcc --with-mpfr=/share/softwares/sift4g/gcc/mpfr --with-gmp=/share/softwares/sift4g/gcc/gmp --with-mpc=/share/softwares/sift4g/gcc/mpc --disable-multilib
Thread model: posix
gcc version 4.9.3 (GCC)

my system is centos7

Sift4g stuck up without error report

Dear Robert,
I met a weird problem when I use sift4g.
The program stuck up when I run a command like that
"home/sift4g/bin/sift4g -d PATH1/uniref90.fasta -q PATH2/all_prot.fasta --subst PATH3/subst --out PATH4/SIFT_predictions --sub-results -t 32 " , As a part of "make-SIFT-db" pipeline.
No CPU time was assigned to sift4g, when the program runs to the "Selecting alignments" sub-step. This state has lasted for almost two days.

The log file and program state are as follow.
image
image

pleas let me informed if there is any solution for this issue ,thanks for your time.

best wish
Huang Liang

processing database part 975 than finish,and nothing generate

Hi,rvaser,
Thank you for create a great software,i would like to use it to annotate animal genome. I use conda to install sift4g,when i run the sift4g in test_file,it can not generate anything, i do not know why . Following is issue i meet
image

I also want to do a datebase for my species,then run the make-SIFT-db-all.pl
But it always processing database part 975,can not go next,so what happened ?i want know,it is a big problem that i can not resolve.

converting gene format to use-able input
done converting gene format
making single records file
done making single records template
making noncoding records file
done making noncoding records
make the fasta sequences
done making the fasta sequences
start siftsharp, getting the alignments
/home/Wangpengfei/anaconda3/envs/sift4g/bin/sift4g -d /home/Wangpengfei/sift4g/SIFT4G_Create_Genomic_DB-master/nr -q /home/Wangpengfei/sift4g/SIFT4G_Create_Genomic_DB-master/test_files/bovine_ncbi/all_prot.fasta --subst /home/Wangpengfei/sift4g/SIFT4G_Create_Genomic_DB-master/test_files/bovine_ncbi/subst --out /home/Wangpengfei/sift4g/SIFT4G_Create_Genomic_DB
** Checking query data and substitutions files **

processing queries: 0.00/100.00% ^M processing queries: 0.01/100.00% ^M processing queries: 0.01/100.00% ^M processing queries: 0.01/100.00% ^M processing queries: 0.01/100.00% ^M processing queries: 0.02/100.00% ^M processing queries: 0.02/100.00% ^M processing queries: 0.02/100.00% ^M processing queries: 0.02/100.00% ^M processing querie..............................................................................** Searching database for candidate sequences **
processing database part 1 (size ~0.25 GB): 0.00/100.00% ^M processing database part 1 (size ~0.25 GB): 2.50/100.00%processing database part 975 (size ~0.25 GB): 92.50/100.00% ^M processing database part 975 (size ~0.25 GB): 95.00/100.00% ^M processing database part 975 (size ~0.25 GB): 97.50/100.00% ^M processing database part 975 (size ~0.25 GB): 100.00/100.00%

Questions about the protein database

Dear Robert,

I apologize in advance if this is not a good place to ask these questions.

I was wondering which protein database is the most appropriate for annotation of missense variants in a bacterium? I've ran variant effect prediction on approx. 30k missense mutations in 4k genes. I've tried swissprot and uniref 90 as reference databases. The final stats are like this

Swissprot: 20574 DELETERIOUS 8453 TOLERATED
UniRef 90: 13215 DELETERIOUS 15810 TOLERATED

I am somewhat unsure what are the numbers in sift4g output. Seems like they are a bit different from classical sift?

P95S DELETERIOUS 0.00 3.74 7 8
A85T DELETERIOUS 0.03 3.74 7 8

From what I understand, last 1 or 2 numbers is the number of reference sequences the prediction is based on. Is the number capped at 400, and how are 400 selected? I've also seen quite a few sequences that are predicted based on 1 sequence - is that OK?

Thank you!

#Chr17_scores.Srecords': No such file or directory

Hi,

I have tried to use the SIFT4G to build my own SIFT database for many times, but it still was unsuccessful. can you help me to check it? The details as bellow:

** Searching database for candidate sequences **

  • processing database part 1 (size ~0.25 GB): 100.00/100.00% *

** Aligning queries with candidate sequences **

  • processing database part 1 (size ~1.00 GB): 100.00/100.00% *

** Selecting alignments with median threshold: 2.75 **

  • processing queries: 100.00/100.00% *

** Generating SIFT predictions with sequence identity: 100.00% **

  • processing queries: 100.00/100.00% *

done getting all the scores
populating databases
cat: /home/scripts_to_build_SIFT_db/test_files/singleRecords/Chr1.singleRecords: No such file or directory
can't open /home/scripts_to_build_SIFT_db/test_files/singleRecords/Chr1.singleRecords at map-scores-back-to-records.pl line 122.
Unable to read from /home/scripts_to_build_SIFT_db/test_files/singleRecords_with_scores/Chr1_scores.Srecords
cat: /home/scripts_to_build_SIFT_db/test_files/singleRecords/Chr1.singleRecords_noncoding.with_dbSNPid: No such file or directory
Traceback (most recent call last):
File "make_regions_file.py", line 68, in
get_regions (chrom_file, out_file)
File "make_regions_file.py", line 31, in get_regions
pos = get_pos (first_line)
File "make_regions_file.py", line 8, in get_pos
return int (fields[0])
ValueError: invalid literal for int() with base 10: ''
rm: cannot remove '/home/scripts_to_build_SIFT_db/test_files/singleRecords_with_scores/Chr1_scores.Srecords': No such file or directory

many thx.

Error while make gpu

Hi, I am facing an error when compiling this code on GPU (make gpu).

>>> vendor/swsharp <<<
[CORE] swsharp
[CU] src/evalue.cu
nvcc fatal   : Unsupported gpu architecture 'compute_30'
Makefile:111: recipe for target 'obj/evalue.o' failed
make[2]: *** [obj/evalue.o] Error 1
Makefile:51: recipe for target 'swsharp' failed
make[1]: *** [swsharp] Error 2
Makefile:16: recipe for target 'vendor/swsharp' failed
make: *** [vendor/swsharp] Error 2

Could you please help me to fix this error? Thank you so much.
My system infor:

# nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Feb_14_21:12:58_PST_2021
Cuda compilation tools, release 11.2, V11.2.152
Build cuda_11.2.r11.2/compiler.29618528_0

SIFT4G_Create_Genomic_DB

Hello,

I tried using the full path in the parent_dir but still it shows that it can't access the gene-annotation-src file. Can you please help me?

I have attached my config file,

Thanks,
Princy
peas_config.txt

regex_error

Dear Robert
I met a error when I run sift4g test data .
"./bin/sift4g -q ./test_files/query.fasta --subst ./test_files/ -d ./test_files/sample_protein_database.fa"
** Checking query data and substitutions files **
terminate called after throwing an instance of 'std::regex_error'
what(): regex_error
Aborted (core dumped).
My gcc is gcc-7.3.0, which is gcc 4.9+. Do you know why?

when I construnct the database using "perl make-SIFT-db-all.pl -config test_files/candidatus_carsonella_ruddii_pv_config.txt --ensembl_download"

it says: Possible precedence issue with control flow operator at /usr/local/lib/perl5/site_perl/5.22.0/Bio/DB/IndexedBase.pm line 791.
done making single records template
making noncoding records file

start siftsharp, getting the alignments
/workdir/yw2326/have_check/SIFT/sift4g/bin/sift4g -d /workdir/yw2326/have_check/SIFT/PROTEIN_DB/uniref90.fasta -q /local/workdir/yw2326/have_check/SIFT/SIFT4G_Create_Genomic_DB/database_output/all_prot.fasta --subst /local/workdir/yw2326/have_check/SIFT/SIFT4G_Create_Genomic_DB/database_output/subst --out /local/workdir/yw2326/have_check/SIFT/SIFT4G_Create_Genomic_DB/database_output/SIFT_predictions --sub-results
** Checking query data and substitutions files **
terminate called after throwing an instance of 'std::regex_error'
what(): regex_error

Make errors

Hi Robert,

I update my gcc and g++ version to "gcc (conda-forge gcc 13.2.0-2) 13.2.0", and change "CP = g++" to "CP = ~/.conda/envs/biosoft/bin/g++". However, I met errors when conduct make. Could you help me to fix it? Thanks!
Screenshot 2023-09-28 at 21 06 41
Screenshot 2023-09-28 at 21 06 55

VCF File input error

Hello,

I am looking to use SIFT on some E coli genomes that I've aligned but I keep getting the error:

Chromosome WithSIFT4GAnnotations WithoutSIFT4GAnnotations Progress
ERROR! Input VCF file should contain at least 8 columns. See line:
Chromosome,70289,.,C,A,100,0,*,TYPE=SUBSTITUTE

This is even though my vcf file has 8 columns (I added in a dummy column so there are actually 9, 8 of the original vcf file) and follows the layout of the example vcf file in the supplementary of the nature protocols paper. I get this error when using either the GUI or running it in terminal.

Here's a head of my vcf file:
#CHROM,POS,ID,REF,ALT,QUAL,Y,FILTER,INFO
Chromosome,70289,.,C,A,100,0,,TYPE=SUBSTITUTE
Chromosome,366519,.,G,A,100,1,
,TYPE=SUBSTITUTE
Chromosome,503429,.,C,T,100,2,,TYPE=SUBSTITUTE
Chromosome,705013,.,A,G,100,3,
,TYPE=SUBSTITUTE
Chromosome,1633629,.,T,C,100,4,,TYPE=SUBSTITUTE
Chromosome,1633924,.,C,A,100,5,
,TYPE=SUBSTITUTE
Chromosome,1652331,.,T,C,100,6,,TYPE=SUBSTITUTE
Chromosome,2173362,.,CCC,C,100,7,
,TYPE=DELETE
Chromosome,2731078,.,C,A,100,8,*,TYPE=SUBSTITUTE

Thanks for your help!

DELETERIOUS (*WARNING! Low confidence)

When I use sift4g,some position(114,661/145,218) it predicted to be DELETERIOUS (*WARNING! Low confidence),Can we consider this position as deleterious ?
image
Why have this happened? Because SIFT4G_Create_Genomic_DB run the result?My CHECK_GENES.LOG Pos with Confident Scores are around 60% to 70%.
image
i use taurine reference genome ARS_UCD1.2 and uniref 90 to creat sift4g_genomic_DB, ORG be setting as bovine its right? my vcf file used to be predicted contain taurine and indicine ,but ARS_UCD1.2 reference genome is taurine ,and ORG seeting is bovine not taurine or indicine its problem? Do I need to consider the bias of the reference genome?

Precision of output tables

Hi Robert,

The output .SIFTprediction tables contain scores rounded to 2 decimal places and it would be useful for my work to have more precision. I only occasionally use C++ but as far as I can tell this rounding is done just before writing the output table (e.g. the print_double functions at line 303 of sift_scores.cpp) with no option to change it.

Is there an important reason to round like this? or would I be able to simply change the precision when writing the output table and get more detailed scores?

Thanks,
Ally

Compile issue

Hello,

I'm getting an error when trying to compile sift4g on a Mac running Catalina. Any ideas what could be going on? See below the result of 'gcc -v' and 'make' below that

Thanks!

jah1$ gcc -v
Configured with: --prefix=/Library/Developer/CommandLineTools/usr --with-gxx-include-dir=/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/c++/4.2.1
Apple clang version 12.0.0 (clang-1200.0.32.29)
Target: x86_64-apple-darwin19.6.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin


h80ad4bb7:sift4g jah1$ make

vendor/swsharp <<<
[CORE] swsharp
[CP] src/nw_find_score_gpu.cu
[CC] src/thread.c
[CC] src/constants.c
[CP] src/ov_end_data_gpu.cu
[CC] src/reconstruct.c
[CC] src/ssw/ssw.c
[CC] src/sse_module.c
[CC] src/database.c
[CC] src/post_proc.c
[CP] src/score_database_gpu.cu
[CP] src/score_database_gpu_short.cu
[CP] src/nw_linear_data_gpu.cu
[CC] src/align.c
[CC] src/utils.c
[CC] src/chain.c
[CP] src/hw_end_data_gpu.cu
[CC] src/threadpool.c
[CC] src/alignment.c
[CC] src/db_alignment.c
[CP] src/sw_end_data_gpu.cu
[CP] src/evalue.cu
[CP] src/score_database_gpu_long.cu
[CP] src/cuda_utils.cu
[CP] src/ov_find_score_gpu.cu
[CP] src/swimd/Swimd.cpp
[CC] src/pre_proc.c
[CC] src/scorer.c
[CP] src/gpu_module.cu
[CC] src/cpu_module.c
[AR] ../lib/libswsharp.a
[CP] ../include/swsharp/align.h
[CP] ../include/swsharp/alignment.h
[CP] ../include/swsharp/chain.h
[CP] ../include/swsharp/constants.h
[CP] ../include/swsharp/cpu_module.h
[CP] ../include/swsharp/cuda_utils.h
[CP] ../include/swsharp/database.h
[CP] ../include/swsharp/db_alignment.h
[CP] ../include/swsharp/evalue.h
[CP] ../include/swsharp/gpu_module.h
[CP] ../include/swsharp/post_proc.h
[CP] ../include/swsharp/pre_proc.h
[CP] ../include/swsharp/reconstruct.h
[CP] ../include/swsharp/scorer.h
[CP] ../include/swsharp/swsharp.h
[CP] ../include/swsharp/thread.h
[CP] ../include/swsharp/threadpool.h
[CP] ../swsharpwin/swsharp/chain.h
[CP] ../swsharpwin/swsharp/error.h
[CP] ../swsharpwin/swsharp/utils.h
[CP] ../swsharpwin/swsharp/score_database_gpu_short.h
[CP] ../swsharpwin/swsharp/ssw/ssw.h
[CP] ../swsharpwin/swsharp/align.h
[CP] ../swsharpwin/swsharp/score_database_gpu_long.h
[CP] ../swsharpwin/swsharp/db_alignment.h
[CP] ../swsharpwin/swsharp/threadpool.h
[CP] ../swsharpwin/swsharp/alignment.h
[CP] ../swsharpwin/swsharp/swsharp.h
[CP] ../swsharpwin/swsharp/cpu_module.h
[CP] ../swsharpwin/swsharp/scorer.h
[CP] ../swsharpwin/swsharp/pre_proc.h
[CP] ../swsharpwin/swsharp/reconstruct.h
[CP] ../swsharpwin/swsharp/constants.h
[CP] ../swsharpwin/swsharp/thread.h
[CP] ../swsharpwin/swsharp/database.h
[CP] ../swsharpwin/swsharp/sse_module.h
[CP] ../swsharpwin/swsharp/cuda_utils.h
[CP] ../swsharpwin/swsharp/swimd/Swimd.h
[CP] ../swsharpwin/swsharp/evalue.h
[CP] ../swsharpwin/swsharp/post_proc.h
[CP] ../swsharpwin/swsharp/gpu_module.h
[CP] ../swsharpwin/swsharp/nw_find_score_gpu.cu
[CP] ../swsharpwin/swsharp/thread.c
[CP] ../swsharpwin/swsharp/constants.c
[CP] ../swsharpwin/swsharp/ov_end_data_gpu.cu
[CP] ../swsharpwin/swsharp/reconstruct.c
[CP] ../swsharpwin/swsharp/ssw/ssw.c
[CP] ../swsharpwin/swsharp/sse_module.c
[CP] ../swsharpwin/swsharp/database.c
[CP] ../swsharpwin/swsharp/post_proc.c
[CP] ../swsharpwin/swsharp/score_database_gpu.cu
[CP] ../swsharpwin/swsharp/score_database_gpu_short.cu
[CP] ../swsharpwin/swsharp/nw_linear_data_gpu.cu
[CP] ../swsharpwin/swsharp/align.c
[CP] ../swsharpwin/swsharp/utils.c
[CP] ../swsharpwin/swsharp/chain.c
[CP] ../swsharpwin/swsharp/hw_end_data_gpu.cu
[CP] ../swsharpwin/swsharp/threadpool.c
[CP] ../swsharpwin/swsharp/alignment.c
[CP] ../swsharpwin/swsharp/db_alignment.c
[CP] ../swsharpwin/swsharp/sw_end_data_gpu.cu
[CP] ../swsharpwin/swsharp/evalue.cu
[CP] ../swsharpwin/swsharp/score_database_gpu_long.cu
[CP] ../swsharpwin/swsharp/cuda_utils.cu
[CP] ../swsharpwin/swsharp/ov_find_score_gpu.cu
[CP] ../swsharpwin/swsharp/swimd/Swimd.cpp
[CP] ../swsharpwin/swsharp/pre_proc.c
[CP] ../swsharpwin/swsharp/scorer.c
[CP] ../swsharpwin/swsharp/gpu_module.cu
[CP] ../swsharpwin/swsharp/cpu_module.c
[MOD] swsharpn
[CC] src/main.c
[LD] swsharpn
[CP] ../bin/swsharpn
[CP] ../swsharpwin/swsharpn/main.c
[MOD] swsharpp
[CC] src/main.c
[LD] swsharpp
[CP] ../bin/swsharpp
[CP] ../swsharpwin/swsharpp/main.c
[MOD] swsharpnc
[CC] src/main.c
[LD] swsharpnc
[CP] ../bin/swsharpnc
[CP] ../swsharpwin/swsharpnc/main.c
[MOD] swsharpdb
[CC] src/main.c
[LD] swsharpdb
[CP] ../bin/swsharpdb
[CP] ../swsharpwin/swsharpdb/main.c
[MOD] swsharpout
[CC] src/main.c
[LD] swsharpout
[CP] ../bin/swsharpout
[CP] ../swsharpwin/swsharpout/main.c
sift4g <<<
[CP] src/database_search.cpp
[CP] src/utils.cpp
[CP] src/sift_prediction.cpp
[CP] src/select_alignments.cpp
[CP] src/database_alignment.cpp
[CP] src/sift_scores.cpp
src/sift_scores.cpp:242:23: error: implicit instantiation of undefined template
'std::__1::basic_stringstream<char, std::__1::char_traits,
std::__1::allocator >'
std::stringstream stream;
^
/Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/iosfwd:139:32: note:
template is declared here
class _LIBCPP_TEMPLATE_VIS basic_stringstream;
^
src/sift_scores.cpp:280:35: error: implicit instantiation of undefined template
'std::__1::basic_stringstream<char, std::__1::char_traits,
std::__1::allocator >'
std::stringstream ss(std::stringstream::in | std::stringstream::out);
^
/Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/iosfwd:139:32: note:
template is declared here
class _LIBCPP_TEMPLATE_VIS basic_stringstream;
^
src/sift_scores.cpp:280:59: error: implicit instantiation of undefined template
'std::__1::basic_stringstream<char, std::__1::char_traits,
std::__1::allocator >'
std::stringstream ss(std::stringstream::in | std::stringstream::out);
^
/Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/iosfwd:139:32: note:
template is declared here
class _LIBCPP_TEMPLATE_VIS basic_stringstream;
^
src/sift_scores.cpp:280:27: error: implicit instantiation of undefined template
'std::__1::basic_stringstream<char, std::__1::char_traits,
std::__1::allocator >'
std::stringstream ss(std::stringstream::in | std::stringstream::out);
^
/Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/iosfwd:139:32: note:
template is declared here
class _LIBCPP_TEMPLATE_VIS basic_stringstream;
^
4 errors generated.
make[1]: *** [obj/sift_scores.o] Error 1
make: *** [sift4g] Error 2

Errors in creating SIFT Database for rheMac10

Hello,

I am continuing an issue here that began with this thread: pauline-ng/SIFT4G_Create_Genomic_DB#70

The current issue is that I run into a segmentation fault when running the sift4g command to generate SIFT_predictions.

This is the most recent output I get:

** Aligning queries with candidate sequences **

processing database part 1 (size ~1.00 GB): 0.00/100.00% *
processing database part 1 (size ~1.00 GB): 2.50/100.00% *
processing database part 1 (size ~1.00 GB): 5.00/100.00% *
processing database part 1 (size ~1.00 GB): 7.50/100.00% *
processing database part 1 (size ~1.00 GB): 10.00/100.00% *
processing database part 1 (size ~1.00 GB): 12.50/100.00% *
/cm/local/apps/slurm/var/spool/job391367/slurm_script: line 16: 18084 Segmentation fault sift4g -t 20 -q /home/npb0015/reinforcement_project/all_prot.fasta -d /home/npb0015/reinforcement_project/Reference_files/uniprot_sprot.fasta --subst /home/npb0015/reinforcement_project/subst --out /home/npb0015/reinforcement_project/SIFT_predictions/ --sub-results

This error comes from running the sift4g command given in the error, which is essentially the same as that suggested on October 16th in the original thread (I get the same error running that command exactly as is). Since a segmentation fault usually means an inability to access the memory required I've tried runs as high as 248GB RAM but still receive the same error, so I assume that is not the issue.

Here I will also provide a link to dropbox for the all_prot.fasta file that has been generated by earlier steps in the database creation.

https://www.dropbox.com/s/ggfgs7senvh3vaf/all_prot.fasta?dl=0

regex_error

Hi @rvaser and @pauline-ng ,
I'm installing sift4g using the command 'make' as the readme file. And it success installed under gcc-4.8.5.

>>> sift4g <<<
[CP] src/database_alignment.cpp
[CP] src/database_search.cpp
[CP] src/hash.cpp
[CP] src/main.cpp
[CP] src/select_alignments.cpp
[CP] src/sift_prediction.cpp
[CP] src/sift_scores.cpp
[CP] src/utils.cpp
[LD] ../bin/sift4g

while running the test command, it failed with:

./bin/sift4g -q ./test_files/query.fasta --subst ./test_files/ -d ./test_files/sample_protein_database.fa
** Checking query data and substitutions files **
terminate called after throwing an instance of 'std::regex_error'
  what():  regex_error
Aborted (core dumped)

According to the answer suggested in google, is about the gcc version(but the readme file suggest 4.8+). But I still try to use another gcc version , failed while running make in gcc-5.4 or gcc-6.2 as below:

>>> sift4g <<<
[CP] src/database_alignment.cpp
[CP] src/database_search.cpp
src/database_search.cpp: In function ‘int32_t longestIncreasingSubsequence(const std::vector<int>&)’:
src/database_search.cpp:264:40: error: ‘floor’ was not declared in this scope
             temp = floor((l + u) / 2.0f);
                                        ^
make[1]: *** [obj/database_search.o] Error 1
make: *** [sift4g] Error 2

Sincerely hope can get some advise about you. And thanks in advance.

Installation Error

Hi @rvaser,

I'm trying to make a SIFT database using @pauline-ng 's SIFT4G_Create_Genomic_DB (which requires sift4g). When I'm trying to install sift4g I'm getting a slew of error messages. Here is what I did to get the repository:

git clone --recursive https://github.com/rvaser/sift4g.git sift4g
cd sift4g/
make

I have GNU Make 4.1 and g++ (GCC) 4.9.3

Any ideas on what might be causing this issue and suggestions on how to fix it?

Thanks,
Randy

Here is the error message:

>>> vendor/swsharp <<< [CORE] swsharp [CP] src/swimd/Swimd.cpp /tmp/cclpPiBu.s: Assembler messages: /tmp/cclpPiBu.s:185: Error: suffix or operands invalid for vbroadcastss'
/tmp/cclpPiBu.s:188: Error: suffix or operands invalid for vbroadcastss' /tmp/cclpPiBu.s:189: Error: suffix or operands invalid for vpsubd'
/tmp/cclpPiBu.s:210: Error: suffix or operands invalid for vpsubd' /tmp/cclpPiBu.s:219: Error: suffix or operands invalid for vpsubd'
/tmp/cclpPiBu.s:401: Error: suffix or operands invalid for vpand' /tmp/cclpPiBu.s:404: Error: suffix or operands invalid for vpsubd'
/tmp/cclpPiBu.s:417: Error: suffix or operands invalid for vpsubd' /tmp/cclpPiBu.s:421: Error: suffix or operands invalid for vpsubd'
/tmp/cclpPiBu.s:422: Error: suffix or operands invalid for vpsubd' /tmp/cclpPiBu.s:423: Error: suffix or operands invalid for vpmaxsd'
/tmp/cclpPiBu.s:426: Error: suffix or operands invalid for vpsubd' /tmp/cclpPiBu.s:428: Error: suffix or operands invalid for vpmaxsd'
/tmp/cclpPiBu.s:429: Error: suffix or operands invalid for vpaddd' /tmp/cclpPiBu.s:431: Error: suffix or operands invalid for vpmaxsd'
/tmp/cclpPiBu.s:432: Error: suffix or operands invalid for vpmaxsd' /tmp/cclpPiBu.s:724: Error: suffix or operands invalid for vpand'
/tmp/cclpPiBu.s:731: Error: suffix or operands invalid for vpand' /tmp/cclpPiBu.s:732: Error: suffix or operands invalid for vpaddd'
/tmp/cclpPiBu.s:740: Error: suffix or operands invalid for vpand' /tmp/cclpPiBu.s:744: Error: suffix or operands invalid for vpand'
/tmp/cclpPiBu.s:749: Error: suffix or operands invalid for vpsubd' /tmp/cclpPiBu.s:750: Error: suffix or operands invalid for vpand'
/tmp/cclpPiBu.s:751: Error: suffix or operands invalid for vpaddd' /tmp/cclpPiBu.s:759: Error: suffix or operands invalid for vpand'
/tmp/cclpPiBu.s:760: Error: suffix or operands invalid for vpand' /tmp/cclpPiBu.s:762: Error: suffix or operands invalid for vpaddd'
/tmp/cclpPiBu.s:811: Error: suffix or operands invalid for vpsubd' /tmp/cclpPiBu.s:1012: Error: no such instruction: vpbroadcastb %xmm7,%xmm7'
/tmp/cclpPiBu.s:1013: Error: no such instruction: vpbroadcastb %xmm5,%xmm5' /tmp/cclpPiBu.s:1015: Error: no such instruction: vinserti128 $1,%xmm7,%ymm7,%ymm7'
/tmp/cclpPiBu.s:1017: Error: no such instruction: vinserti128 $1,%xmm5,%ymm5,%ymm5' /tmp/cclpPiBu.s:1105: Error: suffix or operands invalid for vpsubsb'
/tmp/cclpPiBu.s:1109: Error: suffix or operands invalid for vpsubsb' /tmp/cclpPiBu.s:1110: Error: suffix or operands invalid for vpsubsb'
/tmp/cclpPiBu.s:1112: Error: suffix or operands invalid for vpmaxub' /tmp/cclpPiBu.s:1113: Error: suffix or operands invalid for vpsubsb'
/tmp/cclpPiBu.s:1114: Error: suffix or operands invalid for vpaddsb' /tmp/cclpPiBu.s:1115: Error: suffix or operands invalid for vpmaxub'
/tmp/cclpPiBu.s:1118: Error: suffix or operands invalid for vpmaxub' /tmp/cclpPiBu.s:1119: Error: suffix or operands invalid for vpmaxub'
/tmp/cclpPiBu.s:1120: Error: suffix or operands invalid for vpand' /tmp/cclpPiBu.s:1125: Error: suffix or operands invalid for vpmaxub'
/tmp/cclpPiBu.s:1251: Error: suffix or operands invalid for vpaddsb' /tmp/cclpPiBu.s:1261: Error: suffix or operands invalid for vpaddsb'
/tmp/cclpPiBu.s:1268: Error: suffix or operands invalid for vpaddsb' /tmp/cclpPiBu.s:1496: Error: no such instruction: vpbroadcastb %xmm0,%xmm0'
/tmp/cclpPiBu.s:1497: Error: no such instruction: vinserti128 $1,%xmm0,%ymm0,%ymm0' /tmp/cclpPiBu.s:1580: Error: no such instruction: vpbroadcastb %xmm9,%xmm9'
/tmp/cclpPiBu.s:1581: Error: no such instruction: vpbroadcastb %xmm4,%xmm4' /tmp/cclpPiBu.s:1582: Error: no such instruction: vpbroadcastb %xmm5,%xmm5'
/tmp/cclpPiBu.s:1583: Error: no such instruction: vinserti128 $1,%xmm9,%ymm9,%ymm9' /tmp/cclpPiBu.s:1586: Error: no such instruction: vinserti128 $1,%xmm4,%ymm4,%ymm4'
/tmp/cclpPiBu.s:1588: Error: no such instruction: vinserti128 $1,%xmm5,%ymm5,%ymm5' /tmp/cclpPiBu.s:1652: Error: no such instruction: vpbroadcastb %xmm6,%xmm6'
/tmp/cclpPiBu.s:1653: Error: no such instruction: vinserti128 $1,%xmm6,%ymm6,%ymm6' /tmp/cclpPiBu.s:1670: Error: suffix or operands invalid for vpsubsb'
/tmp/cclpPiBu.s:1674: Error: suffix or operands invalid for vpsubsb' /tmp/cclpPiBu.s:1675: Error: suffix or operands invalid for vpsubsb'
/tmp/cclpPiBu.s:1676: Error: suffix or operands invalid for vpmaxsb' /tmp/cclpPiBu.s:1679: Error: suffix or operands invalid for vpsubsb'
/tmp/cclpPiBu.s:1681: Error: suffix or operands invalid for vpmaxsb' /tmp/cclpPiBu.s:1682: Error: suffix or operands invalid for vpaddsb'
/tmp/cclpPiBu.s:1684: Error: suffix or operands invalid for vpminsb' /tmp/cclpPiBu.s:1685: Error: suffix or operands invalid for vpmaxsb'
/tmp/cclpPiBu.s:1686: Error: suffix or operands invalid for vpmaxsb' /tmp/cclpPiBu.s:1690: Error: suffix or operands invalid for vpmaxsb'
/tmp/cclpPiBu.s:1695: Error: suffix or operands invalid for vpmaxsb' /tmp/cclpPiBu.s:1702: Error: suffix or operands invalid for vpminsb'
/tmp/cclpPiBu.s:1707: Error: suffix or operands invalid for vpminsb' /tmp/cclpPiBu.s:1754: Error: suffix or operands invalid for vpand'
/tmp/cclpPiBu.s:1761: Error: suffix or operands invalid for vpand' /tmp/cclpPiBu.s:1762: Error: suffix or operands invalid for vpaddsb'
/tmp/cclpPiBu.s:1772: Error: suffix or operands invalid for vpand' /tmp/cclpPiBu.s:1778: Error: suffix or operands invalid for vpand'
/tmp/cclpPiBu.s:1779: Error: suffix or operands invalid for vpand' /tmp/cclpPiBu.s:1781: Error: suffix or operands invalid for vpaddsb'
/tmp/cclpPiBu.s:1902: Error: suffix or operands invalid for vpmaxsb' /tmp/cclpPiBu.s:2029: Error: suffix or operands invalid for vpmaxsb'
/tmp/cclpPiBu.s:2166: Error: no such instruction: vpbroadcastb %xmm0,%xmm0' /tmp/cclpPiBu.s:2167: Error: no such instruction: vinserti128 $1,%xmm0,%ymm0,%ymm0'
/tmp/cclpPiBu.s:2204: Error: no such instruction: vpbroadcastb %xmm5,%xmm5' /tmp/cclpPiBu.s:2205: Error: no such instruction: vpbroadcastb %xmm6,%xmm6'
/tmp/cclpPiBu.s:2207: Error: no such instruction: vinserti128 $1,%xmm5,%ymm5,%ymm5' /tmp/cclpPiBu.s:2208: Error: no such instruction: vinserti128 $1,%xmm6,%ymm6,%ymm6'
/tmp/cclpPiBu.s:2225: Error: suffix or operands invalid for vpsubsb' /tmp/cclpPiBu.s:2230: Error: suffix or operands invalid for vpsubsb'
/tmp/cclpPiBu.s:2245: Error: no such instruction: vpbroadcastb %xmm4,%xmm4' /tmp/cclpPiBu.s:2246: Error: no such instruction: vinserti128 $1,%xmm4,%ymm4,%ymm4'
/tmp/cclpPiBu.s:2309: Error: no such instruction: vpbroadcastb %xmm7,%xmm7' /tmp/cclpPiBu.s:2310: Error: no such instruction: vinserti128 $1,%xmm7,%ymm7,%ymm7'
/tmp/cclpPiBu.s:2327: Error: suffix or operands invalid for vpsubsb' /tmp/cclpPiBu.s:2332: Error: suffix or operands invalid for vpsubsb'
/tmp/cclpPiBu.s:2334: Error: suffix or operands invalid for vpsubsb' /tmp/cclpPiBu.s:2335: Error: suffix or operands invalid for vpsubsb'
/tmp/cclpPiBu.s:2336: Error: suffix or operands invalid for vpmaxsb' /tmp/cclpPiBu.s:2337: Error: suffix or operands invalid for vpmaxsb'
/tmp/cclpPiBu.s:2338: Error: suffix or operands invalid for vpaddsb' /tmp/cclpPiBu.s:2340: Error: suffix or operands invalid for vpmaxsb'
/tmp/cclpPiBu.s:2342: Error: suffix or operands invalid for vpminsb' /tmp/cclpPiBu.s:2343: Error: suffix or operands invalid for vpmaxsb'
/tmp/cclpPiBu.s:2347: Error: suffix or operands invalid for vpmaxsb' /tmp/cclpPiBu.s:2352: Error: suffix or operands invalid for vpmaxsb'
/tmp/cclpPiBu.s:2358: Error: suffix or operands invalid for vpminsb' /tmp/cclpPiBu.s:2363: Error: suffix or operands invalid for vpminsb'
/tmp/cclpPiBu.s:2410: Error: suffix or operands invalid for vpand' /tmp/cclpPiBu.s:2417: Error: suffix or operands invalid for vpand'
/tmp/cclpPiBu.s:2418: Error: suffix or operands invalid for vpaddsb' /tmp/cclpPiBu.s:2425: Error: suffix or operands invalid for vpand'
/tmp/cclpPiBu.s:2431: Error: suffix or operands invalid for vpand' /tmp/cclpPiBu.s:2436: Error: suffix or operands invalid for vpsubsb'
/tmp/cclpPiBu.s:2437: Error: suffix or operands invalid for vpand' /tmp/cclpPiBu.s:2438: Error: suffix or operands invalid for vpaddsb'
/tmp/cclpPiBu.s:2446: Error: suffix or operands invalid for vpand' /tmp/cclpPiBu.s:2447: Error: suffix or operands invalid for vpand'
/tmp/cclpPiBu.s:2449: Error: suffix or operands invalid for vpaddsb' /tmp/cclpPiBu.s:2690: Error: suffix or operands invalid for vpmaxsb'
/tmp/cclpPiBu.s:2700: Error: suffix or operands invalid for vpsubsb' /tmp/cclpPiBu.s:2825: Error: no such instruction: vpbroadcastb %xmm0,%xmm0'
/tmp/cclpPiBu.s:2829: Error: no such instruction: vinserti128 $1,%xmm0,%ymm0,%ymm0' /tmp/cclpPiBu.s:2837: Error: no such instruction: vpbroadcastb %xmm0,%xmm0'
/tmp/cclpPiBu.s:2838: Error: no such instruction: vinserti128 $1,%xmm0,%ymm0,%ymm0' /tmp/cclpPiBu.s:2875: Error: no such instruction: vpbroadcastb %xmm6,%xmm6'
/tmp/cclpPiBu.s:2876: Error: no such instruction: vinserti128 $1,%xmm6,%ymm6,%ymm6' /tmp/cclpPiBu.s:2881: Error: no such instruction: vpbroadcastb %xmm7,%xmm7'
/tmp/cclpPiBu.s:2883: Error: no such instruction: vinserti128 $1,%xmm7,%ymm7,%ymm7' /tmp/cclpPiBu.s:2897: Error: suffix or operands invalid for vpsubsb'
/tmp/cclpPiBu.s:2902: Error: suffix or operands invalid for vpsubsb' /tmp/cclpPiBu.s:2911: Error: suffix or operands invalid for vpsubsb'
/tmp/cclpPiBu.s:2996: Error: suffix or operands invalid for vpand' /tmp/cclpPiBu.s:2999: Error: suffix or operands invalid for vpsubsb'
/tmp/cclpPiBu.s:3003: Error: no such instruction: vpbroadcastb %xmm3,%xmm3' /tmp/cclpPiBu.s:3004: Error: no such instruction: vinserti128 $1,%xmm3,%ymm3,%ymm3'
/tmp/cclpPiBu.s:3020: Error: suffix or operands invalid for vpsubsb' /tmp/cclpPiBu.s:3024: Error: suffix or operands invalid for vpsubsb'
/tmp/cclpPiBu.s:3025: Error: suffix or operands invalid for vpsubsb' /tmp/cclpPiBu.s:3027: Error: suffix or operands invalid for vpmaxsb'
/tmp/cclpPiBu.s:3028: Error: suffix or operands invalid for vpsubsb' /tmp/cclpPiBu.s:3029: Error: suffix or operands invalid for vpmaxsb'
/tmp/cclpPiBu.s:3030: Error: suffix or operands invalid for vpaddsb' /tmp/cclpPiBu.s:3033: Error: suffix or operands invalid for vpminsb'
/tmp/cclpPiBu.s:3034: Error: suffix or operands invalid for vpmaxsb' /tmp/cclpPiBu.s:3035: Error: suffix or operands invalid for vpmaxsb'
/tmp/cclpPiBu.s:3040: Error: suffix or operands invalid for vpmaxsb' /tmp/cclpPiBu.s:3049: Error: suffix or operands invalid for vpminsb'
/tmp/cclpPiBu.s:3054: Error: suffix or operands invalid for vpminsb' /tmp/cclpPiBu.s:3342: Error: suffix or operands invalid for vpand'
/tmp/cclpPiBu.s:3349: Error: suffix or operands invalid for vpand' /tmp/cclpPiBu.s:3350: Error: suffix or operands invalid for vpaddsb'
/tmp/cclpPiBu.s:3357: Error: suffix or operands invalid for vpand' /tmp/cclpPiBu.s:3363: Error: suffix or operands invalid for vpand'
/tmp/cclpPiBu.s:3368: Error: suffix or operands invalid for vpsubsb' /tmp/cclpPiBu.s:3369: Error: suffix or operands invalid for vpand'
/tmp/cclpPiBu.s:3370: Error: suffix or operands invalid for vpaddsb' /tmp/cclpPiBu.s:3378: Error: suffix or operands invalid for vpand'
/tmp/cclpPiBu.s:3379: Error: suffix or operands invalid for vpand' /tmp/cclpPiBu.s:3381: Error: suffix or operands invalid for vpaddsb'
/tmp/cclpPiBu.s:3404: Error: suffix or operands invalid for vpsubsb' /tmp/cclpPiBu.s:3574: Error: no such instruction: vpbroadcastw %xmm9,%xmm9'
/tmp/cclpPiBu.s:3575: Error: no such instruction: vpbroadcastw %xmm6,%xmm6' /tmp/cclpPiBu.s:3579: Error: no such instruction: vinserti128 $1,%xmm9,%ymm9,%ymm9'
/tmp/cclpPiBu.s:3580: Error: no such instruction: vinserti128 $1,%xmm6,%ymm6,%ymm6' /tmp/cclpPiBu.s:3846: Error: suffix or operands invalid for vpsubsw'
/tmp/cclpPiBu.s:3847: Error: suffix or operands invalid for vpsubsw' /tmp/cclpPiBu.s:3850: Error: suffix or operands invalid for vpmaxsw'
/tmp/cclpPiBu.s:3853: Error: suffix or operands invalid for vpsubsw' /tmp/cclpPiBu.s:3854: Error: suffix or operands invalid for vpsubsw'
/tmp/cclpPiBu.s:3856: Error: suffix or operands invalid for vpmaxsw' /tmp/cclpPiBu.s:3857: Error: suffix or operands invalid for vpaddsw'
/tmp/cclpPiBu.s:3858: Error: suffix or operands invalid for vpmaxsw' /tmp/cclpPiBu.s:3860: Error: suffix or operands invalid for vpmaxsw'
/tmp/cclpPiBu.s:3861: Error: suffix or operands invalid for vpmaxsw' /tmp/cclpPiBu.s:3866: Error: suffix or operands invalid for vpmaxsw'
/tmp/cclpPiBu.s:4227: Error: suffix or operands invalid for vpand' /tmp/cclpPiBu.s:4237: Error: suffix or operands invalid for vpand'
/tmp/cclpPiBu.s:4244: Error: suffix or operands invalid for vpand' /tmp/cclpPiBu.s:4510: Error: no such instruction: vpbroadcastw %xmm9,%xmm9'
/tmp/cclpPiBu.s:4513: Error: no such instruction: vinserti128 $1,%xmm9,%ymm9,%ymm9' /tmp/cclpPiBu.s:4518: Error: no such instruction: vpbroadcastw %xmm0,%xmm0'
/tmp/cclpPiBu.s:4519: Error: no such instruction: vinserti128 $1,%xmm0,%ymm0,%ymm0' /tmp/cclpPiBu.s:4559: Error: no such instruction: vpbroadcastw %xmm4,%xmm4'
/tmp/cclpPiBu.s:4560: Error: no such instruction: vpbroadcastw %xmm5,%xmm5' /tmp/cclpPiBu.s:4563: Error: no such instruction: vinserti128 $1,%xmm4,%ymm4,%ymm4'
/tmp/cclpPiBu.s:4564: Error: no such instruction: vinserti128 $1,%xmm5,%ymm5,%ymm5' /tmp/cclpPiBu.s:4838: Error: no such instruction: vpbroadcastw %xmm6,%xmm6'
/tmp/cclpPiBu.s:4839: Error: no such instruction: vinserti128 $1,%xmm6,%ymm6,%ymm6' /tmp/cclpPiBu.s:4854: Error: suffix or operands invalid for vpsubsw'
/tmp/cclpPiBu.s:4858: Error: suffix or operands invalid for vpsubsw' /tmp/cclpPiBu.s:4859: Error: suffix or operands invalid for vpsubsw'
/tmp/cclpPiBu.s:4860: Error: suffix or operands invalid for vpmaxsw' /tmp/cclpPiBu.s:4863: Error: suffix or operands invalid for vpsubsw'
/tmp/cclpPiBu.s:4865: Error: suffix or operands invalid for vpmaxsw' /tmp/cclpPiBu.s:4866: Error: suffix or operands invalid for vpaddsw'
/tmp/cclpPiBu.s:4868: Error: suffix or operands invalid for vpminsw' /tmp/cclpPiBu.s:4869: Error: suffix or operands invalid for vpmaxsw'
/tmp/cclpPiBu.s:4870: Error: suffix or operands invalid for vpmaxsw' /tmp/cclpPiBu.s:4874: Error: suffix or operands invalid for vpmaxsw'
/tmp/cclpPiBu.s:4879: Error: suffix or operands invalid for vpmaxsw' /tmp/cclpPiBu.s:4885: Error: suffix or operands invalid for vpminsw'
/tmp/cclpPiBu.s:4890: Error: suffix or operands invalid for vpminsw' /tmp/cclpPiBu.s:5136: Error: suffix or operands invalid for vpmaxsw'
/tmp/cclpPiBu.s:5359: Error: suffix or operands invalid for vpand' /tmp/cclpPiBu.s:5366: Error: suffix or operands invalid for vpand'
/tmp/cclpPiBu.s:5367: Error: suffix or operands invalid for vpaddsw' /tmp/cclpPiBu.s:5377: Error: suffix or operands invalid for vpand'
/tmp/cclpPiBu.s:5383: Error: suffix or operands invalid for vpand' /tmp/cclpPiBu.s:5384: Error: suffix or operands invalid for vpand'
/tmp/cclpPiBu.s:5386: Error: suffix or operands invalid for vpaddsw' /tmp/cclpPiBu.s:5396: Error: suffix or operands invalid for vpmaxsw'
/tmp/cclpPiBu.s:5528: Error: no such instruction: vpbroadcastw %xmm4,%xmm4' /tmp/cclpPiBu.s:5531: Error: no such instruction: vinserti128 $1,%xmm4,%ymm4,%ymm4'
/tmp/cclpPiBu.s:5536: Error: no such instruction: vpbroadcastw %xmm0,%xmm0' /tmp/cclpPiBu.s:5537: Error: no such instruction: vinserti128 $1,%xmm0,%ymm0,%ymm0'
/tmp/cclpPiBu.s:5577: Error: no such instruction: vpbroadcastw %xmm6,%xmm6' /tmp/cclpPiBu.s:5578: Error: no such instruction: vpbroadcastw %xmm5,%xmm5'
/tmp/cclpPiBu.s:5580: Error: no such instruction: vinserti128 $1,%xmm6,%ymm6,%ymm6' /tmp/cclpPiBu.s:5581: Error: no such instruction: vinserti128 $1,%xmm5,%ymm5,%ymm5'
/tmp/cclpPiBu.s:5602: Error: suffix or operands invalid for vpsubsw' /tmp/cclpPiBu.s:5609: Error: suffix or operands invalid for vpsubsw'
/tmp/cclpPiBu.s:5839: Error: no such instruction: vpbroadcastw %xmm7,%xmm7' /tmp/cclpPiBu.s:5840: Error: no such instruction: vinserti128 $1,%xmm7,%ymm7,%ymm7'
/tmp/cclpPiBu.s:5855: Error: suffix or operands invalid for vpsubsw' /tmp/cclpPiBu.s:5860: Error: suffix or operands invalid for vpsubsw'
/tmp/cclpPiBu.s:5862: Error: suffix or operands invalid for vpsubsw' /tmp/cclpPiBu.s:5863: Error: suffix or operands invalid for vpsubsw'
/tmp/cclpPiBu.s:5864: Error: suffix or operands invalid for vpmaxsw' /tmp/cclpPiBu.s:5865: Error: suffix or operands invalid for vpmaxsw'
/tmp/cclpPiBu.s:5866: Error: suffix or operands invalid for vpaddsw' /tmp/cclpPiBu.s:5868: Error: suffix or operands invalid for vpmaxsw'
/tmp/cclpPiBu.s:5870: Error: suffix or operands invalid for vpminsw' /tmp/cclpPiBu.s:5871: Error: suffix or operands invalid for vpmaxsw'
/tmp/cclpPiBu.s:5875: Error: suffix or operands invalid for vpmaxsw' /tmp/cclpPiBu.s:5880: Error: suffix or operands invalid for vpmaxsw'
/tmp/cclpPiBu.s:5885: Error: suffix or operands invalid for vpminsw' /tmp/cclpPiBu.s:5890: Error: suffix or operands invalid for vpminsw'
/tmp/cclpPiBu.s:6358: Error: suffix or operands invalid for vpand' /tmp/cclpPiBu.s:6365: Error: suffix or operands invalid for vpand'
/tmp/cclpPiBu.s:6366: Error: suffix or operands invalid for vpaddsw' /tmp/cclpPiBu.s:6374: Error: suffix or operands invalid for vpand'
/tmp/cclpPiBu.s:6378: Error: suffix or operands invalid for vpand' /tmp/cclpPiBu.s:6383: Error: suffix or operands invalid for vpsubsw'
/tmp/cclpPiBu.s:6384: Error: suffix or operands invalid for vpand' /tmp/cclpPiBu.s:6385: Error: suffix or operands invalid for vpaddsw'
/tmp/cclpPiBu.s:6393: Error: suffix or operands invalid for vpand' /tmp/cclpPiBu.s:6394: Error: suffix or operands invalid for vpand'
/tmp/cclpPiBu.s:6396: Error: suffix or operands invalid for vpaddsw' /tmp/cclpPiBu.s:6406: Error: suffix or operands invalid for vpmaxsw'
/tmp/cclpPiBu.s:6419: Error: suffix or operands invalid for vpsubsw' /tmp/cclpPiBu.s:6545: Error: no such instruction: vpbroadcastw %xmm0,%xmm0'
/tmp/cclpPiBu.s:6548: Error: no such instruction: vinserti128 $1,%xmm0,%ymm0,%ymm0' /tmp/cclpPiBu.s:6554: Error: no such instruction: vpbroadcastw %xmm0,%xmm0'
/tmp/cclpPiBu.s:6555: Error: no such instruction: vinserti128 $1,%xmm0,%ymm0,%ymm0' /tmp/cclpPiBu.s:6595: Error: no such instruction: vpbroadcastw %xmm7,%xmm7'
/tmp/cclpPiBu.s:6596: Error: no such instruction: vpbroadcastw %xmm6,%xmm6' /tmp/cclpPiBu.s:6598: Error: no such instruction: vinserti128 $1,%xmm7,%ymm7,%ymm7'
/tmp/cclpPiBu.s:6599: Error: no such instruction: vinserti128 $1,%xmm6,%ymm6,%ymm6' /tmp/cclpPiBu.s:6620: Error: suffix or operands invalid for vpsubsw'
/tmp/cclpPiBu.s:6627: Error: suffix or operands invalid for vpsubsw' /tmp/cclpPiBu.s:6636: Error: suffix or operands invalid for vpsubsw'
/tmp/cclpPiBu.s:6943: Error: suffix or operands invalid for vpand' /tmp/cclpPiBu.s:6946: Error: suffix or operands invalid for vpsubsw'
/tmp/cclpPiBu.s:6950: Error: no such instruction: vpbroadcastw %xmm3,%xmm3' /tmp/cclpPiBu.s:6951: Error: no such instruction: vinserti128 $1,%xmm3,%ymm3,%ymm3'
/tmp/cclpPiBu.s:6965: Error: suffix or operands invalid for vpsubsw' /tmp/cclpPiBu.s:6969: Error: suffix or operands invalid for vpsubsw'
/tmp/cclpPiBu.s:6970: Error: suffix or operands invalid for vpsubsw' /tmp/cclpPiBu.s:6972: Error: suffix or operands invalid for vpmaxsw'
/tmp/cclpPiBu.s:6973: Error: suffix or operands invalid for vpsubsw' /tmp/cclpPiBu.s:6974: Error: suffix or operands invalid for vpmaxsw'
/tmp/cclpPiBu.s:6975: Error: suffix or operands invalid for vpaddsw' /tmp/cclpPiBu.s:6978: Error: suffix or operands invalid for vpminsw'
/tmp/cclpPiBu.s:6979: Error: suffix or operands invalid for vpmaxsw' /tmp/cclpPiBu.s:6980: Error: suffix or operands invalid for vpmaxsw'
/tmp/cclpPiBu.s:6985: Error: suffix or operands invalid for vpmaxsw' /tmp/cclpPiBu.s:6993: Error: suffix or operands invalid for vpminsw'
/tmp/cclpPiBu.s:6998: Error: suffix or operands invalid for vpminsw' /tmp/cclpPiBu.s:7474: Error: suffix or operands invalid for vpand'
/tmp/cclpPiBu.s:7481: Error: suffix or operands invalid for vpand' /tmp/cclpPiBu.s:7482: Error: suffix or operands invalid for vpaddsw'
/tmp/cclpPiBu.s:7490: Error: suffix or operands invalid for vpand' /tmp/cclpPiBu.s:7494: Error: suffix or operands invalid for vpand'
/tmp/cclpPiBu.s:7499: Error: suffix or operands invalid for vpsubsw' /tmp/cclpPiBu.s:7500: Error: suffix or operands invalid for vpand'
/tmp/cclpPiBu.s:7501: Error: suffix or operands invalid for vpaddsw' /tmp/cclpPiBu.s:7509: Error: suffix or operands invalid for vpand'
/tmp/cclpPiBu.s:7510: Error: suffix or operands invalid for vpand' /tmp/cclpPiBu.s:7512: Error: suffix or operands invalid for vpaddsw'
/tmp/cclpPiBu.s:7545: Error: suffix or operands invalid for vpsubsw' /tmp/cclpPiBu.s:7708: Error: suffix or operands invalid for vbroadcastss'
/tmp/cclpPiBu.s:7890: Error: suffix or operands invalid for vpsubd' /tmp/cclpPiBu.s:7894: Error: suffix or operands invalid for vpsubd'
/tmp/cclpPiBu.s:7895: Error: suffix or operands invalid for vpsubd' /tmp/cclpPiBu.s:7896: Error: suffix or operands invalid for vpmaxsd'
/tmp/cclpPiBu.s:7897: Error: suffix or operands invalid for vpsubd' /tmp/cclpPiBu.s:7898: Error: suffix or operands invalid for vpmaxsd'
/tmp/cclpPiBu.s:7902: Error: suffix or operands invalid for vpmaxsd' /tmp/cclpPiBu.s:7903: Error: suffix or operands invalid for vpaddd'
/tmp/cclpPiBu.s:7904: Error: suffix or operands invalid for vpmaxsd' /tmp/cclpPiBu.s:7905: Error: suffix or operands invalid for vpmaxsd'
/tmp/cclpPiBu.s:7906: Error: suffix or operands invalid for vpminsd' /tmp/cclpPiBu.s:7911: Error: suffix or operands invalid for vpmaxsd'
/tmp/cclpPiBu.s:8152: Error: suffix or operands invalid for vpand' /tmp/cclpPiBu.s:8162: Error: suffix or operands invalid for vpand'
/tmp/cclpPiBu.s:8169: Error: suffix or operands invalid for vpand' /tmp/cclpPiBu.s:8424: Error: suffix or operands invalid for vbroadcastss'
/tmp/cclpPiBu.s:8468: Error: suffix or operands invalid for vbroadcastss' /tmp/cclpPiBu.s:8469: Error: suffix or operands invalid for vbroadcastss'
/tmp/cclpPiBu.s:8668: Error: suffix or operands invalid for vpsubd' /tmp/cclpPiBu.s:8672: Error: suffix or operands invalid for vpsubd'
/tmp/cclpPiBu.s:8673: Error: suffix or operands invalid for vpsubd' /tmp/cclpPiBu.s:8674: Error: suffix or operands invalid for vpmaxsd'
/tmp/cclpPiBu.s:8677: Error: suffix or operands invalid for vpsubd' /tmp/cclpPiBu.s:8679: Error: suffix or operands invalid for vpmaxsd'
/tmp/cclpPiBu.s:8680: Error: suffix or operands invalid for vpaddd' /tmp/cclpPiBu.s:8682: Error: suffix or operands invalid for vpmaxsd'
/tmp/cclpPiBu.s:8683: Error: suffix or operands invalid for vpmaxsd' /tmp/cclpPiBu.s:8687: Error: suffix or operands invalid for vpmaxsd'
/tmp/cclpPiBu.s:8691: Error: suffix or operands invalid for vpmaxsd' /tmp/cclpPiBu.s:8832: Error: suffix or operands invalid for vpmaxsd'
/tmp/cclpPiBu.s:8975: Error: suffix or operands invalid for vpand' /tmp/cclpPiBu.s:8982: Error: suffix or operands invalid for vpand'
/tmp/cclpPiBu.s:8983: Error: suffix or operands invalid for vpaddd' /tmp/cclpPiBu.s:8993: Error: suffix or operands invalid for vpand'
/tmp/cclpPiBu.s:8999: Error: suffix or operands invalid for vpand' /tmp/cclpPiBu.s:9000: Error: suffix or operands invalid for vpand'
/tmp/cclpPiBu.s:9002: Error: suffix or operands invalid for vpaddd' /tmp/cclpPiBu.s:9026: Error: suffix or operands invalid for vpmaxsd'
/tmp/cclpPiBu.s:9198: Error: suffix or operands invalid for vbroadcastss' /tmp/cclpPiBu.s:9241: Error: suffix or operands invalid for vbroadcastss'
/tmp/cclpPiBu.s:9243: Error: suffix or operands invalid for vbroadcastss' /tmp/cclpPiBu.s:9245: Error: suffix or operands invalid for vpsubd'
/tmp/cclpPiBu.s:9266: Error: suffix or operands invalid for vpsubd' /tmp/cclpPiBu.s:9425: Error: suffix or operands invalid for vpsubd'
/tmp/cclpPiBu.s:9430: Error: suffix or operands invalid for vpsubd' /tmp/cclpPiBu.s:9432: Error: suffix or operands invalid for vpsubd'
/tmp/cclpPiBu.s:9433: Error: suffix or operands invalid for vpsubd' /tmp/cclpPiBu.s:9434: Error: suffix or operands invalid for vpmaxsd'
/tmp/cclpPiBu.s:9435: Error: suffix or operands invalid for vpmaxsd' /tmp/cclpPiBu.s:9436: Error: suffix or operands invalid for vpaddd'
/tmp/cclpPiBu.s:9438: Error: suffix or operands invalid for vpmaxsd' /tmp/cclpPiBu.s:9440: Error: suffix or operands invalid for vpmaxsd'
/tmp/cclpPiBu.s:9447: Error: suffix or operands invalid for vpmaxsd' /tmp/cclpPiBu.s:9729: Error: suffix or operands invalid for vpand'
/tmp/cclpPiBu.s:9736: Error: suffix or operands invalid for vpand' /tmp/cclpPiBu.s:9737: Error: suffix or operands invalid for vpaddd'
/tmp/cclpPiBu.s:9745: Error: suffix or operands invalid for vpand' /tmp/cclpPiBu.s:9749: Error: suffix or operands invalid for vpand'
/tmp/cclpPiBu.s:9754: Error: suffix or operands invalid for vpsubd' /tmp/cclpPiBu.s:9755: Error: suffix or operands invalid for vpand'
/tmp/cclpPiBu.s:9756: Error: suffix or operands invalid for vpaddd' /tmp/cclpPiBu.s:9764: Error: suffix or operands invalid for vpand'
/tmp/cclpPiBu.s:9765: Error: suffix or operands invalid for vpand' /tmp/cclpPiBu.s:9767: Error: suffix or operands invalid for vpaddd'
/tmp/cclpPiBu.s:9791: Error: suffix or operands invalid for vpmaxsd' /tmp/cclpPiBu.s:9814: Error: suffix or operands invalid for vpsubd'
Makefile:106: recipe for target 'obj/swimd/Swimd.o' failed
make[2]: *** [obj/swimd/Swimd.o] Error 1
Makefile:51: recipe for target 'swsharp' failed
make[1]: *** [swsharp] Error 2
Makefile:16: recipe for target 'vendor/swsharp' failed
make: *** [vendor/swsharp] Error 2`

missing sift4g prediction

Hi, Robert
with pauline's help, sift4g can bring predictions. But still many errors in log file like this:
Use of uninitialized value $aa in string eq at make-single-records-BIOPERL.pl line 270 Use of uninitialized value $aa in concatenation (.) or string at make-single-records-BIOPERL.pl line 276 Use of uninitialized value $mutated_aa in string eq at make-single-records-BIOPERL.pl line 547. Use of uninitialized value in concatenation (.) or string at make-single-records-BIOPERL.pl line 301. Use of uninitialized value $aa2 in string eq at generate-fasta-subst-files-BIOPERL.pl line 615. Use of uninitialized value $aa2 in string ne at generate-fasta-subst-files-BIOPERL.pl line 620. Use of uninitialized value $mutated_aa in string eq at generate-fasta-subst-files-BIOPERL.pl line 888.

It seems sift4g breaks when deal with CDS with "NNNN". For example,
Picture1

Picture2

Maybe I should ignore this transcript. I don't know but any way to resolve these errors?

Thanks,
Lipeng

access problem for gene-annotation-src

Hello,

I am trying to create sift database for peas. I have my files downloaded. I am getting an error that says it cannot access my gene-annotation-src

Error:
entered mkdir ./newpeas
/Pisum_sativum_v1awpeas
converting gene format to use-able input
ls: cannot access '/gene-annotation-src': No such file or directory
Unable to open for reading
done converting gene format
/*.gz: No such file or directory
DNA files do not exist or did not unzip properly

Config file:

GENETIC_CODE_TABLE=1
GENETIC_CODE_TABLENAME=Standard
MITO_GENETIC_CODE_TABLE=1
MITO_GENETIC_CODE_TABLENAME=Plant Plastid Code

PARENT_DIR=./newpeas/peas
ORG=Pisum_sativum
ORG_VERSION=Pisum_sativum_v1a
DBSNP_VCF_FILE=SNPD4.vcf.gz

#Running SIFT 4G
SIFT4G_PATH=./sift4g/bin/sift4g
PROTEIN_DB=./scripts_to_build_SIFT_db/test_files/fastafiles/uniprot-pisum.fasta

Sub-directories, don't need to change

GENE_DOWNLOAD_DEST=gene-annotation-src
CHR_DOWNLOAD_DEST=chr-src
LOGFILE=Log.txt
ZLOGFILE=Log2.txt
FASTA_DIR=fasta
SUBST_DIR=subst
ALIGN_DIR=SIFT_alignments
SIFT_SCORE_DIR=SIFT_predictions
SINGLE_REC_BY_CHR_DIR=singleRecords
SINGLE_REC_WITH_SIFTSCORE_DIR=singleRecords_with_scores
DBSNP_DIR=dbSNP

Doesn't need to change

FASTA_LOG=fasta.log
INVALID_LOG=invalid.log
PEPTIDE_LOG=peptide.log
ENS_PATTERN=ENS
SINGLE_RECORD_PATTERN=:change:_aa1valid_dbsnp.singleRecord

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.