Giter VIP home page Giter VIP logo

Comments (5)

Dharmendra-G-1 avatar Dharmendra-G-1 commented on August 19, 2024 1

Hello Lauren,
Thanks for pointing to the bedtools Version: v2.27.1, after updating from Version: v2.25.0 to this newer version (v2.27.1) now I have the .fa and .bed output after running tigmint-cut.
Thanks for your help !!

from arcs.

lcoombe avatar lcoombe commented on August 19, 2024

Hi @Dharmendra-G-1 -
Could you please enclose copy-and-paste blocks in triple back ticks? Using that formatting will make your pasted commands much easier to read.
(https://help.github.com/articles/basic-writing-and-formatting-syntax/#quoting-code)

Also, can you include the standard output from your the tigmint command? It looks like something went wrong in the tigmint-cut command.

from arcs.

Dharmendra-G-1 avatar Dharmendra-G-1 commented on August 19, 2024

Hello Lauren,
Thanks for your response, Sorry for confusion with format: As per your request here are details:

I am using
draft.fasta and rename linked reads as reads.fa.gz

The main error I am getting is from tigmint-cut:


ubuntu@ip-10-68-102-172:/mid_large_2t/running_arcs$ python3 tigmint/bin/tigmint-cut -p 40 -w 1000 -n 20 -t 0 draft.fa draft.reads.as0.65.nm5.molecule.size2000.bed --fastaout draft.reads.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.fa
Started at: 2019-01-06 18:04:53.506702
Reading contig lengths...
Finding breakpoints...
Cutting assembly at breakpoints...

Tool:    bedtools getfasta (aka fastaFromBed)
Version: v2.25.0
Summary: Extract DNA sequences into a fasta file based on feature coordinates.

Usage:   bedtools getfasta [OPTIONS] -fi <fasta> -bed <bed/gff/vcf> -fo <fasta> 

Options: 
	-fi	Input FASTA file
	-bed	BED/GFF/VCF file of ranges to extract from -fi
	-fo	Output file (can be FASTA or TAB-delimited)
	-name	Use the name field for the FASTA header
	-split	given BED12 fmt., extract and concatenate the sequencesfrom the BED "blocks" (e.g., exons)
	-tab	Write output in TAB delimited format.
		- Default is FASTA format.

	-s	Force strandedness. If the feature occupies the antisense,
		strand, the sequence will be reverse complemented.
		- By default, strand information is ignored.

	-fullHeader	Use full fasta header.
		- By default, only the word before the first space or tab is used.

DONE!
Ended at: 2019-01-06 18:16:46.167144

The initial command I ran are as follows:

./arcs-master/Examples/arcs-make arcs-tigmint draft=draft reads=reads m=40

which ran this 
./arcs-master/Examples/arcs-make arcs-tigmint -n draft=draft reads=reads m=40
ln -s draft.fasta draft.fa
./tigmint/bin/tigmint tigmint draft=draft reads=reads minsize=2000 as=0.65 nm=5 dist=50000 mapq=0 trim=0 span=20 window=1000 t=40 
touch empty.fof
perl -ne 'chomp; if(/>/){$ct+=1; print ">$ct\n";}else{print "$_\n";} ' < draft.tigmint.fa > draft.tigmint.renamed.fa 
bwa index draft.tigmint.renamed.fa 
sh -c 'bwa mem -t40 -C -p draft.tigmint.renamed.fa reads.fq.gz | /mid_large_2t/important/samtools-1.6/samtools view -Sb - | /mid_large_2t/important/samtools-1.6/samtools sort -@40 -n - -o draft.tigmint.sorted.bam' 
echo draft.tigmint.sorted.bam > draft.tigmint_bamfiles.fof
./arcs-master/Arcs/arcs --bx -v -f draft.tigmint.renamed.fa -a draft.tigmint_bamfiles.fof -c 5 -m 40 -s 98 -r 0.05 -e 30000 -z 500 -d 0 --gap 100 -b draft.tigmint_c5_m40_s98_r0.05_e30000_z500
python /mid_large_2t/running_arcs/arcs-master/Examples/makeTSVfile.py draft.tigmint_c5_m40_s98_r0.05_e30000_z500_original.gv draft.tigmint_c5_m40_s98_r0.05_e30000_z500.tigpair_checkpoint.tsv draft.tigmint.renamed.fa
ln -s draft.tigmint_c5_m40_s98_r0.05_e30000_z500.tigpair_checkpoint.tsv draft.tigmint_c5_m40_s98_r0.05_e30000_z500_l5_a0.3.tigpair_checkpoint.tsv
/opt/links_v1.8.6/LINKS -f draft.tigmint.renamed.fa -s empty.fof -b draft.tigmint_c5_m40_s98_r0.05_e30000_z500_l5_a0.3 -l 5 -a 0.3 -z 500 
rm draft.tigmint_c5_m40_s98_r0.05_e30000_z500_l5_a0.3.tigpair_checkpoint.tsv
I believe the Tigmint tigmint-make will run as follows from the starting part of arcs-tigment
./tigmint tigmint draft=draft reads=reads minsize=2000 as=0.65 nm=5 dist=50000 mapq=0 trim=0 span=20 window=1000 t=40
./tigmint/bin/tigmint tigmint -n draft=draft reads=reads minsize=2000 as=0.65 nm=5 dist=50000 mapq=0 trim=0 span=20 window=1000 t=40
bwa index draft.fa
bwa mem -t40 -pC draft.fa reads.fq.gz | /mid_large_2t/important/samtools-1.6/samtools view -u -F4 | /mid_large_2t/important/samtools-1.6/samtools sort -@40 -tBX -m 2G -T tmp/ -o draft.reads.sortbx.bam
tigmint/bin/tigmint-molecule -a0.65 -n5 -q0 -d50000 -s2000 draft.reads.sortbx.bam | sort -k1,1 -k2,2n -k3,3n >draft.reads.as0.65.nm5.molecule.size2000.bed
/mid_large_2t/important/samtools-1.6/samtools faidx draft.fa
tigmint/bin/tigmint-cut -p40 -w1000 -n20 -t0 -o draft.reads.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.fa draft.fa draft.reads.as0.65.nm5.molecule.size2000.bed
ln -sf draft.reads.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.fa draft.tigmint.fa

The tail end of error message I got is as follows:

[M::mem_pestat] mean and std.dev: (242.76, 127.77)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 849)
[M::mem_pestat] skip orientation FF
[M::mem_pestat] skip orientation RF
[M::mem_pestat] skip orientation RR
[M::mem_process_seqs] Processed 2619274 reads in 1927.504 CPU sec, 50.692 real sec
[main] Version: 0.7.15-r1140
[main] CMD: bwa mem -t40 -pC draft.fa reads.fq.gz
[main] Real time: 52106.558 sec; CPU: 1897517.768 sec
[bam_sort_core] merging from 440 files and 40 in-memory blocks...
tigmint/bin/tigmint-molecule -a0.65 -n5 -q0 -d50000 -s2000 draft.reads.sortbx.bam | sort -k1,1 -k2,2n -k3,3n >draft.reads.as0.65.nm5.molecule.size2000.bed
/mid_large_2t/important/samtools-1.6/samtools faidx draft.fa
tigmint/bin/tigmint-cut -p40 -w1000 -n20 -t0 -o draft.reads.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.fa draft.fa draft.reads.as0.65.nm5.molecule.size2000.bed
Started at: 2019-01-05 19:06:03.814077
Reading contig lengths...
Finding breakpoints...

Tool:    bedtools getfasta (aka fastaFromBed)
Version: v2.25.0
Summary: Extract DNA sequences into a fasta file based on feature coordinates.

Usage:   bedtools getfasta [OPTIONS] -fi <fasta> -bed <bed/gff/vcf> -fo <fasta> 

Options: 
	-fi	Input FASTA file
	-bed	BED/GFF/VCF file of ranges to extract from -fi
	-fo	Output file (can be FASTA or TAB-delimited)
	-name	Use the name field for the FASTA header
	-split	given BED12 fmt., extract and concatenate the sequencesfrom the BED "blocks" (e.g., exons)
	-tab	Write output in TAB delimited format.
		- Default is FASTA format.

	-s	Force strandedness. If the feature occupies the antisense,
		strand, the sequence will be reverse complemented.
		- By default, strand information is ignored.

	-fullHeader	Use full fasta header.
		- By default, only the word before the first space or tab is used.

Cutting assembly at breakpoints...
DONE!
Ended at: 2019-01-05 19:17:49.508013
ln -sf draft.reads.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.fa draft.tigmint.fa
make[1]: Leaving directory '/mid_large_2t/running_arcs'
touch empty.fof
perl -ne 'chomp; if(/>/){$ct+=1; print ">$ct\n";}else{print "$_\n";} ' < draft.tigmint.fa > draft.tigmint.renamed.fa 
bwa index draft.tigmint.renamed.fa 
[bwa_index] Pack FASTA... [bns_fasta2bntseq] Failed to allocate 0 bytes at bntseq.c line 303: Success
arcs-master/Examples/arcs-make:157: recipe for target 'draft.tigmint.renamed.fa.bwt' failed
make: *** [draft.tigmint.renamed.fa.bwt] Error 1

the output file I am getting are as follows:

wxrwxrwx  1 ubuntu ubuntu           11 Jan  4 19:03 draft.fa -> draft.fasta
-rw-rw-r--  1 ubuntu ubuntu       406757 Jan  4 19:40 draft.fa.amb
-rw-rw-r--  1 ubuntu ubuntu      1411478 Jan  4 19:40 draft.fa.ann
-rw-rw-r--  1 ubuntu ubuntu   2494337664 Jan  4 19:40 draft.fa.bwt
-rw-rw-r--  1 ubuntu ubuntu       440254 Jan  5 19:06 draft.fa.fai
-rw-rw-r--  1 ubuntu ubuntu    623584396 Jan  4 19:40 draft.fa.pac
-rw-rw-r--  1 ubuntu ubuntu   1247168840 Jan  4 19:51 draft.fa.sa
-rw-rw-r--  1 ubuntu ubuntu   2526643067 Jan  3 04:49 draft.fasta
-rw-rw-r--  1 ubuntu ubuntu    697296771 Jan  7 03:20 draft.reads.as0.65.nm5.molecule.size2000.bed
-rw-rw-r--  1 ubuntu ubuntu            0 Jan  7 04:00 draft.reads.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.fa
-rw-rw-r--  1 ubuntu ubuntu       342869 Jan  7 04:00 draft.reads.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.fa.bed
-rw-rw-r--  1 ubuntu ubuntu 261106244339 Jan  5 12:02 draft.reads.sortbx.bam
lrwxrwxrwx  1 ubuntu ubuntu           77 Jan  5 19:17 draft.tigmint.fa -> draft.reads.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.fa
-rw-rw-r--  1 ubuntu ubuntu            0 Jan  5 19:17 draft.tigmint.renamed.fa
-rw-rw-r--  1 ubuntu ubuntu            0 Jan  5 19:17 draft.tigmint.renamed.fa.pac
-rw-rw-r--  1 ubuntu ubuntu            0 Jan  5 19:17 empty.fof

So after the tigmint-molecule I get the bed file " draft.reads.as0.65.nm5.molecule.size2000.bed" which has 16813449 lines but when it ran the tigmint-cut we get .bed file but no .fa file. draft.reads.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.fa file get created but completely empty, so no problem with permission as such. The bed file output are as follows :


head draft.reads.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.fa.bed
0	0	15517657	0-1
0	15517657	15517670	0-2
0	15517670	17147778	0-3
0	17147778	17147785	0-4
0	17147785	23376074	0-5
0	23376074	23376100	0-6
0	23376100	29696924	0-7
0	29696924	29697326	0-8
0	29697326	29697665	0-9
0	29697665	29697943	0-10

tail draft.reads.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.fa.bed
204375	0	33106	204375
204415	0	57665	204415
204459	0	25333	204459
204499	0	64113	204499
204541	0	58322	204541
204581	0	3400	204581
204621	0	53185	204621
204661	0	20990	204661
204701	0	60883	204701
204741	0	113995	204741

Thanks

from arcs.

lcoombe avatar lcoombe commented on August 19, 2024

Thanks @Dharmendra-G-1 -
Please try using a newer version of bedtools. The issue is that the more recent versions (I am using 2.27.1) output the fasta to standard out by default, whereas the older version you are using requires an output fasta file specified by -fo. You can see that the help page is being printed in standard out, indicating that the bedtools getfasta command isn't running properly as a result.

Take a look at the different help pages:

Tool:    bedtools getfasta (aka fastaFromBed)
Version: v2.25.0
Summary: Extract DNA sequences into a fasta file based on feature coordinates.

Usage:   bedtools getfasta [OPTIONS] -fi <fasta> -bed <bed/gff/vcf> -fo <fasta> 

Options: 
	-fi	Input FASTA file
	-bed	BED/GFF/VCF file of ranges to extract from -fi
	-fo	Output file (can be FASTA or TAB-delimited)
	-name	Use the name field for the FASTA header
	-split	given BED12 fmt., extract and concatenate the sequencesfrom the BED "blocks" (e.g., exons)
	-tab	Write output in TAB delimited format.
		- Default is FASTA format.

	-s	Force strandedness. If the feature occupies the antisense,
		strand, the sequence will be reverse complemented.
		- By default, strand information is ignored.

	-fullHeader	Use full fasta header.
		- By default, only the word before the first space or tab is used.

Tool:    bedtools getfasta (aka fastaFromBed)
Version: v2.27.1
Summary: Extract DNA sequences from a fasta file based on feature coordinates.

Usage:   bedtools getfasta [OPTIONS] -fi <fasta> -bed <bed/gff/vcf>

Options: 
	-fi	Input FASTA file
	-fo	Output file (opt., default is STDOUT
	-bed	BED/GFF/VCF file of ranges to extract from -fi
	-name	Use the name field for the FASTA header
	-name+	Use the name field and coordinates for the FASTA header
	-split	given BED12 fmt., extract and concatenate the sequences
		from the BED "blocks" (e.g., exons)
	-tab	Write output in TAB delimited format.
		- Default is FASTA format.

	-s	Force strandedness. If the feature occupies the antisense,
		strand, the sequence will be reverse complemented.
		- By default, strand information is ignored.

	-fullHeader	Use full fasta header.
		- By default, only the word before the first space or tab 
		is used.

Hope that helps!
Lauren

from arcs.

lcoombe avatar lcoombe commented on August 19, 2024

Excellent - I'm glad you got it working!

from arcs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.