Giter VIP home page Giter VIP logo

valet's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

valet's Issues

Feature request: gz as input, bam as input

I am having a look at VALET and I think it is a nice piece of code for checking a metagenome assembly. For me at least it would help if

  • I can give already mapped data as input (bam files)
  • support gzipped input

wrong repo name in readme

should be git clone https://github.com/jgluck/VALET.git not git clone https://github.com/jgluck/VALET.git

generate summary table fails

I am validating a metagenomic assembly with a single FASTA file of reads and keep getting this error. The contigs are numbered sequentially in the final assembly, e.g.:

>951_contig1
>951_contig2
...
>951_contig10682

Assembly here

The pipeline keeps failing when trying to generate the summary table. Full command and STDOUT are copied below. I am running (on a cluster):

Python 2.7.9
bowtie2 2.2.4
samtools 1.2
numpy 1.9.3
bedtools 2.24

$ python ~/Software/cmhill_VALET/src/py/valet.py -a 951.final.renamed.headers.fasta -r ../02-pear/951pear.assembled.fasta
###########################################################################
PROCESSING ASSEMBLY: asm_0 (951.final.renamed.headers.fasta)
###########################################################################
COMMAND:     reapr facheck 951.final.renamed.headers.fasta output/asm_0/assembly_facheck
RESULTS:     output/asm_0/assembly_facheck
---------------------------------------------------------------------------
STEP:    FILTERING ASSEMBLY CONTIGS LESS THAN 1000 BPs
RESULTS:     output/asm_0/filtered_assembly.fasta
---------------------------------------------------------------------------
STEP:    ALIGNING READS
COMMAND:     bowtie2-build /panfs/roc/groups/1/bonddr/badalame/CoDL/DDH951/14-VALET/output/asm_0/filtered_assembly.fasta /panfs/roc/groups/1/bonddr/badalame/CoDL/DDH951/14-VALET/output/asm_0/indexes/temp_2ucgDE
COMMAND:     bowtie2 -a -x /panfs/roc/groups/1/bonddr/badalame/CoDL/DDH951/14-VALET/output/asm_0/indexes/temp_2ucgDE -q -U ../02-pear/951pear.assembled.fasta --very-sensitive -a --reorder -p 8 --un /panfs/roc/groups/1/bonddr/badalame/CoDL/DDH951/14-VALET/output/asm_0/unaligned_reads/unaligned.reads -S output/asm_0/sam/library.sam
---------------------------------------------------------------------------
STEP:    RUNNING SAMTOOLS
COMMAND:     samtools view -F 0x100 -bS output/asm_0/sam/library.sam
COMMAND:     samtools sort output/asm_0/bam/library.bam output/asm_0/bam/sorted_library
COMMAND:     samtools mpileup -C50 -A -f output/asm_0/filtered_assembly.fasta output/asm_0/bam/sorted_library.bam
RESULTS:     output/asm_0/coverage/mpileup_output.out
COMMAND:     samtools index output/asm_0/bam/sorted_library.bam
---------------------------------------------------------------------------
STEP:    CALCULATING CONTIG COVERAGE
RESULTS:     output/asm_0/coverage/temp.cvg
---------------------------------------------------------------------------
STEP:    PARTITIONING COVERAGE FILE
COMMAND:     /home/bonddr/badalame/Software/cmhill_VALET/src/py/split_pileup.py -p output/asm_0/coverage/mpileup_output.out -c 8
---------------------------------------------------------------------------
STEP:    DEPTH OF COVERAGE
COMMAND:     /home/bonddr/badalame/Software/cmhill_VALET/src/py/depth_of_coverage.py -m output/asm_0/coverage/mpileup_output.out -w 501 -o output/asm_0/coverage/errors_cov.bed -g -e -c 8
COMMAND:     bedtools sort -i output/asm_0/coverage/errors_cov.bed
Error: The requested file (output/asm_0/coverage/errors_cov.bed) could not be opened. Error message: (No such file or directory). Exiting!
RESULTS:     output/asm_0/coverage.bed
---------------------------------------------------------------------------
STEP:    BREAKPOINT
COMMAND:     /home/bonddr/badalame/Software/cmhill_VALET/src/py/breakpoint_splitter.py -u /panfs/roc/groups/1/bonddr/badalame/CoDL/DDH951/14-VALET/output/asm_0/unaligned_reads/ -o output/asm_0/breakpoint/split_reads/
COMMAND:     /home/bonddr/badalame/Software/cmhill_VALET/src/py/breakpoint_finder.py -a output/asm_0/filtered_assembly.fasta -r output/asm_0/breakpoint/split_reads/ -b 50 -o output/asm_0/breakpoint/ -c output/asm_0/coverage/temp.cvg -p 8
COMMAND:     bedtools sort -i output/asm_0/breakpoint/interesting_bins.bed
Error: The requested file (output/asm_0/breakpoint/interesting_bins.bed) could not be opened. Error message: (No such file or directory). Exiting!
RESULTS:     output/asm_0/breakpoint/../breakpoints.bed
---------------------------------------------------------------------------
STEP:    SUMMARY
RESULTS:     output/asm_0/summary.bed
RESULTS:     output/asm_0/suspicious.bed
Traceback (most recent call last):
  File "/home/bonddr/badalame/Software/cmhill_VALET/src/py/valet.py", line 1169, in <module>
    main()
  File "/home/bonddr/badalame/Software/cmhill_VALET/src/py/valet.py", line 239, in main
    contig_lengths, contig_abundances, final_misassemblies)
  File "/home/bonddr/badalame/Software/cmhill_VALET/src/py/valet.py", line 1008, in generate_summary_table
    table_file.write(contig + '\t' + str(filtered_contig_lengths[contig]) + '\t' + str(contig_abundances[contig]) + '\t' + \
KeyError: '951_contig6066'

Release?

Hi,

I would like to create a Bioconda recipe for your tool. It would be easier if there is a release of the tool. Is it possible for you to create one?

Thanks a lot,

Bérénice

parameter -q for reads as fastq bug

Using parameter -q indicates reads are fastq format with fasta as default. I believe VALET treats the reads as fastq by default, and likely requires reads in fastq format. Using the -q parameter causes and error with breakpoint finder. Proposed solution is to remove the -q parameter from the options and require fastq file as input.

Assembly correction

Hello,

Are you continue to develop VALET? I haven't seen any publication for it yet only the thesis.
Do you think it is possible to use VALET automatically correct assembly? This will mean mostly split an assembly at a breakpoint and may be at the edges of low-density regions.

Did you do a systematic review of the misassemblies detected by quast and VALET?
I would like to contribute.

Kind regards
Silas Kieser

Metagenome assembly?

Hi

Can VALET be used to evaluate metagenome assemblies i.e. just a contigs file (without binning)?

Regards,
Aditya

SyntaxError: invalid syntax

After installing the program according to instructions provided here I try a run with the test data and get the following error:

src/py/valet.py --skip-reapr -a test/c_rudii_reference.fna,test/c_rudii_dup.fna,test/c_rudii_relocation.fna,test/c_rudii_reloc_dup.fna -1 test/lib1.1.fastq -2 test/lib1.2.fastq --assembly-names reference,duplication,relocation,reloc-dup
  File "src/py/valet.py", line 854
    with open(bin_path + '/reapr/03.score.errors.gff', 'r') as reapr_gff, \
                                                                        ^
SyntaxError: invalid syntax

I am not sure why it is related to reapr, if I added "--skip-reapr" (I am not able to install REAPR in my machine).

Would you have any idea of the problem here?

Thank you in advance

VALET with Long reads

Hi @bebatut,

Mine is a general question. Is VALET suited for long-read data such as PacBio or nanopore? I would like a tool that would work as REAPR but for long-reads.

Thanks
Best,
F

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.