Giter VIP home page Giter VIP logo

Comments (7)

tseemann avatar tseemann commented on July 30, 2024

@chrisgulvik ENA stores two sets of FASTQ files, the original submitted ones, and the official fastq ones. I've discovered that many records do not HAVE the submitted ones. If you choose fastq (which i think should be default because of this problem). you will get all the records.

from enabrowsertools.

chrisgulvik avatar chrisgulvik commented on July 30, 2024

from enabrowsertools.

nicsilvester avatar nicsilvester commented on July 30, 2024

Not finding files for a requested format isn't an error. Hence the information (stdout) messages and no stderr, plus the exit code.
Not all formats of submitted files are converted into FASTQ, which is why this isn't the default. All files should have submitted files though. If these are not showing it might be due to forking off the SRA format from the submitted files, which I still need to add into the scripts. I'll make this a priority.

from enabrowsertools.

chrisgulvik avatar chrisgulvik commented on July 30, 2024

Ah, so the "submitted" file format, which is default, actually just means None in this case, but not all cases. I thought it meant it would fetch whatever file format(s) were submitted under the provided accession, but I suppose that's for something like --format all. That might be an even safer default than FastQ. I was testing a BioProject that I created, so I know they were single interleaved FastQ files submitted per isolate (to SRA not ENA).

I see here (Line 96) that because the var is set to None by default, it's just empty space printed in the stdout so that explains the message No files of format submitted for SRR5125719 rather than No files of format None submitted for SRR5125719. Perhaps a more descriptive message could print something like No files were downloaded due to weird issues with --format submitted <Default> not being able to detect which data files are available to fetch. until there's a way to fetch --format all.

print 'No files of format ' + format + ' for ' + data_accession

from enabrowsertools.

nicsilvester avatar nicsilvester commented on July 30, 2024

I'll have a look into this. It is supposed to have a default format set:

if output_format is None:
        output_format = utils.SUBMITTED_FORMAT

Note that the name is different to what is currently in the live code as we needed to rename the variable to not use a keyword.

Submitted format does mean the format submitted by users, but this is only in the case of files submitted to ENA. All data submitted to NCBI and DDBJ are fetched from NCBI in the SRA format. We then make FASTQ versions of each of these. This is why you will see no file in the "submitted" option.

I'm currently working on an update that will handle COMPLETEGENOMICS_NATIVE submission format (whole directory download). I will also plan to including a better handling for default format, an "all" option and the SRA format files into this update as well. I'm aiming to get this next release out by the end of next week.

from enabrowsertools.

chrisgulvik avatar chrisgulvik commented on July 30, 2024

Terrific, thanks for the update!

from enabrowsertools.

nicsilvester avatar nicsilvester commented on July 30, 2024

Included in v1.4 release. Will look in the following order: submitted, sra, fastq. Note that in this case, for your examples, it will download the NCBI SRA format.

COMPLETEGENOMICS_NATIVE submission format and "all" option have been held over for the next release as I thought this change was deserving of going out without waiting for those changes.

from enabrowsertools.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.