Comments (7)
@chrisgulvik ENA stores two sets of FASTQ files, the original submitted
ones, and the official fastq
ones. I've discovered that many records do not HAVE the submitted ones. If you choose fastq
(which i think should be default because of this problem). you will get all the records.
from enabrowsertools.
from enabrowsertools.
Not finding files for a requested format isn't an error. Hence the information (stdout) messages and no stderr, plus the exit code.
Not all formats of submitted files are converted into FASTQ, which is why this isn't the default. All files should have submitted files though. If these are not showing it might be due to forking off the SRA format from the submitted files, which I still need to add into the scripts. I'll make this a priority.
from enabrowsertools.
Ah, so the "submitted" file format, which is default, actually just means None in this case, but not all cases. I thought it meant it would fetch whatever file format(s) were submitted under the provided accession, but I suppose that's for something like --format all
. That might be an even safer default than FastQ. I was testing a BioProject that I created, so I know they were single interleaved FastQ files submitted per isolate (to SRA not ENA).
I see here (Line 96) that because the var is set to None by default, it's just empty space printed in the stdout so that explains the message No files of format submitted for SRR5125719
rather than No files of format None submitted for SRR5125719
. Perhaps a more descriptive message could print something like No files were downloaded due to weird issues with --format submitted <Default> not being able to detect which data files are available to fetch.
until there's a way to fetch --format all
.
enaBrowserTools/python/readGet.py
Line 96 in c0911de
from enabrowsertools.
I'll have a look into this. It is supposed to have a default format set:
if output_format is None:
output_format = utils.SUBMITTED_FORMAT
Note that the name is different to what is currently in the live code as we needed to rename the variable to not use a keyword.
Submitted format does mean the format submitted by users, but this is only in the case of files submitted to ENA. All data submitted to NCBI and DDBJ are fetched from NCBI in the SRA format. We then make FASTQ versions of each of these. This is why you will see no file in the "submitted" option.
I'm currently working on an update that will handle COMPLETEGENOMICS_NATIVE submission format (whole directory download). I will also plan to including a better handling for default format, an "all" option and the SRA format files into this update as well. I'm aiming to get this next release out by the end of next week.
from enabrowsertools.
Terrific, thanks for the update!
from enabrowsertools.
Included in v1.4 release. Will look in the following order: submitted, sra, fastq. Note that in this case, for your examples, it will download the NCBI SRA format.
COMPLETEGENOMICS_NATIVE submission format and "all" option have been held over for the next release as I thought this change was deserving of going out without waiting for those changes.
from enabrowsertools.
Related Issues (20)
- Some SRA accession numbers don't work. HOT 2
- Errors accessing fastq files through enaDataGet HOT 2
- URL sanitizing breaks Aspera downloads HOT 2
- Genome download which files to use
- Error in using enaBrowserTools to download SRR
- No redownload of truncated files
- ssl Internal Error HOT 1
- Ability to only check if files are available for download
- HTTPError after enaGroupGet HOT 2
- Error with retrieving fastq files with enaGroupGet HOT 3
- TimeoutError with whatever I want to download HOT 1
- Error when downloading raw sequencing files from a registered project. HOT 1
- enaDataGet download giving HTTP Error 400 error HOT 10
- Accession not found even though it exists HOT 3
- enaDataGet error HOT 1
- enaDataGet error with FTP transfer HOT 1
- Can't download in parallel HOT 5
- enaGroupGet Portal API URLs return no results HOT 2
- enaDataGet - HTTP error 400 HOT 1
- 425 error - EnaDataGet HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from enabrowsertools.