Giter VIP home page Giter VIP logo

ena-ftp-downloader's Introduction

/*******************************************************************************

  • Copyright 2021 EMBL-EBI, Hinxton outstation
  • Licensed under the Apache License, Version 2.0 (the "License");
  • you may not use this file except in compliance with the License.
  • You may obtain a copy of the License at
  • http://www.apache.org/licenses/LICENSE-2.0
  • Unless required by applicable law or agreed to in writing, software
  • distributed under the License is distributed on an "AS IS" BASIS,
  • WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  • See the License for the specific language governing permissions and
  • limitations under the License. ******************************************************************************/

version:unspecified

Ena File Downloader

Copyright © EMBL 2021 | EMBL-EBI is part of the European Molecular Biology Laboratory

For support/issues, please contact us at https://www.ebi.ac.uk/ena/browser/support

There are two ways to run the tool :

  1. Interactive : Use this the first time you run the tool, to navigate through the available options and create a simple script that can be invoked directly later java -jar ena-file-downloader.jar

OR call one of the convenience scripts provided.

Linux/Unix: ./run.sh You may need to give the script permissions to be runnable as follows: chmod + run.sh

On Windows: run.bat

Download latest version from

https://ftp.ebi.ac.uk/pub/databases/ena/tools/ena-file-downloader.zip

Getting Started

Summary

User runnable tool for getting data files idempotently and resiliently from ENA

The application logs will be written to logs\app.log

Build

Install JDK8 and use the Gradle wrapper to build the project:

./gradlew build

The project jar file will be available at build/libs/.

Run:

There are two ways to run the tool :

  1. Command to run jar file from Console without providing inputs java -jar ena-file-downloader.jar

The user will be prompted to provide inputs (eg : accessions/query, format, location, protocol, asperaLocation, email) and in the end will be prompted with the below options :

  • To start downloading right now, and also create a script that can be invoked directly, please enter 1
  • To create a script that can be invoked directly (e.g. by a pipeline or a script), enter 2

If the user selects 1, then a script file will be created with the provided arguments, that can be invoked directly, and download will also be started. If the user selects 2, then only the script file will be created.

  1. Command to run jar file from Console by providing inputs with accessions java -jar ena-file-downloader.jar --accessions=SAMEA3231268,SAMEA3231287 --format=READS_FASTQ --location=C:\Users\Documents\ena --protocol=FTP --asperaLocation=null [email protected] OR with query java -jar ena-file-downloader.jar --query=result=read_run&query=country=%22Japan%22AND%20depth=168 --format=READS_FASTQ --location="C:\Users\Documents\ena ebi" --protocol=FTP --asperaLocation=null [email protected]
  2. Command to run jar file from Console to download files from data hub java -jar ena-file-downloader.jar --accessions=SAMEA3231268,SAMEA3231287 --format=READS_FASTQ --location=C:\Users\Documents\ena --protocol=FTP --asperaLocation=null [email protected] --dataHubUsername=dcc_abc --dataHubPassword=*****
  • --query The search query for the download. It contains the result and the query string. (eg : result=read_run&query=country=%22Japan%22AND%20depth=168)
  • --accessions Comma separated list of accessions or file path to the accession list. If providing a list, it should be a plain text file in TSV (tab separated values) format. If there are more than one column, the first column must be the accessions. Header row is optional and will be ignored. Values can be enclosed in double quotes or not.
  • --format The format for the download (eg : READS_FASTQ,READS_SUBMITTED,READS_BAM,ANALYSIS_SUBMITTED,ANALYSIS_GENERATED)
  • --location The location for the download
  • --protocol The protocol to be used for download.(eg : FTP, ASPERA). Default is FTP.
  • --asperaLocation The location of local Aspera Connect/CLI folder. Required if Protocol is Aspera.
  • --email The email at which one wishes to receive the alert.(optional)
  • --dataHubUsername Data hub username. (Required only If you want to download the data from a data hub (dcc))
  • --dataHubPassword Data hub password. (Required only If you want to download the data from a data hub (dcc))

Please enclose the inputs within double quotes if it contains spaces. For eg: java -jar ena-file-downloader.jar --accessions=SAMEA3231268,SAMEA3231287 --format=READS_FASTQ --location="C:\Users\Documents\ena ebi" --protocol=FTP --asperaLocation=null [email protected]

Privacy Notice The execution of this tool may require limited processing of your personal data to function. By using this tool you are agreeing to this as outlined in our Privacy Notice: https://www.ebi.ac.uk/data-protection/privacy-notice/ena-presentation and Terms of Use: https://www.ebi.ac.uk/about/terms-of-use.

This software is authored by EMBL-EBI and distributed as is. License: https://www.apache.org/licenses/LICENSE-2.0

ena-ftp-downloader's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

ena-ftp-downloader's Issues

downloads fail

Hi!

I've often used the ENA-FTP-Downloader and just wanted to download a new data set today. However, all downloads fail. I also tried the example from the README file of this GitHub-Repo:

java -jar ena-file-downloader.jar --accessions=SAMEA3231268,SAMEA3231287 --format=READS_FASTQ --location=fastq --protocol=FTP

The output log:

[...] 
(Copyright © EMBL 2022)
 version:1.1.5

Welcome to the ENA File Downloader utility!
----------------------------------------------
Provided parameters:
--accessions=SAMEA3231268,SAMEA3231287
--format=READS_FASTQ
--location=fastq
--protocol=FTP
16:07:49.867 [main ] INFO  - Starting download for format: READS_FASTQ at download location:fastq,protocol:FTP, asperaLoc:null, emailId:null, data hub:null
16:07:49.871 [main ] INFO  - Total 2 sample_accession READS_FASTQ records found
Getting file details from ENA Portal API 100% [===================================================================================================================================================================] 1/1 (0:00:00 / 0:00:00) 
16:07:50.638 [main ] INFO  - Downloading 4 files in total

Starting set 2 with 2 files.

Starting set 1 with 2 files.
Failed to download ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR949/ERR949836/ERR949836_1.fastq.gz
Failed to download ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR949/ERR949855/ERR949855_1.fastq.gz
Failed to download ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR949/ERR949836/ERR949836_2.fastq.gz
Failed to download ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR949/ERR949855/ERR949855_2.fastq.gz
Some files failed to download due to possible network issues. Please re-run the same script=/home/silvio/Software/ENA/download_sample-READS_FASTQ.sh to re-attempt to download those files
Automatically retrying failed downloads...

[...]

Using an FTP client directly works smoothly; thus, the issue does not seem to be due to a problem with the FTP server:

ftp ftp.sra.ebi.ac.uk
ftp> cd /vol1/fastq/ERR949/ERR949836/
ftp> get ERR949836_1.fastq.gz
ftp> get ERR949836_2.fastq.gz

Thank you for the help!

Error: Could not find or load main class uk.ac.ebi.ena.ftp.gui.Main

I keep getting this error when I try to run the JAR.

I suspect I am doing somethign wrong, but I followed the instructions.

% java -showversion
openjdk version "1.8.0_131"
OpenJDK Runtime Environment (build 1.8.0_131-b12)
OpenJDK 64-Bit Server VM (build 25.131-b12, mixed mode)

% java -jar ena-ftp-downloader-v1.0.jar
Error: Could not find or load main class uk.ac.ebi.ena.ftp.gui.Main

% java -jar $PWD/ena-ftp-downloader-v1.0.jar
Error: Could not find or load main class uk.ac.ebi.ena.ftp.gui.Main

Which JAR file?

Download the latest release (.zip file) from the Releases section of the GitHub project and extract it to a location of your choice. The archive contains an executable jar which contains all dependencies

I downloaded the zip file but couldn't locate any jar file in it?

ena-ftp-downloader is very very slow

Hi, I am using ena-ftp-downloader to download what I think is a small to medium dataset (~135G). However, it has taken me over 3 days and the download is still not complete! I've downloaded the equivalent dataset from NCBI using command-line sratools in less than 3 hours, with minimum hands-on interaction time. I know it is not an issue with my internet upload or download speed. Any suggestions on improving this?

Cannot run .jar file

This is similar to two already-closed issues, but the solutions offered to them don't apply to me.

I'm running Windows 8.1 and have installed Oracle JDK 12.0.1. After downloading the .jar file, double-clicking it does nothing, and if I try to run via the command line, I get the same message as reported in earlier issues: Error: Could not find or load main class uk.ac.ebi.ena.downloader.gui.Main

The .jar file in the .zip with runtime also does not run when it is double-clicked.

provide proper --help option

CLI is really weird, when I do:

java -jar ena-file-downloader.jar --help 

to learn supported parameters it asks me choice-questions that is totally opposite to expected CLI experience.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.