espri-mod / synda Goto Github PK
View Code? Open in Web Editor NEWESGF Downloader (this is a deprecated repository, the tool has now moved to https://github.com/ESGF/esgf-download)
Home Page: https://espri-mod.github.io/synda/
ESGF Downloader (this is a deprecated repository, the tool has now moved to https://github.com/ESGF/esgf-download)
Home Page: https://espri-mod.github.io/synda/
Modify SDW to start the worker in non-privileged mode in non-daemon mode
In daemon mode, when certificate error occurs, the daemon is stopped
Example
traceback file is:
Traceback (most recent call last):
File "/home/synda/sdt/lib/sd/sdworkerutils.py", line 33, in run
self._service.run(self._instance)
File "/home/synda/sdt/lib/sd/sddmdefault.py", line 40, in run
cls.start_transfer_script(tr)
File "/home/synda/sdt/lib/sd/sddmdefault.py", line 48, in
start_transfer_script
sdlogon.renew_certificate(False)
File "/home/synda/sdt/lib/sd/sdlogon.py", line 87, in renew_certificate
(hostname,port,username)=sdopenid.extract_info_from_openid(openid)
File "/home/synda/sdt/lib/sd/sdopenid.py", line 42, in
extract_info_from_openid
raise OpenIDProcessingException('SDOPENID-002','Error occured while
processing OpenID')
OpenIDProcessingException: code=SDOPENID-002,message=Error occured while
processing OpenID
transfer.log is
2016-06-06 00:36:04,131 DEBUG SYDEVENT-001 'file_complete_event' triggered (
cmip5.output1.MIROC.MIROC-ESM.piControl.mon.ocean.Omon.r1i1p1.v20120608.thkcello_Omon_MIROC-ESM_piControl_r1i1p1_206001-206912.nc
)
2016-06-06 00:36:12,140 ERROR SDOPENID-200 Error occured while processing
OpenID (code=SDNETUTI-002,message=HTTP query failed (url=
https://pcmdi.llnl.gov/esgf-idp/openid/elidart,exception=<urlopen error
timed out>,timeout=10))
2016-06-06 00:36:12,140 ERROR SDDMDEFA-502 Exception occured while
retrieving certificate (code=SDOPENID-002,message=Error occured while
processing OpenID)
2016-06-06 00:36:12,140 ERROR SDWUTILS-002 Thread didn't complete
successfully
2016-06-06 00:36:13,142 ERROR SDOPENID-200 Error occured while processing
OpenID (code=SDNETUTI-002,message=HTTP query failed (url=
https://pcmdi.llnl.gov/esgf-idp/openid/elidart,exception=<urlopen error
timed out>,timeout=10))
2016-06-06 00:36:13,142 ERROR SDDMDEFA-502 Exception occured while
retrieving certificate (code=SDOPENID-002,message=Error occured while
processing OpenID)
2016-06-06 00:36:13,142 ERROR SDWUTILS-002 Thread didn't complete
successfully
2016-06-06 00:36:14,155 ERROR SDTSCHED-002 Fatal exception occured during
download
2016-06-06 00:36:14,155 INFO SDTSCHED-901 Scheduler successfully stopped
2016-06-06 00:36:14,155 INFO SDDAEMON-034 Daemon stopped
The selection file is
#######################
project="CMIP5"
institute="MIROC"
model="MIROC-ESM"
experiment="piControl"
data_node="aims3.llnl.gov"
protocol=gridftp
time_frequency="mon"
latest=true
replica=false
#######################
A better behaviour would be to keep the daemon running, stop starting new download, and retry certificate renewal at a given interval.
In system install, SDW location must be set to '/var/log/synda/sdw' instead of using $SP_HOME.
In src install, $SP_HOME must still be used.
"synda search" command failed when using url as search key
e.g.
synda search http://aims3.llnl.gov/thredds/fileServer/cmip5_css02_data/cmip5/output1/CCCma/CanESM2/esmFdbk2/mon/ocean/Omon/r1i1p1/zostoga/1/zostoga_Omon_CanESM2_esmFdbk2_r1i1p1_200601-210012.nc -f
File "/home/jerome/sdt/lib/sd/sdremotequtils.py", line 57, in serialize_parameter__ovpp
assert isinstance(values,list)
AssertionError
I have installed synda following the deb instructions and updated my credentials in /etc/synda/sdt/credentials.conf
.
Running
synda get cmip5.output1.MPI-M.MPI-ESM-LR.historical.day.atmos.day.r3i1p1.tasmax.v20111006
Results in:
Traceback (most recent call last):
File "/usr/bin/synda", line 162, in <module>
status=sdtiaction.actions[args.subcommand](args)
File "/usr/share/python/synda/sdt/bin/sdtiaction.py", line 119, in get
import sdlogon, sdrfile, sddeferredafter, sddirectdownload, syndautils, humanize, sdconfig, os, sdconst
File "/usr/share/python/synda/sdt/bin/sdlogon.py", line 122, in <module>
openid=sdconfig.config.get('esgf_credential','openid')
File "/usr/lib/python2.7/ConfigParser.py", line 607, in get
raise NoSectionError(section)
NoSectionError: No section: 'esgf_credential'
Are additional configuration steps needed?
Hi there, another question! When processing a fairly large download queue, I get an unhandled exception within 5-60 minutes of starting the daemon. I'm not sure if this is a problem with ESGF nodes or within Synda. The stack trace is:
Traceback (most recent call last):
File "/home/azureuser/sdt/lib/sd/sdworkerutils.py", line 32, in run
self._service.run(self._instance)
File "/home/azureuser/sdt/lib/sd/sddownload.py", line 31, in run
cls.start_transfer_script(tr)
File "/home/azureuser/sdt/lib/sd/sddownload.py", line 39, in start_transfer_script
sdlogon.renew_certificate(False)
File "/home/azureuser/sdt/lib/sd/sdlogon.py", line 46, in renew_certificate
(hostname,port,username)=sdopenid.extract_info_from_openid(openid)
File "/home/azureuser/sdt/lib/sd/sdopenid.py", line 33, in extract_info_from_openid
xrds_buf=sdnetutils.HTTP_GET(openid)
File "/home/azureuser/sdt/lib/sd/sdnetutils.py", line 129, in HTTP_GET
raise SDException(errcode,errmsg)
SDException: code=SDNETUTI-002,message=HTTP query failed (url=https://esgf-node.ipsl.fr/esgf-idp/openid/XXXXXXX,exception=<urlopen error [SSL: SSLV3_ALERT_HANDSHAKE_FAILURE] sslv3 alert handshake failure (_ssl.c:581)
I also tried an OpenID registered at pcmdi9.llnl.gov/ but also get an exception every so often:
Traceback (most recent call last):
File "/home/azureuser/sdt/lib/sd/sdworkerutils.py", line 32, in run
self._service.run(self._instance)
File "/home/azureuser/sdt/lib/sd/sddownload.py", line 31, in run
cls.start_transfer_script(tr)
File "/home/azureuser/sdt/lib/sd/sddownload.py", line 39, in start_transfer_script
sdlogon.renew_certificate(False)
File "/home/azureuser/sdt/lib/sd/sdlogon.py", line 46, in renew_certificate
(hostname,port,username)=sdopenid.extract_info_from_openid(openid)
File "/home/azureuser/sdt/lib/sd/sdopenid.py", line 33, in extract_info_from_openid
xrds_buf=sdnetutils.HTTP_GET(openid)
File "/home/azureuser/sdt/lib/sd/sdnetutils.py", line 129, in HTTP_GET
raise SDException(errcode,errmsg)
SDException: code=SDNETUTI-002,message=HTTP query failed (url=https://pcmdi9.llnl.gov/esgf-idp/openid/XXXXX,exception=('The read operation timed out',))
Feature added in 3.3
'ImportError: No module named sysconfig' bug.
When process one job only, the worker should inform if a job has been successfully processed or not.
when running
synda install cmip5.output1.MOHC.HadGEM2-A.amip4xCO2.mon.atmos.Amon.r1i1p1.v20131108
project default file is not loaded.
as a consequence, parameter below is not loaded
local_path_format=treevar
This cause files from different variables to be stored all together in the version folder.
Can we have a project specific drs_path_template rather than building the drs from dataset_id?
Maybe we could add project specific sections to the config file and read the drs_path_template from it? That would be similar to ESGF's esg.ini, but unlike esg.ini specifying the drs_path_template should be optional, i.e. use the default (dataset_id) if it's not specified.
E.g. for CORDEX:
drs_path_template=cordex/%(product)s/%(domain)s/%(institute)s/%(driving_model)s/%(experiment)s/%(ensemble)s/%(institute)s-%(rcm_name)s/%(rcm_version)
When installed as system package, 'synda get' command must be run with sudo.
It would be useful to also be able to run 'synda get' as normal user (using per-user credential file), as wget command.
worker script must switch to unprivileged mode even in interactive mode when run as root
Search fails due to incorrect suffix added to url.
|application/netcdf|HTTPServer
instead of
|application/gridftp|GridFTP
Example:
synda search gsiftp://esgf1.dkrz.de:2811//cmip5/cmip5/output2/MPI-M/MPI-ESM-P/past1000/mon/ocean/Omon/r1i1p1/v20131203/umo/umo_Omon_MPI-ESM-P_past1000_r1i1p1_112001-112912.nc -z http://esgf-data.dkrz.de/esg-search/search?url=gsiftp://esgf1.dkrz.de:2811//cmip5/cmip5/output2/MPI-M/MPI-ESM-P/past1000/mon/ocean/Omon/r1i1p1/v20131203/umo/umo_Omon_MPI-ESM-P_past1000_r1i1p1_112001-112912.nc|application/netcdf|HTTPServer&fields=*&project=CMIP5&limit=100&distrib=true&type=File&format=application%2Fsolr%2Bxml&offset=0
I commonly use a pdsh or rsh command to start init scripts on a remote host, especially when I need to operate on multiple nodes at once. When trying to start sdt remotely, however, the start command doesn't return. It appears to start the sdt daemon on the remote host, but the start command is hung and a ^C kills the sdt daemon on that remote host. Some basic printline debugging seems to imply sddaemon.py isn't getting called correctly as print statements in init aren't called.
When pexec create CDF pipeline and IPSL pipeline is not done yet, CDF is set to PAUSE.
Currently, when IPSL pipeline complete, CDF pipeline currently remain untouched.
Instead, CDF must be set to WAITING when IPSL pipeline complete.
synda search project=CORDEX
loads default value stored in "default_CORDEX.txt" file,
but
synda search CORDEX
doesn't
Paths formatted with get_full_local_path contain duplicate slashes
Currently, file date / time attributes is set to download date / time (or to "Last-Modified" attribute from HTTP header if present).
A flag is needed to set instead file date / time attributes to metadata timestamp (when present).
$ synda queue
status count size
done 132 343.4 GB
running 10 15.2 GB
waiting 174 464.0 GB
$ synda metric
....
'No such file or directory' error occur when .esg dir is missing.
AttributeError: 'Namespace' object has no attribute 'type_'
In sdp.db, model column es empty for CORDEX in ppprun et event tables.
It would be useful to set instead rcm_name or rcm_model facets from cordex DRS.
Is there was an easy way to have synda save data strictly
into an "output" directory. As I have it running now the majority of the
data will end up in an "output1" directory with some ending up in
"output2".
Traceback (most recent call last):
File "/usr/share/python/synda/sdt/bin/myproxyclient", line 5, in <module>
from pkg_resources import load_entry_point
File "/usr/share/python/synda/sdt/lib/python2.6/site-packages/pkg_resources.py", line 3025, in <module>
parse_requirements(__requires__), Environment()
File "/usr/share/python/synda/sdt/lib/python2.6/site-packages/pkg_resources.py", line 628, in resolve
raise DistributionNotFound(req)
pkg_resources.DistributionNotFound: setuptools>=1.0
Source
https://github.com/ESGF/esgf.github.io/wiki/ESGF-Index-and-IdP-nodes
[index]
indexes=esgf-node.ipsl.upmc.fr
default_index=esgf-node.ipsl.upmc.fr
When running 'synda get' as normal user with system package installation, error below occur
File "/usr/lib/python2.7/ConfigParser.py", line 607, in get
raise NoSectionError(section)
NoSectionError: No section: 'esgf_credential'
Hi there, I get the following issue with the current install script, on a fresh Ubuntu 14.04 instance with the only modification the install of the prereqs:
running install_egg_info
Writing /home/x/sdt/lib/python2.7/site-packages/sdt-3.0-py2.7.egg-info
/home/azureuser/garbage/sdt-3.0
Traceback (most recent call last):
File "/home/x/sdt/bin/synda", line 21, in
import sdconst
File "/home/x/sdt/lib/sd/sdconst.py", line 12, in
import sdapp
File "/home/x/sdt/lib/sd/sdapp.py", line 22, in
import sdconfig
File "/home/x/sdt/lib/sd/sdconfig.py", line 65, in
raise SDException('SDCONFIG-010',"'ST_HOME' is not set")
sdexception.SDException: code=SDCONFIG-010,message='ST_HOME' is not set
Adding ST_HOME manually fixes the problem.
HPSS system need some time to load files from tape to disk-cache, which cause download errors.
To fix that, download scheduler must be modified to retry failed download after a few minutes.
$ echo r1i1p8 >> ../../conf/default/default_CMIP5.txt
$ synda search -z
http://esgf-data.dkrz.de/esg-search/search?distrib=true&fields=instance_id,id,type,data_node,variable&limit=100&ensemble=r1i1p8&type=Dataset&format=application%2Fsolr%2Bxml&offset=0
When no project is set, project level default file should not be loaded
There may be some user error here, but I tried to install a sample dataset. It created the directory tree in /srv/synda/sdt/cmip5/... but failed to transfer the file (error 27). I later tried the transfer again, but it said the files were already installed:
[root@aimsdtn5:sdt]# sudo synda install cmip5.output1.MPI-M.MPI-ESM-LR.decadal1995.mon.land.Lmon.r2i1p1.v20120529 baresoilFrac
http://esgf-data1.ceda.ac.uk/thredds/fileServer/esg_dataroot/cmip5/output1/MPI-M/MPI-ESM-LR/decadal1995/mon/land/Lmon/r2i1p1/v20120529/baresoilFrac/baresoilFrac_Lmon_MPI-ESM-LR_decadal1995_r2i1p1_199601-200512.nc
http://esgf-data1.ceda.ac.uk/thredds/fileServer/esg_dataroot/cmip5/output1/MPI-M/MPI-ESM-LR/decadal1995/mon/land/Lmon/r2i1p1/v20120529/baresoilFrac/baresoilFrac_Lmon_MPI-ESM-LR_decadal1995_r2i1p1_199601-200512.nc
Nothing to install (files already installed).
The queue still showed the error state:
[root@aimsdtn5:sdt]# synda queue
status count size
error 1 8.9 MB
I then tried a 'synda remove' to see if that would clear the installed state. I removed it, then tried re-installing the dataset. That command showed some errors and a subsequent attempt to install the dataset complained the directory tree was missing, which 'ls' confirmed.
[root@aimsdtn5:sdt]# synda remove cmip5.output1.MPI-M.MPI-ESM-LR.decadal1995.mon.land.Lmon.r2i1p1.v20120529 baresoilFrac
http://esgf-data1.ceda.ac.uk/thredds/fileServer/esg_dataroot/cmip5/output1/MPI-M/MPI-ESM-LR/decadal1995/mon/land/Lmon/r2i1p1/v20120529/baresoilFrac/baresoilFrac_Lmon_MPI-ESM-LR_decadal1995_r2i1p1_199601-200512.nc
http://esgf-data1.ceda.ac.uk/thredds/fileServer/esg_dataroot/cmip5/output1/MPI-M/MPI-ESM-LR/decadal1995/mon/land/Lmon/r2i1p1/v20120529/baresoilFrac/baresoilFrac_Lmon_MPI-ESM-LR_decadal1995_r2i1p1_199601-200512.nc
1 file(s) will be removed.
After this operation, 0 Bytes of disk space will be freed.
Do you want to continue? [y/N] y
1 file(s) removed
*** Error occured at 2016-02-19 14:09:48.902619 ***
==================
* Error code *
==================
SDOPERAT-001
=====================
* Error message *
=====================
Error occurs during tree cleanup
[root@aimsdtn5:sdt]# synda install cmip5.output1.MPI-M.MPI-ESM-LR.decadal1995.mon.land.Lmon.r2i1p1.v20120529 baresoilFrac
Traceback (most recent call last):
File "/usr/bin/synda", line 25, in <module>
import sdapp
File "/usr/share/python/synda/sdt/bin/sdapp.py", line 21, in <module>
import sdconfig
File "/usr/share/python/synda/sdt/bin/sdconfig.py", line 133, in <module>
check_path(data_folder)
File "/usr/share/python/synda/sdt/bin/sdconfig.py", line 50, in check_path
raise SDException("SDATYPES-101","Path not found (%s)"%path)
sdexception.SDException: code=SDATYPES-101,message=Path not found (/srv/synda/sdt)
sdlogon.sh script failed if password contains '=' character
How to check or control the transfer rate of http transfers done by synda?
Thanks
Hans
When the daemon fails to start, an error message should be printed to inform user
'synda search' return incorrect result
$ synda search CORDEX
new cmip5.output2.IPSL.IPSL-CM5B-LR.piControl.yr.ocnBgchem.Oyr.r1i1p1.v20120430
new cmip5.output1.IPSL.IPSL-CM5B-LR.rcp45.yr.ocnBgchem.Oyr.r1i1p1.v20120430
new cmip5.output1.IPSL.IPSL-CM5A-LR.decadal1995.yr.ocnBgchem.Oyr.r1i1p1.v20120430
Corresponding query is
It should try another index node instead.
Hi,
I want to download a batch of CMIP5 model runs using the template file.
This worked great two weeks ago.
Now I need more output, but for any selection.txt (also the example files in selection/sample/) I do get the following output:
synda install -s selection.txt
Nothing to install (0 file found).
I use version 3.2 installed locally on Ubuntu 12.04 from sources.
Any idea?
Thanks,
Here is my selection file:
project=CMIP5
model=ACCESS1.0 ACCESS1.3 BCC-CSM1.1 BCC-CSM1.1(m) CCSM4 CNRM-CM5 CNRM-CM5-2 CanESM2 GFDL-CM3 GFDL-ESM2G GFDL-ESM2M GISS-E2-H GISS-E2-R HadGEM2-ES INM-CM4 IPSL-CM5A-LR IPSL-CM5A-MR IPSL-CM5B-LR MIROC-ESM MIROC5 MPI-ESM-LR MPI-ESM-MR MPI-ESM-P MRI-CGCM3 MRI-ESM1 NorESM1-M
experiment=piControl
ensemble=r1i1p1
variable[atmos][mon]=hfls hfss rsds rsus rlds rlus rlut prc ts tas
Improve error display in this case (remove stacktrace):
$ sudo /usr/share/python/synda/sdt/bin/sdppproxy.py
*** Error occured at 2016-07-05 10:51:15.909778 ***
Traceback (most recent call last):
File "/usr/share/python/synda/sdt/bin/sdppproxy.py", line 76, in <module>
print get_service().test1(1,2)
File "/usr/share/python/synda/sdt/lib/python2.6/site-packages/pyjsonrpc/http.py", line 162, in __call__
return self.http_client_instance.call(self.method, *args, **kwargs)
File "/usr/share/python/synda/sdt/lib/python2.6/site-packages/pyjsonrpc/http.py", line 253, in call
debug = self.debug
File "/usr/share/python/synda/sdt/lib/python2.6/site-packages/pyjsonrpc/http.py", line 131, in http_request
response = urllib2.urlopen(request, timeout = timeout)
File "/usr/lib64/python2.6/urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "/usr/lib64/python2.6/urllib2.py", line 391, in open
response = self._open(req, data)
File "/usr/lib64/python2.6/urllib2.py", line 409, in _open
'_open', req)
File "/usr/lib64/python2.6/urllib2.py", line 369, in _call_chain
result = func(*args)
File "/usr/lib64/python2.6/urllib2.py", line 1198, in https_open
return self.do_open(httplib.HTTPSConnection, req)
File "/usr/lib64/python2.6/urllib2.py", line 1165, in do_open
raise URLError(err)
URLError: <urlopen error [Errno 111] Connection refused>
When machine halt (e.g. kernel panic), SDP daemon pidfile is not removed.
This prevent the service to be started the next time.
A modification should be made to remove the pid file if the correspondig process doesn't exist.
$ synda search rcp85 20050101-21001231 3hr
Dataset not found
http://esgf-data.dkrz.de/esg-search/search?fields=instance_id,id,type,data_node,variable&project=CMIP5&time_frequency=3hr&experiment=rcp85&limit=100&distrib=true&query=20050101-21001231&type=Dataset&format=application%2Fsolr%2Bxml&offset=0
Ubuntu 15.10 here.
I've installed synda from the repository, set up my ESGF credentials and when I tried a perform a search (copying one of the examples on the README) I get:
matteo@matteo-dell:~/work/ECEMWP3$ synda search cmip5 output1 MOHC HadGEM2-A amip4xCO2 mon atmos Amon r1i1p1
Error in sys.excepthook:
Traceback (most recent call last):
File "/usr/share/python/synda/sdt/bin/sdapputils.py", line 27, in print_exception
import os,sys,traceback,datetime # note: those imports must be done here (i.e. it doesn't work if done top of the file)
ImportError: No module named datetime
Original exception was:
Traceback (most recent call last):
File "/usr/bin/synda", line 123, in <module>
stream=syndautils.get_stream(args)
File "/usr/share/python/synda/sdt/bin/syndautils.py", line 32, in get_stream
import sdbuffer, sdparse, sdstream, sdconfig, sddeferredbefore, sdexception
File "/usr/share/python/synda/sdt/bin/sdstream.py", line 18, in <module>
import sddquery
File "/usr/share/python/synda/sdt/bin/sddquery.py", line 20, in <module>
import sdutils
File "/usr/share/python/synda/sdt/bin/sdutils.py", line 20, in <module>
import datetime
ImportError: No module named datetime
e.g. set "limit" parameter default value from 100 to 500
synda remove doesn't delete files as expected
Hi,
The regular expression contained in ESGF_OPENID_REXP
in lib/sd/sdopenid.py
doesn't seem to accommodate (at least some) OpenIDs. I suggest that the documentation be updated to indicate that users may have to change this regular expression.
I recently created a new user account on the Earth System that gave me the OpenID https://www.earthsystemgrid.org/myopenid/my_user_name
.
When trying the test command synda get GeoMIP IPSL Tas mon
, I got error SDOPENID-002
. Searching with grep
suggested that this error message came from lib/sd/sdopenid.py
.
I modified the line containing ESGF_OPENID_REXP
from
ESGF_OPENID_REXP = r'https://./esgf-idp/openid/(.)'
to
ESGF_OPENID_REXP = r'https://www.earthsystemgrid.org/myopenid/(.*)'
which seemed to give better results.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.