das-developers / das2py-server Goto Github PK

View Code? Open in Web Editor NEW

1.0 5.0 2.0 11.92 MB

A CGI based data stream reduction and caching service for das2 readers

License: GNU General Public License v3.0

Python 98.66% CSS 0.91% XSLT 0.22% Shell 0.22%

das2 python cgi-script autoplot space-physics

das2py-server's People

Contributors

Stargazers

Watchers

Forkers

baptistececconi jbfaden

das2py-server's Issues

Authentication fails with python 3 installation

The error (see below) looks like a missing .decode() to convert <bytes> to <str>.

dataset request was 
https://das2server.obs-nancay.fr/das2_new/server?server=dataset&start_time=2019-11-20T11%3A00%3A00.000Z&end_time=2019-11-20T11%3A00%3A10.000Z&resolution=0.017761989342806393&dataset=Nancay%2FNenuFAR%2FSpectra%2F20191120_095800_20191120_125900_SUN_TRACKING%2FStokes_I 
Technical Contact: [email protected]
error: 
Exception in handler: das2server.defhandlers.dsdfDataset
Traceback (most recent call last):
  File '/var/www/html/das2_new/server', line 455, in main
    nRet = H.handleReq(U, sReqType, dConf, fLog, form, sPathInfo)
  File '/usr/local/das2server_new/lib/python3/das2server/defhandlers/dsdfDataset.py', line 138, in handleReq
    nRet = U.auth.authorize(dConf, fLog, form, sDsdf, dsdf['readAccess'])
  File '/usr/local/das2server_new/lib/python3/das2server/util/auth.py', line 334, in authorize
    nRet = checkGroupAccess(dConf, fLog, form, sResource, sValue)
  File '/usr/local/das2server_new/lib/python3/das2server/util/auth.py', line 225, in checkGroupAccess
    (sUser, sPasswd) = _getUserPasswd(fLog)
  File '/usr/local/das2server_new/lib/python3/das2server/util/auth.py', line 74, in _getUserPasswd
    lAuth = sAuthPlain.split(':')
TypeError: a bytes-like object is required, not 'str'

out-of-valid-range error

Darrelle and I were trying to get a script to work to serve Jade data, and while the script would work when called directly from Autoplot, the Das2Server would clip the data at the year boundary. The mistake was that validRange was set, so the server would clip the call to the Autoplot source.

It would be helpful if the error message were more clear, that the requested interval was outside the bounds of validRange. https://github.com/das-developers/das2-pyserver/blob/5c2d4f75e0a5ad82303ef69554b4128294aee9a9/das2server/defhandlers/dsdfDataset.py#L117 is where the error is thrown, and perhaps a message could indicate if the requested interval was before the validRange or if it was after the validRange.

automate das2_srv_arbiter startup at system boot

We need a way to have at least one worker automatically start at system boot.

The most common ways to handle this are systemd unit files and systemV init scripts. Maybe examples of each should be placed in PREFIX/etc by the install command.

Furthermore the worker program must be tested as a properly behaved daemon. Reliance on the python-daemon package may be the way to go.

Should we drop python2 support?

I'm thinking about the fact that we develop code for both python2.x and python3.x. There are a few issues here, directely linked to this (#3, #4, #5...). The extra work required for having a single code for both version is probably not adequate.

Since Python 2.7 will not be maintained any more in less than 40 days, I think we should take a hard turn. I would propose to have a dedicated branch for the python3.x version. Let's also decide also which version of python 3.x we support. I'd advocate personally for Python 3.7.

What do you think ?

User's Guide not complete

The user's guide needs a fair bit of work. It is not complete and may need some reorganization.
Notetable problems include:

No example of a data source that requires authentication
Server-side passwd and group files not documented
No short table of server programs
Appendix A and B are empty
Automatic startup of asynchronous processing not covered

No job management interface

A top-level script, say das2_srvcgi_admin should be created that can be used for basic
maintenance task such as kicking off pre-cache builds, cleaning up log files and seeing which datasets get the most use.

Task workers failing python 3 tests

The task queue programs das2_svr_arbiter and das2_srv_todo are broken under python 3.

spacepy expects to write to the $HOME directory, thus breaking spacepy based das2 CDF readers.

CDF based das2 readers relying on spacepy.pycdf expect to write to their home directory, which fails for web-services. Maybe providing a standalone pycdf as part of das2py is a solution.

HAPI support not working with python3 install

I've activated HAPI on a Python3 installation. The HAPI landing page works, but the sub-pages (capabilities, catalog...) are broken.

The error is

Error loading module catalog from das2server.h_api: No module named 'error'

For reference, the util module is working, so there should be a way to make h_api work.

Server returning 500 instead of 404 for non-existant data sources

When attempting this action:

curl -i "https://jupiter.physics.uiowa.edu/das/server?dataset=Juno/WAV/Jovicentric&server=dsdf"

the server returns a 500 (internal server error), when it should return 404 Not Found instead.

Meta-reader internal interface (aka modification times)

In many situations it would be handy if the core pyserver could find out how recently a sub-set of data had been modified. To stick with the overall pyserver mentality, the interface for asking this question should be handled by a sub-program. My proposed way to handle this is to add a new processes type:

meta - Provide requested metadata items given a set of coordinates and options.

This is as opposed to read process type which directly provides data. Though a full set of metadata items need not be defined here, the most immediately useful one is newest_mod_time. This metadata column should contain the most recent modification timestamps for all input files needed by a reader over the given coordinate range, or coordinate range set.

For a more concrete example, a given meta reader might have an interface as follows:

waves_lrs_meta begin end interval metadata_item1 metadata_item2 metadata_item3

So given the call:

waves_lrs_meta 2022-01-01 2022-01-05 86400 newest_mod_time

The following response might be generated (using 2.3/basic-xml stream format in this case):

<stream version="2.3/basic-xml" >
<packet id="1">
  <x pdim="time">
    <array type="isotime" use="min">
  </x>
  </y pdim="newest_mod_time">
    <array type="isotime" use="max">
  </y>
</packet>
<d id="1">2022-01-01 2022-01-15T14:47:32.345</d>
<d id="1">2022-01-02 2022-01-15T14:47:33.664</d>
<d id="1">2022-01-03 2022-01-15T14:47:33.976</d>
<d id="1">2022-01-04 2022-01-15T18:03:18.768</d>
...
</stream>

This new capability could be utilized by the das2-pyserver cache system to determine which cache blocks are out of date, and thus avoid blind refreshes which are inefficient.

There are other uses for metadata information (as noted in the comments below) so this basic capability might affect external interfaces as well.

setup.py not flexible, wipes out custom passwd and group files, reinstalls examples

The setup.py script should not overwrite the following...

etc/group
etc/passwd
examples/* (anything under here)

... if they already exist in the install area.

Furthermore there should be a setup.py option to skip example data sources installation.

SELinux instructions missing

The configuration instructions should include information how to configure SELinux to allow cgi scripts to be placed in an alternate directory. I don't know how to do this after spending an hour on the problem. SELinux fails silently, not just at the console but also in it's log file /var/log/audit/audit.log as well.