Giter VIP home page Giter VIP logo

rms-pdsfile's People

Contributors

juzen2003 avatar pds-admin avatar rfrenchseti avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

rms-pdsfile's Issues

The return value of "from_path" doesn't match the comments in the function

From rms-webtools created by juzen2003: SETI/rms-webtools#24

  • Current from_path expected results based on comments:

    • 'COISS_2001.targz' --> 'archives-volumes/COISS_2xxx/COISS_2001.tar.gz'
    • 'COISS_2001_previews.targz' --> 'archives-previews/COISS_2xxx/COISS_2001_previews.tar.gz'
    • 'COISS_0xxx_tar.gz' --> 'archives-volumes/COISS_2xxx'
  • Actual return results from function:

    • 'COISS_2001.targz' --> previews/COISS_2xxx/COISS_2001
    • 'COISS_2001_previews.targz' --> volumes/COISS_2xxx/COISS_2001
    • 'COISS_0xxx_tar.gz' --> 'volumes/COISS_0xxx'

Are CORSS VERSIONS rules correct?

From rms-webtools created by rfrenchseti: SETI/rms-webtools#56

The VERSIONS part of rules/CORSS_8xxx.py has the following code:

    (r'volumes/CORSS_8xxx(|_v[0-9\.]+)/(CORSS_8...)/(\w+)(|/.*)', 0,
            [r'volumes/CORSS_8xxx*/\2/#LOWER#\3\4',
             r'volumes/CORSS_8xxx*/\2/#LOWER#\3#MIXED#\4',
             r'volumes/CORSS_8xxx_v1/\2/#UPPER#\3\4',
             r'volumes/CORSS_8xxx_v1/\2/#UPPER#\3#MIXED#\4',
            ]),

The last two lines duplicate the results from the first two, except they also capitalize the REV prefix. When enumerating version files, this results in things like:

'/volumes/pdsdata-admin/holdings/volumes/CORSS_8xxx_v1/CORSS_8001/EASYDATA/REV07E_RSS_2005_123_X43_E/RSS_2005_123_X43_E_CAL.TAB'
'/volumes/pdsdata-admin/holdings/volumes/CORSS_8xxx_v1/CORSS_8001/EASYDATA/Rev07E_RSS_2005_123_X43_E/RSS_2005_123_X43_E_CAL.TAB'

There is code to de-dup lists like this using the Python set() constructor, but this de-dup is case-sensitive and thus both examples of the file end up being present (see, e.g. PdsFile.all_versions()). Usually this is caught in a later phase of PdsFile, but it causes a warning to be logged (which we don't usually see because we don't have PdsFile logging turned on).

The reason I found this is it changes the code coverage for the PdsFile tests when they are run against Linux-vs-Mac filesystems.

There is no other case where we have this problem, leading me to believe the VERSIONS for CORSS are incorrect in this instance.

Add pickle files for holdings/documents directory

From rms-webtools created by rfrenchseti: SETI/rms-webtools#76

Currently the documents directory does not have associated pickle files, which means any access by PdsFile needs to go to the filesystem instead of the pickle files. It would be more consistent to have pickle files for the documents directory as well. This involves updating the scripts in validation and also making any needed modifications to PdsFile.

NH observations have multiple preview images of a given size

From rms-webtools created by rfrenchseti: SETI/rms-webtools#13

From pds-opus created by rfrenchseti: SETI/rms-opus#483

There are different versions of NH observations with suffixes like "_0x630" and "_0x631". These suffixes are ignored when making the OPUS ID, but the different versions are available for downloading. However, each version also has its own preview image, which means ViewSet has multiple previews for a given OPUS ID and size.

How do we choose which one to display? What do we do if we want the user to choose which one to look at?

Need PdsFile.primary_data_abspath to normalize primary filespecs for all types of data

From rms-webtools created by rfrenchseti: SETI/rms-webtools#4

On Nov 7, 2018, at 1:53 PM, Rob French[email protected] wrote:

OK...so fundamentally there is a mismatch here. If I read the "primary
file spec" from an index file, assuming it is in the proper format
(ending in .LBL), then I have no way of using PdsFile.from_filespec() to
look it up and get a viable ViewSet from it, since ViewSets explicitly
don't work with the .LBL extension.

So either we need to change PdsFile to look up Viewables when the
extension is .LBL, or we need to change the extension of the primary
file spec before sending it to PdsFile for lookup. Or is there already
some automated way to ask PdsFile for the "primary data product" which
DOES have a ViewSet?

There's also the problem that Cassini ISS and Galileo SSI do NOT use
.LBL as the extension in the index files. That means that, in OPUS, some
observations have a primary file spec ending in .LBL and some end in
.IMG. Do we want to make these consistent by having the import pipeline
switch the extension to all be .LBL? Or do we want to keep the OPUS
database consistent with what's actually in the PDS archives?

On 11/7/2018 12:28 PM, Mark Showalter wrote:

OK, I remember now why I did this and it has to do with making sensible Viewmaster pages.

We can solve this problem by having a PdsFile attribute "primary_data_abspath" that returns the absolute path to the primary data file. Then...

pdsf = pdsfile.PdsFile.from_path(filespec)
viewset = pdsfile.PdsFile.from_abspath(pdsf.primary_data_abspath).viewset

...would do the trick. The problem is that the association between a random file and the primary data file is currently not easy to make unless you turn on "set_opus_lookups()", which is slow. I can fix that.

This will be the quantity that should be used as primary filespec, no matter what appears in the label. Also, PdsFile.from_abspath(primary_filespec).viewset will return a valid viewset.

On Nov 7, 2018, at 9:45 PM, Rob French [email protected] wrote:

OK, hopefully last question - do you really want abspath stored in the
OPUS database? That exposes our internal filesystem structure. Wouldn't
the logical_path, or better yet the logical_path with "volumes" stripped
off, be more appropriate?

yes, logical path after "volumes/".

Check pickle file ordering

From rms-webtools created by rfrenchseti: SETI/rms-webtools#30

_get_shelf in pdsfile.py is sorting the pickle files as they are read because they are coming in out of order. But Python 3 stores dictionaries in insertion order, so we need to investigate why the pickle files are out of order. It could just be that some of the files are old and were written with Python 2, in which case we can update the pickle files and remove the sort.

Need way to make random associations in PdsFile

From rms-webtools created by rfrenchseti: SETI/rms-webtools#16

Occultation profiles are associated with a large number of raw data products. There needs to be a way to associate the profile with the products (and vice versa!) so that when the user goes to download the profile, they can also download the raw data. This will probably be stored in some kind of index file in the metadata directory for each affected volume.

Code coverage shows potential bugs/missing items in tests

From rms-webtools created by rfrenchseti: SETI/rms-webtools#52

  • rules/COCIRS_xxx.py never executes the loop at 1108 or the if at 1116.
  • rules/COUVIS_0xxx.py doesn't exercise various branches in DATA_SET_ID()
  • rules/COVIMS_0xxx.py doesn't exercise various branches in OPUS_ID_TO_PRIMARY_LOGICAL_PATH()
  • filename_keylen is never tested
  • tests/test_pdsfile_blackbox.py The clause at line 1271 is never executed
  • tests/test_pdsfile_blackbox.py The loop at line 3042 never starts
  • tests/test_pdsfile_whitebox.py The loop at 877 never starts

Update PdsFile to PDS4

From rms-webtools created by rfrenchseti: SETI/rms-webtools#54

PdsFile and its associated systems (e.g. build/validate shelf files, parse and store index files) need to be updated to PDS4. This is a placeholder issue. Over time, as the scope is better understood, it can be expanded into issues for each stage.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.