Giter VIP home page Giter VIP logo

mango's People

Contributors

bonnarel avatar gilleslandais avatar jesusjuansalgado avatar lmichel avatar loumir avatar mcdittmar avatar molinaro-m avatar pdowler avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mango's Issues

Core parameter for the model

Some of the parameters supported by the model could be tagged as core parameters in order to help for the catalog discovery in a archive.
These feature would play a role similar to that played by Obscore for the observation datasets.
This issue has been risen by Mr Arviset during the Source DM session on 8/5/2020

Source and AssociatedData

I'm working a new example case where I'm annotating multiple models (which is not actually relevant to this ticket).
In that example I want to take CSC data with:

  • TABLE - containing Master Source records = mango:Source instances (one per Source)
  • TABLE - containing Detection records = mango:Source instances (one per Observation)

I'd then like to associate the Detections for each Master source record.

  • I'm not sure if this is a Provenance relation or AssociatedData, but for this exercise I am using AssociatedData
    • and VOModelInstance flavor

The issue is that there is no singular model element to place at ModelInstance. The Source class if for a single record.
If I have 7 detections for Source.id=12345, I'd technically need 7 AssociatedData nodes (1 per detection). It' maybe be worth adding a Catalog class which contains a collection of Source.

Parameter: suggested alternative

I strongly suggest that the Parameter class be renamed.
The term is already used in VOTable and Provenance with different meanings.. and Mango has at least some overlap with each of those domains.

I'd like to suggest the term "Property", which is typically used to when talking about these entities (even in the dm-usecases descriptions).

Parameter: description

I'm not sure if this issue belongs here, or in the workshop (or both).. it pertains to both modeling and the annotation.

Parameter.description: ivoa:string[1]

Model:

  • first, is multiplicity 1 for an element which is often missing from serializations
  • as I understand it, things like 'description' are not typically modeled elements. They do not DEFINE the object being modeled, but are merely ancillary information which may or may not be present for the benefit of a human 'reader'.
    • that does not mean that implementations would not/should not have 'description' attributes which convey any text included in the serialization. Just that whether or not this exists is a requirement on the application and has no consequence to the execution of a science thread.
  • I realize this is a grey area, so am not pushing for removal, but think it is worth a conversation
    • notice that Measure and Coordinate types do not have 'description' attributes while their VOTable counterparts often do include some sort of description, however trivial (eg: from standard properties case; "Attribute managed by Saada".)

Annotation

  • as a modeled element (Parameter.description) one would like to be able to annotate it directly as a COLUMN or CONSTANT, referencing the existing description of the corresponding FIELD or PARAM (or GROUP or TABLE). However, in VOTable, DESCRIPTION is just an element of other types, and cannot have an ID to reference.

Discovering dataset mapped with CAB-MSD

The doc should mention a way to discover with a TAP query parameters that are available for the each of the published source catalogues.
This issue has been risen by Mr Arviset during the Source DM session on 8/5/2020

Model evolution proposal

After many discussion, it appears that the way that MANGO flatten all properties (formerly Parameter) is questionned.

The schema below shows up a possible categorization of the properties .

Please comment.

mango_evolv(2)

VOModelInstance relation to ModelInstance

The current Mango model AssociatedData thread for including model instances (ModelInstance) uses a compositon relation.
I think this should be a reference relation.

  • the Mango object does not own the associated object
  • It is another form of referencing the target object
    • WebEndpoint: references by uri
    • VOModelInstance: references the embedded instance
  • the description for VOModelInstance
    • "Reference to a VO model instance that is part of the associated data."

add a header paragraph before the requirements subsection 2.2

explaining the parameter and associated data split in section 2.1

It is difficult to catch why parameters are distinguished from associated data and to highlight their diff nature .
its a key point of the model that must be clearly exposed.

relate to existing stuff about sources .

Parameter: content

Sorry.. this is a long one.

The Source -> Parameter relation is very similar to the Cube model’s NDPoint -> Observable relation.
In Cube, each Observable owns a Measure instance, and adds knowledge of whether this is ‘dependent’ or ‘independent’ data.
But here, each Parameter ends up taking the place of VODML role and type from a formally modeled Source object.
ie: instead of Source.position:Position[*] we have Source with Parameter{semantic=“source position”, ucd=“pos.eq”}

I understand that this model is trying to be generic, and specifically NOT model Source explicitly, so I think the Source has a collection of Property-s is a good mechanism. But instead of this providing access to various types of Properties, it has become something that lets you build proxies for things which are not formally modeled, which I think is outside the model scope.

Parameter.semantic:

  • Replaces VODML role (ie: attribute/relation name)
    • I’m concerned that this approach of replacing model elements with semantic vocabularies is going to be a maintenance problem for us, and an implementation problem for clients. If the Parameter.semantic vocabulary can be “totally free as long as it is published” then there is nothing fixed for clients to queue off of to know how to interpret any given Parameter. In other words, different vocabularies can/will define different terms for the same concept which is the sort of problem which our standards are supposed to be solving.

Parameter.ucd:

  • Replaces VODML type (ie: expected Type of the ‘value’)
  • Has the benefit of facilitating the use of concepts with no formally modeled Measure type; “phys.magField”, “phot.mag”..
    • I’ll note that I believe this is Markus’ argument for not having specialized Measure types at all, but only a single Measurement with a semantic tag to identify its nature (ala ucd).
    • In my opinion, this form may be fine for a serialization, but is VERY difficult to specify dependencies/constraints in the models
      • If ucd = “pos.eq” then associated Coordinate SpaceFrame MUST have referenceFrame=“ICRS|FK4|FK5” and Spherical coordinate space
  • Has the vulnerability of being a consistency problem
    • If ucd = “pos.eq” and the measure is “meas:Position but in GALACTIC”, the client will have to handle the inconsistency
    • If ucd = “phot.mag” and the measure is “meas:GenericMeasure”, the client STILL needs to do all the work to determine if the GenericMeasure content is compatible with “phot.mag” type. If they are doing that, then they can identify it as a “phot.mag” without the prompt. NOTE: doing this MAY mean drilling down to the VOTable element, and checking the UCD on the PARAM|FIELD.. noticing that it is “phot.mag”
  • Having the ucd here does not solve the GenericMeasure problem, since it does not help identify dependent metadata
    • If Parameter.ucd = “phot.mag” or “phot.flux” there should/must be an associated “photDM.PhotCal” instance.. how do they know that? where would they find it? This exact scenario is in the TimeSeries workshop use case.
    • I don’t think these sorts of associations are for this model to solve. It is basically constructing model elements.

Parameter.measure:

  • Is the parameter value, which may or may not be of the type identified in the ucd
    • This can be a good thing ( qualifying GenericMeasure as “phot.flux” or “phys.magField” )
    • Or a consistency problem ( ucd=“pos.eq” with measure=Time )
  • There is only 1 option here.. Parameter contains Measure
    • The model text describes that there are other kinds of parameters ( flags, assigned states, classifications ). By only having a Measurement option, the model has improperly extended Measure and Coordinate for these data. That will be another ticket, but I think there is work to do here on how to handle non-measure properties.

Proposal:

  • I would suggest splitting the Parameter into sub-classes
    • Parameter: abstract parent. contains reference to associated parameter if that is needed (haven’t looked into that use case)
    • PhysicalParameter: extends Parameter, contains Measure instance
    • Classification: extends Parameter, contains a vocabulary literal (VocabularyTerm)
      • removes need for VocabMeasure and VocabCoordinate which are not proper extensions of those models
    • Flag: extends Parameter, contains what basically amounts to a user-defined enumeration value
      • value = integer (OK to start, but in Chandra we have bit array flags where each bit represents a different issue )
      • options = pointer to what is currently defined as FlagSys
      • Removes FlagCoord, FlagSys becomes local class as part of Flag Property spec, not extension of CoordSys
  • None of these would have ‘semantic’ or ‘ucd’ attributes to qualify the value.
    • In the PhysicalParameter we’d need to have a discussion on how to handle the complex unmodeled Measure types.
      • The ‘simple’ ones, can be handled by clients interpreting units and/or the underlying VOTable element ucd.

ValueRange proposal

Right now and taking into account the open issues, we can split measures into distinct categories:

  • those linked with a coordinate frame
  • those linked with a photometric calibration
  • those linked with a limited set of possible values
  • the standalone measures

Looking at the XMM data, it turns out that many measures in any categories, are valid for a specific range of a given quantity.
The perfect example for this pattern is given by the energy bands.
Most of the XMM catalogue columns (and Chandra as well) are related to one energy band.

  • These energy bands must be described by the model
  • They must be attached to the relevant measures

My proposal is to allow any Mango measure to point onto e.g. a ValidityRange object.

  • This association would be optional
  • ValidityRange would be applicable to any sort of physical quantity
  • It would contain the following fields (min, max, ucd, unit)

Meas/Coord extensions

Sorry for the long list here, I think that by trying to push everything under Measure, this section got rather complicated.
We may need to tackle these in waves as we hit the cases in implementation.
Some are already mentioned in Issue #26.

  • Flag
    • is a non-measure type; suggest modeling as a FlagProperty with status:integer[1] and options: reference to FlagSet?[1] (current FlagSys)
    • Flag as Measure becomes Flag as Property
    • FlagSys becomes FlagSet or FlagSpec or something. Local object not extension of coords:CoordSys
    • removes FlagCoord
  • VocabGenericMeasure
    • is a non-measure type; I think this covers the "Classification" type Properties, so suggest modeling as Classification object with value:VocabuartyTerm[1] as a generic representation.
    • VocabGenericMeasure as Measure becomes Classification as Property
    • removes VocabCoordinate
  • StringGenericMeasure
    • is a non-measure type; I'm not sure what use case this serves, but value is a simple string, so maybe targeting various string-type columns in catalogs? A specific suggestion would need review of the use case, but I expect if this is needed, it maybe should be served by a sub-class of Property
    • removes StringGenericMeasure, StringCoordinate,
  • HardnessRatio
    • is a derived value, so qualifies as a Measure. However, the value is a dimensionless ratio, and has no associated CoordSpace, so is not a Coordinate.
    • I suggest keeping HardnessRatio as a Measure, but collapsing the content which is specific to the Ratio nature
      • value:real[1] == ratio
      • low_band: reference to photDM:PhotometryFilter? (see below)
      • hi_band: reference to photDM:PhotometryFilter?
    • removes HardnessRatioCoord, HardnessRatioSys, and HardnessRatioFrame
  • PhotFilter
    • obviously a stub for photDM classes, the text says "compliant with photDM". I don't see a match, but may be out of sync with current work there.
      • ZeroPoint and MagnitudeSystem are elements of photDM:PhotCal
      • name and bandwidth are elements of photDM:PhotometryFilter
      • not sure what 'unit' maps to.
    • in the end, this should go away in favor of importing photDM.. this should be clear in the document
  • LonLatSkyPosition
    • we've had some discussion on this.. interest in restoring Space-based coordinates
    • ultimately, these objects should merge into the meas:Position and coords:Point objects
    • but even at this point
      • LonLatPoint:
        • longitude and latitude should be Quantity types (they have units)
        • can refer to coords:SpaceSys which defaults to SphericalCoordSpace; no need for LonLatCoordSys
  • Redshift
    • was a concept trimmed from earlier iterations on Meas/Coords to minimize its scope.
    • my understanding is that the modeling of this will depend on which type of Redshift we are talking about
      • z = dimensionless ratio (so OK as real) so, like HardnessRatio maybe is not a Coordinate.
      • with several options for how this is determined
    • at this point, I might suggest this is handled similarly to HardnessRatio
    • as a Measure, it would suffice to extend GenericMeasure and constrain the coord to be unitless.
  • Photometry
    • gets us into the magnitude, flux, luminosity thread.. perhaps best a separate topic
    • but as a Measure, these values are Quantities, so maybe best represented as a specialized GenericMeasure
      • PhotometryCoord extends PhysicalCoordinate (Photometry constrains coord to PhotometryCoord)
        • note, this way the coord can serve all three quantities.
      • remove PhotometryCoordSys (no added value)
      • extend coords:GenericFrame to a PhotFrame which references a photDM:PhotCal instance
      • So Photometry has a PhotometryCoord whose coordSys contains a PhotFrame
  • Orbit
    • there is no spec here, but I assume this is a complex object rather than a single coord measure, so at this point I'd suggest leaving Orbit as extension of Measure (empty), until the case gets worked.
    • remove OrbitCoord
  • Shape
    • a region.. the STC Regions package is a separate beast which has not been refactored to date.
    • should not be extending Measure/Coord
    • my expectation has always been that Regions would use Coords (for the vertices, etc) though the DALI Shape-s work has given me cause to wonder about that. (each vertex doesn't need to reference the coordsys, but the region as a whole does).
  • ObjectTypeCoord/ObjectTypeSys
    • I don't see any usage of these.. and object type sounds like an enumeration/Classification so maybe handled by that?
    • remove

Model Overview - mangoInstance

This is mostly for me to work the thread of making a change to the document.

In Section 3: Model Overview, the paragraph references "mangoInstance" in several places. From the diagram, it looks like this should be "MangoObject".

ModelInstance

This element seems to be a 'catch-all' for any modeled data product.
In that regard, it is very similar to Markus' recommendation, where there is an undescribed class where you then "put appropriate object here".

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.