We have to implement a way to control which SED goes in the catalog. <p dir="auto"

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Well, gamma_cat_dataset.yaml is written down in <code

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Control which SED goes in the catalog about gamma-cat HOT 6 OPEN

cdeil commented on August 18, 2024

Control which SED goes in the catalog

from gamma-cat.

Comments (6)

pdeiml commented on August 18, 2024

I read through the code which makes the collection and the catalog and I have a few suggestions.

We have https://github.com/gammapy/gamma-cat/blob/master/input/gammacat/gamma_cat_dataset.yaml which is intended to control which datasets (tev-*.yaml) go into the collection. But - as you can see in https://github.com/gammapy/gamma-cat/blob/master/gammacat/collection.py#L275 - it is not used to control it. The datasets which are being written to the collection are searched by going through the info.yaml files.
I think this is not the idea of https://github.com/gammapy/gamma-cat/blob/master/input/gammacat/gamma_cat_dataset.yaml
Moreover, https://github.com/gammapy/gamma-cat/blob/master/input/gammacat/gamma_cat_dataset.yaml is not used for the seds and lightcurves, too.

Therefore, my suggestion is the following:
We keep the CollectionMaker as it is, meaning that the list of leds, seds and datasets which go into the collection are produced by scanning through the info.yaml files and the folders, respectively (I don't see a reason why a sed, lc, dataset which exists in the input should not go to the collection).
To control which sed, lc, dataset goes to the catalog, we use https://github.com/gammapy/gamma-cat/blob/master/input/gammacat/gamma_cat_dataset.yaml for the datasets and we write two new yaml-files gamma_cat_lightcurve.yaml and gamma_cat_sed.yaml.
Another way to implement it whould be to only have one file for the control of the catalog which could be formatted like this (the references are arbitrary, I only want to show the format):

source_id: 1
status: source
sed: 2013ApJ...764...38A
lc: 2015ApJ...456..40H
dataset: 2017A&A..56...40A

from gamma-cat.

cdeil commented on August 18, 2024

@pdeiml - gamma_cat_dataset.yaml is the configuration file that controls the content of the catalog.

You can see that it's used from catalog.py here:
https://github.com/gammapy/gamma-cat/search?utf8=%E2%9C%93&q=gamma_cat_dataset.yaml+&type=

Of course, a fix for this issue and generally improvements to the catalog making / configuration are very welcome! If you take on this task, please make small pull requests that improve code / docs / content of the catalog that are easy and quick to review for me. (and not a large restructuring / rewrite, together with a fix for this issue)

Since it seems it wasn't clear to you how the catalog is built, starting by improving the contributor docs (and maybe renaming gamma_cat_dataset.yaml to gamma_cat_catalog_config.yaml or something) on that point would probably be the best first step.

For users, the most useful improvement (apart from fixing this issue which SED is filled in the catalog) would be to explain what's in the catalog here: https://gamma-cat.readthedocs.io/data/catalog.html

Again, I would suggest to make small improvements, e.g. just to link to https://github.com/gammapy/gamma-cat/blob/master/input/gammacat/gamma_cat_columns.yaml and say that those columns are available, or add a link to https://github.com/gammapy/gamma-cat/blob/master/input/gammacat/gamma_cat_dataset.yaml and explain that this controls the content of the catalog.

from gamma-cat.

pdeiml commented on August 18, 2024

Well, gamma_cat_dataset.yaml is written down in catalog.py but if you take a closer look to the CatalogMaker, you see in https://github.com/gammapy/gamma-cat/blob/master/gammacat/catalog.py#L576 that not the gamma_cat_dataset.yaml in the input folder is used but the one in the output folder which is basically a sum of all output resources.
This means that in fact input/gamma_cat_dataset.yaml is not used by the catalog (I am sure that make.py will break as soon as I delete the file but it is not really used).

But what about my suggestion above with one single file for all types of data and that we copy the datasets by scanning through the folders?

from gamma-cat.

cdeil commented on August 18, 2024

That sounds like a regression bug then. Initially when I wrote and used this, gamma_cat_dataset.yaml from the input folder was used. Can you start by making a PR fixing this?
In this case, please also update this output file (or all catalog output files) so that we can see what changes in the catalog:
https://github.com/gammapy/gamma-cat/blob/master/output/gammacat.yaml

It would also be good to have some checks, e.g. you could add a test_catalog.py here:
https://github.com/gammapy/gamma-cat/tree/master/gammacat/tests
which reads the catalog and puts some asserts to make sure it's filled OK.
Especially "regression tests", i.e. some information / number that is currently not filled correctly, is a good candidate to add an assert line.

But what about my suggestion above with one single file for all types of data and that we copy the datasets by scanning through the folders?

I don't understand. IMO the way it should work is that only input folder is scanned once to make an input index file. Then the output collection is made, starting with the input index. Then the catalog is made, from the output collection (via the output index), plus the config file what to put in the catalog. I thought it already worked that way; if not, please fix.

@pdeiml - I'm very busy this week, but if there's still questions, I'd prefer a 10 min call over Github back and forth.

from gamma-cat.

pdeiml commented on August 18, 2024

I will take a closer look to this but I think that is a regression bug, yes.

I totally agree with you about the procedure at the end of your comment but I think that is not working like that. I will look at this problem as well the next days.

from gamma-cat.

cdeil commented on August 18, 2024

@pdeiml - Thank you!

from gamma-cat.

Control which SED goes in the catalog about gamma-cat HOT 6 OPEN

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent