Giter VIP home page Giter VIP logo

fink-science-portal's Introduction

Fink Science Portal

Sentinel

fronted

The Fink Science Portal allows users to browse and display alert data collected and processed by Fink from a web browser: https://fink-portal.org.

The backend is using Apache HBase, a distributed non-relational database. The frontend is based on Dash, a Python web framework built on top of Flask, Plotly and React. The frontend has also integrated components to perform fit on the data, such as gatspy for variable stars, pyLIMA for microlensing, or the imcce tools for Solar System objects.

Backend structure

After each observation night, the data is aggregated and pushed into Apache HBase tables. The main table contains all alert data processed by Fink since 2019-11-01. This represents more than 217 million alerts collected, and about 147 million scientifically valid (8.0 TB) as of 01/2024. The main table data is indexed along the objectId of alerts, and the emission date jd.

In order to allow multi-indexing with HBase, we create index tables. These tables are indexed along different properties (time, sky position, classification, ...). They contain the same number of rows than the main table but fewer columns. These index tables are used to perform fast search along arbitrary properties and isolate interesting candidates, while the main table is used to display final data.

We developed custom HBase clients to manipulate the data efficiently (Lomikel, FinkBrowser, more information here).

Tests

You can test the REST API using:

./run_tests.sh --url https://fink-portal.org

The folder tests contain many example on how to use the REST API.

Deployment

The portal has been tested on Python 3.11. Other versions might work.

Local deployment

If you want to deploy on your machine for test purposes, you can follow the tutorial. Note that a Dockerfile should be ready at some point.

Production

The frontend is host at the VirtualData cloud at Université Paris-Saclay, France. To deploy it, just edit config.yml.

APIURL: https://fink-portal.org
IP: fink-portal.org
PORT: 24000
HBASEIP: hbase-1.lal.in2p3.fr
ZOOPORT: 2183
SCHEMAVER: "schema_3.1_5.0.0"
tablename: ztf

and the launch is supervised by gunicorn:

gunicorn index:server -b :24000 --workers=4

In practice we also use a reverse-proxy (nginx).

fink-science-portal's People

Contributors

aflp91 avatar anaismoller avatar fusroman avatar julienpeloton avatar karpov-sv avatar quentincdr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

fink-science-portal's Issues

Early SN class

Add a limit on ndethist<20 for early SNe class. Thanks!

Run SuperNNova on full lightcurve

The SNN scores are computed for each alert, hence using only partial information. In the science portal, we aggregate data, so the user should be able to re-run SNN on all aggregated data.

Number of digits in examples with coordinate formats

Once the coordinates are transformed to hours, m, s for alpha and degrees, arcmin, arcsec for delta, the number of digits after the decimal point can not be the same: one more significant digit for alpha is needed.
For example, 13h 09m 47.706s -23d 23' 01.79''

Identify Failed xmatch

If the CDS xmatch service is down, we skip the initial xmatch step and we continue with the other modules. Each cdsxmatch field in alerts contain Fail, but it is not directly searchable.

Action item: Add an entry in the class dropdown button with the Fail category

Under construction warning for mobile view

If the user access the web portal via mobile phone there should be a warning saying that the mobile website is under construction -- or saying that we advise use in the desktop... just to demonstrate we are aware of the situation.

Cutouts not displayed correctly

When opening an object page, one or several of the cutouts are not displayed correctly. i think the error comes from the fact that the task is multi-threaded (+different callbacks), and the client that retrieves the cutouts does not like it...
Action item: gather all db calls in the same callback.

Multi-index search

Currently, users can only perform search along one axis (time, space, class or objectId). Ideally, we need to combine different axes (time & space, or space & class...). For a non-relational database, this is tricky, and we might have to create complex index tables based on pre-defined needs.

Extend the representation of different coordinates formats for the conesearch

Comment from Maria:

Is it possible to extend the representation of different coordinates formats? 
Astronomers not only like to use them in a form "hh:mm:ss and dd:mm:ss", 
but also "hh mm ss and dd mm ss" or "275d11m15.6954s+17d59m59.876s", etc. 
This can be useful if we are looking for the alerts on a place of some known objects 
(e.g., dwarf novae) and the coordinates of this object are copied from some astronomical 
catalog or database. Unfortunately, the coordinate format is not homogeneous among 
them. The more formats will be available, the easier it will be for a researcher.

aesthetic of the Supernova tab

  • in the legend below the probability score plot which says Random Forest should be Early SN score (to be more homogeneous with the other 2 labels).

  • it would be nice to add more vertical space between the plots and between plots and axis labels, to avoid lines on top of each other.

Bulk download: choosing columns of interest

It should be possible to choose the field we want to download when performing bulk download (to reduce the bandwidth & reduce complexity for the user). Note that bulk download is not yet publicly available.

Update classification of alerts

We use a very simple algorithm to combine individual science module outputs and assign the final classification. we should at least follow what was published in the paper...! This will mostly affect SN & microlensing.

Action item: update utils.extract_fink_classification with paper criteria.

Color evolution in the explorer page

We should add color evolution. Two quantities to consider:

# assuming r and g measurement at epoch t
r_minus_g(t) = mag_r(t) - mag_g(t)

# at last epoch with 2 filters
r_minus_g(t=last) = mag_r(t=last) - mag_g(t=last)

# last epoch - first detection
delta(r_minus_g) = r_minus_g(t=last) - r_minus_g(t=first)

External link to ASAS-SN

# general search
https://asas-sn.osu.edu/?ra=190.82411&dec=29.68735

# variable star search
https://asas-sn.osu.edu/variables?ra=190.8241088&dec=29.6873522&radius=0.5&vmag_min=&vmag_max=&amplitude_min=&amplitude_max=&period_min=&period_max=&lksl_min=&lksl_max=&class_prob_min=&class_prob_max=&parallax_over_err_min=&parallax_over_err_max=&name=&references[]=I&references[]=II&references[]=III&references[]=IV&references[]=V&references[]=VI&sort_by=raj2000&sort_order=asc&show_non_periodic=true&show_without_class=true&asassn_discov_only=false&

# photometry search
https://asas-sn.osu.edu/photometry?utf8=%E2%9C%93&ra=190.8241088&dec=29.6873522&radius=0.5&vmag_min=&vmag_max=&epochs_min=&epochs_max=&rms_min=&rms_max=&sort_by=raj2000

Xmatch page

We should have a xmatch page based on the .pixel table. Users would upload a file, and we would gather all corresponding pixels and return results. Current perfs:

neighbour pixels (i.e. 1 object with increasing conesearch radius)

radius (arcsec) # pixels query time (s)
1.5 6 0.03
60 4534 4
180 39772 130

disconnected pixels (N objects with fix radius = 1.5 arcsec)

# objects # pixels query time (s)
50 431 0.5
100 865 0.6
1000 8640 5.4
10000 too many too long

We should put limit at 1000.

display Fink probabilities for each point in the light curve

It would be nice to have the Fink probabilities displayed in the right to change when the user clicks on different points in the light curve.

that would correspond to the probabilities of a specific alert (similar to what is already implemented with the cut outs).

Add score for variable stars

gatspy computes internally a score (model.score(period)). Not sure how this can be translated into a chi^2/dof.
Not that we are using

https://github.com/astroML/gatspy/blob/a8f94082a3f27dfe9cb58165707b883bf28d9223/gatspy/periodic/lomb_scargle_multiband.py#L205-L209

def _score(self, periods):
    # Total score is the sum of powers weighted by chi2-normalization
    powers = np.array([model.score(periods) for model in self.models_])
    chi2_0 = np.array([np.sum(model.yw_ ** 2) for model in self.models_])
    return np.dot(chi2_0 / chi2_0.sum(), powers)

so we can easily derive the chi2 using np.sum(model.yw_ ** 2). It remains the number of ddof to determine (nterm_base + nterm_band + period?)

[Bug] Card ID in the summary view displays wrong class

The class computation in the card ID is based on extract_fink_classification_single which calls extract_fink_classification. Unfortunately the call to the last one contains a typo: twice d:mulens_class_1... leading to wrong display!

[Supernova]: Display of scores does not follow upper limits

The ML scores are for valid measurements. Hence, when the lightcurve is in units of Difference Mag, the x-axes of the two plots are not aligned... (it works fine in DC mag or DC flux that do not contain upper limits).

Action item: Force the x-axis to be always the same.

Displaying upper limits

We only show valid measurements for the lightcurve, but recorded non-detections could be useful to e.g. constrain the SN-subtyping and prioritisation for followup (thanks M. Smith!).

Summary page with statistics

It would be nice to access from the home page to a statistics page that would show the global footprint as in the paper with a summary of alerts processed.

Add external link to SDSS

We can easily perform cross-match with SDSS using

http://skyserver.sdss.org/dr13/en/tools/chart/navi.aspx?ra=190.82411&dec=29.68735

[bug] No microlensing fit if only r band

We usually have 2 bands for the fit (g & r). If there is only one band, the fit is still performed. For g only, that is fine, but for r only, there is an index mismatch when plotting data measurements (the second band becomes first!) and the fit is not done.

Microlensing: no display when only 1 band is available

When one band is available, the display is not done with error:

File "/root/miniconda/lib/python3.7/site-packages/flask/app.py", line 1936, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/root/miniconda/lib/python3.7/site-packages/dash/dash.py", line 1059, in dispatch
    response.set_data(func(*args, outputs_list=outputs_list))
  File "/root/miniconda/lib/python3.7/site-packages/dash/dash.py", line 994, in add_context
    output_value = func(*args, **kwargs)  # %% callback invoked %%
  File "/home/centos/fink-science-portal/apps/plotting.py", line 738, in plot_mulens
    'x': [convert_jd(t, to='iso') for t in normalised_lightcurves[1][:,0]],
IndexError: list index out of range

See e.g. http://134.158.75.151:24000/ZTF20aavtedl

GitHub link should point to fink-science-portal

The current GitHub link points to fink-broker instead of fink-science-portal!

# building the navigation bar
dropdown = dbc.DropdownMenu(
children=[
dbc.DropdownMenuItem("About", href="/about"),
dbc.DropdownMenuItem("Fink Website", href="https://fink-broker.org/"),
dbc.DropdownMenuItem(
"GitHub",
href="https://github.com/astrolabsoftware/fink-broker"
),
dbc.DropdownMenuItem(
"Fink documentation",
href="https://fink-broker.readthedocs.io/en/latest/"
),
],
nav=True,
in_navbar=True,
label="Explore",
)

Thanks @aflp91!

Public API

Develop necessary routines for users to access the data via an API

display alert classes by highest probability

In the explorer page, whenever someone chooses one of the classes in the drop down menu the list should show alerts in decreasing order of probability.

In this case the user would see the most probable candidates first, and build confidence/intuition for subsequent searches.

Using State instead of Input for explorer

In explorer, we wait for the user to click on the submit button to trigger an action. The code uses a trick to manage it:

# in explorer.py
# Trigger the query only if the submit button is pressed.
changed_id = [p['prop_id'] for p in dash.callback_context.triggered][0]
if 'submit_query' not in changed_id:
    raise PreventUpdate

But that's needed because we use Inputs. Instead, if we use States, the callback will be triggered only when the button will be pressed. Action:

-@app.callback(
-    Output("table", "children"),
-    [
-        Input("submit_query", "n_clicks"),
-        Input("objectid", "value"),
-        Input("conesearch", "value"),
-        Input('startdate', 'value'),
-        Input('window', 'value'),
-        Input('class-dropdown', 'value')
-    ]
-)
-def construct_table

+@app.callback(
+    Output("table", "children"),
+    [
+        Input("submit_query", "n_clicks"),
+    ],
+    [
+        State("objectid", "value"),
+        State("conesearch", "value"),
+        State('startdate', 'value'),
+        State('window', 'value'),
+        State('class-dropdown', 'value')
+    ]
+)
+def construct_table

and remove the piece of code above.

Put in place annotation mechanism

We have a column family a: for annotation. We should, through the API (PUT) and via a secured channel, let users upload their comments. Example user toto would annotate an alert with a:toto -> str

reset bottom and not found message

  • whenever a give search does not find anything there should be a message: "nothing found"

  • It would be nice to add a reset bottom to the search page and a message so the user is informed on the need to re-initiate after a search returning no alerts.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.