riparias / gbif-alert Goto Github PK

GBIF Alert is a GBIF occurrence based alert system.

Home Page: https://gbif-alert-demo.thebinaryforest.net/

License: MIT License

Python 72.83% HTML 7.95% JavaScript 0.41% Vue 14.57% TypeScript 3.76% CSS 0.05% Dockerfile 0.18% Shell 0.05% Scheme 0.20%

biodiversity biodiversity-data biodiversity-informatics django gbif invasive-species webapp

gbif-alert's People

Contributors

Stargazers

Watchers

Forkers

pieterprovoost

gbif-alert's Issues

Add GitHub repo link to website

Option to filter on validation status

Subsequent question: how to find/interpret this data in DwC.

(on the data publishing side, we should make sure all data providers use the same field and vocabulary. What about data providers that are not part of Riparias?)

Species to be visualized and could be selected for early warning system

As with @timadriaens ever discussed, we should not only visualize the key species for RIPARIAS, but also the species of EU concern. You can find the list here: https://github.com/trias-project/indicators/blob/master/data/input/eu_concern_species.tsv

It seems this list could be updated soon (~December) with 33 new species. We are going to update the file above as soon as this happens.

Implement user management

User registration
Profile page
Lost password

Tip:

Let's use a custom user model early (so we have flexibility later)

To investigate:

How to deal with authentication in frontend/Vue.js-based components?

Single slide to present the architecture

(harvester retrieve daily from the field collection tools, publish to GBIF, the dashboard refresh its data from GBIF daily, too)

Test suite for the occurrence import mechanism

This would be useful since it's an important/central part of the tool, and it's probably getting slowly more complex over time.

Data import: maintenance mode during data update

There's now a mechanism in place to automatically refresh data by downloading from GBIF.

During this process, the database spends a few minutes in a messed up state (occurrence duplicates because new data is added before the previous one is deleted, ...).

We need to show the website as temporarily unavailable during this time to avoid presenting incorrect data to users. I consider using django-maintenance-mode for that.

Additional considerations:

the data imports will probably be run at night, to limit interference with the users
~~infrastructure: we'll need an alert (mail? others) for urgent intervention if the website stays for too long in maintenance mode (error during the data import process: GBIF outage, bug, ...)~~ (not needed thanks to transactions)
we need to take precautions so user alerts (mail for new observations, ...) are not sent when the website is in maintenance mode

Implement alert mechanism

Large task, to be done later based on user's feedback. First ideas in #9, collecting ideas and requirement here

Bram confirmed that people will need to draw their own polygons for the geographic selection, but also that having MULIPOLYGON (= for a single alert) would be great, and also shapefile(?) upload (some users already have a shapefile with the multiple tiny zones they manage).
Bram: observers contact information will be a problem: often needed for management, but some people don't want to share, and privacy/RGPD must be respected AND it's not a good fit for GBIF. How to reconcile all this?
Bram: e-mail frequency should be configurable: you want higher frequency for animals than plants (+ user personal preferences).

Occurrence map: replace the hexagon grid by circles

Request by @peterdesmet: make it more similar to https://hp-theme.gbif-staging.org/occurrence?view=MAP.

Circle colour and radius would both reflect the occurrence density

JS bundle size

The inde-bundle.js file generated by Webpack is currently 4.5Mb. We should investigate soon how to improve.

Same login/accounts than other Riparias webapps

That's something that popped during our meeting with BBPF

Experiment with Pinia for a global application state

pass less props everywhere, cleaner code

what about TypeScript integration?
what about a Vue 3 update?

Auto-deploy on demo server on each push?

Should be fairly easy to implement with a GitHub action that can execute commands via SSH

Possible database issues with regards to frequently deleted/recreated occurrences

Something to investigate soon, first questions that come to mind:

risk of getting rid of IDs
performance issues (need to run VACUUM, ANALYZE, ...?)

TimeLine chart: replace Chart.js for the bar chart?

We're currently using Chart.js which works okayish, but might bother us for the future, for example:

Can we customise the look as much as we want?
Can we dynamically change the color of individual bars? (to reflect the selected range when moving the slider)
Can we get rid of #37 without too much hassle?

APIs: throw errors if unexpected parameters

Some of our API endpoints accept get parameters, for example for filtering (datasetsIds[]=1, ...)

Currently, if we make a typo and pass an unexpected parameter(datasetIds[] for example), the parameter is simply ignored. It would be a better/more defensive approach to throw a (500?) error, with a clear error message.

This is probably not necessary for parameters that are automatically managed with Django routes (I believe errors are already thrown, to be checked), such as "api/tiles/hexagon-grid-aggregated/<int:zoom>/<int:x>/<int:y>.mvt"

Dev server: more production-like environment

serve over HTTPS
turn Django debugging off
make sure JS/assets are served in production-ready way (minified, through Nginx or withenoise, ...)
Enable some reporting mechanism (e-mail? status page?) so we can be efficiently informed of incidents (error 500, data import issues, ...)

Data import: missing dataset names

While working on the data import for the datasets models, it appears some dataset are loaded into the database with a proper GBIF key, but without a name.

After investigation, it appears the datasetName value is missing from the downloaded Darwin Core Archive (for some records/source datasets at least).

Possible solutions:

report the issue at GBIF and see if it is considered normal behaviour or not, and is not, wait for a fix.
Use the GBIF API to retrieve the dataset name based on the dataset key

Test suite: group model creation in class method "setupTestData"

So data is more global (to all tests in a given class) rather than having exceptions in each test function.

Can be done incrementally, and maybe it shouldn't be done in all cases:

test_api.py
test_maps.py
test_pages.py

Map view: switch from hexagons to individual occurrences at high zoom level

The map currently shows occurrences aggregated over an hexagon grid. That works great when they're many occurrences on the map.

I was wondering if we should tweak this system so at high zoom levels (i.e. neighbourhood) it would automatically switch to a more classic "one point per occurrence" view.

Anyu opinion?

We need a proper, shareable URL for a given occurrence

Important remark from @timadriaens (people will share by email)

Mockup / presentation of the alert mechanism

Data import displayed time: shifted by 2 hours

Probably due to server time VS user time issues.

Architecture explanations

I'd like to have a short paragraph explaining the scope, architecture, and where to find the different pieces of the puzzle, for example:

explain that we're currently targeting the early alert system AND the publishing to global infrastructures.
show the updated architecture diagram
saying this repository focus on the right-hand part, and that the left hand part is handled in https://github.com/riparias/occurrences-publishing

Where to write that exactly: I'm not sure. I would like that to be immediately visible to someone that visits our GitHub repositories (so this person can navigate them), but adding that to both READMEs mean duplication. Add the paragraph to one of the README, but link it from the other? Use Github wiki so we can have global (not tied to a specific repository) documentation.

@damianooldoni: would you be able to draft something? Do you have an opinion about the best place for that?

JS source maps

Tons of warnings in the Chrome console:

Solving this could make things easier for debugging, and a less bloated dev console would definitely be more comfortable.

Get a nicer URL for dev website

instead of http://54.75.164.69/:

@damianooldoni: any preference/suggestion?

Show application version somewhere

So it's easier to immediately spot which code is running on a given instance

preferably: git commit SHA
where: footer? about page?

Improve species selector

Some ideas:

Select multiple species simultaneously
Group species in <select> (per taxonomic group?, spread status?, ... )
Show more data in <select>? (vernacular name? others?)
(if the species list becomes longer): typeahead feature?

What is needed? What is the priority of this?

Implement date range filtering

Ideally with a graph showing the number of occurrences per month (or week?)

Find and add shapefiles river basins units and subunits

@niconoe: where should I put these files? In which format do you prefer to have these files?

How to harvest data hosted locally

@niconoe: at least one important data provider (CR Jette) collects its data locally as GIS project on a laptop ("CR Jette" = "1 person" data speaking). Once the guidelines about GBIF, DwC requirements etc. are sent, I would like to contact CR Jette (ideally at end of October) for the technical part suggesting the easiest and most automatized way to harvest such GIS data.

@niconoe: do you have suggestions about it? Thinking loud: what do you think about creating an URL end-point under the riparias.be umbrella? We cannot share such data openly via GitHub as they contain sensible information.

Alternatively, we discuss it with CR-Jette directly during a short meeting without proposing any solution in advance.

GitHub actions: only deploy to test server if all tests passes?

Data issue: duplicate in GBIF download?

The following download was generated by the application: https://www.gbif.org/occurrence/download/0022344-210914110416597

Apparently it contains a duplicate (same gbifId: 334518077 and same content).

Report to GBIF?

Comments mechanism

Suggestion by @timadriaens: it would be great to have a comment system where users can let comment on a given occurrence.

Pay attention to the data update process: since observations are deleted and recreated every day, comments should be moved from one observation to another during the process.

Use official project color in nav bar

A detail. The official colour of the project is 00a58d in hex notation (see badge colour in README for example). @niconoe : could you replace the blue colour with this one? Thanks.

How much "integration" in terms of code / infrastructure between this webapp and other RIPARIAS developments?

I suspect this will be a ticket open for some time since I feel we don't have enough information now to take sounds decision about this. It's however clear that some integration/communication will have to take place between the two tools developed by INBO and the one developed by BBPF.

In practice, this integration can take place in many different ways: API communication between different web applications shared Vue components, common Django backend for everything, ...

There are good reasons for a tight integration (common accounts - #48, common page design, ...), but also good reasons to keep some level of separation and build a huge monolith (different timeline/release cycles, keep web apps smaller so they can evolve more easily/independently, ...). Where to put the cursor exactly: I don't know, but that's a question we should keep in mind.

The split into different GitHub repositories (or not!) will have to reflect that, so I think it's a secondary, related question.

Occurrences: Import/store/show more data fields

We are importing for each occurrence (only checked are done):

gbifId
source dataset
species name (from taxonKey with a fallback to acceptedTaxonKey)
date (from year, month, day)
location (from decimalLatitude / decimalLongitude)
Organism quantity and possible absence (a combination of individualCount, organismQuantity, organismQuantityType and occurrenceStatus
basisOfRecord
locality / municipality

Feedback welcome: this list can be gradually completed and implemented (let's think about priorities: more important fields first)

Show occurrence counters (with currently selected filters)

Implement some frontend test infrastructure

We currently have standard Django tests, which are run on each push thanks to GitHub actions.

It would be very useful to have something in place to help testing the frontend components (whole pages? Vue components? Use Django or Vue tooling?)

TimeLine chart crashes if filters (species selection for example) changes too fast

First investigation: this is apparently due to Chart.js itself that acts weird where a chart is destroyed and recreated too fast (still webworkers/animations) running in the background. Possible interactions with Vue lifecycle / event management.

Not fixing now since it's not that easy, and we might drop Chart.js for something else.

Vue.js: consider using components library

We're currently using stock Bootstrap for our UI needs, but it appears more and more clearly that we'll probably build a sophisticated frontend app.

It might then be helpful to use a components library such as PrimeVUE, Quasar Framework, element-plus, ...

That would be an important change, a few considerations:

The earlier we change, the easier
We'd like to have as many components as possible
We want full Vue 3 compatibility and good TypeScript support
We want a well maintained library that'll continue to exists in the future
We need something that integrate well into Django-rendered template (i.e. output Bootstrap code that fit in a standard bootstrap grid as generated by Django template). Something specially designed for SPA wouldn't work here.

Implement "occurrence details" page

That'll be a central part of the tool, but it's pretty easy to a basic version now and it'll help presenting the proposed alert mechanism to stakeholders (#9) soon.

Implement per-dataset filtering

Try to upgrade Vue.js to v3

This would be future proof, and the earlier we try, the easier it should be

Show infos for single observation: pop-ups?

If a user is interested in an observation, he/she should be able to get some details about it. I was thinking about pop-ups.

Some details I think we need absoultely to show as they are relevant for the field managers:

timestamp
coordinate uncertainty
validation status
validator
link to image(s) if any: field managers are keen to check photographs before going to the fields

@timadriaens : what do you think?