mrmap-community / mrmap Goto Github PK

Spatial Service Registry

Home Page: https://mrmap.rtfd.io/en/master/

License: MIT License

Python 81.45% HTML 4.15% Shell 1.86% Dockerfile 0.77% Gherkin 11.76%

geopython spatial spatial-service-registry

mrmap's Introduction

Mr. Map is a generic registry for geospatial data, metadata, services and their describing documents (e.g. web map services WMS, web feature services WFS and all the other OGC stuff. A similar project maybe Hypermap-Registry - but it lacks off many things which are needed in practice. The information model was inspired by the old well known mapbender2 project and somewhen will be replaced by mrmap.

Since most GIS solutions out there are specified on a specific use case or aged without proper support, the need for an free and open source, generic geospatial registry system is high.

MrMap runs as a web application atop the Django Python framework with a PostgreSQL database. For a complete list of requirements, see requirements.txt. The code is available on GitHub.

Documentation

The complete documentation for MrMap can be found at github page.

Discussion

GitHub Discussions - Discussion forum hosted by GitHub; ideal for Q&A and other structured discussions

Installation

Please see the documentation for instructions on installing MrMap.

Providing Feedback

The best platform for general feedback, assistance, and other discussion is our GitHub discussions. To report a bug or request a specific feature, please open a GitHub issue using the appropriate template.

If you are interested in contributing to the development of MrMap, please read our contributing guide prior to beginning any work.

mrmap's People

Contributors

Stargazers

Watchers

Forkers

majo72 terrestris marcjansen inspiregeo fabiansuck nursix

mrmap's Issues

Dedicated Task Queues

With an increasing number of apps using the background workers, we should think about creating dedicated queues for certain tasks. This should ensure that crucial tasks will not be blocked by time consuming lower priority tasks (e.g. metadata validation).

I suggest we should at least create dedicated queues for metadata validation and service registration, respectively.

What do you think @hollsandre ?

The user should be able to decide delivery strategy and mode

#Status quo
If the user configure a spatial restriction on access editor, mr. map delivers only the content within the areas which the user has defined.

Furthermore the user can disable some layers to prevent them from delivery. This results in hard shaped content like @mipel described it in his lecture on fossgis @ min 16:40.

#Enhancement

Since the user can only define an area which will be delivered by mr. map, we should refactor this by give the user the opportunity to define the strategy how mr. map delivers out the content:
- deny-all-outside --> current way
- allow-all-outside --> additional way
Additionally the user should have the choice, how the restricted data will be presented in state of delivery:
- cut out --> current way. The content is hard shaped.
- gaussian soft focus --> The content which is not delivered in cut out mode will be delivered with gaussian soft focus on it

Example usage of gaussian soft focus: Peek 2020-03-19 07-51.gif
It shows the nuclear facility Marcoule in France.

Refactoring of views

Status quo

Currently we implement the logic for http get/post of index and details views by our self. This is not a django common way to deploy views.

improvement

Django provides a better way to deploy standard view logic with generic views. With this framework we only need to declare some classes to specify the table and filter for example and the framework will do anything else for us. So we get rid of all prepare table functions, configure pagination functions and other custom code.

Refactoring: Update service views

Status quo

Currently there are four different update service views.

new_pending_update_service to start a update
pending_update_service to show the current pending update
dismiss_pending_update_service to dismiss a pending update
run_update_service to start async task for updating

Refactoring

Refactor the views as generic views as done in #493

ModuleNotFoundError: No module named 'vine.five'

Status quo

On a fresh installed debian 10 we have struggle with the current celery package celery==4.3.0

Running runserver command runs in to python-BaseException:

Traceback (most recent call last):
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 953, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/home/jonas/Git/MrMap/MrMap/__init__.py", line 5, in <module>
    from .celery_app import app as celery_app
  File "/home/jonas/Git/MrMap/MrMap/celery_app.py", line 12, in <module>
    from celery import Celery, task
  File "/home/jonas/Git/MrMap/venv/lib/python3.7/site-packages/celery/__init__.py", line 153, in <module>
    from . import local  # noqa
  File "/home/jonas/Git/MrMap/venv/lib/python3.7/site-packages/celery/local.py", line 17, in <module>
    from .five import PY3, bytes_if_py2, items, string, string_t
  File "/home/jonas/Git/MrMap/venv/lib/python3.7/site-packages/celery/five.py", line 7, in <module>
    import vine.five
ModuleNotFoundError: No module named 'vine.five'
python-BaseException

Refactoring: csw/views as generic views

Status quo

As discussed in #493, it is better to use generic views instead of using function views.

Refactoring

Refactor all the following views as generic view:

get_csw_results
harvest_catalogue

Installation script

Status quo

We have a detailed instruction on how to setup MrMap.

Enhancement

We need an installation script, which runs all the commands that can be found in the current instruction.

Please note

The adding of cgi to the apache configuration should only be done, if the configuration is missing. We do not want multiple configurations in the same file for cgi.

move codebase to subfolder

Proposed Changes

move MrMap codebase to a subfolder called src or main

Justification

Sonarcloud has trouble by separating test from source files:

ERROR: Error during SonarScanner execution
ERROR: File tests/test_data.py can't be indexed twice. Please check that inclusion/exclusion patterns produce disjoint sets for main and test files
ERROR: 
ERROR: Re-run SonarScanner using the -X switch to enable full debug logging.

we get a compact project page, where the readme shows quickly up

@jansule Take care of this. I will do it tomorrow. So better wait for it to avoid merge conflicts.

Translations

Status quo

We got a lot of new text snippets due to new frontends in the past weeks. Somehow the translations (de) have been forgotten to be done.

Improvement

Add the missing translations in one bulk!

Migrate to Bootstrap 5

Status quo

We are using bootstrap version 4 as frontend css framework in our project.
Bootstrap 5 is in the starting blocks and the version 5.0.0 is released as a first beta.

What's new

jQuery was removed
Switch to Vanilla JavaScript
Responsive Font Sizes
Drop Internet Explorer 10 and 11 support
Change of gutter width unit of measurement
Removed Card Decks
Navbar Optimization
Custom SVG icon library
Switching from Jekyll to Hugo
Class updates

Migration

Take a look on the migration guide
~~Take also care about pipy package django-bootstrap4 which we can't use after migration~~ use new package django-bootstrap5

Benefits

I see the following advantages for which it is worth migrating

We could get rid of jQuery
We could get rid of FontAwesome and use the svg icons from the bs5 library
We could use responsive font sizes
We could use fullscreen modals
Accordion component comes with a toggle icon. So we doesn't need use custom css
New checkbox switches are available

Rename Codebase Directory

Proposed Changes

It would be nice to rename the directory with the codebase from mrmap to src.

Justification

It's a widely followed convention.

Check reference system parsing

Bug

I registered the resource http://geo5.service24.rlp.de/wms/karte_rp.fcgi?REQUEST=GetCapabilities&SERVICE=WMS&VERSION=1.1.0

For layer Wald 0 (identifier wald_00), there is one reference system with the following data:
prefix: EPSG:31466 EPSG:31467 EPSG:25832 EPSG:4326 EPGS:4258 EPSG:
code: 3857

see:

Todo

Check the parsing. The parser should seperate the reference systems correctly in single ReferenceSystem objects.

Fix OGCOperationRequestHandler to work with CRS namespace

Bug

I'am working currently on the issue #514 and i found a bug in the OGCOperationRequestHandler. If some gis clients starts requesting a resource, for example with the SRS param like SRS=CRS:84, the OGCOperationRequestHandler currently can't handle the CRS namespace.

I added a todo on line 149 in this commit

Todo

implement a logic which converts from the CRS namespace to the EPSG namespace.

If we convert the CRS namespace to EPSG namespace, we can simply use the GEOS API to make lookups in our database.

Duplicate MonitoringResults at on MonitoringRun

Environment

Python version: 3.7
MrMap version: b214ce3

Steps to Reproduce

Register Karte RP http://geo5.service24.rlp.de/wms/karte_rp.fcgi?SERVICE=WMS&VERSION=1.3.0&REQUEST=GetCapabilities
Run health checks

Expected Behavior

Non duplacted MonitoringResult objects should be generated.
A MonitoringResult object shall be unique together for attributes monitored_uri, monitoring_run.

Observed Behavior

Multiple calls for the same uri on a specific run for a specific metadata object happens.

I added a unique_together = ("metadata", "monitored_uri", "monitoring_run") constraint, to avoid saving duplicates in the database.

Celery raises multiple times psycopg2.errors.UniqueViolation errors like:

[2021-02-18 12:30:46,976: ERROR/ForkPoolWorker-7] duplicate key value violates unique constraint "monitoring_monitoringres_metadata_id_monitored_ur_389f4983_uniq"
DETAIL:  Key (metadata_id, monitored_uri, monitoring_run_id)=(bc4b95d3-008c-48fc-b18d-42d449695213, http://geo5.service24.rlp.de/wms/karte_rp.fcgi?REQUEST=GetCapabilities&VERSION=1.3.0&SERVICE=wms, 10af31f7-73d2-4acc-a323-dd9163df69cc) already exists.
Traceback (most recent call last):
  File "/home/jonas/git/MrMap/venv/lib/python3.7/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "monitoring_monitoringres_metadata_id_monitored_ur_389f4983_uniq"
DETAIL:  Key (metadata_id, monitored_uri, monitoring_run_id)=(bc4b95d3-008c-48fc-b18d-42d449695213, http://geo5.service24.rlp.de/wms/karte_rp.fcgi?REQUEST=GetCapabilities&VERSION=1.3.0&SERVICE=wms, 10af31f7-73d2-4acc-a323-dd9163df69cc) already exists.

@jansule i dont get it why the run_manual_monitoring() task runs in to the described behavior. It also seems that the task runs endless, if an error occurs.

Dataseteditor: Additional related objects are empty

Bug

We provide the feature to add more relations to a dataset from other metadatas by user input in our dataseteditor.

The form field Additional related objects should show all related objects which are added by user. However, after the user adds one new relation, saved the dataset and reopend the dataseteditor, the Additional related objects field is empty.

Fix

The Additional related objects field should always contains all relations which are added by the user.

Implement object based permissions

Environment

Python version: 3.7
MrMap version: v0.0.0

Proposed Functionality

The built in django permission handling should be enhanced by a object level permission handling from django-guardian.

remove get_services()
remove get_services_as_qs()
remove get_metadatas_as_qs()
remove get_datasets_as_qs()
implement PermissionListMixin for all ListViews
change from django PermissionRequiredMixin to PermissionRequiredMixin

Use Case

security benefit

Since #52, the permission handling is only model based. For example, a user could delete a group if he has the structure.remove_mrmapgroup permission in any case. We need also permission handling on object level. This means a user shall only be able to delete a group, if he has specific permissions for this specific group.

filter querysets benefit

With the PermissionListMixin the user will only see object for that he has permissions.

Database Changes

add signals to create permissions on object creation
implement dependency workflow on delete group with the following options:
- If a group has dependencies (objects created_by the group), the group can not be deleted. The user will be forced to decide what to do with his objects. Two options:
  1. move all objects to new group
  2. remove all objects

External Dependencies

django-guardian

Docker Image for Developers

Greetings!

I just remembered the last meeting when the possibility of a development container beeing created was mentioned.

Is that still planned?

Atom Feed Client

Atom Feed

The atom feed itself is technically the successor of RSS feed and was introduced in 2005. RSS and atom are pretty similar to each other, whilst atom is capable of e.g. providing embedded HTML, which can produce nicer rendering on client side. RSS is only capable of plain text, without any formatting or further embedded HTML content.

Basically an atom feed may look like this

<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

  <title>Beispiel Feed</title>
  <link href="http://www.beispiel.org/"/>
  <updated>2005-12-13T17:30:01Z</updated>
  <author>
    <name>Hannes Schmidt</name>
  </author>
  <id>urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6</id>

  <entry>
    <title>Atom Feed Beispiel</title>
    <link href="http://beispiel.org/2005/12/13/atom05"/>
    <id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
    <updated>2005-12-13T17:30:01Z</updated>
    <summary>Textzeile</summary>
  </entry>

</feed>

Atom Feed and INSPIRE

Since 2012 INSPIRE expects services to be downloadable using something different, than a desktop gis client (like QGSI). Somehow, the solution ended up being a simple web client, which receives an atom feed url, and extracts information based on the given feed.

The user is then able to download the data, which is provided by the underlying service of the atom feed.

Implementations

There are numerous implementations of this technique out there. Just a few examples

Geoportal Sachsen - Orthografie Atom Feed
- User has to download the whole service's data at once, no further selection of smaller parts
Geoportal RLP - Atom Feed Client
- User is able to select - if provided by client - smaller parts of a service, such as tiles of a regular WMS

Since we are working on a mapbender like successor, we may follow the current Geoportal RLP approach. We may use other techniques and implement further enhancements

GDI-DE technical guidance

The GDI-DE has published a document regarding the implementation of atom feed client. It may be useful to read beforehand.

Atom Feed GDI examples

Geoportal Sachsen with multiple entries (I'm not sure what exactly the differences are for each entry)
Geoportal RLP

Requirements

A new django app has to be created called atom
The atom feed client landing page is callable using a /atom-feed route
1. Additional parameters are
  1. q (query)
    - optional
    - holds multiple keywords, separated using + for the minimal catalogue
    - example: /atom-feed?q=test+keyword+search
The atom feed client for a specific service is callable using /atom-feed/<id>
The client is not embedded into the MrMap UI, but - just as the HTML Metadata - accessible for everyone, even guest users
A new model has to be created called AtomFeedDownload, holding the following attributes
1. id (UUID - no auto increasing integer)
2. metadata (ForeignKey, no related_name)
3. zip_uuid (UUID)
4. timestamp (DateTime)

Atom Feed Generator

The atom feed documents will not be pre-calculated. Since they are rather small, they will be generated on request
A new AtomFeedGenerator has to be created, which may be placed in the project folder /service/helper/atom
1. this generator must generate a xml document, based on the available information from a metadata record
The catalogue API has to be extended
1. Next to the existing attributes of a catalogue metadata entry, another attribute called atom_feed_url must be set
2. atom_feed_url shall follow the structure /service/metadata/<id>/atom-feed
3. A call of this atom_feed_url must generate the corresponding atom feed document

Landing page of client (minimal catalogue)

If none of the optional parameters are given, the user will be greeted by a field input, just like a search bar
1. Any input on this search bar will perform a request on /api/suggestion, which returns a list of possible related keywords, ordered by relevance. Further information on how to use this api route can be found here
2. When a search is started using a given input, a request on /api/catalogue will be performed, which delivers catalogue results, based on the given input. Further information on how to use this api route can be found here
3. Each result is listed with the title, the abstract and the type of resource (metadata.metadata_type -> service|layer|featuretype|dataset|...)
4. Each result has two buttons ons the right:
  1. Download opens the /atom-feed/<id> route with the appropriate id of the search result set --> access to the download client
  2. Feed opens the /service/metadata/<id>/atom-feed route with the appropriate id set --> access to the feed document
Searches on the landing page shall not be handled using AJAX. Instead the start of a search will open the route /atom-feed?q=INPUT, which performs the internal API call and renders the result list
The catalogue API provides pagination, which can be used for pagination on the minimal catalogue UI

Behol! My magnificent layout idea:

Atom Feed Download Client

Let me explain the technical behaviour of this download interface:

How do we get data from web services? (simplified)

Basically, we use two important operations on WMS and WFS:

WMS
- wms-url?service=WMS&version=1.1.1&layers=layer1,layer2,...&bbox=90.0,100.0,-90,0,-100.0&srs=EPSG:4326&format=image/png&width=100&height=100
- This means to provide a smaller subset of data for the user, we simply need to change the requested bbox! All other parameters can be the same!
WFS
- wfs-url?SERVICE=WFS&REQUEST=GetFeature&VERSION=1.1.0&TYPENAME=feature_name&BBOX=5539710.91589949745684862,2528178.23551842849701643,5602187.72767475247383118,2633820.21731885056942701,urn:ogc:def:crs:EPSG::31466
- The same for WFS - we just have to iterate over all available features of this wfs, use the identifier for the parameter typename and the selected bbox to achieve our goal

Back to requirements

The user must be aware whether he/she is logged in or not. This means in one of the upper corners, the user name should be visible (if logged in), otherwise guest should be visible.
1. If a logged in user will perform the requests on internal services, the MrMap logic will automatically crop the resulting data to the allowed areas
There must be a leaflet client, which zooms automatically to the extent of the web service
- The leaflet must render the service extent as polygon to indicate for which area data is available
- The leaflet client must include the leaflet-geoman plugin, just as it's used in the editor-access-geometry-form, so the user will be able to draw intersecting bounding boxes!
  - the plugin can be configured in a way, that only the square polygon tool is availabe. We will read the requested bbox from this polygon!
  - if multiple polygons are drawn by the user, each one leads to an own request, using the specific bbox of the polygon
There must be a Download button, which starts the appropriate WMS/WFS request(s)
- To avoid bots spamming us, we need a reCaptcha plugin here, which has to check whether the user is a real human being or not
- This means, the drawn polygon's extents will be retrieved from the leaflet client, and sent via POST in one request to /atom-feed/<id>/download. How multiple polygons are transferred doesn't matter, as long as they can be retrieved correctly on the backend!
  - Please note: Implement this transfer in such a way, that we could extend it easily in the future. Let's say we want to give the user more options on which resolution the WMS images shall be retrieved. Than we need to match these further parameters to each bounding box without problems easily.
The backend must download and zip the requested data
1. A new pendingTask has to be created, which will give us the option to show a progress on the frontend
  1. Since we know how many requests we have to perform, we can calculate the step size for the progress bar and update the pendingTask record accordingly
2. As stated beforehand, each bbox results in an own request. After each request, the pendingTask has to be updated.
3. The result of each request has to be checked on sanity: WMS must provide some kind of image data (check using the PIL/Pillow package), WFS must deliver xml/gml like data
4. The results must be stored on the file system inside a unique folder, where all requested data will be stored as files
5. When all requests have been performed, the folder has to be zipped and named using a generated UUID4
6. When the folder has been zipped, the original folder has to be removed
7. A new record of AtomFeedDownload has to be created and persisted, where the generated uuid is set as zip_uuid and all other information accordingly
The download url has to be constructed like /atom-feed/<id>/download/<zip_uuid>, where <zip_uuid> refers to the generated UUID4 of the zipped file
1. The frontend must inform the user, after refreshing of the progressbar, that the task is finished and the data can be downloaded for x hours using this link
The user must be able to call the /atom-feed/<id>/download/<zip_uuid> url even after closing the window
1. The corresponding view method just takes the <zip_uuid> value and checks for an existing AtomFeedDownload record. If it exists, the corresponding zip file on the file system will be returned using a specific file response of django which supports streaming of larger files.
2. A one-time periodicTask record with an intervalSchedule, which is set to x hours, must be created after AtomFeedDownload has been created. This periodicTask must delete the zipped file on the file system and the related database entry and remove itself afterwards, to keep things clean.

Behold! Another fabulous layout:

OGC API

Status quo

We provide a RESTful API which does not follow any official specification.

Improvement

The OGC API will be the next step in automatical exchange of data for spatial infrastructures, since it's the first draft for a new specification.

The OGC API has been presented in many videos on YT as well, including a presentation from FOSSGIS 2020 Freiburg,Germany.

The idea of this API follows a modern RESTful design and is therefore extremely interesting!

But since this is just a draft of a final specification, we would have to keep track on changes and maintain them in our system as well

New app

It would be best practice to create a new app ogcApi instead of extending the existing api app. This way we could maintain the code more easily instead of digging through different files, which are not related to the ogc api itself but our regular api.

Specification

It is very important to read and understand the specifications draft before implementation!

All routes have to follow the design and behaviour that is described in there.

CSW Frontend

Status quo

The frontend is callable by

/csw

If a user does not know this exists, he/she/it will never detect it right now. And if a user is not capable of the specifics of a CSW request url building, we should provide some help to get what they need.

Improvement

Rename API menu to Interfaces
Replace dash icon of menu entry to plug icon
Place API Token card inside of API card, for clearer grouping
Rename API card to REST API
Add card CSW, using the same icon as the user is already used to like in resource index view. The following elements have to stay inside this card.
1. Add static Base Url button
  1. Opens the base /csw
2. Add static GetCapabilities button
  1. Opens /csw/?request=GetCapabilities&version=2.0.2
3. Add disabled one-line input element, just as we did for the API Token. This will hold the dynamically created csw link for the user
4. Add some 'filter' inputs below:
  1. Request (DropDown)
    1. GetRecords
    2. GetRecordById
  2. Version (DropDown)
    1. 2.0.2
  3. ResultType (DropDown)
    1. hits
    2. results
  4. OutputSchema (DropDown)
    2. http://www.isotc211.org/2005/gmd
  5. TypeNames (DropDown)
    1. gmd:MD_Metadata
  6. ElementSetName (DropDown)
    1. summary
    2. brief
    3. full
5. Each change of the 'filter' inputs generate the url inside of the disabled text input, so the user can easily copy it from there and use it

New resource type - mapContext

Basic Concept

MrMap needs the possibility to manage "mapContext" resources. These are configurations that allow users to store a kind of mapset. Similar to a QGIS project file.
To allow the external usage of those contexts, MrMap should be able to provide an API for OWSContext documents (OGC OWS Context)

Predecessor

MrMaps predecessor mapbender2 handles user defined mapsets in form of OGC WMC documents. These are stored in a database table with additional metadata and information about the owner.

Concept

mapContext

In MrMap a mapContext object has metadata similar to the datasets and services.

Draft mandatory attributes:

Title
Abstract
...

contextLayer

It has a hierachical list of generic layers - "contextLayers". These layers are resources which can be rendered/used by a gis client. The idea is to search for datasets and select the dataset which should be included in the mapContext. If no dataset is available, the contextLayer maybe simply a wms layer.

Draft attributes for contextLayer object:

title
accessTo - foreign key to internal dataset metadata if exists
hierarchy1 - depends on implementation
hierarchy2 - depends on implementation
type - internal/external
renderingBy - foreign key of internal metadata
renderingActive - boolean
renderingRemote - json string for connection - like ows context offering
featureInfoBy - foreign key of internal metadata
featureInfoActive - boolean
featureInfoRemote - json string for connection - like ows context offering
objectAccessBy - foreign key of internal metadata
objectAccessActive - boolean
objectAccessRemote - json string for connection - like ows context offering
currentDimension - TBD
minscale
maxscale
maxFeatures
featuresPerPage
style?/sld?

Implement channels

Status quo

Async running background tasks

In our project there are async long running tasks. This tasks are processed by a celery worker and only lives in memory db redis.
For example, to show the status of a running registration of a new resource, the task status is persist in a custom django model called PendingTask.

The PendingTaskTable is polled by a simple ajax script:

var pending_task_count = 0;
    var timer;

    function get_pending_task_list() {

        fetch('{% url 'resource:pending-tasks' %}?with-base="False"', {redirect: 'manual'})
        .then(function(response) {
            return response.text().then(function( text ) {
                $('#id_pending_tasks_div [data-toggle="tooltip"]').tooltip("hide");
                $( "#id_pending_tasks_div" ).html( text );
                $('#id_pending_tasks_div [data-toggle="tooltip"]').tooltip();

                // -1 for table header row
                rowCount = $('#id_pending_tasks_div table tr').length -1;

                if ( rowCount < 1 ){
                    // table is empty - stop interval
                    clearInterval( timer );
                    // delete content
                    $( "#id_pending_tasks_div" ).html( '' );
                    timer = setInterval(get_pending_task_list, 5000);
                }

                // todo: refresh resource tables if something becomes changed
            })
        });
    }

    timer = setInterval(get_pending_task_list, 1000);

This is not a smart way, cause for example, if a small service is registerd and the registration process takes less than five seconds, a user maybe do not see the process runs.

Other async tasks are defined in the service/tasks.py file. The following tasks are implemented right now:

async_increase_hits increases the hits attribute of a metadata object. Fast
async_activate_service activates a resource. Long
async_secure_service_task secures a resource. Long
async_remove_service_task removes a resource. Long
async_new_service registes a new resource. Long
async_log_response logs some request on a resource. Fast

Model view controller pattern

Currently there is no model view controller implementation. In the context of this issue it means the following:

We got models like PendingTask and a view called PendingTaskView with a table representation called PendingTaskTable. But there is no controller which one fires events on model changes and bring the changes direcly to the view.

Enhancement

We should get rid of the above polling style and implement a much smarter logic by using the model view controller pattern. However, in a web app this is not simple possible. For this we need to implement new technologies like websocket. But don't be afraid, there are ready to use packages out there.

pip package channels

For django there is a package called channels. This package implements all we need to implement our websockets on server side.

django signals

To fire events on model changes for example, the means of choice is django signals. It is also possible to write custom signals which are fired on custom events.

be carefull of transaction atomic. There is a post on stackoverflow which describes the problem.

on client side

I didn't find a package to create the websocket client easily. So for the first implementation we need to implement it by self.

bring it all together

Example implementation to handle for example the PendingTaskView (adapted from example):

The server side of the force (Controller)

from channels import Group

 
def ws_message(message):
    new_rendered_pending_task_table = ...
    Group('pending_task').send({'text':new_rendered_pending_task_table})
 
def ws_connect(message):
    Group('pending_task').add(message.reply_channel)
    Group('pending_task').send({'text':'connected'})
 
 
def ws_disconnect(message):
    Group('pending_task').send({'text':'disconnected'})
    Group('pending_task').discard(message.reply_channel)

The signals side of the force (Model)

To fire signals in django, we need to tell the framework which receiver function is to call by the event. For that we mark the def ws_message(message) function of the above example with the receiver decorator:

@receiver(post_save, sender=PendingTask)
def ws_message(message):
...

Now the django framework nows that this is our callback to use if the PendingTask model becomes changed.

The client side of the force (View)

In our view we need to implement the client logic to initiate the websocket:

	var pending_tasks_socket = new WebSocket('wss://ws/pending-tasks');
 
    pending_tasks_socket.onopen = function open() {
    	console.log('WebSockets connection created.');
    };
 
    pending_tasks_socket.onmessage = function message(event) {
    	$( "#id_pending_tasks_div" ).html( event.data );
    };
 
    pending_tasks_socket.onclose = function(e) {
    	console.error('Chat socket closed unexpectedly');
    };

The same could be done for all long running async tasks. For example we could disable the activate button till the last async_activate_service task is done.

@hollsandre @armin11 @MarkusSchneider @jansule other proposals?

Refactoring: HTML metadata detail views

Status quo

Currently there are two views for showing metadata details as a html rendered version.
*get_metadata_html to show a html rendered version of the metadatas deprecated. We drop this views and use simple class views as html representations. Tree view is already implemented for services. A plain html view is still needed.

get_service_preview to render a thumbnail of the requested service

Refactoring

Refactor the views as generic views as done in #493.
Maybe we could us the ResourceTreeView for this also?

Refactoring: api/views as generic views

Status quo

As discussed in #493, it is better to use generic views instead of using function views.

Refactoring

Refactor all the following views as generic view:

menu_view
generate_token

Error "The requested service could not be found." when fetching current capabilities of WMS

Steps for reproducing:

Add a new WMS. I used the following URL: https://www.wms.nrw.de/geobasis/wms_nw_bodenbewegungsgebiete_kacheln?&request=GetCapabilities&version=1.3.0&service=WMS
Click "Activate"
Click "Capabilities" -> "Current"

Implement constraints for some models

Status quo

The validation of user input is only handled by some forms with ValidationErrors. In this case we don't have any integrity check on database level. For that constraints are needed.

Advantage of using constraints

We will implement django signals in #504. For example, if we start the async task async_process_securing_access by observing the fields use_proxy_uri, log_proxy_access, is_secured and start the async task on changing event, we need to ensure that the saved data is valid. If we use constraints, only valid data could be saved to our db.

Disadvantage of using constraints

Currently constraints doesn't raise ValidationErrors. So we still have to implement Validation checks on form side. However, a ticket is open, which will implement this.

Conclusion

We should observe the ticket and if the validation check on forms is implemented for CheckConstraints, we should implement constraints instead of implementing form validation.

Implement a quality assurance process for the whole Project

#Status quo

For now some tests are stored directly under the django app directory like this: MapSkinner/service/tests.py
No quality assurance system is in use.

#Improvement

Use sonar-lint in pycharm as static analyser
Setup a sonarqube server to collect all rules and results (we can't use sonarcloud now; only github, gitlab, bitbucket, azure devops as repositry server is suported)
Setup the MapSkinner project. We need to move the testcode. Otherwise sonar can't exclude testcode from the analysis. See best practice from django: https://docs.djangoproject.com/en/3.0/topics/testing/advanced/#testing-reusable-applications

Dark themed table filters

Status quo

It looks like the DARK theme selection in the user settings do not affect the style of the table filters. A light gray would fit in here nicely, I guess.

Improvement

Check the dark theme for table filter sections.

Conformity checks

Requirements - outdated!

New app must be created: quality
New models must be created
1. Rule
  1. name - CharField
  2. field_name - CharField (with choices)
  3. property - CharField (with choices)
  4. operator - CharField (with choices)
  5. threshold - IntegerField
2. RuleSet
  1. name - CharField
  2. rules - M2MField (on Rule)
3. ConformityCheckConfiguration
  1. name - CharField
4. ConformityCheckConfigurationExternal inherits from ConformityCheckConfiguration
  1. external_url - URLField
  2. parameter_map - TextField (contains json)
  3. response_map - TextField (contains json)
5. ConformityCheckConfigurationInternal inherits from ConformityCheckConfiguration
  1. mandatoryrulesets - M2MField (on RuleSet)
  2. optionalrulesets - M2MField (on RuleSet)
6. ConformityCheckRun
  1. metadata - ForeignKey (on Metadata)
  2. confirmity_check_configuration - ForeignKey (on ConformityCheckConfiguration)
  3. time_start - DateTimeField
  4. time_stop - DateTimeField
  5. errors - TextField (contains json)
  6. additional_info - TextField (contains json)
All models must be configurable in the django admin interface

Requirements reduced 2020-10-15

New app must be created: quality
New models must be created
1. ConformityCheckConfiguration
  1. name - CharField
  2. metadata_types - TextField (contains json) - metadata.metadata_type, view format
2. ConformityCheckConfigurationExternal inherits from ConformityCheckConfiguration
  1. external_url - URLField - API
  2. api_configuration - TextField (contains json) - defines test classes, result parsing, ...
3. ConformityCheckRun - async - see pending tasks - type validate
  1. metadata - ForeignKey (on Metadata)
  2. conformity_check_configuration - ForeignKey (on ConformityCheckConfiguration)
  3. time_start - DateTimeField
  4. time_stop - DateTimeField
  5. errors - TextField (contains json)
  6. passed - Boolean
  7. result - TextField (contains json, xml, html)

Conformity checks

We need to separate between internal and external confirmity checks. Please regard the attached diagram to get an idea of the structure.

ConformityCheckConfiguration

Base model for ConformityCheckConfigurationExternal and ...Internal.

name

The name of the configuration

ConformityCheckConfigurationInternal

Holds the configuration for an internal conformity check.

mandatory_rule_sets

A set of ruleSet records, which are mandatory to pass successfully.

optional_rule_sets

A set of ruleSet records, which are nice if passed successfully.

ConformityCheckConfigurationExternal

Holds the configuration for an external conformity check.

external_url

A link pointing to an external test API.

parameter_map

Holds json as text. Further details below.

response_map

Holds json as text. Further details below.

ConformityCheckRun

Holds the relation of a metadata record to the results of a check

metadata

A ForeignKey to the related metadata

conformity_check_configuration

A ForeignKey to a ConformityCheckConfiguration record.

time_start

DateTime when the test started

time_stop

DateTime when the test ended

errors

Holds json as text.

additional_info

Holds json as text.

ConformityCheckInternal

Internal checks are based on Rules and RuleSets.

RuleSet

Groups rules and holds the results of a rule check run. RuleSets iterate over their rules and perform the rule checks.

name

The name of a ruleSet.

rules

A set of rules.

Rule

name

The name of a rule.

field_name

field_name defines an attribute of the Metadata model, such as abstract or keywords. A list of valid choices must be provided, so the rules can be easily created.

Valid choices are

title
abstract
access_constraints
keywords
formats
reference_system

property

A property describes what can be measured on the field_name. E.g. abstract can have the property len, since it's a string. keywords is a queryset of records and can have the property count, which is semantically equal to len but reduces the query time.

For now there are only these two properties which can be checked. The list of choices can be extended in the future.

operator

Defines a mathematical operator, which is used to describe the expected field_name property.

Valid choices are:

>
>=
<
<=
==
!=

How a rule could be used

A rule could be checked like this

def check_rule(metadata: Metadata, rule: Rule):
    prop = rule.property

    if prop == 'len':
        attribute = metadata.__getattribute__(role.field_name)
        real_value = len(attribute)
    elif prop == 'count':
        # count wird für M2M relations verwendet
        manager = metadata.__getattribute__(field)
	elements = manager.all()
	real_value = elements.count()
    else:
        raise Exception("No valid property")
    
    condition = str(real_value) + rule.operator + str(rule.threshold)
    return eval(condition, {'__builtins__': None})


rules = Rule.objects.all()
failed = []

for rule in rules:
    success = check_condition(metadata, rule)
    if not success:
        failed.append(rule)

ConformityCheckExternal

Mandatory attributes for this class are:

external_url
parameter_map
response_map

external_url

Points to an online API

parameter_map

Maps the metadata records field names on the required parameters for the API.

Example:
We have a metadata object with a field named get_capabilities_uri, which has to be used as the parameter inputURI of the API:

{
  "inputURI": "get_capabilities_uri",
}

Example implementation for usage

json_obj = json.loads("...")  # the json from above
params = {}

for key, val in json_obj.items():
	params[key] = metadata.__getattribute__(val)

request.post(api_uri, params)  # perform test request

response_map

Maps the known response elements to a given structure. This way, we are aware of how, e.g. the differen error elements of the response are named, so we can parse the response in a generic way.

Example:

{
  "error_identifier": [],
  "additional_info_identifier": [],
  "success_identifier": "ResultOfTest",
}

error_identifier

Every API must have at least one error element, which contains the found errors during the test. It may be, that there is not just one error element, but rather multiple error elements for different types of error (TypeError, OutOfRangeError, ...). All these error element identifiers have to be collected in a list, which we call error_identifier

additional_info_identifier

Some APIs might provide warnings or hints as well, which are not really errors but some kind of additional information. We proceed here the same as we did for error_identifiers: Put all these hints or warning element identifier in a list, which is called additional_info_identifier.

success_identifier

Every test API should have one element in response, which states whether the test run was successful or failed for the input. This element has to be named by using success_identifier

Example implementation for usage

json_obj = json.loads("...")  # the response_map from above
result = {
    "errors": [],
    "additional_information": [],
    "success": None,
}

error_identifiers = json_obj["error_identifier"]
additional_info_identifier = json_obj["additional_info_identifier"]
success_identifier = json_obj["success_identifier"]

for error_identifier in error_identifiers:
    result["errors"] = response.get_val_for_identifier(error_identifier)

# analogous for additional_information and success

Please note: The function get_val_for_identifier(id: str) does not exist yet and I didn't give any example of how to implement it. This function would be more complex, since it has to handle json, as well as xml responses, in case we deal with a test API, that uses xml.

refactor the rendering of links, buttons and other html content in tables

Proposed Changes

refactor the rendering of links, buttons and other html content in tables.

bad practice:

def render_title(self, record, value):
    return Link(url=record.detail_view_uri, content=value).render(safe=True)

def render_actions(self, record):
    self.render_helper.update_attrs = {"class": ["btn-sm", "mr-1"]}
    renderd_actions = self.render_helper.render_list_coherent(items=record.get_actions())
    self.render_helper.update_attrs = None
    return format_html(renderd_actions)

best practice:

LINK = """
<a href="{{ record.get_absolute_url }}">{{ value }}</a>
"""

title = tables.TemplateColumn(
        template_code=LINK
    )

Justification

speeds up the rendering of tables. This is caused by calling templates with the django rendering engine multiple times for every row.

every calling of the render() function of a django_bootstrap_swt component results in a single render process for the component.

remove unused html templates

Proposed Changes

remove unused html templates
move all templates to subfolders like templates/service/* for templates of the service app for example.

Justification

smaller code base

remove custom Permission and Role model

Proposed Changes

remove the custom Permission and Role model from MrMap.

This results in the following changes:

structure\models.py

remove

class Permission(models.Model):
    name = models.CharField(max_length=500, choices=PermissionEnum.as_choices(), unique=True)

    def __str__(self):
        return str(self.name)

    def get_permission_set(self) -> set:
        p_list = set()
        perms = self.__dict__
        if perms.get("id", None) is not None:
            del perms["id"]
        if perms.get("_state", None) is not None:
            del perms["_state"]
        for perm_key, perm_val in perms.items():
            if perm_val:
                p_list.add(perm_key)
        return p_list


class Role(models.Model):
    name = models.CharField(max_length=100)
    description = models.TextField()
    permissions = models.ManyToManyField(Permission)

    def __str__(self):
        return self.name

    def has_permission(self, perm: PermissionEnum):
        """ Checks whether a permission can be found inside this role

        Args:
            perm (PermissionEnum): The permission to be checked
        Returns:
             bool
        """
        return self.permissions.filter(
            name=perm.value
        ).exists()

remove field

class MrMapGroup(Group):
    ...
    # todo: remove role ForeignKey
    role = models.ForeignKey(Role, on_delete=models.CASCADE, null=True)
    ...

remove custom permissions depending functions

class MrMapUser(AbstractUser):
    @cached_property
    def all_permissions(self) -> set:
        """Returns a set containing all permission identifiers as strings in a list.

        Returns:
             A set of permission strings
        """
        return self.get_all_permissions()

    def get_all_permissions(self, group: MrMapGroup = None) -> set:
        """Returns a set containing all permission identifiers as strings in a list.

        The list is generated by fetching all permissions from all groups the user is part of.
        Alternatively the list is generated by fetching all permissions from a special group.

        Args:
            group: The group object
        Returns:
             A set of permission strings
        """
        if group is not None:
            groups = MrMapGroup.objects.filter(id=group.id)
        else:
            groups = self.get_groups
        all_perm = set(groups.values_list("role__permissions__name", flat=True))
        return all_perm

    def has_perm(self, perm, obj=None) -> bool:
        # Active superusers have all permissions.
        if self.is_active and self.is_superuser:
            return True

        has_perm = self.get_groups.filter(
            role__permissions__name=perm
        )
        has_perm = has_perm.exists()
        return has_perm

    # do not overwrite has_perms cause of using django permission system in Rest API
    def has_permissions(self, perm_list, obj=None) -> bool:
        has_perm = self.get_groups.filter(
            role__permissions__name__in=perm_list
        )
        has_perm = has_perm.exists()
        return has_perm

MrMap/management/commands/setup_settings.py

migrate all custom Permissions settings to the built-in Permision system.

example - current:

DEFAULT_GROUPS = [
    {
        "name": _("Group Administrator"),
        "description": _("Permission group. Holds users which are allowed to manage groups."),
        "parent_group": None,
        "permissions": [
            PermissionEnum.CAN_CREATE_GROUP,
            PermissionEnum.CAN_DELETE_GROUP,
            PermissionEnum.CAN_EDIT_GROUP,
            PermissionEnum.CAN_ADD_USER_TO_GROUP,
            PermissionEnum.CAN_REMOVE_USER_FROM_GROUP,
            PermissionEnum.CAN_TOGGLE_PUBLISH_REQUESTS,
            PermissionEnum.CAN_REQUEST_TO_BECOME_PUBLISHER,
        ]
    },
...
]

should be changed in:

DEFAULT_GROUPS = [
    {
        "name": _("Group Administrator"),
        "description": _("Permission group. Holds users which are allowed to manage groups."),
        "parent_group": None,
        "permissions": [
            'add_mrmapgroup',
            'change_mrmapgroup',
            'delete_mrmapgroup',
            'view_mrmapgroup',
            'change_groupinvitationrequest',
            'change_publishrequest',
        ]
    },
...
]

We should still handle it with the common way by using enums.

cause we refactor all views as class based views, with matching the default CreateView, UpdateView, DeleteView and so on, we don't need to create fancy new custom permissions, if it is not really necessary. Adding or removing a user to a group is a simple update of a given MrMapGroup instance.

Justification

The role concept is also handled by the is_permission_group flag or MrMapGroup
Simplify the codebase by using existing common django built in code

Simplify Generic Views with Django Signals

Original Post by @joki in https://git.osgeo.org/gitea/GDI-RP/MrMap/issues/503#issuecomment-9154

status quo

I currently refactoring the views in #493 to match the django best practice and using generic views. While i'm working on the UpdateView for activating/deactivating services there comes another usefull idea in this context.

I implement the following AsyncUpdateView class:

class AsyncUpdateView(UpdateView):
    template_name = 'generic_views/generic_update.html'
    action = ""
    action_url = ""
    alert_msg = ""
    async_task_func = None
    async_task_params = {}

    def get_context_data(self, **kwargs):
        context = super().get_context_data(**kwargs)
        context.update({"action": self.action,
                        "action_url": self.action_url})
        return context

    def form_invalid(self, form):
        content = render_to_string(template_name=self.template_name,
                                   context=self.get_context_data(form=form),
                                   request=self.request)
        response = {
            "data": content
        }
        return JsonResponse(status=400, data=response)

    def form_valid(self, form):
        self.object.save()

        task = self.async_task_func.delay(object_id=self.object.id,
                                          additional_params=self.async_task_params)

        content = {
            "task": {
                "id": task.task_id,
                "alert": Alert(msg=self.alert_msg, alert_type=AlertEnum.SUCCESS).render()
            },
        }

        # cause this is a async task which can take longer we response with 'accept' status
        return JsonResponse(status=202, data=content)

If the form is invalid the form is returned with errors in a json data field. If it is valid it saves the object and starts the async job. The returning is the task with an alert message.

Idea

Maybe it is also possible to use django signal in this context again, to start a async task on post_save of a metadata object if the is_active field becomes changed. If so, we must refactor the returned json for form_valid.

@jansule The AsyncUpdateView refactors the way we using modals. The complete modal is fetched by javascript and the submitting is also handled in javascript. What do you think about the idea to fire signals on post_save event?

action buttons in resource detail views are broken

Environment

Python version: 3.7
MrMap version: v0.0.0

Steps to Reproduce

register a service
open the detail view of the registered resource

Expected Behavior

All action buttons should shown up on the right side

Observed Behavior

No action buttons shown up on the right side

Refactoring of periodic tasks needed?

Traceback (most recent call last):
   File "MrMap/wsgi.py", line 16, in <module>
     application = get_wsgi_application()
   File "/usr/local/lib/python3.7/dist-packages/django/core/wsgi.py", line 12, in get_wsgi_application
     django.setup(set_prefix=False)
   File "/usr/local/lib/python3.7/dist-packages/django/__init__.py", line 19, in setup
     configure_logging(settings.LOGGING_CONFIG, settings.LOGGING)
   File "/usr/local/lib/python3.7/dist-packages/django/conf/__init__.py", line 83, in __getattr__
     self._setup(name)
   File "/usr/local/lib/python3.7/dist-packages/django/conf/__init__.py", line 70, in _setup
     self._wrapped = Settings(settings_module)
   File "/usr/local/lib/python3.7/dist-packages/django/conf/__init__.py", line 177, in __init__
     mod = importlib.import_module(self.SETTINGS_MODULE)
   File "/usr/lib/python3.7/importlib/__init__.py", line 127, in import_module
     return _bootstrap._gcd_import(name[level:], package, level)
   File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
   File "<frozen importlib._bootstrap>", line 983, in _find_and_load
   File "<frozen importlib._bootstrap>", line 953, in _find_and_load_unlocked
   File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
   File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
   File "<frozen importlib._bootstrap>", line 983, in _find_and_load
   File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
   File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
   File "<frozen importlib._bootstrap_external>", line 728, in exec_module
   File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
   File "./MrMap/__init__.py", line 5, in <module>
     from .celery_app import app as celery_app
   File "./MrMap/celery_app.py", line 12, in <module>
     from celery import Celery, task
   File "/usr/local/lib/python3.7/dist-packages/celery/local.py", line 497, in __getattr__
     module = __import__(self._direct[name], None, None, [name])
ModuleNotFoundError: No module named 'celery.task'

Is anyone else getting this error when trying to start celery or is it my setup 🤔 ?

According to this post the periodic task module is deprecated and has to be replaced somehow...

Install jenkins server instance on a public uri

#Status quo
Since we discussed the usage of sonar and automated tests in #94, we need a jenkins server to automate the sonar analyzes and test automation.

#Enhancement
Install jenkins as public (but protected for authentication) server instance

CSW GetRecords for harvested results

Status quo

The CSW GetRecords and GetRecordById is designed to create decent, specification based output for registered services and metadata.

However, using the gmd:Md_Metadata as output schema

http://127.0.0.1:8000/csw/?version=2.0.2&request=GetRecords&resultType=results&outputSchema=http://www.isotc211.org/2005/gmd&typenames=gmd:MD_Metadata

leads in case of harvested metadata to some errors, which roots can be found in the fact, that these metadata do not have Service records, since they didn't provide Service Model related data in the harvesting procedure.

Discussion

@joki
@armin11
@hollsandre

We have to refactor the CSW output gmd:MD_Metadata schema to work ...
1. ... only on Metadata -> will work always, but we can not provide certain Service model data
2. ... only on Metadata for harvested metadata (detectable by related_metadata.relation_type=HARVESTED_THROUGH) and regularly on registered metadata -> means more if-else cases for special treatment
We exclude harvested metadata from the CSW interface generally

Restore dataset metadata is broken

Status quo

To restore a Metadata object from type Dataset there is a function called _restore_dataset_md in service/models.py

Describe the bug

I customize a Metadata object from type Dataset, for example edit the Metadata date field.

After that, i tried to restore it with the function above, but the Metadata object isn't restored right. The edited date is still there and the Metadata object is still marked as custom.

Write documentation

Status quo

Currently there are a some doc snippets in the root folder of the repo:

README.md
UNITTESTING.md
INTEGRATIONTESTING.md
INSTALLDEB10.md
FUNCTIONALITY.md
CONTRIBUTING.md

Readthedocs

We should reorganize our documentation and use readthedocs to document our software project.

For that all documentation should be moved in to a docs folder. Initialize the docs by following this guide.
~~[ ] Create a GDI-RP group on readthedocs.~~ only available for business
Create the project on readthedocs and build it.

Example project

Mayan has a good documentation structure which we could adapt.

remove public id feature

Proposed Changes

remove the feature of generating public id's

Justification

reducing code base by removing non needed feature

Enhancement: Resource detail view

Status quo

Currently there is one ResourceTreeView implemented as generic DetailView.

This view shows a tree detail view for WFS and WMS resources.

Enhancement

implement a version of this for csw resources. @armin11 How should the detail view shows for a csw resource?

User adding to group

Status quo

The MrMap frontend does not provide any functionality to add users to a group.

Improvement

We should discuss the way we implement this feature! There are two possibilities:

Edit group form

We could use the Edit group form to add users, similar to adding keywords to a metadata record in the metadata editor form. This way, the user can enter the partial username and gets some suggestions for matching users.

Con

The user would be added automatically (in a simple implementation of this feature) without having the option to deny the adding.

User overview

In structure we list organizations and groups but no users. We could think of implementing a Users menu, which covers the table and filtering of users and provides actions like Invite to group.

An invitation would be handled just like a PendingRequest object. So the invited user will have a notification on the dashboard on the next login, which can be accepted or declined.

PendingRequest Refactoring

The current state of the PendingRequest model looks like this:

class PendingRequest(models.Model):
    type = models.CharField(max_length=255)  # defines what type of request this is
    group = models.ForeignKey(MrMapGroup, related_name="pending_publish_requests", on_delete=models.CASCADE)
    organization = models.ForeignKey(Organization, related_name="pending_publish_requests", on_delete=models.CASCADE)
    message = models.TextField(null=True, blank=True)
    activation_until = models.DateTimeField(null=True)
    created_on = models.DateTimeField(auto_now_add=True)

I suggest the following refactoring steps:

Generalize an abstract BaseInternalRequest model which holds the basic data, such as message, valid_until, created_on and created_by. Use uuid as primary key!
Create PublishRequest which inherits from BaseInternalRequest and add the missing data from the original PendingRequest model.
1. Refactor the usage of PendingRequest in all currenct code occurences to work with PublishRequest
Create GroupInvitationRequest which inherits from BaseInternalRequest and holds the data
1. invited_user (MrMapUser, ForeignKey)
2. to_group (MrMapGroup, ForeignKey)

Discussion

@joki
I prefer the second, refactoring, solution since we allow many more possible types of request in the future!

metadata validation

This enables the metadata conformity checks following the specification in #36.

Kudos to @MarkusSchneider

We made some adjustments to the proposed models and workflows, but kept the overall concept.

Currently, we support running conformity checks against an internal testsuite as well as against any external ETF testsuite compliant server. Our plugin concept for integrating testsuites allows adding logic for other APIs as well. Thereby, we can add e.g. gdi-de testsuite without refactoring in the future, if needed.

We added the created models to the django admin view, so that configurations can be edited there. Please note that most of the input fields are JSONFields and require valid JSON.

Model Info

ConformityCheckConfiguration.metadata_types expects a list of strings of metadata_types (e.g. ["dataset", "layer"])
ConformityCheckConfigurationExternal.external_url holds the base url to an ETF Testsuite
ConformityCheckConfigurationExternal.validation_target holds the name of the Metadata column containing the url to the xml that should be validated
ConformityCheckConfigurationExternal.parameter_map works as requested. Any string that starts with __ (double underscore) will be treated as variable. See example parameter_map below. Please note that __test_object_id MUST always be used for testObject.id.
ConformityCheckConfigurationExternal.polling_interval_seconds holds the initial polling interval for checking the status of a running test. After each request, the polling interval will be increased by factor 2 in order to decrease the number of requests to the testsuite.
ConformityCheckConfigurationExternal.polling_interval_seconds_max holds the maximum polling interval.
ConformityCheckRun.result holds the complete report of the test as json

Example parameter_map:

{
  "label": "Common Requirements for ISO/TC 19139:2007 based INSPIRE metadata records.",
  "arguments": {
    "files_to_test": ".*",
    "tests_to_execute": ".*"
  },
  "testObject": {
    "id": "__test_object_id"
  },
  "executableTestSuiteIds": [
    "EID59692c11-df86-49ad-be7f-94a1e1ddd8da"
  ]
}

Task Queue

With the current implementation, a worker is reserved until a test completes. This can take quite some time for the ETF tests, which will probably block other processes that run in the background. We should think about creating dedicated queues for different kind of tasks (e.g. queue for service registration, queue for validation, etc.). Thereby, only the queue for metadata validation would be blocked and other background processes could still run. I created a separate issue for that (https://git.osgeo.org/gitea/GDI-RP/MrMap/issues/490).

Docker Dev Setup

We added a docker-compose-dev.yml that can be used for the local dev setup. This installs and runs the databases, a geoserver and a local ETF testsuite.

Please note that the compose file is not meant for going to production, but to simplify the setup process for devs who cannot install all requirements directly on their system.

Please also note the changes in the port for the postgres database. The same port must be set in the django settings, if you want to use the docker setup.

README updates

We made some minor adjustments to the README. Since the migrations were added to the repository, the makemigrations commands are obsolete.

Dependency updates

We made some minor adjustments to the requirements.txt that were needed in order to make the application running in a newly created venv.

@hollsandre please review.

Fix integration test test_access_securing

Test failed

test function: test_access_securing

output:

======================================================================
FAIL: Tests whether the securing of a service changes the returned restuls on GetFeature and GetMap.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/jonas/Git/MrMap-sven/tests/integration_tests/editor_app/test_views.py", line 659, in test_access_securing
    self.assertEqual(response.status_code, 200, msg="Wrong status code returned: {}. Content:\n{}".format(response.status_code, response.content))
AssertionError: 500 != 200 : Wrong status code returned: 500. Content:
b"HTTPSConnectionPool(host='epsg.orgexport.htm', port=443): Max retries exceeded with url: /?gml=urn:ogc:def:crs:EPSG::25832 (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fe740056390>: Failed to establish a new connection: [Errno -2] Name or service not known'))"

ServiceUrl url attribute is null for some layers in wms version 1.1.0

Status quo

If we register a wms with version 1.1.0 there are some service url which doesn't containing a url. This causes problemes for example in our monitoring runs.

Steps to reproduce

Registere service: http://geo5.service24.rlp.de/wms/karte_rp.fcgi?REQUEST=GetCapabilities&SERVICE=WMS&VERSION=1.1.0
Run monitoring for this new registered service ==> Cause there are None filled url's this runs in to:

Traceback (most recent call last):
  File "/home/jonas/Git/MrMap/monitoring/tasks.py", line 69, in run_manual_monitoring
    monitor.run_checks()
  File "/usr/lib/python3.7/contextlib.py", line 74, in inner
    return func(*args, **kwds)
  File "/home/jonas/Git/MrMap/monitoring/monitoring.py", line 70, in run_checks
    self.check_wms(check_obj)
  File "/home/jonas/Git/MrMap/monitoring/monitoring.py", line 132, in check_wms
    wms_helper.set_operation_urls()
  File "/home/jonas/Git/MrMap/monitoring/helper/wmsHelper.py", line 44, in set_operation_urls
    self.get_styles_url = self.get_get_styles_url()
  File "/home/jonas/Git/MrMap/monitoring/helper/wmsHelper.py", line 74, in get_get_styles_url
    url = UrlHelper.build(uri, queries)
  File "/home/jonas/Git/MrMap/monitoring/helper/urlHelper.py", line 41, in build
    build = parse.urlunparse((_url.scheme, _url.netloc, _url.path, _url.params, query_string, _url.fragment))
  File "/usr/lib/python3.7/urllib/parse.py", line 475, in urlunparse
    _coerce_args(*components))
  File "/usr/lib/python3.7/urllib/parse.py", line 120, in _coerce_args
    raise TypeError("Cannot mix str and non-str arguments")
TypeError: Cannot mix str and non-str arguments```


`

remove theme feature

Proposed Changes

remove the possibility of choosing themes by user

Justification

reducing code base by removing non needed feature

Refactoring: All tests

Status quo

There are many old tests that are broken after refactoring (see #493).

Refacotring

think about a new test concept for generic views

Refactoring: Resource detail view

Status quo

Currently there are five different views to serve

get_service_metadata
get_dataset_metadata
get_operation_result
get_metadata_legend

This views returns content_type application/xml

Refacotring

refactor get_service_metadata as generic view
refactor get_dataset_metadata as generic view
refactor get_operation_result as generic view
refactor get_metadata_legent as generic view

Automated Updating of Services

With the addition of the Monitoring application (see https://git.osgeo.org/gitea/hollsandre/MapSkinner/pulls/91), we know if a service was updated. Now, we have to add logic to actually apply these updates.

This, among others, includes adding/removing/updating operations and layers.

AFAIK, as soon as a service can be updated, owners of the service will receive a notification and are able to trigger the updating process by clicking on a button.

@mipel, please let me know, if this rough concept complies with your understanding of the updating process

Still running pending tasks

Status quo

If a pending task runs in to an error, an errorreport should be created. This happens for register wms/wfs/csw resources. However, if an async harvest task runs into an error, the error report is'nt generated by Mr. Map.

Reproduction

I registered https://gdk.gdi-de.org/gdi-de/srv/eng/csw?SERVICE=CSW&VERSION=2.0.2&REQUEST=GetCapabilities in Mr. Map.
I start harvest for this csw resource.
The task runs into ReadTimeout exception:

Traceback (most recent call last):
  File "/home/jonas/Git/MrMap/venv/lib/python3.7/site-packages/celery/app/trace.py", line 385, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/home/jonas/Git/MrMap/venv/lib/python3.7/site-packages/celery/app/trace.py", line 648, in __protected_call__
    return self.run(*args, **kwargs)
  File "/home/jonas/Git/MrMap/csw/tasks.py", line 37, in async_harvest
    harvester.harvest(task_id=async_harvest.request.id)
  File "/home/jonas/Git/MrMap/csw/utils/harvester.py", line 121, in harvest
    hits_response, status_code = self._get_harvest_response(result_type="hits")
  File "/home/jonas/Git/MrMap/csw/utils/harvester.py", line 307, in _get_harvest_response
    connector.load(params=params)
  File "/home/jonas/Git/MrMap/service/helper/common_connector.py", line 98, in load
    response = self.__load_requests(params)
  File "/home/jonas/Git/MrMap/service/helper/common_connector.py", line 201, in __load_requests
    response = requests.request(self.http_method, self._url, params=params, proxies=proxies, timeout=REQUEST_TIMEOUT)
  File "/home/jonas/Git/MrMap/venv/lib/python3.7/site-packages/requests/api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "/home/jonas/Git/MrMap/venv/lib/python3.7/site-packages/requests/sessions.py", line 530, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/jonas/Git/MrMap/venv/lib/python3.7/site-packages/requests/sessions.py", line 643, in send
    r = adapter.send(request, **kwargs)
  File "/home/jonas/Git/MrMap/venv/lib/python3.7/site-packages/requests/adapters.py", line 529, in send
    raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='gdk.gdi-de.org', port=443): Read timed out. (read timeout=100)

Solution

Wrap all async tasks with a try/catch block to catch all exceptions.

mrmap-community / mrmap Goto Github PK

mrmap's Introduction

Documentation

Discussion

Installation

Providing Feedback

mrmap's People

Contributors

Stargazers

Watchers

Forkers

mrmap's Issues

Status quo

improvement

Status quo

Refactoring

Status quo

Status quo

Refactoring

Status quo

Enhancement

Please note

Proposed Changes

Justification

Status quo

Improvement

Status quo

What's new

Migration

Benefits

Proposed Changes

Justification

Bug

Todo

Bug

Todo

Environment

Steps to Reproduce

Expected Behavior

Observed Behavior

Bug

Fix

Environment

Proposed Functionality

Use Case

security benefit

filter querysets benefit

Database Changes

External Dependencies

Atom Feed

Atom Feed and INSPIRE

Implementations

GDI-DE technical guidance

Atom Feed GDI examples

Requirements

Atom Feed Generator

Landing page of client (minimal catalogue)

Atom Feed Download Client

How do we get data from web services? (simplified)

Back to requirements

Status quo

Improvement

New app

Specification

Status quo

Improvement

Basic Concept

Predecessor

Concept

mapContext

contextLayer

Status quo

Async running background tasks

Other problems

Enhancement

pip package channels

django signals

on client side

bring it all together

The server side of the force (Controller)