Giter VIP home page Giter VIP logo

open-geodata-model's Introduction

KfWs Open Project Location Model

About

Welcome! This is KfWs repository for storing the open project location model which can be used to collect project location sites and meta-information for international development projects, including those financed by KfW. The project location model comes in form of an Excel template which can be used to collect the data.

A guideline on how to collect this information is published and continously updated here.

Available versions and versioning of the template.

Please note, that the Excel template and the guidelines are living documents. We keep on advancing the model i.e. by extending the list of location types beyond those available in the IATI standard. You might therefore find different versions of the template in this repository over time. We are also working hard to provide the template in different languages.

Open Data Kit / KoboToolbox Templates

In addition to the Excel template named "Project_Location_Data_Template," we provide two other templates named "Project_Location_ODK_Template" for data collection using either ODK or KoboToolbox. Therefore, all three templates adhere to the same configuration, encompassing identical required data points, such as project number, location types, DAC codes etc.

Some general remarks

The technical specifications as well as the sample Terms of Referencecan for collecting the data can be found in this repository. The information in the table of the page "Technical Notes" is also given as comments in the Excel template if you click on the cells. Most importantly make sure to use WGS 84 as the coordinate reference system when submitting locations in lat/long format. You might also collect proper geo-data in formats such as kml oder geojson (oder shapefile). If you send this information together with the template it might help to verify the coordinates in the template. In addition it will be very usefull in cases where a project location is better represented by line or polygon geometries (e.g. protected areas or transimission lines).

More information on remote management methods

This open data model is part of KfW's RMMV Digital Public Content Initiative. You can find more information on RMMV here.

open-geodata-model's People

Contributors

fretchen avatar maja4dev avatar jo-schie avatar s-r-f-l avatar goergen95 avatar allcontributors[bot] avatar ckreutz avatar julest94 avatar dependabot[bot] avatar

Stargazers

 avatar  avatar  avatar

Watchers

Jure Zakotnik avatar Sebastian Sascha Helfinger avatar  avatar  avatar

open-geodata-model's Issues

Redundante und aufwändige Datenstrukturen bei GeoDaten-Erhebungen

Ich möchte folgenden Punkt zu bedenken geben:

In der Praxis zeigt sich jetzt schon, dass die Datenauswahl und -menge nicht bedarfsgerecht ist, also ´zu viel´ ist, und dass wir
dadurch einen viel zu großen und potenziell inkonsistenten Pflegeaufwand für GeoDaten erwarten dürfen:

Das Excel-Template für GeoDaten enthält Auswahlfelder.
Die jeweiligen Auswahloptionen sind zum einen nicht passend zu den Prozessen oder Bezeichnungen bei den Consultants;
Diese wirft Rückfragen auf und erzeigt inkonsistente Methadaten.

zum Anderen:
Es werden hier Daten abgefragt, die eigentlich im Datenmodell des PMT zu Hause sind!
So hängt z.B. der "Projektstatus" von der Logik her am Projekt und damit an der InPro-Nummer im PMT, ein Status bei den GeoDaten ist redundant und gefährlich, da Wildwuchs und falsche Daten entstehen!!!
Zudem muss der Status bei jeder der sechs (!) Status-Optionen immer wieder aktualisiert werden.
Ich frage mich, ob die Pflege der vorgegebenen Datenmenge mit den Pflegeschleifen durchhaltbar ist.

Besser wäre es, die Spalten nochmals auf Dopplungen zu untersuchen und auch die Auswahl-Optionen abzuschaffen oder stark einzuschränken. Schließlich können / müssen daten wie der Projektstatus aus dem PMT / GeoApp dazu gespielt werden!

Use external DAC codes

Discussed in #68

Originally posted by fretchen June 14, 2024
I saw in the Excel file that there is a DAC Purpose Codes worksheet. What is this based on ? Is it correct that the column on the DAC 5 code is based on the published iati standards version 2.03 ? Same on the description. Is this the description from the published standard or has it been edited ?

I assume that @Maja4Dev might know more about this ?

Given that we use the external DAC codes we should also make use of them and reference them. My suggested approach would be:

  • Copy the official version in identifiable fashion into this repo. This might be the csv or json, no strong opinions here.
  • Provide automatic tests that verify that the DAC codes used in our templates are identical to the ones of the official versions.

Potential benefits

  • As the DAC codes are updated we have automatic tests that allow us to detect important changes. This would also allow us to make sur that different technical implementations of the project location template adhere to the same standard.
  • In the long term we could try to create the worksheets in the excel files directly from the external DAC lists, making updates easier. However, this would have to be most likely left to later stages.

@Jo-Schie and @Maja4Dev if there is interest in this I could prepare a draft PR to make clearer what I have in mind.

Governance of future changes to the specification looks like

This issue was raised by @goergen95 in the discussion of #21 . I will cite the initial statement here for clarity:

Currently there is the lack of a clear process how the governance of future changes to the specification looks like. This is very much required if KfW wants to make it legally binding for partners to adhere to the specifiation. Then, partners need to know which version is the legally binding one, how changes are being handled on what that means for them during the lifetime of a project.

Let us see how we can gain some clarity.

Licence

What is the appropiate Licence ?

  • It looks to me very much like the other contents of the RMMV context and then it would be similiar to the one of the d4dtools CC-BY-NC-SA-4.0

Integration of different data formats

A number of different possible data formats exist and ideally we should find a way to streamline them. One issue that was raised with the xlsx format in #21 was by @goergen95

The provision of the ToR are a burden to technical adapt partners. For partners with elaborated GIS systems there are better formats/procedures for the exchange of geospatial information. I would be against KfW making it mandatory for them to deliver their data in sub-optimal and proprietary data formats.

We have started to look into this in #17 and we have first tools for conversion in #18 . However, it is not yet clarified how we can put all of the ideas together...

Create a github pages from this repo

@fretchen : do you have expirience with github pages. It would be nicer since we could provide content in a user-frindly way and have the link from the RMMV Guidebook to the page... If this is a quick thing to do we might consider it.

Usability of of ODK Template

          Hi there, sorry in advance for the length of the post and if any of these points are off topic, but this can happen with an external point of view 😊. I admit I didn't read all the associated documentation, but also on purpose, as I think it's useful to have such data collection tools as self-explanatory as possible to make sure everything is done for a quality data collection.

I guess the users will be perfectly aware/trained on all the notions etc explained (as else very hard to get the whole understanding for someone who does not have prior knowledge)? Because I doubt that many will open the external documentation, so it can be good to embed it further as hints/explanations in the form in my opinion. Relatedly, I would add a clearer title to the form, a quick intro explaining the purpose for the user (who is capturing, when & why), perhaps some section labels to give it flow/ for better readability (project general information, location specific info, location specific geo etc)? Perhaps add color/bold to some labels to make sure some elements are well read/ user friendly, ie start and end date in the activity_start /activity_end labels (https://xlsform.org/en/#styling-prompts).

Here is more specific feedback:
Unique-ID :

  • It’s not clear to me for new locations what should be done for the code, as the variable is mandatory. Is there any guidance on how it should be built that could be added to the hint? Or a skip pattern on whether it’s new or not to be added to deal with the case?
  • For the hint, maybe talk of variable rather than column, easier for the person collecting the data to understand

All text / number / data variables: any constraints that should be added (Regex- https://cartong.pages.gitlab.cartong.org/learning-corner/en/5_survey_design_mdc/5_6_form/5_6_4_validation_criteria#use-of-regex--when-how-and-examples-in-the-humanitarian-and-development-fields) or min/max for ex) to avoid errors? For example for it to be in a range of numbers, to start with a letter and then have X numbers, for there to be a min and max date or budget, to avoid avoidatable errors. Even just making sure there is no negative data captured:)

Data owner: perhaps add example in hint to make it more user friendly? I suppose a dropdown list with an “if other, please specify” would be too complicated to compile?

publishing_restrictions: will the person answering always know for sure if yes or no, or should a “to be discussed further” option be useful?

Location name: what should they do if there is no name? Describe its location, write NA?
Location Activity Status: there is no need to have an “if other please specify”? Just checking

Additional Activity-Description : would it make sense to make the text multiline in appearance so they can describe in more detail? I guess this depends if data is collected on web at all, which I imagine it is?

KC Theme/Sub-sector: I guess this is not finalised, but the names are not coded in a format that is exploitable, they are “just” normal labels, so the cascading lists can’t work -there are spaces, special characters etc)

DAC 5: don’t get the new row for each location, you mean a new submission? Or do you want a repeat group in the form so that a series of location per project can be captured inside one submission?

Budget share: same, do you want this in a repeat group, to then be able to make and check the sums through calculations? Sorry if off topic, hard for me to be sure of the general logic of what is being collected

Latitude & Longitude: If the data is not always captured in the field, could you not at least have a skip pattern to make a GPS type variable usable if relevant (as so much easier / less error prone…). Or if using an online app like Enketo in Kobo for capture, to be able to capture it by pointing on the map as such tools make possible? And if “yet unknown”, should there not be a skip pattern so the latitude/longitude not be captured, or else not make it mandatory?

Do you want any [metadata](https://cartong.pages.gitlab.cartong.org/learning-corner/en/5_survey_design_mdc/5_6_form/5_6_6_quality_control) (such as when the submission was started/ended?) or is it not useful?

And just a side note that made me smile- I just saw in the “start here” tab a mention of our XLSForm cheat sheet, that I initiated a few years ago, nice to see it being used 😊.

hope this is useful, don't hesitate if you have further questions,
Maeve

Originally posted by @maevedefrance in #18 (comment)

Contribution guide

We need a guide that explains newcomers how they can easily contribute to the website. Most likely they are even new to the whole workflow of issues, PR etc...

Proposition to add an extra sheet to collect individual project information to the Excel Template

We are frequently asked by the projects if it was possible to extend the location model and add additional information for specific use-cases, sectors, projects etc. We are so far hesitant to do (and allow) that, because we do not want to blow up the model and make it unusable or charged with too much information that people might interpret as obligatory for specific sectors.

One easy and convenient way to circument this problem could be to create a new sheet called e.g. "project specific information" in the Excel Template. This sheet could duplicate some of the information from the "fill-me" sheet such as project number, project name, location name, location type, etc... and then projects or sectors could add their own columns with information they need to collect regarding this location anyway.

This would help us to better deal with the Excel because we could simply protect the "fille-me" sheet so that people who fill it out can not mess this up (which will facilitate processing as well) and understand, that these are the "minimum requirements". In addition our projects and sectors would be free, to add extra information which can be linked again with the locations, if the want to (voluntarily). The extra columns would not be published here, at least if there was no agreement that this information should be collected for all projects from a specific sector anyways (exeption use case is e.g. now the protected areas where the reporting needs to be extended) Like this we will not overwhelm people with additional requests but give projects the flexiblity to add information.

I already discussed this via phone with @Maja4Dev but I would like to also hear your opinion @fretchen and @karpfen and @goergen95 . Btw. lets not discuss Excel vs. other options here but rather focus on how we can improve the template and situation in the short term. Other format etc. should be developed as well of course.

Correct grammer for technical nodes

We wrote the technical notes sessions ourselves without having major native language support. I would like to have a tool checked the grammar and then suggest minor grammar corrections to the file technical notes

for your information @Maja4Dev . It would be great if you could review the pull request afterwards.

Extend the collection of existing location types

This is an issue for a future update of the project location model regarding the standard location types. Updates should probably be made as soon as enough changes are proposed and updates should be made not too frequently. Next Update could be e.g. beginning of next year.

Here is a list of KC Themes and suggested new types. This list can be extended by editing the issue description.

_Generic / Cross-Sectoral (incl. Climate, Gender, ICT etc.)

Governance / Decentralization

Peace / Displacement / Fragility

Financial Sector Development

Education

Health

Social Protection

Mobility / Transport

Urban Infrastructure

Energy

Water Management (incl. Water Supply & Sanitation)

Waste

Agriculture / Rural Development

Aquaculture / Fishery

Terrestrial Natural Resources Protection

  • indigenous and traditional territories (ITT)

Marine Natural Resources Protection

  • locally managed marine areas (LMMA)

Fyi @Maja4Dev & @fretchen

Commenting

Do we want something fancy for commenting like we use at d4dtools ? @Jo-Schie

For an example see here.

Make the template example rows consistent with the technical notes

In the template we have a number of rows with examples. My understanding is that they should be in agreement with the technical guidelines. Otherwise they might confuse the user and also fail in automatic validation tests...

This really starts to become obvious as we start work on automatic validators as for example in #76. There the validator currently fails on numerous rows with data. The current errors are:

Row 3 is valid
Row:  4
Error in row: 4
'approximate (admin unit)' is not one of ['exact', 'approximate']
Row:  5
Error in row: 5
'longitude' is a required property
Row:  6
Error in row: 6
'kfwProjectNoINPRO' is a required property
Row:  7
Error in row: 7
'kfwProjectNoINPRO' is a required property
Row:  8
Error in row: 8
'kfwProjectNoINPRO' is a required property
Row:  9
Error in row: 9
'kfwProjectNoINPRO' is a required property
Row:  10
Error in row: 10
'kfwProjectNoINPRO' is a required property
Row:  11
Error in row: 11
'kfwProjectNoINPRO' is a required property
Row:  12
Error in row: 12
'kfwProjectNoINPRO' is a required property
Row:  13
Error in row: 13
'kfwProjectNoINPRO' is a required property
Row:  14
Error in row: 14
'kfwProjectNoINPRO' is a required property
Row:  15
Error in row: 15
'kfwProjectNoINPRO' is a required property
Row:  16
Error in row: 16
'kfwProjectNoINPRO' is a required property
Row:  17
Error in row: 17
'kfwProjectNoINPRO' is a required property
Row:  18
Error in row: 18
'kfwProjectNoINPRO' is a required property

From what I can see all these complaints are valid. I would think that each entered line in the example template should be valid, but maybe three data lines are also sufficient ? @Maja4Dev I think that this one is most likely close to you no ?

Create a json schema for the model

We now have the specifications for the template fixed within the technical notes. To make progress on this we could use these notes and translate them into a json schema. The advantages of json schema to the markdown format:

  • It is directly machine readable.
  • It can be used as an input for the generation of markdown files.
  • It as a very broad support across different technological tools
  • It can be used to directly create web forms.
  • It plays nicely with python / pydantic, which is under the hood of geonode.
  • It is designed for complex data formats. Hence it could allow us to provide a technical bridge to more advanced systems as suggested in #10, #18 and #21.
  • Any technical implementation could then be automatically tested if it is compatible with the jsonschema, lending to much more flexibililty and robustness.

Alternatives

I currently do not see any good alternatives. Possible options would be:

xlsx

  • Hard to read for machines.
  • Hard to implement complex data structures.
  • Not really a very open standard.
  • Not very broadly used as reference.
  • more adapt as a technical implementation than as a reference

markdown

not machine readable and hence not able to serve as technical basis for validations.

uml

  • this is a fairly abstract format which does not allow for validation.
  • so it falls more into the documentation level and allows for less precise implementations.

direct technical implementations

Keeping them compatible requires a common reference / language. This should be digestable across technical implementations. Hence, jsonschema.

I would propose this as first step to see where we can go with this. Comments @Jo-Schie , @Maja4Dev or @goergen95 before I start simple first attempts in this direction ?

Originally posted by @fretchen in #24 (comment)

Merge "Technical Notes on the Project Location Model" with READ-ME in Excel-Template

As agreed with Johannes, we will now merge the page "Technical Notes on the Project Location Model" with the READ-ME in the Excel-Template, so that users only need to use one information soruce on how to fill out the Excel Template and we need less effort to update both.
The plan is for me to propose changes in the page "Technical Notes on the Project Location Model". Once these changes have been amended and /or accepted and merged, we will update the respective READ-MEs in the Excel-Templates.

Standard für Prozess Geo-Daten-Pflege

Wir erhalten Geodaten in regelmäßigen Abständen, z.B: bei Reports der Consultants.

Als ersten Effekt könnte dies zunächst zur Folge haben, dass diese Daten redundant hochgeladen werden und an einem Ort zig Dubletten entstehen.

Es könnten grundsätzlich auch alle vorhandenen Daten eines Projekts neu gesendet werden, oder es könnten nur geänderte Daten gesendet werden.
Dies hat als nächstes zur Folge, dass der Anwender alle erhaltenen Daten auf Änderungen untersuchen muss.
Absichtlich Redundant werden die Daten aber auch dadurch, dass GeoDaten von Projekte "in Planung" gesendet werden könnten und auch später im Verlauf mit Umsetzungsphasen. Dann entstehen auch wieder historisch gewachsene und unnötige Datensätze mit doppelten Pins auf der Landkarte.
Potenziell entsteht "ein Wust" an Daten.
Neben diesen Konsistenzfragen ist der Pflegeprozess, gerade bei doppelten Daten oder einzelnen Änderungen EXTREM unnötig aufwendig!

Wichtig wäre daher, wenn die Datenanlieferung Standardisiert würde und z.B. nach aller-erster Meldung nur noch Datenänderungen zu einem Projektdatensatz hinzugeladen werden.
Dies setzt aber wieder eine eindeutige ID der Geodatensätze voraus (?!).
Alternativ könnten immer alle Daten neu gemeldet werden, wodurch aber auch das Risiko entsteht, dass Daten durch Überschreiben gelöscht werden.

xlsx to kml

We have frequent discussion on how to convert the xlsx files into a kml. I have created a first draft on how this might be done.

Validation of project location types

The location types are found in the worksheet Location Types IATI and New. From what I understand they are loosely based on the IATI location types, which can be found here. Those types have four attributes:

  • Code
  • Name
  • Description
  • Category

I suggest to start comparing the location types here with the ones from the IATI standard and move them towards a json / csv. This would allow

a) versioning
b) simpler validation
c) simpler comparision with the IATI standards

From what I understand our location types also have four attributes:

  • KC-Theme / Subsector
  • physical or immaterial location type
  • (IATI) Location Type Name (EN)
  • Geodata type

So to get started on this I have a few questions.

Questions

  1. Is it correct that our location types and the IATI location type should same names ?
  2. Is there a way to also use location codes for our location types ? This would allow us to identify IATI types for which we changed the name.
  3. I would strongly suggest to use single words to describe each attribute of our types. They could be Subsector / Immateriality / Name / Type.
  4. Is it correct that we ignore category, code and description within the IATI location types for the moment ?

Solving this seems like a prerequisite for the work on #70 from what I can see.

@Maja4Dev I think that you know the answers to these questions best.

make the repo ready for netlify

It would be nice to have previews in PRs and this might be possible with netlify. However, the builds are currently failing as we might be missing some important docs...

Release prep for v1

In this issue we collect important things that need to be done to finish milestone v1 and then transform them into issues where appropiate.

@Maja4Dev, @goergen95 and @Jo-Schie any lists ?

Questions I have are:

  • French stuff is required or optional ?
  • Readme updates are required or optional ?
  • Any technical details required ?
  • Workflows from #34 required ?
  • Icons of #8 required for v1 ?
  • Governance of changes #23 ?

What else ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.