Goal

Create a schema for the biological data-set on the wellimos server.

Tasklist

Get user access to the wellimos postgres
Create schema
Create store on Geoserver
Add table with blob for download

Success Criteria

There is a postgres schema for the biologoical data-set. This includes a table which refers to the file path for downloads.

Goal

Acquire an account on the NZODN GeoNetwork to publish metadata records.

Tasklist

Email Glenn
Create an account
Create a test record
Purpose

Success Criteria

We are now able to publish metadata records on the NZODN GeoNetwork catalogue. These records must refer to the GeoServer webservice, referencing a specific layer.

Goal

The biological_map table within the schema bio should have one entry with a polygon geometry. That polygon needs to be the bounding box for the dataset.

Tasklist

Find Bounding Box
Update geom column
grant nzodn role read privileges
verify with DBeaver or QGIS

Success Criteria

The bounding box has been added as column to the biological_map table, and we have verified it is correct using a visual tool (i.e., DBeaver or QGIS).

Investigate FTP usage in bash script

Goal

Understand the existing use of the FTP server within the existing data ingestion script.

Tasklist

find FTP usage in loadAllMooringData.sh
understand how it works
see relation to configuration file prod.ini

Success Criteria

I understand how the existing cron job uses FTP in its scripts. And how the configuration file factors in to the whole process. I am now comfortable investigating how this can be implemented as SFTP

Goal

Gain access to the /data/niwa/publish directory to copy the zipfile of the dataset to. Move the zip there.

Tasklist

Raise ticket to Service Desk
Gain access to the directory
Email Glenn to clarify process
Create bio sub-directory
Copy temporary zipfile to bio directory

Success Criteria

I have access to the directory where the other zipfiles are stored. And have copied the temporary zipfile into its own folder called bio.

Goal

We need a dedicated service account, say nzodn_admin or robot to perform the cron job.

Tasklist

Privileges to add cron jobs to root
Sevice account nzodn_admin
relevant permissions to access directories
Privileges to sudo into this account

Success Criteria

There is a service account, that can run cron jobs, with permissions to the directories it needs. The cron jobs are scheduled on the root user, to use this service account.

Bounding Box and SRS for GeoServer

Goal

Update the bounding box for the biological_map layer on GeoServer. Also ensure that the correct SRS projection is used.

Tasklist

Get bounding box
Concert to decimal degrees
Adjust Bounding box for layer
SRS projection for the data

Success Criteria

The correct SRS projection and bounding box information is provided on GeoServer for the Biological dataset.

Goal

Move the new zip file to the publish directory once it has been updated.

Tasklist

Update the folder.
Zip the folder again.
Move to the correct directory.

Success Criteria

The zip file in the publish directory contains the updated version of the directory.

Goal

Allow the ingestion script to take CSVs that have depth values that contain decimals.

Background

The existing script expects the depth value to be an integer. However, in the CSV provided, these can be decimals.

invalid input syntax for integer

Tasklist

Change datatype for the depth field.

Success Criteria

The ingestion script is able to handle decimal values for depth measurements.

Goal

Create a crontab for the fisheries datastream.

Tasklist

Find the script to run.
Add to crontab.

Success Criteria

There is a crontab that has been created for the fisheries datastream.

Goal

Test access the to the Niwa SFTP server.

Tasklist

Send IT public SSH key.
Get SFTP address.
Create test directory for experimentation
Test CRUD functionality

Success Criteria

We have remote access to the SFTP server. And sufficient permissions to perform the jobs needed for the ingestion.

Goal

Create a page which outlines the usage of SFTP for the CTD dataset.

Tasklist

SFTP / FTP / SSH
SSH Keygen
SFTP Example

Success Criteria

There is enough documentation on the project wiki, such that a new user, with no technical experience, could get up to speed on the purpose and usage of SFTP within the project.

Goal

The zip file for the first data ingestion is not readable. It can be downloaded, but it is in a format that cannot be read by the system.

Tasklist

Success Criteria

Goal

Ensure the existing Moorings dataset works on the SFTP server.

Tasklist

Change FTP commands to SFTP in scripts.
Test on existing datasets.

Success Criteria

The existing ingestion stream for the moorings data now works on the FTP server.

Goal

Make the fisheries scripts pass a valid date time to the the datasource table for the startime and endtime fields.

Background

The following error was been thrown.

ERROR:  date/time field value out of range: "2016-10-30 53:1 +12"

This was because the unix timestamp created in the loadFisheriesData.sh script using the gawk utility could not handle times before 10am (i.e., 10:00). It can't handle 05:31 am (shown above).

Since it is read as 531, and a very simple substring() is called. It extracts the first to characters as the hours (i.e. '53'), and the last to characters starting at index 3, (i.e. "1').

This created an invalid timestamp.

Tasklist

Handle 539 i.e. 05:39
Handle 40 i.e 00:40

Success Criteria

The script should be able to handle timestamps before 10am. This involves checking the length of the HHMM string, then deciding how best to proceed. A robust script will be able to handle the following times: 40 --> 00:40, 531 --> 05:31, 1521 --> 15:21.

CRON

Goal

Learn what is a CRON job is and how to implement it.

Tasklist

Learn CRON basics
Understand existing scripts
Write my own CRON job

Success Criteria

I will understand what a CRON job is, what the existing mooring data CRON job does, and have first hand experience implementing one.

Data Anaylsis

Goal

Get an understanding of the datasets that are to be ingested into the NZODN database.

Tasklist

find the directory for each data set
explore the files
purpose for each file type

Success Criteria

I have a general understanding of the CTD and Biological dataset. Sufficient enough, to make tracks towards starting the ingestion process.

Chatham CTD Dataset

Goal

Create a new dataset on NZODN for the one time Chatham biological dataset.

Tasklist

GeoServer
GeoNetwork
PostgreSQL
wellimos data folder

Success Criteria

A dataset has been added to the NZODN for the new biological dataset.

Convert FTP to SFTP

Goal

Convert the existing scripts to use sftp rather than ftp.

Tasklist

Generate SSH Keys for service account
Copy public key to katapo server
Replace ftp calls with sftp
test it works

Success Criteria

The script uses sftp now. And is able to make a secure connection to the server, and pull the data that needs to be ingested.

Missing Bounding Box on Subset

Goal

Show the bounding box for the map on the subset page.

Task-list

Compare to existing example.
Copy existing examples implementation.
Update details where appropriate.
Publish

Success Criteria

The bounding box is showing on the subset page of the NZODN website.

Datastream for Oceanographic

Goal

Create a new data stream for the Oceanographic data.

Tasklist

GeoServer Store
GeoServer Layer
Wellimos Scripts (copy of mooring)
Publish directory 'oceanographic'
Test that ingestion stream works on data.
Test that the downloads works from NZODN portal.

Success Criteria

There is a new datastream that works for the Oceanographic data.

NZODN Web Portal Down

Goal

Restore functionality to the NZODN web portal.

Background

The NZODN web portal is not working. It has ran in to the same issue before. It is likely to be related to the same bug, an hopefully an easy fix on the application level.

Tasklist

Raise ticket with IT.
Troubleshoot ourselves...
Get fixed?

Success Criteria

The NZODN web portal is back up and running.

Update Views for other datastreams

Goal

The materialized views for the other datastreams are not updated each time new data is added to the portal.

Tasklist

Success Criteria

Ensure that the other data streams (mooring and fisheries) also update their materialized views when new data is ingested.

GeoNetwork Metadata Record

Goal

Create a metadata record on the GeoNetwork with the relevant information and that references the GeoServer webservice.

Tasklist

Copy similar existing metadata record
Fill in information that is available
Reference WMS and WFS
Publish

Success Criteria

We have created a metadata record on the GeoNetwork catalogue. This references the GeoServer. Once tested, it is published to the NZODN.

NZODN Workflow Documentation

Goal

Create a basic documentation for the NZODN workflow.

Tasklist

PlantUML diagram
Textual explanation for each component
Wiki

Success Criteria

Created a basic documentation for the NZODN data ingestion process for new users to follow.

woodrock / psychic-invention Goto Github PK

psychic-invention's Introduction

Hello there 👋

Fish + AI Business - Fish-Nose

Shiny links

Demos

Personal introduction

Social Links

Support

My Latest Blog Posts 👇

psychic-invention's People

Contributors

Watchers

psychic-invention's Issues

Goal

Tasklist

Success Criteria

Goal

Tasklist

Success Criteria

Goal

Tasklist

Success Criteria

Goal

Tasklist

Success Criteria

Goal

Tasklist

Success Criteria

Goal

Tasklist

Success Criteria

Goal

Tasklist

Success Criteria

Goal

Tasklist

Success Criteria

Goal

Background

Tasklist

Success Criteria

Goal

Tasklist

Success Criteria

Goal

Tasklist

Success Criteria

Goal

Tasklist

Success Criteria

Goal

Tasklist

Success Criteria

Goal

Tasklist

Success Criteria

Goal

Background

Tasklist

Success Criteria

Goal

Tasklist

Success Criteria

Goal

Tasklist

Success Criteria

Goal

Tasklist

Success Criteria

Goal

Tasklist

Success Criteria

Goal

Task-list

Success Criteria

Goal

Tasklist

Success Criteria

Goal