Giter VIP home page Giter VIP logo

psychic-invention's Introduction

Hello there ๐Ÿ‘‹

Fish + AI Business - Fish-Nose

Fish have super good noses that can smell all kinds of things! Well, we made a computer fish nose that can smell fish too! This fish nose machine can smell what kind of fish it is, like if it's a Hoki fish or a Mackerel fish. It can even smell which part of the fish it's sniffing, like the head or the tail. Sometimes the fish get mixed together by accident and the fish nose sniffs that out too!

If stinky oil from a boat leaks onto the fish, the fish nose machine can smell that - yuck! It's good at finding fish that got oily and shouldn't be eaten. The best part is the fish nose knows each fish apart by their smell, like Sally the Hoki and Sammy the Mackerel! Our fish nose makes sure all the fish stay safe to eat. Pretty cool huh?

Shiny links

  1. Proposal
  2. Documentation
  3. AJCAI 2022 Paper

Demos

Personal introduction

My goal is to leave the world a better place than I found it. I plan to bring that goal into reality by creating technology that improves the quality of life. These goals have motivated BE Honours (First Class) in Sofware Engineering. This objective has led me to a contract in Software Development at NIWA, collaborating with scientists and physicists to publish their research on our oceans and atmosphere to a global audience. I am currently undertaking a PhD in Artificial Intelligence. Software is a medium to explore my scientific curiosity and contribute a meaningful change.

Social Links

Support

  • Buy me a coffee โ˜•
  • $ETH --> 0x045BA9c0c69AF53B2Fca0e1A3769E44D9a328696

My Latest Blog Posts ๐Ÿ‘‡

psychic-invention's People

Contributors

woodrock avatar

Watchers

 avatar  avatar

psychic-invention's Issues

Postgres Schema

Goal

Create a schema for the biological data-set on the wellimos server.

Tasklist

  • Get user access to the wellimos postgres
  • Create schema
  • Create store on Geoserver
  • Add table with blob for download

Success Criteria

There is a postgres schema for the biologoical data-set. This includes a table which refers to the file path for downloads.

GeoNetwork Access

Goal

Acquire an account on the NZODN GeoNetwork to publish metadata records.

Tasklist

  • Email Glenn
  • Create an account
  • Create a test record
  • Purpose

Success Criteria

We are now able to publish metadata records on the NZODN GeoNetwork catalogue. These records must refer to the GeoServer webservice, referencing a specific layer.

Bounding box as geometry

Goal

The biological_map table within the schema bio should have one entry with a polygon geometry. That polygon needs to be the bounding box for the dataset.

Tasklist

  • Find Bounding Box
  • Update geom column
  • grant nzodn role read privileges
  • verify with DBeaver or QGIS

Success Criteria

The bounding box has been added as column to the biological_map table, and we have verified it is correct using a visual tool (i.e., DBeaver or QGIS).

Investigate FTP usage in bash script

Goal

Understand the existing use of the FTP server within the existing data ingestion script.

Tasklist

  • find FTP usage in loadAllMooringData.sh
  • understand how it works
  • see relation to configuration file prod.ini

Success Criteria

I understand how the existing cron job uses FTP in its scripts. And how the configuration file factors in to the whole process. I am now comfortable investigating how this can be implemented as SFTP

Publish Directory

Goal

Gain access to the /data/niwa/publish directory to copy the zipfile of the dataset to. Move the zip there.

Tasklist

  • Raise ticket to Service Desk
  • Gain access to the directory
  • Email Glenn to clarify process
  • Create bio sub-directory
  • Copy temporary zipfile to bio directory

Success Criteria

I have access to the directory where the other zipfiles are stored. And have copied the temporary zipfile into its own folder called bio.

Service Account

Goal

We need a dedicated service account, say nzodn_admin or robot to perform the cron job.

Tasklist

  • Privileges to add cron jobs to root
  • Sevice account nzodn_admin
  • relevant permissions to access directories
  • Privileges to sudo into this account

Success Criteria

There is a service account, that can run cron jobs, with permissions to the directories it needs. The cron jobs are scheduled on the root user, to use this service account.

Bounding Box and SRS for GeoServer

Goal

Update the bounding box for the biological_map layer on GeoServer. Also ensure that the correct SRS projection is used.

Tasklist

  • Get bounding box
  • Concert to decimal degrees
  • Adjust Bounding box for layer
  • SRS projection for the data

Success Criteria

The correct SRS projection and bounding box information is provided on GeoServer for the Biological dataset.

Move new ๐Ÿ†• zip file

Goal

Move the new zip file to the publish directory once it has been updated.

Tasklist

  • Update the folder.
  • Zip the folder again.
  • Move to the correct directory.

Success Criteria

The zip file in the publish directory contains the updated version of the directory.

Invalid integer for depth.

Goal

Allow the ingestion script to take CSVs that have depth values that contain decimals.

Background

The existing script expects the depth value to be an integer. However, in the CSV provided, these can be decimals.

invalid input syntax for integer

Tasklist

  • Change datatype for the depth field.

Success Criteria

The ingestion script is able to handle decimal values for depth measurements.

Cronjob for Fisheries

Goal

Create a crontab for the fisheries datastream.

Tasklist

  • Find the script to run.
  • Add to crontab.

Success Criteria

There is a crontab that has been created for the fisheries datastream.

Test SFTP Server

Goal

Test access the to the Niwa SFTP server.

Tasklist

  • Send IT public SSH key.
  • Get SFTP address.
  • Create test directory for experimentation
  • Test CRUD functionality

Success Criteria

We have remote access to the SFTP server. And sufficient permissions to perform the jobs needed for the ingestion.

SFTP Wiki

Goal

Create a page which outlines the usage of SFTP for the CTD dataset.

Tasklist

  • SFTP / FTP / SSH
  • SSH Keygen
  • SFTP Example

Success Criteria

There is enough documentation on the project wiki, such that a new user, with no technical experience, could get up to speed on the purpose and usage of SFTP within the project.

Zip File Unreadable

Goal

The zip file for the first data ingestion is not readable. It can be downloaded, but it is in a format that cannot be read by the system.

Tasklist

Success Criteria

Move Moorings to SFTP

Goal

Ensure the existing Moorings dataset works on the SFTP server.

Tasklist

  • Change FTP commands to SFTP in scripts.
  • Test on existing datasets.

Success Criteria

The existing ingestion stream for the moorings data now works on the FTP server.

Date/time invalid.

Goal

Make the fisheries scripts pass a valid date time to the the datasource table for the startime and endtime fields.

Background

The following error was been thrown.

ERROR:  date/time field value out of range: "2016-10-30 53:1 +12"

This was because the unix timestamp created in the loadFisheriesData.sh script using the gawk utility could not handle times before 10am (i.e., 10:00). It can't handle 05:31 am (shown above).

Since it is read as 531, and a very simple substring() is called. It extracts the first to characters as the hours (i.e. '53'), and the last to characters starting at index 3, (i.e. "1').

This created an invalid timestamp.

Tasklist

  • Handle 539 i.e. 05:39
  • Handle 40 i.e 00:40

Success Criteria

The script should be able to handle timestamps before 10am. This involves checking the length of the HHMM string, then deciding how best to proceed. A robust script will be able to handle the following times: 40 --> 00:40, 531 --> 05:31, 1521 --> 15:21.

CRON

Goal

Learn what is a CRON job is and how to implement it.

Tasklist

  • Learn CRON basics
  • Understand existing scripts
  • Write my own CRON job

Success Criteria

I will understand what a CRON job is, what the existing mooring data CRON job does, and have first hand experience implementing one.

Data Anaylsis

Goal

Get an understanding of the datasets that are to be ingested into the NZODN database.

Tasklist

  • find the directory for each data set
  • explore the files
  • purpose for each file type

Success Criteria

I have a general understanding of the CTD and Biological dataset. Sufficient enough, to make tracks towards starting the ingestion process.

Chatham CTD Dataset

Goal

Create a new dataset on NZODN for the one time Chatham biological dataset.

Tasklist

  • GeoServer
  • GeoNetwork
  • PostgreSQL
  • wellimos data folder

Success Criteria

A dataset has been added to the NZODN for the new biological dataset.

Convert FTP to SFTP

Goal

Convert the existing scripts to use sftp rather than ftp.

Tasklist

  • Generate SSH Keys for service account
  • Copy public key to katapo server
  • Replace ftp calls with sftp
  • test it works

Success Criteria

The script uses sftp now. And is able to make a secure connection to the server, and pull the data that needs to be ingested.

Missing Bounding Box on Subset

Goal

Show the bounding box for the map on the subset page.

Task-list

  • Compare to existing example.
  • Copy existing examples implementation.
  • Update details where appropriate.
  • Publish

Success Criteria

The bounding box is showing on the subset page of the NZODN website.

Datastream for Oceanographic

Goal

Create a new data stream for the Oceanographic data.

Tasklist

  • GeoServer Store
  • GeoServer Layer
  • Wellimos Scripts (copy of mooring)
  • Publish directory 'oceanographic'
  • Test that ingestion stream works on data.
  • Test that the downloads works from NZODN portal.

Success Criteria

There is a new datastream that works for the Oceanographic data.

NZODN Web Portal Down

Goal

Restore functionality to the NZODN web portal.

Background

The NZODN web portal is not working. It has ran in to the same issue before. It is likely to be related to the same bug, an hopefully an easy fix on the application level.

Tasklist

  • Raise ticket with IT.
  • Troubleshoot ourselves...
  • Get fixed?

Success Criteria

The NZODN web portal is back up and running.

Update Views for other datastreams

Goal

The materialized views for the other datastreams are not updated each time new data is added to the portal.

Tasklist

  • Moorings
    • Add SQL function to update views
    • Test
  • Fisheries
    • Add SQL function to update views
    • Test

Success Criteria

Ensure that the other data streams (mooring and fisheries) also update their materialized views when new data is ingested.

GeoNetwork Metadata Record

Goal

Create a metadata record on the GeoNetwork with the relevant information and that references the GeoServer webservice.

Tasklist

  • Copy similar existing metadata record
  • Fill in information that is available
  • Reference WMS and WFS
  • Publish

Success Criteria

We have created a metadata record on the GeoNetwork catalogue. This references the GeoServer. Once tested, it is published to the NZODN.

NZODN Workflow Documentation

Goal

Create a basic documentation for the NZODN workflow.

Tasklist

  • PlantUML diagram
  • Textual explanation for each component
  • Wiki

Success Criteria

Created a basic documentation for the NZODN data ingestion process for new users to follow.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.