Giter VIP home page Giter VIP logo

ocean-toxin-prediction's Introduction

A data science project predicting risk of toxic domoic acid events in California coast region

Background

While algae are important players in our ecosystem and the marine food web, under certain conditions, some algae can grow rapidly and release large amount of toxins that are harmful to human and other marine animals. Such biological events are known as harmful algal blooms. There are many types of harmful algal blooms; one of them is of particular interest to pacific coastal states of the U.S. That is Pseudo-nitzschia bloom, which produces the toxin, domoic acid. Domoic acid is a neurotoxin that can cause serious neurological symptoms, and sometimes death, if ingested. It is not only dangerous to swimmers and beachgoers, but it can also accumulate in seafood, especially shellfish such as crabs and clams. Therefore, high domoic acid level in coastal waters is not only a public health concern, but also an economic risk. For example, the harmful algal bloom in 2015 caused closures and delays of dungeness crab, rock crab season, which costed millions of dollars to the fishing/seafood industry.

Problem Statement

Physical, chemical, and biological conditions such as water temperature, nutrients in the water, seasonality and other algae in the water may contribute to the formation of harmful algal blooms and the production of toxins. This project aims to use these data to predict risks of toxic algal bloom events in California coastal region.

Findings

Data was obtained from multiple sources, combined and cleaned. There were 2,750 samples with 17 independent features. There were 120 samples that had dangerous levels of toxin.
Through exploratory data analysis, I found that Pseudo-nitzchia seriata cell concentration was the most important factor in causing toxic events. Chlorophyll concentration, water temperature, season also had significant impact on risk of toxic events.
I built logistic regression, support vector machines, random forest, gradient boost, and voting models to predict risk of toxic events. The voting model was the most useful among them, achieving 53% average precision and 91% accuracy.

File structure

  1. Data wrangling, EDA, machine learning code and figures are in Jupyter Notebook form and can be found in notebooks/
  2. Raw data are in data/raw
  3. Cleaned data are in data/cleaned_data.csv
  4. Find report and a presentation can be found in Final_report.pdf and presentation.pdf

ocean-toxin-prediction's People

Contributors

zxl124 avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

Forkers

panhl12 abhiramp1

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.