Giter VIP home page Giter VIP logo

datastreampilot's Introduction

Contributors Forks Issues MIT License


Logo

Data Stream Pilot

Data preparation pipeline on IoT Test Bed!
Explore the docs »

View Demo · Report Bug · Request Feature



Table of Contents
  1. About The Project
  2. Getting Started
  3. Dataprocessing
  4. Architecture
  5. Documentation
  6. Contact
  7. Acknowledgments

About The Project

This project demonstrate a data preparation pipeline. We have sensors which gives erronous data simuated by adding gaussian noise. Noise is really high such that sensors can be considered almost not working properly. Pipeline filters the noise in different stages save processed data to a database.

Presentation : Google Slides

Built With

(back to top)

Getting started

Prerequisites

Test bed,

Server,

Server with IPv6 stack and public IPv6 address. This is because RIOT OS only has IPv6 stack support by the time we did this project. We are using an Amazon EC2 instance.

  • AWS account with EC2 access
  • Docker installed on the EC2 instance

References,

(back to top)

Installation

IoT Test Bed setup

  • Clone the repository to the testbed.
  • Setup variables SENSE_SITE and COAP_SERVER_IP in scripts\setup_env.sh
# grenoble, paris, lille, saclay, strasbourg
export SENSE_SITE=grenoble
  • System already setup for grenoble, strasbourg and saclay
    • If needed for each site you may have to change BORDER_ROUTER_IP assignment in the same script
    if [ "$SENSE_SITE" = "grenoble" ]; then
      # 2001:660:5307:3100::/64	2001:660:5307:317f::/64
      export BORDER_ROUTER_IP=2001:660:5307:313f::1/64
    elif [ "$SENSE_SITE" = "paris" ]; then
    ...
  • Also in scripts\mini_project2.sh nodes are manually selected for the above same sites. If the nodes are not working now you have to select them in the script
if [ "$SENSE_SITE" = "grenoble" ]; then
    export BORDER_ROUTER_NODE=219
    export COMPUTE_ENGINE_NODE_1=220
    export COMPUTE_ENGINE_NODE_2=221
    export COMPUTE_ENGINE_NODE_3=222
elif [ "$SENSE_SITE" = "saclay" ]; then
    export BORDER_ROUTER_NODE=5
    export COMPUTE_ENGINE_NODE_1=7
  • execute command make run_mini_project_2

(back to top)

Server setup

Setting Up EC2 and Assigning a Public IPv6 Address

  1. Create an EC2 Instance

    • Launch an EC2 instance with a suitable AMI (Amazon Machine Image).
    • Ensure that the instance has the necessary permissions to interact with other AWS services.
  2. Assign a Public IPv6 Address

  3. Configure Inbound Rules for CoAP

    • Go to the AWS Management Console.

    • Navigate to the EC2 Dashboard.

    • Select your instance, go to the "Security" tab, and click on the associated Security Group.

    • In the Security Group settings, add an inbound rule for UDP at port 5683 for IPv6.

      Type: Custom UDP Rule
      Protocol: UDP
      Port Range: 5683
      Source: ::/0
      

      This allows incoming UDP traffic on port 5683 from any IPv6 address.

    • Note for Testing: For testing purposes, all IPv6 addresses are allowed (::/0). In a production environment, consider limiting access by applying a range of IPs.

  4. Configure Inbound Rules for Grafana

    • Add an inbound rule for TCP at port 3000 for IPv4.

      Type: Custom TCP Rule
      Protocol: TCP
      Port Range: 3000
      Source: 0.0.0.0/0
      

      This allows incoming TCP traffic on port 3000 from any IPv4 address.

    • Note for Testing: For testing purposes, all IPv4 addresses are allowed (0.0.0.0/0). In a production environment, consider limiting access by applying a range of IPs.

Setting up the server

  1. Clone the Repository and

    git clone <repository_url>
    cd <repository_directory>/src/server
  2. Install docker

    Run the below script to install docker

    chmod +x install_docker.sh
    ./install_docker.sh
  3. Build and Deploy the CoAP Server, Influxdb and Grafana

    docker-compose up -d

    This script builds the CoAP server Docker container and deploys it. This will also setup the Grafana dashboard visualizing time-series data and create influxdb instances as docker containers.

Usage: Grafana with InfluxDB

  1. Access Grafana Dashboard

    • Open your web browser and go to http://<public-ip>:3000/.

(back to top)

Architecture

Archhitecture

Dataprocessing

Spatial consistency checks

By comparing the readings from three sensors, we identify inconsistencies and outliers that may be missed by a single sensor.

Temporal consistency checks

In summary, employing moving window average and z-score normalization offers several advantages in temporal consistency checks.The combination of moving window average and z-score normalization can effectively identify and correct a wide range of errors in temporal data.

Sensor layer

Matlab simulink to simulate SMA (simple moving window averaging)

Archhitecture

Full pipeline inside the nodes

Archhitecture

  • calculate SMA and standard deviation for the window
  • Remove outliers based on deviation factor 2.0
  • Use the cleanup window again to calculte more correct average
  • Use new corrected window data to discard outliers from future sensor reading
  • continue for each moving window

Cloud and Edge Layer

We have three sensor almost at the same location in the sensor layer. Therefore we assume that these three temperature sensors to provide us same temperature value in the location.

We calcualte the mean and standard deviation from three sensor values. Then we caclulate the z-scores of each sensor reading. We use a z-score threshold choose good sensor reading values. Using selected good sensor readings we take the average of them to get the final output. This final output is the data we save in the database.

Archhitecture

Documentation

Sensor Layer

docs/Sensor explains our network layer

Network Layer

docs/Network explains our network layer

Data Layer

docs/Server explains our network layer

(back to top)

License

Distributed under the MIT License. See LICENSE.txt for more information.

(back to top)

Contact

Dilan Fernando - LinkedIn

Nipun Waas - MyPage

Rukshan Perera - MyPage

(back to top)

Acknowledgments

(back to top)

datastreampilot's People

Contributors

krvperera avatar waasnipun avatar dilafdo avatar

Watchers

 avatar

Forkers

dilafdo

datastreampilot's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.