Giter VIP home page Giter VIP logo

sparkcode's Introduction

RDII Apache Spark

All Code written for processing D3S detector data collected from a mobile sensor network. This network consists of a small radiation detector that can easily fit in a pocket. This detector is paired via bluetooth to a Samsung Galaxy S6. The smartphone has an app created by our research group that sends the data into our Amazon Web Service Cloud where it is stored in S3. The header of the data is as follows:

Detector ID | Latitude | Longitude | Sigma (cps) | n (cps) | Time (ns) | channel 0 | channel 1 | ... | channel 4096

Detector ID is a unique identifcation number paired with each detector Latitude and Logitude are the GPS coordinates of the detector (within a few meters if outdoors) in degrees Sigma is the ionizing radiation counts per second measured by the D3S detector N is the neutron counts per second measured by the D3S detector Time is the unix time measured in nanoseconds Channels 0-4096 represent the energy spectrum measured from the detector

Additionally weather data was used from wunderground.com

GOAL: Sensor networks are increasingly being used for Nuclear Nonproliferation purposes in urban environments. The goal of this research is to take the gps and time-tagged radiation data and generate heat maps using various geostatistical techniques. Including MapReduce, Kriging, and Inverse Distance Weighting. By accurately characterizing background radiation we can improve outlier detection and find anomalous radioactive sources in our data set. This project required utilizing Elastic Map Reduce (EMR) clusters with Apache Spark pre-installed. All scripts were then run on subsets of the D3S data set. Some data cleaning and manipulation was required before processing.

TODO:

  1. Describe research project, goals, and accomplishments in README
  2. Add descriptive comment headers to each file...

All spark code run in the Amazon EMR environment.

DISCLAIMER: These code examples alternate between using SparkSQL and DataFrame operations (often within the same file). I acknowledge that this might not be the best coding practice; this was done for my own learning purposes.

sparkcode's People

Contributors

karlroth avatar nrothchicago avatar

Watchers

James Cloos avatar  avatar

sparkcode's Issues

Sample data

Thanks for the code. is it possible to post some sample data for us to be able to run and test

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.