Giter VIP home page Giter VIP logo

anomaly-detection-on-temperature-device-failure's Introduction

Anomaly-Detection-On-Temperature-Device-Failure

A typical anomaly detection task and performing KMeans, PCA, Gaussian distribution, and Isolation Forest

  1. Plotting / visualizing the 'original' dataset (hint: this is a Time Series object)

  2. Performing Feature Engineering on the dataset such that new features are to be added. Specifically, you need to create a feature that will indicate the day of the week and the time of the day. Namely, there should be four (4) categories (clusters?) for the feature, name it 'dtcat' (date-time-category):

  • Weekday Day
  • Weekday Night
  • Weekend Day
  • Weekend Night Note: Some features such as ‘dayofweek’, ‘hours’, ‘day’, etc. may remain in the dataset. We define the duration of ‘Day’ and Night’ as follows: Duration of 'Day' should be defined: 7:00am - 7:00pm Duration of 'Night' should be defined: 7:01pm - 6:59am Ultimately, we would like to figure out when (weekday, weekend, day, or night) the device fails!
  1. Applying the K-Means algorithm to the revised dataset and determining the best value for K. Testing K in the range of [1, 20]. Plotting a graph showing the number of clusters (K) with the score of each K-Means model.

  2. After determining the best value of K, plot (scatter plot) all these K clusters by choosing 2 features from the dataset. Should the dataset has more than 2 features (which most likely will be the case), applying PCA to derive those 2 features (2 Principal Components) [pca = PCA(n_components=2), then 'fit' pca into the dataset]

All following steps were executed twice:

  • for outliers_fraction = 0.01 (1%), assume that someone gave us this figure
  • calculating outliers_fraction by finding the total number of outliers utilizing the IQR Method; specifically use the 1.5xIQR rule.
  1. Applying the Gaussian distribution (EllipticEnvelope) algorithm, as defined in step 2. Listing anomalies (if any) in each category and showing them graphically.

  2. Applying the Isolation Forest algorithm at each category, as defined in step 2.
    Listing anomalies (if any) in each category and showing them graphically.

anomaly-detection-on-temperature-device-failure's People

Contributors

ksharma67 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.