Giter VIP home page Giter VIP logo

monitoring's Introduction

Solution writeup to the comma.ai Driver Monitoring Challenge

Input
  • 4x 60s 20hz video
Ouput
  • Annotated face tracking video
  • Head pose feature vector

Dependencies

  • numpy
  • sklearn
  • skimage
  • cv2

Layout

  1. Core

    • Frame preprocessing
    • Facial detection
    • Facial landmark identification
    • Geometric orientation
    • Rendering
    • Main
  2. Support

    • Configuration
    • SVM preprocessing
    • SVM training
    • File utilities
    • Numpy utilities
    • Image utilities
  3. Data

    • Trained SVM
    • Input
      • Video files
    • Intermediate
      • Preprocessing (optional)
      • imagesToFaces
      • videosToFrames
    • Output
      • Annotated video
      • Head pose estimation feature vectors
    • Haar cascade classifier
  4. Spike

    • Random excursions
  5. Tests

    • TBD

Method

video -> video preprocess + dataset preprocess -> train svm -> face detection -> retrain svm -> face detection -> find landmarks -> calculate geometry -> render

Pipeline

read frame -> frame preprocess -> face detection -> find landmarks -> calculate geometry -> render

SVM Preprocessing

  1. HEVC video dataset -> frames
  2. Yale faces dataset -> cropped

SVM Training

  1. Cropped yale faces -> positive samples
  2. 256 object categories dataset -> negative samples
  3. Annotate samples
  4. Train linear SVM
  5. Save SVM model
  6. (After sliding window): Retain SVM with hard-negative mining
  7. Save new SVM model

Face Detection

  1. Sliding window over image pyramid
  2. Non-maximum suppression

Face Alignment and Head Pose

  1. Facial landmark alignment
  2. 2D-3D point mapping
  3. Compute head orientation

Render Tracking and Pose

Future:

  1. Pupil detection
    • CDF
    • Feature Extraction and Normalization
  2. Gaze Classification and Decision Pruning

Method:

Using comma ai dataset: Take in hevc video Extract frames from 60s of 20hz video (~1200)

Using yale faces dataset: Convert to jpg and grayscale Crop the images using builtin haar cascades uniform resize write to disk generate (~165) positive samples for SVM using skimage hog descriptor

Using 256_object_categories dataset: generate (~30600) negative samples for SVM using skimage hog in batches of 1000 saving to disk

arrange data correctly + add labels train svm with the positive and negative samples save trained svm

sliding window image pyramid non-maximum suppression hard negative mining retrain

find face find eyes geometric transformation for facial plane generate vector

monitoring's People

Contributors

acarcher avatar geohot avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.