Giter VIP home page Giter VIP logo

driver-state-detection's Introduction

Real Time Driver State Detection

Python OpenCV

Real time, webcam based, driver attention state detection and monitoring using Python with the OpenCV and mediapipe libraries.

driver state detection demo

Note: This work is partially based on this paper for the scores and methods used.

Mediapipe Update

Thanks to the awesome contribution of MustafaLotfi, now the script uses the better performing and accurate face keypoints detection model from the Google Mediapipe library.

Features added:

  • 478 face keypoints detection
  • Direct iris keypoint detection for gaze score estimation
  • Improved head pose estimation using the dynamical canonical face model
  • Fixed euler angles function and wrong returned values
  • Using time variables to make the code more modular and machine agnostic

NOTE: the old mediapipe version can still be found in the "dlib-based" repository branch.

How Does It Work?

This script searches for the driver face, then use the mediapipe library to predict 478 face and iris keypoints. The enumeration and location of all the face keypoints/landmarks can be seen here.

With those keypoints, the following scores are computed:

  • EAR: Eye Aspect Ratio, it's the normalized average eyes aperture, and it's used to see how much the eyes are opened or closed
  • Gaze Score: L2 Norm (Euclidean distance) between the center of the eye and the pupil, it's used to see if the driver is looking away or not
  • Head Pose: Roll, Pitch and Yaw of the head of the driver. The angles are used to see if the driver is not looking straight ahead or doesn't have a straight head pose (is probably unconscious)
  • PERCLOS: PERcentage of CLOSure eye time, used to see how much time the eyes are closed in a minute. A threshold of 0.2 is used in this case (20% of a minute) and the EAR score is used to estimate when the eyes are closed.

The driver states can be classified as:

  • Normal: no messages are printed
  • Tired: when the PERCLOS score is > 0.2, a warning message is printed on screen
  • Asleep: when the eyes are closed (EAR < closure_threshold) for a certain amount of time, a warning message is printed on screen
  • Looking Away: when the gaze score is higher than a certain threshold for a certain amount of time, a warning message is printed on screen
  • Distracted: when the head pose score is higher than a certain threshold for a certain amount of time, a warning message is printed on screen

Demo

MEDIAPIPE DEMO COMING SOON

OLD DEMO:

demo.mp4

The Scores Explained

EAR

Eye Aspect Ratio is a normalized score that is useful to understand the rate of aperture of the eyes. Using the mediapipe face mesh keypoints for each eye (six for each), the eye lenght and width are estimated and using this data the EAR score is computed as explained in the image below: EAR

NOTE: the average of the two eyes EAR score is computed

Gaze Score Estimation

The gaze score gives information about how much the driver is looking away without turning his head.

To understand this, the distance between the eye center and the position of the pupil is computed. The result is then normalized by the eye width that can be different depending on the driver physionomy and distance from the camera.

The below image explains graphically how the Gaze Score for a single eye is computed: Gaze Score NOTE: the average of the two eyes Gaze Score is computed

Head Pose Estimation

For the head pose estimation, a standard 3d head model in world coordinates was considered, in combination of the respective face mesh keypoints in the image plane. In this way, using the solvePnP function of OpenCV, estimating the rotation and translation vector of the head in respect to the camera is possible. Then the 3 Euler angles are computed.

The partial snippets of code used for this task can be found in this article.

Installation

This projects runs on Python 3.9 with the following libraries:

  • numpy
  • OpenCV (opencv-python)
  • mediapipe

You can use the requirements.txt file provided in the repository using:

pip install -r requirements.txt

Or you can execute the following pip commands on terminal:

pip install numpy
pip install opencv-python
pip install mediapipe

Usage

First navigate inside the driver state detection folder:

cd driver_state_detection

The scripts can be used with all default options and parameters by calling it via command line:

python main.py

For the list of possible arguments, write:

python main.py --help

Example of a possible use with parameters:

python main.py --ear_time_tresh 5

This will sets to 5 seconds the eye closure time before a warning message is shown on screen

Why this project

This project was developed as part for a final group project for the course of Computer Vision and Cognitive Systems done at the University of Modena and Reggio Emilia in the second semester of the academic year 2020/2021. Given the possible applications of Computer Vision, we wanted to focus mainly on the automotive field, developing a useful and potential life saving proof of concept project. In fact, sadly, many fatal accidents happens because of the driver distraction.

License and Contacts

This project is freely available under the MIT license. You can use/modify this code as long as you include the original license present in this repository in it.

For any question or if you want to contribute to this project, feel free to contact me or open a pull request.

Improvements to make

  • Reformat code in packages
  • Add argparser to run the script with various settings using the command line
  • Improve robustness of gaze detection (using mediapipe)
  • Add argparser option for importing and using the camera matrix and dist. coefficients
  • Reformat classes to follow design patterns and Python conventions
  • Improve perfomances of the script by minimizing image processing steps

driver-state-detection's People

Contributors

e-candeloro avatar mustafalotfi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

driver-state-detection's Issues

MediaPipe: Select only a few points to improve FPS

I am trying to develop a system similar to this one using MediaPipe on a Jetson Nano. The problem is the availability of current versions of MediaPipe that do not provide support for this platform, (I'm using 0.8.9).

One of the challenges I encounter is processing; 368 points impact the FPS. My FPS rate is around 5. Is it possible to select only a few points? It seems that your program does not use or process the 378 points from the latest version of MediaPipe.

WhatsApp Image 2023-10-10 at 09 18 04

Yaw pitch roll value range

If we manually annotate the head pose, the angle would be in range of [-90, 90] for yaw/pitch/roll.
So how can we evaluate the output Eulers angle (yaw/pitch/roll) with the ground truth data, SInce the output angles ranges between -180 to 180

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.