Giter VIP home page Giter VIP logo

sbohra-16623-project's Introduction

sbohra-16623-project

Deep Face Verification on iOS - Sudev Bohra

Commercial use not allowed.

Introduction:

For my project I use a deep network to run facial verification/recognition on live video on an iOS device.

IIn the last few years, Convolution Neural Network’s (CNN) have proven to be great at various computer vision tasks like object recognition, pose estimation and face recognition. In this paper, I use a light CNN to implement face recognition on an iOS device.

Systems like DeepFace [2] and FaceNet [3] have beat human-level performance accuracies of over 98-99% on the LFW face recognition. However, these networks are very deep and contain many parameters. DeepFace contains 120 million parmeters and 9 layers. FaceNet contains between 8-20 million parameters and between 9 and 22 layers.

Wu et al.[1] propose a light CNN framework for deep face representation from noisy labels. They define a Max-Feature-Map activation function to replace the, usual ReLU to suppress low-activations. The Max-Feature-Map helps to separate out noisy labels, allowing you to use large automatically collected datasets for training. The CNN This light CNN outputs a 256D CNN based feature vector. Cosine-similarity between these 256D vectors can be used to detect similarity between faces. This representation is a lot more compact compared to systems like DeepFace that extracted 4096D feature vectors. Wu et al.’s CNN achieves, state of the art results of the LFW benchmark of a 98.80% accuracy.

Due to Wu et al. CNN’s light nature and compact face representation, this paper is a great candidate for mobile implementation. I want to see how fast you can run this facial recognition neural network on a mobile device.

Background:

I use Wu et al.’s CNN framework and the pre-trained model from his paper for this iOS implementation since I wanted to focus on the mobile implementation and not on training the model (which is usually done on a GPU server).

I use Google TensorFlow (TF) as my neural network framework of choice. Since TF is available on all platforms, the models are portable across desktop and mobile. TF represents neural networks as a graph of computations. The model can be written in Python and then the graphs can be run on either C++ or Python. The iOS implementation of TF uses Eigen under the hood, which supports explicit vectorization for ARM NEON.

I also explore using Apple’s Metal framework for GPU neural network computations. Although I ran into some issues due to floating point conversions that didn’t allow me to demo the model. I have included the speed of the Metal implementation in this paper as anecdotal evidence.

Initially, I used OpenCV and dlib for facial landmark alignment, which is a necessary step to crop and align faces before passing it to the CNN. Dlib face alignment is based on an ensemble of regression trees and is decently fast. In the end, I replaced this with iOS CoreImage since it runs face detection and 3 point landmark detection very quickly.

I was able to run the network at 5-6 FPS on iOS TensorFlow framework (CPU) and was able to run the network at 10 FPS on Apple Metal framework (GPU).

Schedule:

✔ Get a Neural Network framework running on iOS. Find dataset and sanitize it.

✔ Build the neural network necessary to extract deep face features based on research paper and train it either on a GPU server

✔ Run the neural network on the iOS device and begin running it on camera. Wire up camera to neural network, detecting face, cropping face and passing to the network and converting response to visual overlay on camera view.

✔ Tweak and build a demo project that uses the deep face features on mobile for face verification/recognition.

Recently, there has been a lot of breakthrough in the use of deep networks in face recognition and face verification tasks. For my project I use a deep network to run facial recognition on live video. I use techniques from the lightened CNN paper and tensorflow framework, along with CoreImage Face detection to run face detection live.

I built a lightened CNN with Max-Feature-Map activations for the face detection in tensorflow.

Other github projects referenced:

VGGNet on Metal iOS: https://github.com/hollance/VGGNet-Metal/tree/master/VGGNet-iOS/VGGNet

Caffe-to-Tensorflow model conversion: https://github.com/ethereon/caffe-tensorflow

Tensorflow iOS examples: https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/ios_examples

These are research projects that I heavily referenced: Research Papers: [1403.2802] Learning Deep Face Representation - arXiv.org https://arxiv.org › cs by H Fan - ‎2014 - ‎Cited by 47 - ‎Related articles

A Lightened CNN for Deep Face Representation https://arxiv.org › cs by X Wu - ‎2015 - ‎Cited by 12 - ‎Related articles

sbohra-16623-project's People

Contributors

sudevbohra avatar

Stargazers

 avatar Nicolas Innocenti avatar CapabilityLimited avatar NSLegend avatar  avatar wang zhuo avatar MadThanos avatar Fengshi avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.