Giter VIP home page Giter VIP logo

deepvog's Introduction

DeepVOG

DeepVOG is a framework for pupil segmentation and gaze estimation based on a fully convolutional neural network. Currently it is available for offline gaze estimation of eye-tracking video clips.

Citation

DeepVOG has been peer-reviewed and accepted as an original article in the Journal of Neuroscience Method (Elsevier). The manuscript is available open access and can be downloaded free of charge here. If you use DeepVOG or some part of the code, please cite (see bibtex):

Yiu YH, Aboulatta M, Raiser T, Ophey L, Flanagin VL, zu Eulenburg P, Ahmadi SA. DeepVOG: Open-source Pupil Segmentation and Gaze Estimation in Neuroscience using Deep Learning. Journal of neuroscience methods. vol. 324, 2019, DOI: https://doi.org/10.1016/j.jneumeth.2019.05.016

Release Notes

DeepVOG v1.1.4 (Date: 31-07-2019, latest)

Improvements:

  1. Added --skip_existed flag for skipping the operation in --table mode if the output file already exists
  2. Added --skip_errors flag for skipping the operation in --table mode and continue the next video if error is encountered.
  3. Added --log_errors flag for logging the errors and tracebacks in a file for --table mode, when error is encountered.
  4. Added --no_gaze flag for only pupil segmentation in --infer mode.
  5. One more column (with_gaze) to fill in the input csv file for --table mode. It enables/disable gaze estimation in --table mode.

For details of command line arguments, see doc/documentation.md

Removed:

  1. Text-based User Interface (TUI) is removed.

For release history, see RELEASE.md. Update from existing package can be done via directly copying the source code deepvog/ to your directory of installed DeepVOG module .

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.

Prerequisites

To run DeepVOG, you need to have a Python distribution (we recommend Anaconda) and the following Python packages:

numpy
scikit-video
scikit-image
tensorflow-gpu
keras

As an alternative, you can use our docker image which already includes all the dependencies. The only requirement is a platform installed with nvidia driver and nvidia-docker (or nvidia runtime of docker).

Installation of Package

A step by step series of examples that tell you how to get DeepVOG running.

  1. Installing from package
$ git clone https://github.com/pydsgz/DeepVOG
 (or you can download the files in this repo with your browser)

Move to the directory of DeepVOG that you just cloned/downloaded, and type

$ python setup.py install

If it happens to be missing some dependencies listed above, you may install them with pip:

$ pip install numpy
$ pip install scikit-video
$ ...
  1. It is highly recommended to run our program in docker. You can directly pull our docker image from dockerhub. (For tutorials on docker, see docker and nvidia-docker)
$ docker run --runtime=nvidia -it --rm yyhhoi/deepvog:v1.1.4 bash
or
$ nvidia-docker run -it --rm yyhhoi/deepvog:v1.1.4 bash

Removal of Package

Removal can be done by simply deleting the python package, for example:

$ rm -r /usr/local/lib/python3.5/dist-packages/deepvog-1.1.2-py3.5.egg

The exact path will depend on where you store your installed python package, and the version of deepvog and python.

Usage (Command-line interface)

The CLI allows you to fit/infer single video, or multiple of them by importing a csv table. They can be simply called by:

$ python -m deepvog --fit /PATH/video_fit.mp4 /PATH/eyeball_model.json
$ python -m deepvog --infer /PATH/video_infer.mp4 /PATH/eyeball_model.json /PATH/results.csv
$ python -m deepvog --table /PATH/list_of_operations.csv

DeepVOG first fits a 3D eyeball model from a video clip. Base on the eyeball model, it estimates the gaze direction on any other videos if the relative position of the eye with respect to the camera remains the same. It has no problem that you fit an eyeball model and infer the gaze directions from the same video clip. However, for clinical use, some users may want to have a more accurate estimate by having a separate fitting clip where the subjects perform a calibration paradigm.

In addition, you will need to specify your camera parameters such as focal length, if your parameters differ from default values.

$ python -m deepvog --fit /PATH/video_fit.mp4 /PATH/eyeball_model.json --flen 12 --vid-shape 240,320 --sensor 3.6,4.8 --batchsize 32 --gpu 0

Please refer to doc/documentation.md for the meaning of arguments and input/output formats. Alternatively, you can also type $ python -m deepvog -h for usage examples.

Usage (As a python module)

For more flexibility, you may import the module directly in python.

import deepvog

# Load our pre-trained network
model = deepvog.load_DeepVOG()

# Initialize the class. It requires information of your camera's focal length and sensor size, which should be available in product manual.
inferer = deepvog.gaze_inferer(model, focal_length, video_shape, sensor_size) 

# Fit an eyeball model from "demo.mp4". The model will be stored as the "inferer" instance's attribute.
inferer.process("demo.mp4", mode="Fit")

# After fitting, infer gaze from "demo.mp4" and output the results into "demo_result.csv"
inferer.process("demo.mp4", mode="Infer", output_record_path="demo_results.csv")

# Optional

# You may also save the eyeball model to "demo_model.json" for subsequent gaze inference
inferer.save_eyeball_model("demo_model.json") 

# By loading the eyeball model, you don't need to fit the model again
inferer.load_eyeball_model("demo_model.json") 

Demo

Demo video is located at demo. After installing DeepVOG, you can move to that directory and run the following commands:

$ python -m deepvog --fit ./demo.mp4 ./demo_eyeball_model.json -v ./demo_visualization_fitting.mp4 -m -b 256
$ python -m deepvog --infer ./demo.mp4 ./demo_eyeball_model.json ./demo_gaze_results.csv -b 32 -v ./demo_visualization_inference.mp4 -m

The -v argument draws the visualization of fitted ellipse and gaze vector to a designated video. The -m argument draws the segmented heatmap of pupil side by side. The -b argument controls the batch size. For more details of arguments, see doc/documentation.md.

In the results, you should be able to see the visualization in the generated video "demo_visualization_inference.mp4", as shown below.

In addtion, you can also test out the --table mode by:

$ python -m deepvog --table demo_table_mode.csv

Limitations

DeepVOG is intended for pupil segmentation and gaze estimation under the assumptions below:

  1. Video contains only single eye features (pupil, iris, eyebrows, eyelashes, eyelids...etc), for example the demo video. Videos with facial or body features may compromise its accuracy.
  2. DeepVOG was intended for eye video recorded by head-mounted camera. Hence, It assumes fixed relative position of the eye with respect to the camera.

For more detailed discussion of the underlying assumptions of DeepVOG, please refer to the paper.

Annotation tools

See annotation_tool/README.md.

Authors

  • Yiu Yuk Hoi - Implementation and validation
  • Seyed-Ahmad Ahmadi - Research study concept
  • Moustafa Aboulatta - Initial work

Links to other related papers

License

This project is licensed under the GNU General Public License v3.0 (GNU GPLv3) License - see the LICENSE file for details

Acknowledgments

We thank our fellow researchers at the German Center for Vertigo and Balance Disorders for help in acquiring data for training and validation of pupil segmentation and gaze estimation. In particular, we would like to thank Theresa Raiser, Dr. Virginia Flanagin and Prof. Dr. Peter zu Eulenburg.

DeepVOG was created with support from the German Federal Ministry of Education and Research (BMBF) in connection with the foundation of the German Center for Vertigo and Balance Disorders (DSGZ) (grant number 01 EO 0901), and a stipend of the Graduate School of Systemic Neurosciences (DFG-GSC 82/3).

deepvog's People

Contributors

cancan101 avatar mangotee avatar pydsgz avatar yyhhoi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deepvog's Issues

import deepvog

import deepvog is not working

.model.DeepVOG_model cannot be found

running on Google Colab

Enforce Aspect Ratio

I assume that the aspect ratio here should be the same:

ori_video_shape (tuple or list or np.ndarray): Original video shape from your camera, (height, width) in pixel. If you cropped the video before, use the "original" shape but not the cropped shape
sensor_size (tuple or list or np.ndarray): Sensor size of your camera, (height, width) in mm. For 1/3 inch CMOS sensor, it should be (3.6, 4.8). Further reference can be found in https://en.wikipedia.org/wiki/Image_sensor_format and you can look up in your camera product menu

i.e ori_video_shape[0] / ori_video_shape[1] ~= sensor_size [0] / sensor_size[1] (within floating point tolerance)?

If that is the case, then the inputs can be checked with an assertion, etc

Training parameters of the model

Hi,

Really impressive work and many thanks for making it open-source. I was trying to replicate your model by re-training using another dataset, but I never reached comparable with your published pre-trained weights. While I understand that you might not be able to share your training data, could you please reveal some of the hyperparameters you used for training, e.g. learning rate, optimizer, batch size, epochs, regularization etc. (augmentation was kindly described in the paper).

Thank you

deepVOG

Hi there,
I need the dataset for that project to use it in my own project

Request for dataset availability

Hey,

Good work. We are looking to similar lines but with really low dimensional data. We are currently getting data from Webcam.

Can we please have dataset links as well? We and the community can greatly benefit from the same.

tensorflow.python.framework.errors_impl.OperatorNotAllowedInGraphError: using a `tf.Tensor` as a Python `bool` is not allowed in Graph execution. Use Eager execution or decorate this function with @tf.function.

Hi,

I'm trying to run the demo as instructed and I get the following:
tensorflow.python.framework.errors_impl.OperatorNotAllowedInGraphError: using a tf.Tensoras a Pythonbool is not allowed in Graph execution. Use Eager execution or decorate this function with @tf.function..

I'm using windows 7 (with an i7 processor and w/o GPU)... not sure if it has any connection.

Thanks in advance!

wrong unit?

I just downloaded the code and tested the fit on the demo video included in the repo and I get the following eye model
{ "eye_centre": [ [ -189.91085622309689 ], [ 129.4567167957034 ], [ 3333.3333333333335 ] ], "aver_eye_radius": 1286.1201062769812 }
Shouldn't the units be in mm? Do these numbers make sense? Also I was expecting the algorithm to estimate the depth (z) of the eye center but it always reports the given initial z value. Is that correct?

Keras type error : symbolic inputs/outputs donot match

raise TypeError('Keras symbolic inputs/outputs do not '
TypeError: Keras symbolic inputs/outputs do not implement __len__. You may be trying to pass Keras symbolic inputs/outputs to a TF API that does not register dispatching, preventing Keras from automatically converting the API call to a lambda layer in the Functional Model. This error will also get raised if you try asserting a symbolic input/output directly

Blank output

I tried running the "test_if_model_work.py" file. The test_image.png file included with the python and h5 files didn't work, so I just took an image off google images. However, the output image is black. The numbers in the outputted array are on the order of 10^-5, 10^-6 and 10^-7. When I tried scaling these to the 0-255 range, the output was just grayscale noise. I've included the image for completeness. I only slightly modified test_if_model_works, and did not modify anything else. Here is the slightly modified test_if_model_work code I used:

(Edit: I downloaded all of the codes in the DeepVOG folder, but it is not clear which ones are dependencies and which is the main code that runs everything.)

def test_if_model_work():
model = load_DeepVOG()
img = np.zeros((1, 240, 320, 3))
reader=ski.imread("test_image.png")/255
reader.resize((240,320))
img[:,:,:,:] = reader.reshape(1, 240, 320, 1)
prediction = model.predict(img)
ski.imsave("test_prediction.png", prediction[0,:,:,1])
#print(prediction)
viewer = ImageViewer(np.uint8(prediction[0,:,:,1]*10000000))
viewer.show()
test_image

error about "No unprojected gaze lines or ellipse centres were added (not yet initalized). Use add_to_fitting() function to add them first"

hello, thank you for sharing your project. I installed it as read.me, and started fit step, but I got the error "No unprojected gaze lines or ellipse centres were added (not yet initalized). Use add_to_fitting() function to add them first ". Besides, i want konw that whether can i utilize the project to estimate gaze of video from DMS(driver monitor system) NIR camera ?

Demo Videos

The demo videos can not be played, is there a way to play them?

Visualization of eye center

When trying out the visualization branch, i noticed that the eye center was off from what it supposed to be. The pupil ellipse and Gaze vector were accurate but the blue line from the gaze to eye center did not form a continuous vector as it should have.

I am assuming the error is in the unprojection but I have not tracked through the math.

the initialization parameters

deepvog/inferer.py Line 46
self.mm2px_scaling = np.linalg.norm(self.ori_video_shape) / np.linalg.norm(self.sensor_size)#mm转换为像素 self.model = model self.confidence_fitting_threshold = 0.96 self.eyefitter = SingleEyeFitter(focal_length=self.flen * self.mm2px_scaling, pupil_radius=2 * self.mm2px_scaling, initial_eye_z=50 * self.mm2px_scaling)#眼睛模型
Hello, I would like to ask a question about initialization parameters. Because the test data I am using is synthetic image data, I am unable to determine the camera parameters.

  • mm2px_scaling:What is the meaning of this parameter generation? It seems that it has no actual physical meaning.
  • initial_eye_z:It seems that initializing the z-coordinates of the eyeball in this way cannot be used as the actual z-coordinate calculation.

DeepVOG annotation

How exactly do we upload the annotation to the DeepVOG model once images are annotated? Are we to make a separate script for it or can you provide one?

Low GPU usage / Performance

I am running fitting on your demo script using the command:

python -m deepvog --fit ./demo.mp4 ./demo_eyeball_model.json -m -b 32

I've seen that in your paper (Section 3.1.5 - Inference Speed) you run your program at 130Hz for batch sizes of 32, however when I run your program on your demo files (even without visualisation) I am averaging around 15Hz.

I am using a machine with the following specs:
CPU - Intel Xeon 12-core 2.5Ghz w/ Windows 10
GPU - Nvidia GeForce RTX 2080 Ti
RAM - 64GB

Python - 3.6.1
Tensorflow-gpu - 1.15.0
CUDA - 10.0
cuDNN - 7.6.5

Is there anything obvious that I am missing here that could lead to the weak performance?

ValueError

I've managed to run the code on the demo versions successfully. I then tried to use my own video (in .mp4 format) and receive this error:

ValueError: No way to determine width or height from video. Need -sininputdict. Consult documentation on I/O.

I'm not sure whether my understanding of the documentation is wrong. I believe I have set the video size and sensor size correctly, and don't fully understand the above error. I have tried running the program using:

python -m deepvog --fit ./output.mp4 ./demo_eyeball_model.json -v ./demo_visualization_fitting.mp4 -b 32

as well as one command I found in the readme:

python -m deepvog --fit ./output.mp4 ./demo_eyeball_model.json --flen 12 -vs 300,400 -s 0.005,0.005 -b 32

Do you have any advice you could offer?

Issue in inferer.py with resizing.

The variable resizing is passed into _preprocess_image but in the body of the code, the variable shape_correct is passed which has opposite definition as resizing. Suggest changing resizing in the function definition to be shape_correct and change line 277 to be if not shape_correct:.

Pupil diameter

Good day!

Thanks for your research and implementation. Can I get the pupil diameter?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.