lecturetracking / trackhd Goto Github PK

An open-source, automated, lecture recording system that tracks the presenter in 4K video streams

C++ 86.27% CMake 0.82% Shell 0.72% C 12.20%

lecture-videos opencv lecture virtual-videography lecture-tracking c-plus-plus

trackhd's Introduction

TRACK4K

Track4K is an open source C++ project that takes a High Definition video of a lecture recording and then produces a smaller cropped output video, which frames the lecturer. This is done using image processing and computer vision algorithms to track the lecturer and uses the lecturer position information to pan the virtual camera.

Getting Started

These instructions will help get the program and all its dependencies set up on your machine.

Prerequisites

These instructions are written with the assumption that the project will be installed on a Linux-based system (preferably a Debian version). Track4K has been tested on Ubuntu 16.04

To be able to run this project, you will need to first install the following dependencies:

ffmpeg (3.4 or newer)
OpenCV 3 (3.2.0 or future releases)
OpenCV Extra Modules (latest version on repository)
C++ Libraries (6.3 or future releases)
CMake (3.8.0 or future releases)
git (2.10.2 or future releases)

Installation

FFmpeg

The standard repositories of your distribution may include FFmpeg 3.4+. If not, FFmpeg 3.4 can be built from source (more on that topic here).

For Ubuntu-based distributions, the PPA ppa:jonathonf/ffmpeg-3 allows for simpler installation without the needing to build from source. The PPA can be added as follows:

$ sudo add-apt-repository ppa:jonathonf/ffmpeg-3
[Press enter when prompted]
$ sudo apt-get update

Downloading and Installing base dependencies

The first on the install list (and most important) is CMake, followed by git, C++ and various multimedia packages. The following terminal command will get and install the necessary requirements

$ sudo apt-get install cmake git build-essential libgtk2.0-dev pkg-config libavcodec-dev libavformat-dev libswscale-dev libavfilter-dev libx264-dev libx265-dev libvpx-dev liblzma-dev libbz2-dev libva-dev libvdpau-dev

Downloading and Installing the OpenCV libraries

The next step is to download and install the OpenCV libraries. The necessary OpenCV library comes in two components. First download the core OpenCV library. Choose any directory as your download destination directory. Clone OpenCV from Git as follows:

$ cd `your_chosen_working_directory`
$ git clone https://github.com/opencv/opencv

Next, repeat the process for the Extra modules. Remain in the same working directory and execute the following terminal command:

$ git clone https://github.com/opencv/opencv_contrib

You should now have two folders in your working directory. The next step is to build OpenCV.

Building the OpenCV library

Your Chosen directory now contains two folders, opencv and opencv_contrib. The opencv folder contains the main OpenCV libraries and opencv_contib contains the extra modules.

$ cd `your_chosen_working_directory`

Inside the main OpenCV folder, change directory into the build folder (create one if it does not exist) and remove all files, since it will require rebuilding. To rebuild OpenCV run the following command from within the build folder:

$ cmake -D CMAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX=/usr/local ..

This step will generate a MakeFile. Once complete perform the following command to run make faster (the number after the j-flag is the number of processors the job will use). If you are not sure how many processors the machine has use the following instruction to find out:

cat/proc/cpuinfo | grep processor | wc -l

Use the result from this in the j-flag

$ make -j`processor_count`

Remain in the build folder and run the following cmake command to make the extra modules. The path decribed below is an example. Fill in the directory path on your machine which points to the OpenCV Extra modules folder.

cmake -DOPENCV_EXTRA_MODULES_PATH=`OpenCV_Extra_Modules_Folder_Path`/modules ../

Next step is to make these files:

$ make -j8

Finally, install these modules by running the following command:

$ sudo make install

Building Track4K

Automatic Method

There is a shell script in the trackhd folder called intall_track4k.sh which can be used to install track4k automatically. To use this script run the following command:

sudo ./install_track4k.sh

This will run all the steps listed in the manual method mentioned below.

Manual Method

This method is for the case where the automatic method does not work. It does everything the shell script does manually.

The trackhd directory should have 2 main folders inside it: source and build. The source folder comntains all the header and source files while the build file contains all object files and executables. The first step is to navigate into the build folder. Once inside run delete all files (if any) and then type the following command in terminal:

cmake ../source

Now it is possible to run the build instruction:

make -j`number_of_processors`

You can now install the project to /usr/local/bin/ by running the following command:

sudo make install

Then build cropvid:

cd cropvid
./build.sh
cp cropvid /usr/local/bin/

Running Track4K

Track4K runs in two parts: track4k analyzes a video file and produces a cropping data file in text format. cropvid crops the video file according to the cropping information in the data file, using ffmpeg libraries.

$ track4k <inputFileName> <outputFileName> <output-width> <output-height>
$ cropvid <input file> <output file> <cropping file>

Example:

track4k presenter.mkv presenter-crop.txt 1920 1080
cropvid presenter.mkv tracked.mkv presenter-crop.txt

Track4K can also output the cropping information in JSON format, when the output filename has a .json extension:

Example:

track4k presenter.mkv presenter-crop.json 1920 1080

The JSON format includes a timestamp as well as a frame number. The timestamp is only guaranteed to be accurate when the source video has been recorded with a fixed frame rate. It may be incorrect for variable frame rate source videos.

Memory Requirements

The program reads a maximum of 29 frames into memory at a time. So a minimum of 4GB RAM should be sufficient.

Built With

OpenCV - The computer vision library of choice

License

Copyright 2016 Charles Fitzhenry / Mohamed Tanweer Khatieb / Maximilian Hahn Licensed under the Educational Community License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.osedu.org/licenses/ECL-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

trackhd's People

Contributors

Stargazers

Watchers

Forkers

mtkhatieb cilt-uct slampunk interactivezoomuniosnabrueck polimediaupv guangwensi tommyteavee

trackhd's Issues

Handle cases where PTS timestamps in source video are not monotonic

In some cases, the input video can contain invalid PTS timestamps, e.g. that go backwards. This has been only been seen once in a short section of video that appeared to be corrupt.

Nevertheless, for robustness it is better to deal with these cases rather than be unable to process the video.

Add description and website link to github project

Please edit github project info - currently showing "No description, website, or topics provided." at the top. Should have a short description and link to http://track4k.co.za

Add ability to accept command line parameters

Add functionality to the MainDriver class to accept parameters from the command line.
Parameters include:

Input filename
Output filename for cropped video and board video

First crop co-ordinates start at y=0

The y-value for the crop co-ordinates doesn't seem to be set correctly until the first movement.

# track4k presenter.mkv 89877 frames (frame top-left-x top-left-y) output frame size 1920 1080
0 0 0
24 0 600
79 1 600
80 2 600

As track4k is currently pan-only, write out the co-ords with fixed y value.

Aspect ratio calculation

The function to check aspect ratio is not robust enough and fails when two boards overlap. This is because the function only classifies a board if a certain aspect is met. Might need to check colour and shape as well to make a better classification of boards

Light Correction not working well and inefficient

Light correction is used to detect boards under varying lighting conditions. Currently the lighting correction only corrects the light for the frame needed. The parameters need to be experimented with more and light correction might only be applied if lighting is bad. So add a way to check light quality first (Assuming this would be more efficient)

Ignore smaller objects

In this example, the tracking is distracted by the white beanie on the right of the frame and moves away from the lecturer on the left.

Detect real top and bottom of the image and adjust crop position if necessary

On some cameras, privacy masks can be set to black out the top and/or bottom of an image.

If these have been set, then track4k should not select a crop frame that includes any portion of the blacked-out part, i.e. the crop y co-ordinates should snap to the top and/or bottom of the real image, or if the real image height is less than the crop height, the centre the real image so there's an equal height of black above and below the real image.

Rectangle detection very redundant

Need to remove redundant rectangle detection methods and combine into one function that takes parameters to determine which rectangle detection to use. This is used to detect boards

Write out co-ordinates for board analysis instead of full video

Output the co-ordinates of blackboards instead of a full video.

out_w:out_h:x:y

to make it easy to run a follow-up ffmpeg process

http://video.stackexchange.com/questions/4563/how-can-i-crop-a-video-with-ffmpeg

Testing

Test that the system works on Ubuntu and that no code was accidentally removed in the cleaning process.

Check input resolution and exit if it's the same or less than output resolution

If you give a 1280x720 video to track4k with an output resolution of 1280x720, it fails with an assertion error.

There's nothing meaningful to do in this instance (may as well use the input video unchanged), so track4k should check at startup and exit with an error message if input resolution is equal to or smaller than output resolution.

OpenCV: FFMPEG: tag 0x34363258/'X264' is not supported with codec id 28 and format 'mp4 / MP4 (MPEG-4 Part 14)'
OpenCV: FFMPEG: fallback to use tag 0x00000021/'!???'
Input frame resolution: Width=1280  Height=720 of nr#: 7499
Input codec type:
Video Duration (Seconds): 299
FPS: 25
OpenCV Error: Assertion failed (0 <= roi.x && 0 <= roi.width && roi.x + roi.width <= m.cols && 0 <= roi.y && 0 <= roi.height && roi.y + roi.height <= m.rows) in Mat, file /usr/local/src/track4k/opencv/modules/core/src/matrix.cpp, line 526
terminate called after throwing an instance of 'cv::Exception'
  what():  /usr/local/src/track4k/opencv/modules/core/src/matrix.cpp:526: error: (-215) 0 <= roi.x && 0 <= roi.width && roi.x + roi.width <= m.cols && 0 <= roi.y && 0 <= roi.height && roi.y + roi.height <= m.rows in function Mat

Build cropvid fails on clean system

Building cropvid as described in the docs on a clean ubuntu 16.04 fails:

Package libavfilter was not found in the pkg-config search path.
Perhaps you should add the directory containing `libavfilter.pc'
to the PKG_CONFIG_PATH environment variable
No package 'libavfilter' found
cropvid.c:36:34: fatal error: libavfilter/avfilter.h: Datei oder Verzeichnis nicht gefunden

Large repo size

The repo is about 89M to check out, when it should be tiny:

~ $ git clone https://github.com/LectureTracking/trackhd.git
Cloning into 'trackhd'...
remote: Counting objects: 604, done.
remote: Compressing objects: 100% (26/26), done.
remote: Total 604 (delta 9), reused 0 (delta 0), pack-reused 578
Receiving objects: 100% (604/604), 89.01 MiB | 169.00 KiB/s, done.
Resolving deltas: 100% (333/333), done.

There are some techniques for cleaning up git history that you could look into, otherwise you may want to consider deleting and re-creating the repo with a clean commit of source only.

Copy Licence into all code files & Repo structure

Copy into all files and into the github repository and update the readme once all is completed. Rename folders to single lowercase names and ensure the .ignore file prevents git from adding build/executable files.

Process mp4s correctly with cropvid

Test case video
https://github.com/LectureTracking/trackhd/files/1464667/vid.zip

Tracking section code cleanup and documentation

clean debug and unnecessary code from the project and add comments where necessary.

Virtual Cinematographer Code Cleanup

Rename Variables
Remove unnecessary and redundant code

OpenCV Error: Assertion failed

Stage [3 of 3] - Virtual Cinematographer

baqwa-720p.mp4
BoardSegment-baqwa-720pmp4
875967064
OpenCV: FFMPEG: tag 0x34363258/'X264' is not supported with codec id 28 and format 'mp4 / MP4 (MPEG-4 Part 14)'
OpenCV: FFMPEG: fallback to use tag 0x00000021/'!???'
OpenCV: FFMPEG: tag 0x34363258/'X264' is not supported with codec id 28 and format 'mp4 / MP4 (MPEG-4 Part 14)'
OpenCV: FFMPEG: fallback to use tag 0x00000021/'!???'
[flv @ 0x18e4b40] AMF_DATA_TYPE_STRING parsing failed
Input frame resolution: Width=3840 Height=2160 of nr#: 1948
Input codec type:
Video Duration (Seconds): 67
FPS: 29
End of video file
OpenCV Error: Assertion failed (0 <= roi.x && 0 <= roi.width && roi.x + roi.width <= m.cols && 0 <= roi.y && 0 <= roi.height && roi.y + roi.height <= m.rows) in Mat, file /srv/dev/opencv/modules/core/src/matrix.cpp, line 522
terminate called after throwing an instance of 'cv::Exception'
what(): /srv/dev/opencv/modules/core/src/matrix.cpp:522: error: (-215) 0 <= roi.x && 0 <= roi.width && roi.x + roi.width <= m.cols && 0 <= roi.y && 0 <= roi.height && roi.y + roi.height <= m.rows in function Mat

Board detection - Using static binary threshold is not working

Currently the board is detected based on shape, aspect ratio, size as well as its intensity. It is assumed that the boards are green and when using a threshold, this area will be black, and for each rectangle assumed to be a board, the colour is evaluated and if it is black, this adds confidence to the classification. However under varying light, such as very bright light, the boards turn out to be white after applying the threshold, so the threshold value needs to be variable based on current lighting conditions. Need to make a method to detect light levels and then based on that select a threshold value.

Code cleanup and documentation for Board Segmentation stage

Remove unused methods, variables and classes. Rename variables that are ambiguous and provide comments explaining methods and classes in more detail.

Memory efficiency issue

Currently each frame is read into a C++ vector. This can be changed to instead to initialize a pointer array that would store the frames and this pointer can be passed around instead of having to clone the vector and frames several times.

Start panning from centre rather than left of image

Track4K always starts panning from the very left of the image.

If there is an identified presenter already, the first frame should be there (rather than pan to the presenter).

if there's no identified presenter in the first frame, then start in the horizontal centre of the image.

VC Documentation

Comment methods:

List input variables by type
Describe purpose of the method
List output variables by type

Output framerate and duration differs from source

ffprobe presenter-MAM1000W-20160811.mp4 shows:

Duration: 00:50:00.01, start: 0.033367, bitrate: 5655 kb/s
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709), 3840x2160 [SAR 1:1 DAR 16:9], 5652 kb/s, 29.95 fps, 29.97 tbr, 90k tbn, 180k tbc (default)

Output processed to 1920x1080p shows:

Duration: 00:51:38.28, start: 0.000000, bitrate: 1925 kb/s
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1920x1080, 1922 kb/s, 29 fps, 29 tbr, 14848 tbn, 58 tbc (default)

It looks like the framerate has been scaled down so the duration is incorrect, and possibly the frame timestamp information has been dropped (which is important if the source is not a fixed framerate).

Improve installation instructions

Suggestions from @slampunk

On the GitHub README.md:

Where applicable, list the minimum required versions for prerequisite packages [e.g. CMake (>= 3.5)]
"Firstly change into the OpenCV directory (download destination)..."

I would assume I need to cd into "<your_chosen_working_directory>/opencv", where in fact, I'd simply need to be in "<your_chosen_working_directory>".
Have it read as

"Your chosen working directory now contains two folders, opencv and opencv_contrib. The opencv folder contains the main OpenCV libraries, and the opencv_contrib...."

Consider the use of backticks for inline nouns (commands, packages, directories). Makes it clear that the word refers to a terminal/console item.
"Inside the main OpenCV folder, change directory into the build folder and remove all files, since it will require rebuilding. "

build folder did not exist for me.
Append with something along the lines of....
"Create a build folder if it does not exist".

make -j8 assumes a cpu with at least 8 cores. Maybe add "cat /proc/cpuinfo | grep processor" to get number of cpus.

Installation Automation

Write an installation README explaining how to install program's dependencies and run the program itself.
Automate the installation process of the program (excluding dependencies) by writing a shell script.

Open Source Licensing

Paste the Open Source license into every header and source file