Giter VIP home page Giter VIP logo

eta's Introduction

ETA: Extensible Toolkit for Analytics

An open and extensible computer vision, machine learning and video analytics infrastructure.

PyPI python PyPI version pre-commit License Twitter

eta-infrastructure.png

Requirements

ETA is very portable:

  • Installable on Mac or Linux
  • Supports Python 3.6 or later
  • Supports TensorFlow 1.X and 2.X
  • Supports OpenCV 2.4+ and OpenCV 3.0+
  • Supports CPU-only and GPU-enabled installations
  • Supports CUDA 8, 9 and 10 for GPU installations

Installation

You can install the latest release of ETA via pip:

pip install voxel51-eta

This will perform a lite installation of ETA. If you use an ETA feature that requires additional dependencies (e.g., ffmpeg or tensorflow), you will be prompted to install the relevant packages.

Docker Installation

If you prefer to operate via Docker, see the Docker Build Guide for simple instructions for building a Docker image with an ETA environment installed.

Installation from source

Step 0: Setup your Python environment

It is assumed that you already have Python installed on your machine.

IMPORTANT: ETA assumes that the version of Python that you intend to use is accessible via python and pip on your path. In particular, for Python 3 users, this means that you may need to alias python3 and pip3 to python and pip, respectively.

We strongly recommend that you install ETA in a virtual environment to maintain a clean workspace.

Step 1: Clone the repository

git clone https://github.com/voxel51/eta
cd eta

Step 2: Run the install script

bash install.bash

Note that the install script supports flags that control things like (on macOS) whether port or brew is used to install packages. Run bash install.bash -h for more information.

For Linux installs, the script inspects your system to see if CUDA is installed via the lspci command. If CUDA is available, TensorFlow is installed with GPU support.

The table below lists the version of TensorFlow that will be installed by the installer, as recommended by the tested build configurations:

CUDA Version Found TensorFlow Version Installed
CUDA 8 tensorflow-gpu~=1.4
CUDA 9 tensorflow-gpu~=1.12
CUDA 10 tensorflow-gpu~=1.15
Other CUDA tensorflow-gpu~=1.15
No CUDA tensorflow~=1.15

Note that ETA also supports TensorFlow 2.X. The only problems you may face when using ETA with TensorFlow 2 are when trying to run inference with ETA models that only support TensorFlow 1. A notable case here are TF-slim models. In such cases, you should see an informative error message alerting you of the requirement mismatch.

Lite installation

Some ETA users are only interested in using the core ETA library defined in the eta.core package. In such cases, you can perform a lite installation using the -l flag of the install script:

bash install.bash -l

Lite installation omits submodules and other large dependencies that are not required in order for the core library to function. If you use an ETA feature that requires additional dependencies (e.g., ffmpeg or tensorflow), you will be prompted to install the relevant packages.

Developer installation

If you are interested in contributing to ETA or generating its documentation from source, you should perform a developer installation using the -d flag of the install script:

bash install.bash -d

Setting up your execution environment

When the root eta package is imported, it tries to read the eta/config.json file to configure various package-level constants. Many advanced ETA features such as pipeline building, model management, etc. require a properly configured environment to function.

To setup your environment, create a copy the example configuration file:

cp config-example.json eta/config.json

If desired, you can edit your config file to customize the various paths, change default constants, add environment variables, customize your default PYTHONPATH, and so on. You can also add additional paths to the module_dirs, pipeline_dirs, and models_dirs sections to expose custom modules, pipelines, and models to your system.

Note that, when the config file is loaded, any {{eta}} patterns in directory paths are replaced with the absolute path to the eta/ directory on your machine.

The default config includes the modules/, pipelines/, and models/ directories on your module, pipeline, and models search paths, respectively. These directories contain the necessary information to run the standard analytics exposed by the ETA library. In addition, the relative paths ./modules/, ./pipelines/, and ./models/ are added to their respective paths to support the typical directory structure that we adopt for our custom projects.

CLI

Installing ETA automatically installs eta, a command-line interface (CLI) for interacting with the ETA Library. This utility provides access to many useful features of ETA, including building and running pipelines, downloading models, and interacting with remote storage.

To explore the CLI, type eta --help, and see the CLI Guide for complete information.

Quickstart

Get your feet wet with ETA by running some of examples in the examples folder.

Also, see the docs folder for more documentation about the various components of the ETA library.

Organization

The ETA package is organized as described below. For more information about the design and function of the various ETA components, read the documentation in the docs folder.

Directory Description
eta/classifiers wrappers for performing inference with common classifiers
eta/core the core ETA library, which includes utilities for working with images, videos, embeddings, and much more
eta/detectors wrappers for performing inference with common detectors
eta/docs documentation about the ETA library
eta/examples examples of using the ETA library
eta/models library of ML models. The manifest.json file in this folder enumerates the models, which are downloaded to this folder as needed. See the Models developer's guide for more information about ETA's model registry
eta/modules library of video processing/analytics modules. See the Module developer's guide for more information about ETA modules
eta/pipelines library of video processing/analytics pipelines. See the Pipeline developer's guide for more information about ETA pipelines
eta/resources resources such as media, templates, etc
eta/segmenters wrappers for performing inference with common semantic segmenters
eta/tensorflow third-party TensorFlow repositories that ETA builds upon

Generating Documentation

This project uses Sphinx-Napoleon to generate its documentation from source.

To generate the documentation, you must install the developer dependencies by running the install.bash script with the -d flag.

Then you can generate the docs by running:

bash sphinx/generate_docs.bash

To view the documentation, open the sphinx/build/html/index.html file in your browser.

Uninstallation

pip uninstall voxel51-eta

Acknowledgements

This project was gratefully supported by the NIST Public Safety Innovation Accelerator Program.

Citation

If you use ETA in your research, feel free to cite the project (but only if you love it ๐Ÿ˜Š):

@article{moore2017eta,
  title={ETA: Extensible Toolkit for Analytics},
  author={Moore, B. E. and Corso, J. J.},
  journal={GitHub. Note: https://github.com/voxel51/eta},
  year={2017}
}

eta's People

Contributors

allenleetc avatar anddraca avatar aturkelson avatar benjaminpkane avatar brimoor avatar chrisstauffer avatar clementpinard avatar dependabot[bot] avatar ehofesmann avatar findtopher avatar iantimmis avatar j053y avatar jasoncorso avatar jeffreydominic avatar jinyixin621 avatar kevinqi34 avatar kunyilu avatar lethosor avatar mattphotonman avatar mikejeffers avatar nebulae avatar rohis06 avatar rpinnaka avatar sashankaryal avatar swheaton avatar tylerganter avatar yashbhalgat avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

eta's Issues

Support `eta run --dry-run` flag

Request to add pipeline support for dry-run case that removes all configs and output files for the case that the user only wants to see the stdout.

Fresh install does not install tensorflow --- NO WAIT, sudo bash or bash...

Now that vgg is in the repo, we should have the install scripts install tensorflow.
On my mac, I got this after running the install script and then running embed_image.

jcorso@newbury-2 /voxel51/w/eta
$ cd examples/embed_vgg16
/voxel51/w/eta/examples/embed_vgg16
jcorso@newbury-2 /voxel51/w/eta/examples/embed_vgg16
$ python embed_image.py 
Traceback (most recent call last):
  File "embed_image.py", line 20, in <module>
    import tensorflow as tf
ImportError: No module named tensorflow

Ah, after digging a bit deeper, this is actually a problem with the install script. It got up to the install python bits, but then quit (without message) because they failed. My suspicion is that those bits did not get executed as sudo and my python requires sudo for installing for some reason that escapes me. (this is on a mac).

So, something needs to be changed/improved, even if it is the doc on how to run the install_externals as sudo.

Thoughts?

pipelines: need a way to have global config settings inherited by individual modules

If I have a pipeline with a dozen modules and they all require a "frames" setting because they are all working with the same video. It would be far easier to have a setting like this set in the top-level config and then inherited. And, less room for error.

This would be harder if there are multiple videos. But, even less room for error.

(This is a thought I had while working with the pipeline bits. Up for discussion, of course, but wanted to get it down.)

Make eta.core.video.FramesRanges more general

The need to pass around sets of numbers like [1, 5, 6, 7, 10] or "1,5-7,10" is pretty general. We should upgrade the eta.core.video.FramesRanges class to provide this general functionality.

It should accept strings (including "*") and lists (including [])

eta.core.config.Config should also understand how to accept fields of this new type.

Add an `eta.core.objects.BaseFrame` class to encapsulate Frame implementation

There are many types of objects that we will want to store in Frame-like classes. We should have a BaseFrame class that defines all the common functionality and then subclasses like DetectedFrame, EmbeddedFrame, TrackedFrame, etc. that are thin-wrappers over BaseFrame that specify what type of objects are in the list.

Add an `eta.core.config.Config.parse_enum` method to parse config fields that are enumerations

It would be useful to have an eta.core.config.Config.parse_enum() method that works like this:

class MyConfig(Config):

    def __init__(self, d);
        self.value = self.parse_enum(d, "value", Choices)

where the "enum" can be defined either as a class:

class Choices(Enum):
    A = valA
    B = valB

or a dict:

Choices = {
    "A": valA,
    "B": valB
}

A common pattern will be to use this mechanism when the user needs to choose between one or classes or functions to use.

Developer samples and "User" samples

We have a samples directory now that has examples code that seems to be intended for developers.

We also need to create samples for every module. Right now, I am putting such examples in the same place, but it is not clear to me that this is the right thing to do. Having examples running pipelines will make using and extending eta much easier.

Also: I do not like the word samples here. These are examples.

Should modules be included as a package in eta?

Currently modules is just a set of executable python code that uses the eta codebase. It is not a package (it has no "init.py" file). But, it is inside of eta within the repo. I'd suggest either moving it outside of the eta directory or turning it into a package.

Is there a fundamental reason why we would not want to allow modules to import other modules. It would not be possible just to "import modulename" because the actual code may be executing somewhere else.

Sample Data Should Be A Separate Download

We have been putting the sample data into the repository, but this will quickly bloat the repository if we add any sizable amount making it hard to work with. We need to establish a separate data dump that can be fetched if the user wants to run the examples, etc.

new requirement dill

eta.core.serial now imports dill. It should be added to requirements.txt. e.g.

dill==0.2.7.1

Serializable needs a write_json method

Need to add a Serializable.write_json method. We really shouldn't be calling serial.write_json directly. Data I/O to disk should almost always be done through a "data class" that implements Serializable

Installs: virtualenv and cross-platform

Probably not best practice to rely on system-wide installs.

Also: the mac parts rely on brew. Some of us use port (macports) instead of brew. How to reconcile? (Virtualenv?)

Need ability to include/run one pipeline within another

Options:
(A) support this only at the pipeline metadata level by adding a "pipelines" field that allows access to I/O of other pipelines. When a pipeline is built, a single pipeline config would be populated based on this information
(B) support this at the pipeline config level by allowing pipeline configs to point to other pipeline configs.

I'm leaning towards (A).

VGG16Featurizer should force user to call start() and stop()

Currently if we use eta.core.vgg16.VGG16Featurizer without explicitly calling start() and stop(), it will silently load and destroy a huge CNN every time featurize() is called. This is never what the user really wants.

I can see why Featurizer allows this to silently happen (setup/tear-down could be cheap), but VGG16Featurizer should raise an error here.

The other option is to set keep_alive=True, but then the naive user would be carrying around a CNN in memory, which also deserves an error.

Make eta.core.utils.parse_dir_* methods into builder methods of DataFileSequence

Methods like eta.core.utils.parse_dir_pattern and eta.core.utils.parse_bounds_from_dir_pattern should be converted into builder methods of eta.core.data.DataFileSequence, which should be our one-stop shop for all file-sequence-related operations.

(I like eta.core.data.DataFileSequence --- this idea has been sorely missing)

`Serializable` needs to be reflective

We need everything in eta that is written via json to be reflective. This would enhance and simplify overall functionality.

I also think we should deprecate from_json and write_json to just read and write.

Is a custom OpenCV build necessary or worth it?

We currently build OpenCV from source during our external installs, but it is causing us pain every time we re-install ETA on a new machine (new developers, production deployments, etc). Moreover, the only customization we currently do is setting the WITH_CUDA flag.

Should we continue building OpenCV from source, or would pip install opencv-python suffice for us?

Need ability to assign names to modules in pipelines

This will allow us to, for example, write a pipeline with multiple instances of the same module in it in different places.

These "custom" names would be used when setting parameters and defining the module connections in the pipeline metadata file.

Functionality to query/list available modules and pipelines on the path

A new-to-ETA developer will want to get acquainted with the available functionality out of the box. A seasoned-ETA developer will want to learn what new modules or pipelines may have been added recently. A pipeline developer will need to list available modules.

ETA needs an apt-cache-like functionality to navigate the module and pipeline space.

Formalize the notion of conditional execution of modules

For example:

  • only resize a video if it is above a certain allowed resolution. This is currently achieved via a max_size argument of the resize_videos module, but perhaps this is a general enough need that we should provide formal support for it.
  • only resize a video if a size argument is provided; if no argument is provided, the module should be "skipped" all together. This is currently achieved on a per-module basis in the resize_videos module by symlinking the outputs to the inputs, but perhaps this is a general enough need that we should provide formal support for it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.