Giter VIP home page Giter VIP logo

pliers's Introduction

pliers: a python package for automated feature extraction

PyPI version fury.io pytest Coverage Status Documentation Status DOI:10.1145/3097983.3098075

Pliers is a Python package for automated extraction of features from multimodal stimuli. It provides a unified, standardized interface to dozens of different feature extraction tools and services--including many state-of-the-art deep learning-based models and content analysis APIs. It's designed to let you rapidly and flexibly extract all kinds of useful information from videos, images, audio, and text.

You might benefit from pliers if you need to accomplish any of the following tasks (and many others!):

  • Identify objects or faces in a series of images
  • Transcribe the speech in an audio or video file
  • Apply sentiment analysis to text
  • Extract musical features from an audio clip
  • Apply a part-of-speech tagger to a block of text

Each of the above tasks can typically be accomplished in 2 - 3 lines of code with pliers. Combining them all--and returning a single, standardized DataFrame--might take a bit more work. Say maybe 5 or 6 lines.

In a nutshell, pliers provides a high-level, unified interface to a large number of feature extraction tools spanning a wide range of modalities.

Documentation

The official pliers documentation on ReadTheDocs is comprehensive, and contains a quickstart, API Reference, and more.

Pliers overview (with application to naturalistic fMRI)

Pliers is a general purpose tool, this is just one domain where it's useful.

Tutorial Video

The above video is from a tutorial as a part of a course about naturalistic data.

How to cite

If you use pliers in your work, please cite both the pliers and the following paper:

McNamara, Q., De La Vega, A., & Yarkoni, T. (2017, August). Developing a comprehensive framework for multimodal feature extraction. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1567-1574). ACM.

pliers's People

Contributors

adelavega avatar adswa avatar andrewheusser avatar anibalsolon avatar darius522 avatar ejolly avatar emdupre avatar hugovk avatar jayeeta-roy avatar jayeetaroy avatar jdkent avatar jsmentch avatar kaczmarj avatar mgxd avatar mih avatar peerherholz avatar poldrack avatar qmac avatar rbroc avatar rogilmore avatar shabtastic avatar snastase avatar tirkarthi avatar tsalo avatar tyarkoni avatar yarikoptic avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pliers's Issues

Add new text dictionaries

FeatureX has a PredefinedDictionaryExtractor class that takes a block of text as input and returns values for each word. For example, via an affective norms database, one can get the valence and arousal of the words in one's text.

Adding new dictionaries is as simple as adding new JSON dictionaries to the dictionaries.json file bundled with the package. Any file added there can subsequently be used in the PredefinedDictionaryExtractor. Since there are potentially hundreds of usable and useful text feature dictionaries on the web, it would be great to expand the current list of supported resources.

Require explicit permission to run a large set of queries against API extractors?

At the moment, the graph API doesn't do anything to prevent a user from trying to run a full-length movie file through an image extractor, which could result in a very large number of queries (1 per frame) to an API extractor if users aren't careful. It might be a good idea to at minimum issue a warning when a large set of queries (e.g., > 100) to an API Extractor is detected, and possibly even require the user to set an explicit flag (e.g., large_jobs=True). Alternatively, we could disallow automatic VideoToImageStim conversion in cases where the resulting video frame set is very large.

Test don't run

When I try to execute nose tests fail with:

from .stimuli import VideoStim, AudioStim, TextStim, ImageStim
ImportError: No module named stimuli

This probably means that some modifications were not pushed to github yet.

Extractor registry

There's no centralized tracking of Extractors at the moment, which makes it difficult to search for specific extractors, properly attribute credit, etc. We should add some tools for annotating Extractors with information like author, purpose, description, citation, tags, etc.

Add memoization of Converters

There will be a lot of overhead calling Converters repeatedly if implicit Stim conversion is required. We can address this by memoizing the conversion functions with joblib or something similar.

GoogleVisionAPIFaceExtractor unusable output when multiple faces

The flattened output structure of the extractor does not contain an indicator that allows for binding individual features to a face in the case of multiple faces being detected. For every additional face a set of new columns will be added that have identical names. It seems that column order cannot be used at present to infer the start of a set of features for an additional face.

It looks as if a per-face column name prefix could be a solution.

Automatic Stim adapters

Consider a situation where a user wants to take a VideoStim as input and apply the STFTExtractor (i.e., short-time Fourier transform) to the audio track. Currently, an exception will be raised, because the STFTExtractor only handles AudioStim inputs. However, since most movies have an audio track, featurex should be smart enough to attempt to automatically extract an AudioStim from a VideoStim and apply the audioextractor to the result (i.e., basically building an implicit graph) before it raises an exception. This isn't a high priority, but would be nice to have at some point.

StimCollection class

At the moment the standard way to apply extractors to a Stim is via an .extract call to the Stim--e.g.,

stim = ImageStim('my_image.jpg')
extractors = [ExtractorA(), ExtractorB(), ExtractorC()]
stim.extract(extractors)

This allows multiple Extractors to be applied at once to a single Stim, but it would be useful to do multiple stims at once. Some kind of StimCollection container that implicitly loops over Stims might be worth adding. Thoughts?

Consolidated list of all optional dependencies

It's getting hard to keep track of all the optional dependencies; we should add an optional_dependencies.txt file in the package root that users can pip install -r with if they want everything.

Add 'columns' field to dictionaries in dictionaries.json

It's a bit annoying that there's no way to know what the columns are in the lookup dictionaries supported in datasets/dictionaries.json without fetching them. We should add a mandatory 'column_names' field to the JSON objects that lists all valid column names (even if all columns in the target file are valid for use). This way users can easily scan dictionaries.json (and eventually, we can dynamically generate a table inside the docs). We could even extend this eventually to include an optional 'column_descriptions' that describes each column.

Filters vs. extractors

I'm implementing A-weighting, which filters the audio timeseries, and I was thinking about differentiating filters and extractors. It seems almost wasteful to create an event for every frame in an audio stream, and filters seem like they'd be used to preprocess data rather than to generate timelines.

If filters are sufficiently different, they may merit another submodule along with extractors and stimuli.

Thoughts?

Add economy config setting at package level

Some Extractors now create intermediate files en route to generating feature values. Since some of these are movies or images of same dimension as the original Stims, we could end up consuming a lot of memory. At some point we should add an economy config variable that determines how intermediate files are handled/stored. We'll then need to go over all existing Extractors and make sure they condition properly on that setting.

Add multi-step Converters

To really unlock the potential of the graph API, we need to support implicit conversion between Stim types that involve multiple steps--e.g., VideoStim to ComplexTextStim via an extracted AudioStim. There are (at least) two ways we could go about this:

  1. Recursively try to construct valid paths from the input Stim to the output Stim, and stop as soon as one is found. E.g., suppose we pass a VideoStim to a TextExtractor. Then get_converter would search all possible paths from VideoStim to TextStim until it found VideoStim --> AudioStim --> ComplexTextStim.
  2. Manually add Converter classes for all valid paths, which explicitly call the full chain internally. E.g., we would write a new VideoToComplexTextStimConverter with a _convert method that explicitly uses a VideoToAudioConverter class, then an AudioToTextConverter.

In principle, (1) is the cleaner and more extensible approach. But it introduces completely unnecessary computation when the number of valid paths between Stims is small (as it currently is). The main disadvantage of (2) is if we add many more Stim types, we could end up with combinatorial explosion.

I guess for now I favor (2), and if it starts to get unwieldy, we can move to (1).

This is a high-priority issue that we should try to get done before revamping the README, because it would be nice to be able to show a Graph example where the user only has to worry about the leaf nodes (all of which are Extractors), and doesn't have to explicitly think about the Converters.

Identify key frames in videos based on magnitude of difference between frames

Many of the APIs only work on images, but we want to process videos by passing in individual frames. To keep processing efficient (and costs low for paid services), we want to pass in as few frames as we can get away with. Rather than processing every Nth frame, we could take the diff between every two frames and identify frames where the scene changes to a significant degree. This could be a method implemented in VideoStim that could be called by any API-based extractor that loops over frames.

Should implicit conversion output CollectionStimMixins?

Say we are passing an AudioStim through a LengthExtractor, which takes TextStim inputs. The implicit conversion will look for converters that go audio->text. However, most of the converters will instead have AudioStim->ComplexTextStim specified.

Should the implicit conversion also look for conversions to collection stimuli who's elements are of LengthExtractor's input type? Either way it may be a good idea to put an element_type specification in all CollectionStimMixins.

Alternatively, which we coincidentally have implemented now, we could just have converters specify AudioStim->TextStim, (even though they actually output ComplexTextStim) and have the logic in transformers.py take over from there.

Multiple stims in API requests

A few of the API's impose a request limit, with no penalty for including several stimuli in one request. Therefore it is much more efficient to chunk stimuli into single API calls. Currently, each API converter/extractor is written to request using one stimulus at a time. This may be resolved by improving the graph module to automatically handle collections of stimuli.

Improve test coverage

We now have working continuous integration testing via travis-ci; the coveralls report is here. We're not doing too badly, but we should be able to get to 95%+ coverage without too much work. Additionally, as a secondary priority, many of the earliest tests I wrote are overly broad, and could stand to be refactored.

Improved docs: examples, tutorial and/or user guide

Currently the quickstart doc only provides the bare minimum of information about what the package does and how it runs. Pretty much any doc contributions would be great at this point. The easiest place to start might be by adding example Jupyter notebooks illustrating usage for different stimuli. A more comprehensive tutorial would also be nice. Ultimately we want to have a comprehensive user guide, but that can probably wait on #4.

Rename target to _input_type

For consistency and clarity, we should use _input_type and _output_type attributes to identify the expected types of all Stim inputs (and for Converters, the expected returned type).

Distinguish between Stim source and name

There's some ambiguity over what a Stim name means. Right now it defaults to the filename, but it's probably a good idea to separately track the source file and name. This becomes an issue mainly in the context of graphs, where we might want to propagate the initial source file to a Stim as it flows through the graph (e.g., annotated text extracted from a VideoStim should retain some indication of the original video file).

add SRT support

It would be useful to support text feature extraction from subtitle files.

Stop using opencv for image/movie loading

Currently ImageStims and VideoStims are loaded via opencv, which imposes an unnecessary (and difficult-to-install) dependency. OpenCV should only be imported when running extractors that depend on it; we should find an alternative solution for reading in stimuli. For images we could use scipy.misc.imread. Not sure about movies, but I think MoviePy might be the way to go.

include data with package

Many data files useful for extraction/annotation can be repackaged under their current license. This is particularly true of word norms (e.g., frequency, emotional valence and intensity, etc.), which can be included in the package to make text feature extraction much more useful out of the box. Key data files should be bundled with the package (or maintained in a separate submodule).

Quickstart needs update

from featurex.stims import VideoStim
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-2-fd7d648ab17b> in <module>()
----> 1 from featurex.stims import VideoStim

ImportError: No module named stims

Wishlist: Movie frame cropping before content labeling

While exporing the Google vision API I found that it makes a big difference if movie frames are cropped (freed of any horizontal bars) before labeling. Without cropping they get "Screenshot" labels, but after cropping more of the actual content is tagged.

add option to retain original result dictionary in API extractors

For the Google extractors (and possibly other API extractors), we currently flatten the returned JSON object into a one-level dictionary. This makes life easy when working with pandas DFs, but users could potentially want direct access to the original result. This will require adding a new attribute to ExtractorResult, maybe called something like response, that can optionally be set when the instance is initialized.

Alternatively, we could have a generic metadata attribute on ExtractorResult that is itself a dictionary, which would allow different kinds of Extractors to set different kinds of metadata.

Add pipelines / chained extractors

A fairly common potential use case involves chaining multiple extractors--e.g., transcribing the audio track from a movie, and then feeding it into a DictionaryExtractor. Currently there's no automatic way to convert the results returned from one extractor and converting them into a Stim to feed into another. We should add a scikit-learn-like pipeline module that allows easy chaining of extractors.

Switch to py.test for all testing

  • Switch to py.test
  • Simplify tests--we probably don't need the wrapper classes
  • Drop all unittest assertions in favor of just assert

Fix OpenCV dependency in Python 3

Some tests currently fail because OpenCV was difficult to install on Python 3 until recently. There now appears to be a conda installer, so we should fix the travis config to properly install OpenCV on both Python 2 and 3 (and make sure the tests pass).

add opencv to travis-ci

OpenCV (and/or its Python bindings) doesn't install properly on the travis env, so cv2-dependent tests fail.

wide format dataframes

Durations in wide format data frames repeat if multiple values are extracted (e.g., from indico API). For srt file types, text is not provided.

add part-of-speech tagging

Wrap nltk's part-of-speech tagging and return a set of binary column features for, e.g., the universal part-of-speech tagset.

Values shouldn't be nested

When multiple Transformers are applied to a single Stim, the returned Value objects are nested, such that the keys in the top-level Value.data dict are Transformer names, and the values are other Value instances (whose data attribute is a normal dictionary of values). This is counter-intuitive and kind of horrendous. The returned top-level object should probably be either a plain dict, or some new container class (e.g., ValueList).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.