mitmedialab / ajl.ai Goto Github PK

A web application for crowdsourcing image annotations.

License: GNU Affero General Public License v3.0

HTML 7.78% CSS 10.38% JavaScript 79.95% PLSQL 1.76% Shell 0.13%

ajl.ai's Introduction

Image Annotator

This tool is a work in progress collaboration between MIT Civic Media and Bocoup to build an open source tool for crowd sourcing image annotation. Our first campaign with the tool is focused on annotating demographics in VGG Faces.

You can checkout that campaign at ajl.ai.

This repo is currently structured as a base open source annotator, and the VGG Faces annotation campaign together.

In the future, we'll be working to publish reusable image annotation tools, along with our data and findings. In the meantime, this is a good starting point if you're looking to run a large scale crowd image annotation campaign.

Repository Structure

ansible - deployment scripts for running the application on Ubuntu 16 server
backend - back-end api written in node/express/postgress
frontend - front-end client written in react/redux/webpack
jest - unit and integration tests
prototypes/ - exploratory interaction prototypes
- fabric-test/ - prototypes using fabric.js for rendering
- paper-test/ - prototype using Paper.js for rendering
- snap-svg-test/ - prototype using SVG to render landmarks
sample-data - initial sample data from researchers
static - initial static data for testing
api.md - REST API reference documentation

REST API

This project exposes a REST API with the following endpoints:

get /api/annotations/types
get /api/annotations/workload
post /api/annotations

More information available in api.md.

Deployment & Reporting

Image Annotator is configured to be deployable on any Ubuntu 16 server. All you need is Ansible 2.2x and root access to your desired target machine.

The current VGG Faces campaign is deployed to Bocoup's infrastructure which is orchastrated from bocoup/infrastructure-foundation.

In order to deploy to, or download reports from, this server, you will need to submit a PR to this (mitmedialab/ajl.ai) repo adding configuration for yourself to ansible/vars/users.yml. Once this has been done, ask one the project contributors can provision your account and give you the "Vault password" needed to run the following commands:

npm run edit-secrets

This will open the secrets file in your default editor. Secret data is managed by Ansible Vault. All secrets are stored in ansible/vars/secrets.yml.

npm run provision:[production|staging|vagrant]

This will prepare a target machine with all system dependencies needed to run Image Annotator and grant collaborators access to run deployments. You will be prompted for both a Vault password and a SUDO password during this task. You will have configured your sudo password at the beginning of the project when you added yourself to ansible/vars/users.yml.

npm run deploy:[production|staging|vagrant] -- -e commit=master

This will clone your desired commit to the target machine, install all node dependencies, compile the site with webpack, apply any outstanding migrations to the database and restart the API server.

npm run database-restore:[production|staging|vagrant]

The database residing on the production server is backed up to S3 hourly. Running this for any target server will completely replace the database with the most recent backup. Take care not to do this for production unless you know what you are doing!

Reporting

Running the following commands from the root directory of this repository will dump a CSV from our production database to your local machine.

npm run download-annotations

This will export all annotation data to annotations.csv. You'll be prompted for a "Vault password" when running this.

npm run download-feedback

This will export all feedback form data to feedback.csv. You'll be prompted for a "Vault password" when running this. TODO: implement a feedback form in the UI. At the time of this writing we are using google forms for feedback.

Installation

Frontend

The front end is running React, Redux, Webpack.

Run npm install to install all front-end dependencies; then run npm start to launch the webpack dev server. The application will then be available at localhost:8080.

The front end currently needs the local node/postgres backend running (installation instructions below).

Other commands

These commands are available after installation within the ajl.ai/ directory:

npm test: run unit tests with Jest, then lint on exit
npm run lint: run ESLint to identify syntax & style issues in the code
npm run build: generate a static build into ajl.ai/dist
npm run storybook: launch React Storybook for component development (accessible at localhost:6006 by default)

backend

The backend is running node, PostgreSQL and an express based set of middlewares for the REST API we expose. In order to develop the application, you must have an installation of PostgreSQL.

Setting up Postgres

There are numerous ways to run PostgreSQL. Choose the one that is most familiar to you!

If you have Docker on your machine, do the following:

docker run --name ia-pg -d -p 5432:5432 -i postgres
docker exec -it ia-pg su postgres -c 'createdb image-annotator'
npm run migrate:up
npm start

# stop postgres
docker stop ia-pg

# start postgres
docker start ia-pg

If you have VirtualBox and Vagrant on your machine, do the following:

printf "PGHOST=10.10.0.100\nPGUSER=image-annotator" > .env
vagrant up
npm run provision:vagrant
npm run migrate:up
npm start

# stop postgres
vagrant halt

# start postgres
vagrant up

If you want to install PostgreSQL locally on Ubuntu, do the following:

sudo apt-get install postgresql
sudo service postgresql stop
sudo sh -c "printf 'local all all trust\nhost all all 127.0.0.1/32 trust\nhost all all ::1/128 trust' > /etc/postgresql/9.5/main/pg_hba.conf"
sudo service postgresql start
createdb -U postgres image-annotator
npm run migrate:up
npm start

# stop postgres
sudo service postgresql stop

# start postgres
sudo service postgresql start

If you want to install PostgreSQL locally on OSX, do the following:

Ensure you have brew installed. Then, run this:

printf "PGUSER=$USER" > .env
brew install postgresql
brew services start postgresql
createdb image-annotator
npm run migrate:up
npm start

# stop postgres
brew services stop postgresql

# start postgres
brew services start postgresql

Running the prototypes

cd into the prototypes directory you want to look at and run npm start (Requires python, equivalent to running python -m SimpleHTTPServer 8000) or a comparable static web server.

Navigate to the following prototypes in your browser:

test on 25 sample landmarked images of faces sample-data-rotate-regions
UI experiment for three annotation modes: general fabric.js test

Glossary of Terms

This section clarifies the verbiage and terms used within the application code.

Attribute

An attribute is the object representing a specific type of annotation a user will apply to any given image, such as demographic attributes like "perceived ethnicity" or positional values like "the location of the right eye."

Attribute properties:

Name: the human-oriented label of an attribute, such as "Perceived Gender."
Type: the type of data represented by the attribute, e.g. a list of multiple-choice options, or a set of coordinates, etc. Note: This is a database- / data model-oriented value, and has no inherent correspondence to front-end presentation.
Options: the list of accepted values for a multiple-choice image attribute.

Attribute objects have the shape

{
    "name": "Attribute Name",
    "type": "type-of-data",
    "options": ["list", "of", "accepted", "values", "for", "multiple", "choice", "attributes"]
}

Annotation

An annotation is the object representing the submitted value for a particular Attribute.

Annotation properties:

Name: the name of the image attribute annotated, such as "Perceived Ethnicity".
Value: the value with which the image is annotated, such as "Asian" or "Black."

Annotation objects have the shape

{
    "name": "Attribute Name",
    "value": "selected-value"
}

ImageAnnotation

An image annotation is an object associating one or more Annotations with the specific Image to which they were applied. It contains an array of annotations, and the ID of their associated image.

ImageAnnotation objects have the shape:

{
    "id": 3359,
    "annotations": [{
        "name": "Attribute Name",
        "value": "selected-value"
    }, {
        "name": "Another Attribute Name",
        "value": "selected-value"
    }]
}

Image

An image is a representation of an image to which Annotations will be applied.

Image objects have the shape:

{
    "id": 3359,
    "url": "http://www.url.com/some-image.jpg",
    "width": 250,
    "height": 250
}

Workload

A workload is an object containing a list of Images to be annotated.

Workload objects have the shape:

{
    "id": 121,
    "images": [{
        "id": 3359,
        "url": "http://url.com/some-image.jpg",
        "width": 250,
        "height": 250
    }, {
        "id": 3360,
        "url": "http://url.com/other-image.jpg",
        "width": 250,
        "height": 250
    }]
}

AnnotatedWorkload

An annotated workload is a collection of ImageAnnotations submitted for a specific Workload.

Annotated Workload objects have the shape

{
    "workloadId": 121,
    "images": [{
        "id": 3359,
        "annotations": [{
            "name": "Attribute Name",
            "value": "selected-value"
        }, {
            "name": "Another Attribute Name",
            "value": "selected-value"
        }]
    }, {
        "id": 3360,
        "annotations": [{
            "name": "Attribute Name",
            "value": "selected-value"
        }, {
            "name": "Another Attribute Name",
            "value": "selected-value"
        }]
    }]
}

ajl.ai's People

Contributors

Stargazers

Watchers

Forkers

boazsender alexxnica mzgoddard enterstudio adebigare kryndex gnarf andytsing jackjiyh

ajl.ai's Issues

Record how long it takes an annotator to annotate an image

change age categories to integers 10-100

Add logic for establishing ground truth and starting new workloads

When a session posts it's first set of 12 annotations

check if they got the 8 ground truths correct
if they didn't get 8 right then have them start over
if they did, then check if the 4 unknown images have 4 agreeing submissions and make them into new ground truths
then respond with a new 3 image workload

Introduce the concept of seed truth for enrolling users

Create a table for seed_truth and consensus_truth.

Consider capturing workload order for each session

per @cowboy

finish scaffolding backend

Remove skintone from db scaffold and replace with ethniciy
Add endpoint for bootstrapping the client annotation types:

// to load the annotation types into the client (eg /api/v1/annotation_types
  annotations:{
    demographics: [{
        name: 'Perceived Age',
        options: ['infant', 'child', 'young adult', 'adult', 'elderly']
      },
      {
        name: 'Perceived Gender',
        options: ['other', 'female', 'male'],
      },
      {
        name: 'Perceived Ethnicity',
        options: ['balck', 'white', 'lantino/a', 'asian', 'other']
      }
    ],
    regions: [{}],
    landmarks: [{}]
  },

Add endpoint for bootstrapping the client work loads:

// to load workloads for the client (eg /api/v1/work)
// this would load 12 work items to start, followed by 3 each
// question about whether the server should default to knowing how many to send based on session, 
// or if this should be specified with a query param 
  workload : [{},{},{},{},{},{},{},{},{},{}]

Add endpoint for accepting annotations from the client:

// to take an array of annotations to be processed (eg /api/v1/annotations)
var annotations = [
  {
    image_id: ''
    demographics:[{
      name: '',
      option: ''
    },{
      name: '',
      option: ''
    }],
    regions: [],
    landmarks: [{}, {}, {}],
  },
  {...}
]

Consider adding a secondary demographics annotating inteface

For example:

a grid of 12 images where you select all the people who have the same gender as you.

CC @joyab Can you say more about this?

Establish Routing

We should implement a front-end router for the React application, and determine what the server infrastructure will be: should this be purely static w/ front-end routing, or have a server, etc

Improve error handling UX

HTTP errors

Add form logic for percieved demographics

create perceived demographics annotation interface

Perceived ethnicity: https://invis.io/HKACANH6Z#/218146722_Perceived_Ethnicity_-_Mobile
Perceived age: https://invis.io/HKACANH6Z#/218146720_Perceived_Age_-_Mobile
Perceived gender: https://invis.io/HKACANH6Z#/218146724_Perceived_Gender_-_Mobile

Set up CI with travis

setup deployment workflow dev/prod

Consensus criterea

Add concept of consensus criteria to attribute/annotation types.

Consume this in postAnnotation controller (which is currently being hard coded to do 3 other annotatiosn 100% match)

add logic for creating a new truth

In addition to the commits on PR #78, create logic to set a new truth in the known based on some arbitrary criteria of a "annotatable attribute". Perhaps the first pass is '4 different annotators agree'.

tech debt: Change idoms in var names and funcs from "face" to "img"

This is a tool for image annotation, not face annotation.

Move eye ridge landmarks into left eye and right eye regions

prototype SVG landmark interaction

Add unit test framework

The boilerplate I am using for #4 does not have a built-in unit testing framework; this issue tracks pulling something like Jest into the project

Pick a front end stack

Either following the https://github.com/AJL-U/openfaces stack (angular 2) or a different one.

Form Validation

Splitting out from #53: We should require users to have filled out an option for each step of the perceived demographics form before proceeding so they can't get the form into an invisibly invalid state.

This may be a good opportunity to add one final "confirmation" step before submitting & moving on to the next image, and/or to make the checkboxes auto-advance when selected.

This may mean moving form state into redux, per @gnarf's earlier feedback.

Return 400 code for invalid POST data

Pulled out from @gnarf's review of #65: we have a comment that says,

This post must contain only and all annotations for images sent to the current session. If there is a discrepancy between the current workload sent to the current session's annotation post, the post will return an error 500.

We should probably return a 400 "Bad Request" error instead

Sample data records should have .id instead of .name

I believe it to be an artifact of the LFW sample data set, but I'd like to propose we use a unique ID for each record instead of a name

setup vagrant

so people don't have to get postgres running.

Make face outline region 1

Resolve propType error in demographics component

Design UX checkpoint feedback system

Such as,

You've finished a workload & passed validation - take a moment to call that out, then here's the next workload
You've finished a workload and your answers do not match the "truths" - call that out and give them ability to re-start w/ the new workload
You just tried to annotate a workload that was not sent to you
You have been "enrolled"

Some of the errors (see #53) are more a "state" that the user is in, rather than a system error; we should expand whatever we do for error handling, to account for these workflow actions

Add a google form to optionally capture email address of annotators

Select & implement an AJAX library

We should convert the existing fake api into something that fires an actual AJAX call to load the data, so that there will be an easier migration path once we have a read-write API to call against.

create region UI

https://invis.io/HKACANH6Z#/218160455_Region_-_Mobile

refactor annotation_types into more normal form

annotation_type <-- top level names of all the annotation types
annotation_option <-- ethnicity/gender
annotation_range <-- age
annotation_box <-- regions
annotation_coord <--landmarks

consider adding a score column to the annotator table

per @gnarf eg: compute accuracy

create admin interface

administrate demographics

https://invis.io/HKACANH6Z#/217941303_Labeling_Demographics_-_Dawn_Pinch

administrate regions

https://invis.io/HKACANH6Z#/217941305_Labeling_Region_-_Dawn_Pinch

administrate landmarks

https://invis.io/HKACANH6Z#/217941304_Labeling_Landmarks_-_Dawn_Pinch

Define application state model

Define the data model for the application state and structure it using Redux

Depends on #4

create home page

https://invis.io/HKACANH6Z#/221452966_Welcome_-_Admin_Lady_Justice

Scaffold backend

Record demographics of annotator

add some way to represent multiple images of the same person

Specifically, start by implement a public_identities table and a link table with all of the images in the system.

identities:

id
real name
public name
imdb_id

Joy's model from mongo

person_id (IMDB Unique Number)
person_realname
person_publicName
dataset_enrollments (list of datasets with celebrity)
Imgs_of_celebrities (list of img urls with celebrity)

Remodel state

opening this issue to discuss how state should be modeled.

I'm leaning toward remodeling the client side state, but perhaps both server and client should evolve toward each other now that we know more about this software.

Create baseline visualization on home page

visualize annotation stats from annotation campaign
visualize ethnic/gender/age of landmark-annotated faces

Primary goal: motivate citizens to participate.

Show users how their participation improves their participation

Setup dev server that is not postgres dependant

either by pointing it at a remote test db, or at a local moch data server.

finish setting up initial development env

Update Loading reducer to be aware of new API actions

We added new API actions, but did not update the loading indicator reducers to account for them. This should be cleaned up.

determine landmarking data model according to standard numbering system

Set up React Storybook for developing presentational components

Criteria:

Install React Storybook
Configure Storybook to work with our components
(WIP)Add "stories" for existing presentational components
Evaluate existing interface to evaluate what other UI should be moved into presentational components
Add "stories" for main UI flow