Giter VIP home page Giter VIP logo

ajl.ai's Introduction

Image Annotator

This tool is a work in progress collaboration between MIT Civic Media and Bocoup to build an open source tool for crowd sourcing image annotation. Our first campaign with the tool is focused on annotating demographics in VGG Faces.

You can checkout that campaign at ajl.ai.

This repo is currently structured as a base open source annotator, and the VGG Faces annotation campaign together.

In the future, we'll be working to publish reusable image annotation tools, along with our data and findings. In the meantime, this is a good starting point if you're looking to run a large scale crowd image annotation campaign.

Repository Structure

  • ansible - deployment scripts for running the application on Ubuntu 16 server
  • backend - back-end api written in node/express/postgress
  • frontend - front-end client written in react/redux/webpack
  • jest - unit and integration tests
  • prototypes/ - exploratory interaction prototypes
    • fabric-test/ - prototypes using fabric.js for rendering
    • paper-test/ - prototype using Paper.js for rendering
    • snap-svg-test/ - prototype using SVG to render landmarks
  • sample-data - initial sample data from researchers
  • static - initial static data for testing
  • api.md - REST API reference documentation

REST API

This project exposes a REST API with the following endpoints:

  • get /api/annotations/types
  • get /api/annotations/workload
  • post /api/annotations

More information available in api.md.

Deployment & Reporting

Image Annotator is configured to be deployable on any Ubuntu 16 server. All you need is Ansible 2.2x and root access to your desired target machine.

The current VGG Faces campaign is deployed to Bocoup's infrastructure which is orchastrated from bocoup/infrastructure-foundation.

In order to deploy to, or download reports from, this server, you will need to submit a PR to this (mitmedialab/ajl.ai) repo adding configuration for yourself to ansible/vars/users.yml. Once this has been done, ask one the project contributors can provision your account and give you the "Vault password" needed to run the following commands:

npm run edit-secrets

This will open the secrets file in your default editor. Secret data is managed by Ansible Vault. All secrets are stored in ansible/vars/secrets.yml.

npm run provision:[production|staging|vagrant]

This will prepare a target machine with all system dependencies needed to run Image Annotator and grant collaborators access to run deployments. You will be prompted for both a Vault password and a SUDO password during this task. You will have configured your sudo password at the beginning of the project when you added yourself to ansible/vars/users.yml.

npm run deploy:[production|staging|vagrant] -- -e commit=master

This will clone your desired commit to the target machine, install all node dependencies, compile the site with webpack, apply any outstanding migrations to the database and restart the API server.

npm run database-restore:[production|staging|vagrant]

The database residing on the production server is backed up to S3 hourly. Running this for any target server will completely replace the database with the most recent backup. Take care not to do this for production unless you know what you are doing!

Reporting

Running the following commands from the root directory of this repository will dump a CSV from our production database to your local machine.

npm run download-annotations

This will export all annotation data to annotations.csv. You'll be prompted for a "Vault password" when running this.

npm run download-feedback

This will export all feedback form data to feedback.csv. You'll be prompted for a "Vault password" when running this. TODO: implement a feedback form in the UI. At the time of this writing we are using google forms for feedback.

Installation

Frontend

The front end is running React, Redux, Webpack.

Run npm install to install all front-end dependencies; then run npm start to launch the webpack dev server. The application will then be available at localhost:8080.

The front end currently needs the local node/postgres backend running (installation instructions below).

Other commands

These commands are available after installation within the ajl.ai/ directory:

  • npm test: run unit tests with Jest, then lint on exit
  • npm run lint: run ESLint to identify syntax & style issues in the code
  • npm run build: generate a static build into ajl.ai/dist
  • npm run storybook: launch React Storybook for component development (accessible at localhost:6006 by default)

backend

The backend is running node, PostgreSQL and an express based set of middlewares for the REST API we expose. In order to develop the application, you must have an installation of PostgreSQL.

Setting up Postgres

There are numerous ways to run PostgreSQL. Choose the one that is most familiar to you!

If you have Docker on your machine, do the following:

docker run --name ia-pg -d -p 5432:5432 -i postgres
docker exec -it ia-pg su postgres -c 'createdb image-annotator'
npm run migrate:up
npm start

# stop postgres
docker stop ia-pg

# start postgres
docker start ia-pg

If you have VirtualBox and Vagrant on your machine, do the following:

printf "PGHOST=10.10.0.100\nPGUSER=image-annotator" > .env
vagrant up
npm run provision:vagrant
npm run migrate:up
npm start

# stop postgres
vagrant halt

# start postgres
vagrant up

If you want to install PostgreSQL locally on Ubuntu, do the following:

sudo apt-get install postgresql
sudo service postgresql stop
sudo sh -c "printf 'local all all trust\nhost all all 127.0.0.1/32 trust\nhost all all ::1/128 trust' > /etc/postgresql/9.5/main/pg_hba.conf"
sudo service postgresql start
createdb -U postgres image-annotator
npm run migrate:up
npm start

# stop postgres
sudo service postgresql stop

# start postgres
sudo service postgresql start

If you want to install PostgreSQL locally on OSX, do the following:

Ensure you have brew installed. Then, run this:

printf "PGUSER=$USER" > .env
brew install postgresql
brew services start postgresql
createdb image-annotator
npm run migrate:up
npm start

# stop postgres
brew services stop postgresql

# start postgres
brew services start postgresql

Running the prototypes

cd into the prototypes directory you want to look at and run npm start (Requires python, equivalent to running python -m SimpleHTTPServer 8000) or a comparable static web server.

Navigate to the following prototypes in your browser:

Glossary of Terms

This section clarifies the verbiage and terms used within the application code.

Attribute

An attribute is the object representing a specific type of annotation a user will apply to any given image, such as demographic attributes like "perceived ethnicity" or positional values like "the location of the right eye."

Attribute properties:

  • Name: the human-oriented label of an attribute, such as "Perceived Gender."
  • Type: the type of data represented by the attribute, e.g. a list of multiple-choice options, or a set of coordinates, etc. Note: This is a database- / data model-oriented value, and has no inherent correspondence to front-end presentation.
  • Options: the list of accepted values for a multiple-choice image attribute.

Attribute objects have the shape

{
    "name": "Attribute Name",
    "type": "type-of-data",
    "options": ["list", "of", "accepted", "values", "for", "multiple", "choice", "attributes"]
}

Annotation

An annotation is the object representing the submitted value for a particular Attribute.

Annotation properties:

  • Name: the name of the image attribute annotated, such as "Perceived Ethnicity".
  • Value: the value with which the image is annotated, such as "Asian" or "Black."

Annotation objects have the shape

{
    "name": "Attribute Name",
    "value": "selected-value"
}

ImageAnnotation

An image annotation is an object associating one or more Annotations with the specific Image to which they were applied. It contains an array of annotations, and the ID of their associated image.

ImageAnnotation objects have the shape:

{
    "id": 3359,
    "annotations": [{
        "name": "Attribute Name",
        "value": "selected-value"
    }, {
        "name": "Another Attribute Name",
        "value": "selected-value"
    }]
}

Image

An image is a representation of an image to which Annotations will be applied.

Image objects have the shape:

{
    "id": 3359,
    "url": "http://www.url.com/some-image.jpg",
    "width": 250,
    "height": 250
}

Workload

A workload is an object containing a list of Images to be annotated.

Workload objects have the shape:

{
    "id": 121,
    "images": [{
        "id": 3359,
        "url": "http://url.com/some-image.jpg",
        "width": 250,
        "height": 250
    }, {
        "id": 3360,
        "url": "http://url.com/other-image.jpg",
        "width": 250,
        "height": 250
    }]
}

AnnotatedWorkload

An annotated workload is a collection of ImageAnnotations submitted for a specific Workload.

Annotated Workload objects have the shape

{
    "workloadId": 121,
    "images": [{
        "id": 3359,
        "annotations": [{
            "name": "Attribute Name",
            "value": "selected-value"
        }, {
            "name": "Another Attribute Name",
            "value": "selected-value"
        }]
    }, {
        "id": 3360,
        "annotations": [{
            "name": "Attribute Name",
            "value": "selected-value"
        }, {
            "name": "Another Attribute Name",
            "value": "selected-value"
        }]
    }]
}

ajl.ai's People

Contributors

boazsender avatar cowboy avatar gnarf avatar isaacdurazo avatar joyab avatar kadamwhite avatar mzgoddard avatar pbeshai avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ajl.ai's Issues

Add logic for establishing ground truth and starting new workloads

When a session posts it's first set of 12 annotations

  • check if they got the 8 ground truths correct
  • if they didn't get 8 right then have them start over
  • if they did, then check if the 4 unknown images have 4 agreeing submissions and make them into new ground truths
  • then respond with a new 3 image workload

finish scaffolding backend

  • Remove skintone from db scaffold and replace with ethniciy
  • Add endpoint for bootstrapping the client annotation types:
// to load the annotation types into the client (eg /api/v1/annotation_types
  annotations:{
    demographics: [{
        name: 'Perceived Age',
        options: ['infant', 'child', 'young adult', 'adult', 'elderly']
      },
      {
        name: 'Perceived Gender',
        options: ['other', 'female', 'male'],
      },
      {
        name: 'Perceived Ethnicity',
        options: ['balck', 'white', 'lantino/a', 'asian', 'other']
      }
    ],
    regions: [{}],
    landmarks: [{}]
  },
  • Add endpoint for bootstrapping the client work loads:
// to load workloads for the client (eg /api/v1/work)
// this would load 12 work items to start, followed by 3 each
// question about whether the server should default to knowing how many to send based on session, 
// or if this should be specified with a query param 
  workload : [{},{},{},{},{},{},{},{},{},{}]
  • Add endpoint for accepting annotations from the client:
// to take an array of annotations to be processed (eg /api/v1/annotations)
var annotations = [
  {
    image_id: ''
    demographics:[{
      name: '',
      option: ''
    },{
      name: '',
      option: ''
    }],
    regions: [],
    landmarks: [{}, {}, {}],
  },
  {...}
]

Establish Routing

We should implement a front-end router for the React application, and determine what the server infrastructure will be: should this be purely static w/ front-end routing, or have a server, etc

Consensus criterea

Add concept of consensus criteria to attribute/annotation types.

Consume this in postAnnotation controller (which is currently being hard coded to do 3 other annotatiosn 100% match)

add logic for creating a new truth

In addition to the commits on PR #78, create logic to set a new truth in the known based on some arbitrary criteria of a "annotatable attribute". Perhaps the first pass is '4 different annotators agree'.

Add unit test framework

The boilerplate I am using for #4 does not have a built-in unit testing framework; this issue tracks pulling something like Jest into the project

Form Validation

Splitting out from #53: We should require users to have filled out an option for each step of the perceived demographics form before proceeding so they can't get the form into an invisibly invalid state.

This may be a good opportunity to add one final "confirmation" step before submitting & moving on to the next image, and/or to make the checkboxes auto-advance when selected.

This may mean moving form state into redux, per @gnarf's earlier feedback.

Return 400 code for invalid POST data

Pulled out from @gnarf's review of #65: we have a comment that says,

This post must contain only and all annotations for images sent to the current session. If there is a discrepancy between the current workload sent to the current session's annotation post, the post will return an error 500.

We should probably return a 400 "Bad Request" error instead

Design UX checkpoint feedback system

Such as,

  • You've finished a workload & passed validation - take a moment to call that out, then here's the next workload
  • You've finished a workload and your answers do not match the "truths" - call that out and give them ability to re-start w/ the new workload
  • You just tried to annotate a workload that was not sent to you
  • You have been "enrolled"

Some of the errors (see #53) are more a "state" that the user is in, rather than a system error; we should expand whatever we do for error handling, to account for these workflow actions

Select & implement an AJAX library

We should convert the existing fake api into something that fires an actual AJAX call to load the data, so that there will be an easier migration path once we have a read-write API to call against.

add some way to represent multiple images of the same person

Specifically, start by implement a public_identities table and a link table with all of the images in the system.

identities:

  • id
  • real name
  • public name
  • imdb_id

Joy's model from mongo

  • person_id (IMDB Unique Number)
  • person_realname
  • person_publicName
  • dataset_enrollments (list of datasets with celebrity)
  • Imgs_of_celebrities (list of img urls with celebrity)

Remodel state

opening this issue to discuss how state should be modeled.

I'm leaning toward remodeling the client side state, but perhaps both server and client should evolve toward each other now that we know more about this software.

Create baseline visualization on home page

  • visualize annotation stats from annotation campaign
  • visualize ethnic/gender/age of landmark-annotated faces

Primary goal: motivate citizens to participate.

Show users how their participation improves their participation

Fill out missing LFW landmarks

Several sample data faces are missing full landmark data, this issue tracks manually updating the sample data to add in that information

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.