Giter VIP home page Giter VIP logo

afra's Introduction

build status Code Climate Gitter chat

Afra: crowdsourcing gene feature annotation

Genomes of emerging model organisms are now being sequenced at low cost. However, obtaining accurate gene predictions remains challenging. Even the best gene prediction algorithms make substantial errors, leading to further erroneous analysis. Therefore, many predicted genes need to be visually inspected and manually curated (Yandell & Ence); this can be infeasible when working with thousands of genes from multiple organisms.

Inspired by crowdsourcing approaches and platforms including Foldit, Galaxy Zoo and Crowdflower, we are developing Afra to recruit additional gene feature curators. This should help dramatically increase the quality of gene curations available for newly sequenced genomes. In the long-term we aim to recruit contributors among members of the general public. However, gene curation requires large amounts of specialist knowledge and overcoming a steep learning curve. While we are working to reduce the steepness of the learning curve via interactive tutorials and support forums, genome curation is not yet easily accessible to all. Thus in a first instance we are recruiting curators among biology students. They perform curations as part of their courses aiming to understand gene structure and/or challenges with gene identification and gene prediction.

Current status

Users login to their dashboard using their Facebook account, where they are presented with documentation, guided tutorial exercises, and curation challenges which include "Curate" buttons. Each curation challenge invites user to contribute towards a different curation project.

user dashboard

Clicking 'Curate' sends the user to a JBrowse-derived WebApollo-like curation interface focusing on a single gene model and showing all available tracks of evidence for this gene model. The user starts by dragging one of these models (typically the consensus gene model) to the edit track and can then edit this gene model.

curation interface

Users may refer to the tutorials or seek help on our forum using the 'Help & Support' link at the top. A simple step by step guideline to curation is always available in a sidebar that folds to the right.

Behind the scenes

Afra imports a GFF file of predicted gene models and creates a prioritized list of "curation tasks" based on expected curation difficulty; the administrator can additionally prioritize specific genes for a specific curation project. Each gene prediction is presented to four independent users/curators. Each curator independently examines the gene model and may propose revisions or add comments (e.g., if there is insufficient evidence to curate).

For each gene prediction, submitted gene models are then automatically compared: if all users propose the same changes to a gene model, these changes are considered to be correct. If gene models proposed by different curators disagree, the different gene predictions are shown to several more experience curators who submit their curation in turn. If gene models proposed by the more experienced curators disagree, all predictions are shown to an even more senior curator who makes a final verdict.

Roadmap

at work

  • Annotation editing.
  • Prioritized redundant task distribution
  • Basic user dashboard.
  • Simple, non-interactive tutorials.
  • Obtain curations from eight QMUL MSc students.
  • Obtain contributions from 20 of undergraduate students.
  • December 2014: Simple editor synchronization between two tabs/windows.
  • December 2014: Improve annotation editing experience. Make it more intuitive.
  • December 2014: Basic automated testing of annotation editing functionality.

Todos:

  • Improve page load times.
  • Partially done genome dashboard: Overview of contributions per genome. How many curations. How many pass auto-check.
  • Comments on curations.
  • Extensive automated testing of annotation editing functionality.
  • Improve annotation editing performance.
  • Interactive tutorial.
  • Roll out to 200 first year students learning about gene structure ... and the inadequacies of Bioinformatics algorithms.

Contributions are welcome

We welcome contributions of code, curations, or documentation. Find us on Gitter to discuss how you could best help.

Our Wiki details setting up a development environment using Docker.

Contact

Please email if you:

  • would like a demo
  • would like to use Afra in your institution to help teach students
  • have any other questions

Afra is Copyright (©) 2013 Queen Mary, University of London.
Parts of Afra are a derivative work of JBrowse and WebApollo which are respectively copyright (c) 2000-2006 The Perl Foundation and copyright (c) 2010 Regents of the University of California.

afra's People

Contributors

aniarya82 avatar bmpvieira avatar hargup avatar raivivek avatar sa1 avatar yannickwurm avatar yeban avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

afra's Issues

scrollwheel-zooming

normal scrolling zoom (with scroll-wheel or gestures on ipad/trackpad) should work: users expect it + its extremely handy + we'll need it for tablets

Zoom level with overlapping genes.

If two maker predictions overlap (even if on different strands), the curation task is much more difficult. We shouldn't show these to junior users.

When we do show them, the browser should probably be zoomed out to show both models (e.g. this didn't happen for the task at Si_gnF.scaffold02694:11748..16889 )

Installation fails

I tried following the installation instructions on my 10.9.1 mac. Some of the prerequisites were easier (thanks to home-brew). I did the following:

brew install ruby-install
ruby-install ruby  
# actually installs 2.1, not 2.0.0 as the documentation says. 

brew install chruby
# Add the following to the ~/.bashrc or ~/.zshrc file:
# source /usr/local/share/chruby/chruby.sh

brew install https://raw.github.com/postmodern/chgems/master/homebrew/chgems.rb
brew install n

open /Applications/Postgres.app

That all went very smoothly. However, rake fails. The full log is here. At least one reason is that it cannot install pg gem. Instead, you need: gem install pg -- --with-pg-config=/Applications/Postgres.app/Contents/MacOS/bin/pg_config

If I try to push/hack through these issues, I later get stuck with this type of error:

/usr/local/Cellar/ruby/2.1.0/lib/ruby/gems/2.1.0/gems/sequel-4.7.0/lib/sequel/adapters/postgres.rb:161:in async_exec': PG::UndefinedTable: ERROR: relation "features" does not exist (Sequel::DatabaseError)
LINE 1: SELECT * FROM "features" LIMIT 1`

BTW, In the wiki, the line ruby -r ./app.rb -e 'App.gff2jbrowse is missing a closing '

jBrowse scrolls to bottom

When you click "Curate" from the dashboard, the loaded jBrowse scrolls down to the bottom. This is confusing to users. Like Webapollo & normal jBrowse, it should load scrolled to the top (i.e. where you cannot go any higher)

highlight reading frame

This is applicable only when zoomed in to base pair level. Perhaps highlight the active reading frame with a green border.

Display only relevant data

After clicking "Contribute more", previous curation contribution is still visible (if new location is on same scaffold). Data needs to be flushed/loaded from scratch

search track empty after zooming in

The search option is not user-friendly. In this particular gene (>maker-Si_gnH.scaffold00024-snap-gene-4.24-mRNA-1Si_gnH.scaffold00024:438016..455209(+) genomic 17193bp), I searched [options aa and forward] for the short sequence VELAG. The new track appeared at the bottom of the screen but was left empty (no usual yellow rectangle with an arrow). When zooming out to 435,000 to 460,000 it can be seen (see screenshot 1). But when zooming in on it (click-and-drag), it disappears (see screenshot 2).
velag_search_visible_zoomed_out
velag_search_not_visible_zoomed_in

Error message when searching reference sequence

After creating a track “Search reference sequence.” of 5 aa (GWYLVW), I scrolled up to check other tracks. Then when I scrolled back down, an error message popped up (see screenshot). I then realised I did not tick “aa” in the track creation, so I created a second track ticking “aa” and exactly the same error message appeared. The only way to remove these error messages was to go back to the dashboard and click “Curate”.
search track error cropped

Add Alert: Only works in Chrome

Times aren't displayed properly in Safari. Easiest solution is to restrict development & beta usage to Chrome (where things do display correctly).

context menu missing

The menu that appears when you right click on a feature to extract sequence etc in webapollo is missing

splice sites

when editing, non-canonical splice sites fail to light up

Rightclick on exon 2 while resizing exon 1

Rightclick on exon 2 while resizing exon 1 - the resize thing stays blocked:

http://i.imgur.com/4jh7zXP.png

Even if you change pages, the resize-thing stays blocked onscreen

http://i.imgur.com/7JxEOtV.png

The solution is likely to kill any currently active editing action if user right-clicks...

JS: Dragging on first click

After the first click on a gene model you cannot drag it into the edit track. You can only do this after the second. That it counterintuitive.

nucleotides within gene predictions

when I zoom in, nucleotides often aren't shown correctly in the exons (within edit track) - the spacing is incorrect thus they are left-aligned within each "block" (the equivalent nucleotides are corruptly positioned in the DNA track)

gene model become non-editable after a glitch while zooming out

I was editing one model and zoomed out, only to see that the gene model had disappeared from the edit track. I selected it again and dragged it on the edit track. It became shaded (see screenshot) and won't be editable. I had to go back to dashboard and click on curate to fully recover the gene model.

shaded_sequence_when_dragged_on_edit_track

Stop rebasing master!

PLEASE!!! There's no reason for this... and every time you do it, a kitten dies somewhere... srsly!!! THINK OF THE KITTENS!!!1

Issues installing using Docker

I'm using boot2docker on Mac OS 10.10b3, and having loading the app:

docker build -t yeban/afra .
docker run -t -i -p 9292:9292 yeban/afra

This all works find, and I can see that the run.sh script execute and the database is created and app.rb is running and responding to my requests, but whenever I load http://localhost:9292 I get a blank page, not the content from www/index.html.

vertical screenspace underused

vertical screenspace is badly used. Only very few gene predictions are shown in comparison with a similar webapollo display

Contribute more should send farther

"contribute more" shifts me 100bp further to the right. If genes are closely, this makes it feel as if it did nothing.
Instead, "contribute more" should send the user to a random gene with same priority

Allow alternate translation table

Can do by modifying or swapping CodonTable.js with another once the following are implemented:

  • Afra assumes only one possible start codon (ATG). 'Set Longest ORF', 'Set Translation Start', 'Mark non-canonical translation start site' will need to be flexible to multiple start codons.
  • Known splice donors and splice acceptors should be a constant in CodonTable.js, and functions need to take that into account.

performance degradation

After spending a few minutes clicking on things the cursor messes up... e.g. stays stuck in "move around mode" or in "zoom in mode"

edit track should be above others.

when dragging the screen the yellow edit track stays "stuck" where it was.
Ok but currently the other tracks go on top of this and above this. Visually very confusing - other tracks should disappear underneath the yellow.

WA?

add WA logo if some WA is used. And/or state in readme that some code is from WA.

Checklist placeholder

Currently need: bullet-points or 'function-less' checkboxes.
Anurag: if you propose something based on webapollo tutorial/documentation in a google doc, I can revise

Broken editing

CSS is messed up when editing (resize boxes, arrowheads)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.