Giter VIP home page Giter VIP logo

anafora's Introduction

About Anafora - How Annotation Works - Requirements - Documentation - FAQ - Get Anafora

About Anafora

Anafora (pronounced "a-nuh-FOUR-uh", /ænəˈfɔɹə/) is a new annotation tool written at the University of Colorado by Wei-te Chen and Will Styler. Anafora is designed to be a lightweight, flexible annotation solution which is easy to deploy for large and small projects. Previous tools (Protege/Knowtator, eHost) have been written primarily with local annotation in mind, running as native, local applications and reading complex file or folder structures. This limits cross-platform deployment and requires the annotated data to be stored locally on machines, complicating data-use agreements and increasing data fragmentation. Alternatively, these programs can be run remotely via X-windows, but this increases latency and leaves annotation vulnerable to any connectivity interruptions.

Anafora was designed as a web-based tool to avoid this issue, allowing multiple anntators to access data remotely from a single instance running on a remote server. Designed for webkit-based browsers, annotators can work from nearly any modern OS, and no installation, local storage, or SSH logins are required. All data is regularly autosaved, and annotations (not source text) can be saved to local storage for restoration in the event of a connectivity interruption.

In addition, avoiding the complex formats, schemas, and filetypes associated with current solutions, Anafora was built to provide simple, organized representations of the data required for annotation. Annotation schemas and stored annotation data are both saved as human-readable XML, and these are stored alongside plaintext annotated files in a simple, database-free, static filesystem. This allows easy automated assignment of new sets, pre-made data organization, and ease of administration and oversight unmatched by other tools.

Most importantly, though, Anafora has been designed to offer an efficient and learnable means to annotate and adjudicate using even complex schemas, pipelines and workflows. Designed with complex schemas in mind (featuring spanned and spanless annotations, relations, annotation and pointer properties), Anafora has been built to handle any annotation type, and to be equally at home with multiple steps, passes, or annotation types.

Anafora provides annotation projects, simple or complex, with an easy-to-use single, lightweight and open-source solution to all spanned and spanless annotation needs.

How Anafora's web-based annotation works

Anafora is a secure, web-based tool. Once properly installed on a remote server, it can be accessed from anywhere with a web browser, and no part of the document being annotated is saved to the local machine.

When you open Anafora and select your document, the relevant text is opened in a browser along with the schema and a pane for properties, for you to annotate.

As you annotate, your annotations (only the numerical spans, annotation IDs, and associated properties) are cached in memory and saved to the server automatically every 2 minutes. In the event that your connection is interrupted, reloading the page will reload your annotations from that cache and save them to the server, meaning that although a consistent internet connection is desirable, an occasional interruption shouldn't result in data loss.

When you finish a session, you'll want to use the "save" command and exit the browser. By doing that, you'll know that your work is safe.

Screenshots

Demo Site

Although you cannot currently work on your own documents or navigate to others, the below is a live instance of Anafora allowing you to try annotation, examining annotations, or see the tool in action. No data will be saved.

Anafora Demo Site

Requirements

Server Requirements

To install Anafora on your machine as a central repository for an annotation project, you'll need:

  • Linux- or Unix-based server
  • Apache
  • Django (more details pending)

User-level requirements

In order to run Anafora as an end user/annotator, you'll need:

  • A consistent internet connection (although 100% consistency isn't necessary)
  • A modern browser supporting HTML5, Javascript and CSS (Google Chrome is our recommended choice and is the browser we're primarily testing with, but Safari, Chromium, Firefox and other Webkit-based browsers have been shown to work well as well. Anafora is not compatible with any version of Internet Explorer)
  • An account (which is a member of the requisite groups) set up on the server hosting Anafora
  • For adjudication and access to other annotators' data, you must be a member of the "anaforaadmin" group on the server on which Anafora is installed.

Documentation

The Anafora User Manual

The Anafora User/Administrator Manual (PDF) contains instructions for Annotators, Administrators, and Installation.

Frequently Asked Questions (FAQ)

Please see our Anafora FAQ for a complete listing of questions.

Sample Annotations, Schema, and Project file

To help you get a feel for how Anafora data and schemas look, we've created a few sample documents in Anafora, available here under sample data.

Version History and Changelog

For information about iterations of Anafora, view our version history and changelog,

Reference

Wei-Te Chen, Will Styler. 2013. Anafora: A Web-based General Purpose Annotation Tool, In Proceedings of the NAACL-HLT, Companion Volume: Demonstrations, Atlanta, GA, USA, pp. 433-438.

@InProceedings{N13-3004,
  author = 	"Chen, Wei-Te
		and Styler, Will",
  title = 	"Anafora: A Web-based General Purpose Annotation Tool",
  booktitle = 	"Proceedings of the 2013 NAACL HLT Demonstration Session",
  year = 	"2013",
  publisher = 	"Association for Computational Linguistics",
  pages = 	"14--19",
  location = 	"Atlanta, Georgia",
  url = 	"http://www.aclweb.org/anthology/N13-3004"
}

Get Anafora

Download Anafora

Anafora is Free and Open Source, released under the Apache License. Our source code is available in the src folder on github. We welcome new commits and bugfixes through pull requests.

anafora's People

Contributors

stylerw avatar weitechen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

anafora's Issues

Comparison vs. Gold Standard view for annotators

Annotators should have the ability, when a set is complete for them and from adjudication, to click an icon and view their annotations in a sort of adjudication mode next to the gold standard for learning.

Spelling mistake in **views.py** file

Hi, there is a spelling mistake in line 665 on the master branch.

  from django.core.exceptions import ImpoperlyConfigured

It should be ImproperlyConfigured instead of ImpoperlyConfigured. I found this error when I tried to configure it on my own computer.

"About Anafora" screen

We should have an "About Anafora" menu with the version number, logo, author names, etc

Relations Adjudication

Relations Adjudication is missing a few final tweaks before it's ready for deployment.

Anafora creates invalid XML

Anafora creates invalid XML if you use characters like "<" in a text field. For example, if you have a property named "Years" which has a text field for entry, then you get an Anafora XML file containing invalid XML like:

<Years><1</Years>

Supervisor Mode

Allows view-as-annotator, easy viewing of all annotations

It's possible for too many annotators to check out a file if they open it at about the same time

Currently have this situation with a schema as "directSetGold" and only allows one annotator.

One annotator starts a new task. Then another annotator starts it at virtually the same time -- since the first annotator hasn't saved yet, Anafora doesn't prevent the second one from starting it. It may be helpful if on a save action (one that would lead to creating a file), Anafora double-checks the file system to see if saving would still be compliant with the allowed number of annotators, and warns the user if it's not.

Newline characters selected in span are not displayed as being part of it

If you compare these two screenshots… One has just the date highlighted because I highlighted from the beginning to the end. The other has the date plus a little bit of whitespace afterwards because I started highlighting on the line below the date and just dragged up to select the text. Doing it that way also selected the newline character to be included in the span. Once you actually hit a hot key to mark the span, that whitespace character isn’t coloured/highlighted, so you can't see anymore that you've accidentally included some whitespace.

screen shot 2017-07-05 at 14 59 29

screen shot 2017-07-05 at 14 59 40

Self annotation has problem

If one entity set itself as linking property, it can not been set at first. However, after we click the entity again, itself has been set as its property still. Should ban this activity

Disagreement in an entity shows up as disagreements in entities that reference it

As you can see in ID 169 here (span "bends"), Anafora is claiming a disagreement in the "part_of" property, when they are actually referring to the same entity ("fault").

31240697-b92ed3ac-a9be-11e7-9063-8cadb805a31c

If we look at that entity, we can see that this is because the annotators disagreed on the properties of the "fault".

31240695-b80d6218-a9be-11e7-9b01-f4d5276e95e0

Anafora should mark the latter as a disagreement, but not the former. So perhaps comparing just span indices and entity types of the properties, not the properties of properties.

Incomplete Annotations Cannot Be Accessed

I'm using Anafora to annotate a corpus of files using a custom schema. However, I've noticed that if I begin to annotate a file and save it, but then leave the site and come back to keep working on it later, I'm not able to access the schema. Even though the file is saved, as long as I walk away from the task and come back I am subsequently unable to mark up the text again.

Any work on the file cannot resume until I delete the existing annotation file on the server, which means losing all the work I had done on it so far. As you can imagine, this is incredibly frustrating. I was wondering if this is something inherent to Anafora or if I'm perhaps doing something wrong.

Any help you can provide would be greatly appreciated!

Annotations extending beyond the end of the document.

I'm currently writing a plugin for GATE (http://gate.ac.uk) to enable us to read in documents annotated with Anafora. You can find the latest version of the plugin at https://github.com/GateNLP/gateplugin-Format_Anafora

While the XML format Anafora uses is nice and easy to parse I've come across quite a few documents where an annotator has somehow managed to produce an annotation that ends after the end of the document. In many cases this isn't just one or two characters difference (something I could understand if there were issues of encoding etc.) but a difference of 50 characters or more. Currently I simply truncate these annotations to match the document length, and it seems that this results in seeing the same annotations in both Anafora and GATE.

As I haven't checked every annotation on every document though I'm wondering if this is an isolated issues with annotations that end at the end of a document, or if there might be a wider issue with annotation offsets being stored incorrectly.

I've attached a document and annotation file so you can see what I mean. In this instance the document is 817 characters long (I'm assuming it's UTF-8 but wiith no multi-byte characters as it's also 817 bytes long), but the second annotation in the file produced by Anafora spans from offset 656 to 835; in otherwords it goes 18 characters beyond the end of the document.

doc000149.zip

Installation guide

Hey everyone,

Anafora looks like a really powerful tool, but so far I haven't been able to set it up successfully. Is there any installation guide other that this one? Also, is the project still alive?

RSS for completed sets

Allow administrators to view RSS feeds of telling when sets have been completed and by who.

"Mark as completed" greyed out, but clicking on it still works

I don't have more details than this about the circumstances, I'm afraid, but it's been reported that sometimes when an annotator goes to mark a note as completed, they find the button in the menu is greyed out. But apparently if they click it, the note will get marked as completed appropriately.

User Management

Hi,

I haven't yet understood how to handle multiple users. Could you briefly describe how to create / add new users?

Directory error

We are very impressed with Anafora's features. We intend to try your application for annotating custom transcription documents.

We have just managed to install the application, but we have been encountering issues loading content. Could you suggest a way forward?

The goal is to be able to load our document and view annotations that works exactly like the demo page.

I have attached the screenshot of the error message
anafora error

Error when open in-progress adjudication task

I got following error opening an in-progress adjudication task

Uncaught TypeError: Cannot read property 'length' of undefined
at Function.each (jquery-1.8.2.min.js:2)
at PropertyType. (anaforaAdjudicationProject.js:914)
at Function.each (jquery-1.8.2.min.js:2)
at Entity. (anaforaAdjudicationProject.js:912)
at Function.each (jquery-1.8.2.min.js:2)
at AnaforaProject. (anaforaAdjudicationProject.js:897)
at Function.each (jquery-1.8.2.min.js:2)
at AnaforaAdjudicationProject.readFromXMLDOM (anaforaAdjudicationProject.js:896)
at String. (annotate.js:217)
at Function.each (jquery-1.8.2.min.js:2)

Automation of Complex Pipelines

When Entity adjudication is completed, preannotations for relations can be automatically created, specified by project/schema. This is currently done by script.

Double clicking on a word selects 2 words

The documentation states :
When you double-click on a word, most browsers will expand the selection to include the entire word.

This works on Edge, but on Firefox or Chrome double-clicking on a word selects that word plus the following word, which is not very ergonomic. It would be nice to have a fix on that.

TLinks disappear when highlighting source or target

Because TLINK/ALINK are not allowable types for filling the Source/Target slots, all relations disappear from the grid when you create a new relation. This is the expected behavior, but is frustrating. Perhaps always display relations, even when they're not allowed in the slot, because that behavior is confusing annotators and me.

Tablet Mode

Anafora is already supported on Safari (iOS 6) and Chrome (Android, iOS), but lacks ability to click-to-highlight, click in tree to create, required for annotation. Tablet mode could be auto-enabled when mobile chrome (for consistency) useragent is found.

"Cancel" button doesn't work in document selection

The “cancel” button on the screen before you select a specific document doesn’t seem to work. If I accidentally select, say, "colon notes" instead of "brain notes," clicking "cancel" doesn't take me back to the previous screen (or do anything at all, actually); I have to close and restart Anafora.

Annotations created in Anafora 1.0 won't open in v 1.1

I have a lot of annotated documents that we did in Anafora 1.0. When I try to open these in Anafora 1.1 the document opens but no schema is loading. Was there backward incompatible change made in the formal of the annotation documents between versions?

Need ability to scroll through menu of links

Sometimes when a markable participates in a lot of links, when you click on the markable to view those links, it's not possible to see them all because they go off the screen and there's no way of scrolling up or down.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.