Giter VIP home page Giter VIP logo

wire57's Introduction

WiRe57

This repository contains resources for the Open Information Extraction benchmark WiRe57.

It acts as a companion to the article

William Léchelle, Fabrizio Gotti, Philippe Langlais (2019), WiRe57 : A Fine-Grained Benchmark for Open Information Extraction, LAW XIII 2019 : The 13th Linguistic Annotation Workshop

Introduction

Open Information Extraction (OIE) systems, starting with TextRunner, seek to extract all relational tuples expressed in text, without being bound to an anticipated list of predicates. Such systems have been used recently for relation extraction, question-answering, and for building domain-targeted knowledge bases, among others.

Subsequent extractors (ReVerb, Ollie, ClausIE, Stanford Open IE, OpenIE4, MinIE) have sought to improve yield and precision. Despite this, the task definition is underspecified, and, to the best of our knowledge, there is no gold standard.

In order to mitigate this problem, we built a reference for the task of Open Information Extraction, on five documents. We tentatively resolve a number of issues that arise, including inference and granularity. We seek to better pinpoint the requirements for the task. We produce our annotation guidelines specifying what is correct to extract and what is not. In turn, we use this reference to score existing Open IE systems. We address the non-trivial problem of evaluating the extractions produced by systems against the reference tuples, and share our evaluation script. Among seven compared extractors, we find the MinIE system to perform best.

Available resources

The annotation guidelines define how to annotate English text with triples.

The data directory contains the following files:

  • README.md: Specifies the source of the annotated documents.
  • WiRe57_343-manual-oie.json: The WiRe57 manual reference
  • WiRe57_extractions_by_ollie_clausie_openie_stanford_minie_reverb_props-export.json: Extractions by state-of-the-art OIE systems on WiRe documents.

The code directory contains the evaluation script used in the paper.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.