The contradoc from ddhruvkr

contradoc's Introduction

ContraDoc: Understanding Self-Contradictions in Documents with Large Language Models

This is the repo for ContraDoc: Understanding Self-Contradictions in Documents with Large Language Models.

Dataset Introduction:

CONTRADOC contains 449 self-contradictory(positive examples) and 442 non-contradictory(negative examples) documents, covering documents sourced from CNN_Dailymain News, Wikipedia and Story Summaries, document length varying from 300 to 2200 tokens. This dataset is created by introducing self-contradiction to documents using GPT-4-Modify => Human Annotate and Verify. This dataset is developed as a benchmark to test model's ability in finding contradiction in long document.

Dataset Format:

The positive examples are in "pos", while negative examples are in "neg", please refer to the paper for more detials of each label.

{"pos":
  {"DOC_ID":
    {"text": DOCUMENT, 
      "evidence": SENTENCE_INTRODUCING_CONTRADICTION,
      "unique id": DOC_ID,
      "doc_type":"story_OR_news_OR_wiki",
      "contra_plug": "Insert_OR_Replace", 
      "contra_type": [contradiction type],
      "scope": "global_OR_local_OR_intra",
      "ref sentences"(optional): [sentences contradict the evidence]
    },
  },
},
{"neg":
  {"DOC_ID":
    {"text": DOCUMENT,
      "doc_type": "story_OR_news_OR_wiki",
      "unique id": DOC_ID
    },
  },
}

We will be releasing the code for evaluation and dataset creation soon.

contradoc's People

Contributors

Stargazers

Watchers

contradoc's Issues

Can't tell if the repository is complete

The README says "We will be releasing the code for evaluation and dataset creation soon." That file was last updated 9 months ago. I am not sure whether this code has been added to the repository yet.

7 months ago the file eval_metric.py was added with the comment, "Add evaluation metrics in the paper." Was that the code for evaluation mentioned in the README?

Has the code for dataset creation been added to the repo? If not, will it be?

Thanks.

Recommend Projects

ddhruvkr / contradoc Goto Github PK

contradoc's Introduction

ContraDoc: Understanding Self-Contradictions in Documents with Large Language Models

Dataset Introduction:

Dataset Format:

We will be releasing the code for evaluation and dataset creation soon.

contradoc's People

Contributors

Stargazers

Watchers

Forkers

contradoc's Issues

Can't tell if the repository is complete

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent