Giter VIP home page Giter VIP logo

malvag / claimlinker Goto Github PK

View Code? Open in Web Editor NEW
4.0 2.0 0.0 87.94 MB

ClaimLinker is a Web service and API that links arbitrary text to fact-checked claims, offering a novel kind of semantic annotation of unstructured content. The system is based on a scalable, fully unsupervised and modular approach that does not require training or tuning and which can serve high quality results at real time.

HTML 1.12% Java 96.31% Shell 0.50% Roff 1.94% JavaScript 0.13%
nlp stanford-corenlp elasticsearch stanford-nlp fact-checking

claimlinker's Introduction

ClaimLinker

ClaimLinker is a Web service and API that links arbitrary text to fact-checked claims, offering a novel kind of semantic annotation of unstructured content. The system is based on a scalable, fully unsupervised and modular approach that does not require training or tuning and which can serve high quality results at real time.

More information is available at the following paper:

Maliaroudakis E., Boland K., Dietze S., Todorov K., Tzitzikas Y. and Fafalios P., 
"ClaimLinker: Linking Text to a Knowledge Graph of Fact-checked Claims", 
In Companion Proceedings of the Web Conference 2021.

A sub-module system has been implemented, divided respectively by the following modules:

Claimlinker_commons (main library for claim linking)

  • ClaimLinker
  • Core NLP API
  • Association Type
  • Similarity Measures
  • Test class

Claimlinker_web (web services)

  • Web Servlet that returns annotations in JSON
  • Web Servlet for Bookmarklet that allows a user to select a piece of text in a web page and check if there are fact-checked claims linked to the selected text.
  • JSP offering a form where the user can give sometext and check if there are fact-checked claims linked to that text.

ElasticSearch_Tools (indexing of claims and search service provision)

  • Initializer for an ElasticSearch server
  • Wrapper API for an ElasticSearch server
  • OpenCSV wrapper class

Getting Started

Installation

  1. We need to install the FEL library to the ClaimLinker_commons pom.
cd ClaimLinker_commons
mvn install:install-file -Dfile=./lib/FEL-0.1.0-fat.jar -DgroupId=com.yahoo.semsearch -DartifactId=FEL -Dversion=0.1.0 -Dpackaging=jar -DgeneratePom=true
  1. Then compile the whole project from the project's root directory:
cd ..
mvn compile package #to compile and package into jar the claimlinker
  1. Get the claim data from a csv, the hash file for FEL library the stopwords file and the punctuations file: i.e.
    • data/
      • claim_extraction_18_10_2019_annotated.csv (a dataset of fact-checked claims)
      • english-20200420.hash (used by FEL)
      • stopwords.txt
      • puncs.txt

ElasticSearch setup

You can check here how you can set up a single-node elasticsearch using docker.

Usage

Initializing ElasticSearch:

java ElasticInitializer -f "data.csv" -h "elasticsearch_host"

Running the ClaimlinkerTest class:

java -cp .:ClaimLinker_commons/target/ClaimLinker_commons-1.0-jar-with-dependencies.jar:ClaimLinker_web/target/ClaimLinker_web-1.0.jar:ElasticSearch_Tools/target/ElasticSearch_Tools-1.0.jar: csd.claimlinker.ClaimLinkerTest

Using it as a library:

ClaimLinker CLInstance = new ClaimLinker(elastic_search_threashold, similarityMeasures, stopwords_file, punctuations_file english_hash_FEL, ElasticSearch_host);

CLInstance.claimLink(text, num_of_returned_claims, associationtype, cleanPrevAnnotations)

i.e. (csd.claimlinker.ClaimLinkerTest)

public static void main(String[] args) throws Exception{
        demo_pipeline("Of course, we are 5 percent of the world's population;\n");
}

static Set<CLAnnotation> demo_pipeline(String text) throws IOException, ClassNotFoundException {
    AnalyzerDispatcher.SimilarityMeasure[] similarityMeasures = new AnalyzerDispatcher.SimilarityMeasure[]{
		AnalyzerDispatcher.SimilarityMeasure.jcrd_comm_words,           //Common (jaccard) words
		AnalyzerDispatcher.SimilarityMeasure.jcrd_comm_lemm_words,      //Common (jaccard) lemmatized words
		AnalyzerDispatcher.SimilarityMeasure.jcrd_comm_ne,              //Common (jaccard) named entities
		AnalyzerDispatcher.SimilarityMeasure.jcrd_comm_dissambig_ents,  //Common (jaccard) disambiguated entities BFY
		AnalyzerDispatcher.SimilarityMeasure.jcrd_comm_pos_words,       //Common (jaccard) words of specific POS
		AnalyzerDispatcher.SimilarityMeasure.jcrd_comm_ngram,           //Common (jaccard) ngrams
		AnalyzerDispatcher.SimilarityMeasure.jcrd_comm_nchargram,       //Common (jaccard) nchargrams
		AnalyzerDispatcher.SimilarityMeasure.vec_cosine_sim             //Cosine similarity
	};
	ClaimLinker CLInstance = new ClaimLinker(20, similarityMeasures, "data/stopwords.txt", "data/puncs.txt", "data/english-20200420.hash", "192.168.2.112");
	System.out.println("Demo pipeline started!");
	Set<CLAnnotation> results = CLInstance.claimLink(text, 5, Association_type.all, true);
	return results;
}

Using it as a Web Service returning results in JSON:

Example request for text "You know, interest on debt will soon exceed security spending.":

http://<claimlinker-url>/claimlinker?app=service&text=You%20know,%20interest%20on%20debt%20will%20soon%20exceed%20security%20spending.

Example of results in JSON: 

{"_results":[{
   "text":"You know, interest on debt will soon exceed security spending.",
   "sentencePosition":0,
   "association_type":"same_as",
   "linkedClaims":[
       { "claimReview_claimReviewed":"'Within a few years, we will be spending more on interest payments than on national security.'",
	 "_score":39.261948,
	 "extra_title":"Mitch Daniels says interest on debt will soon exceed security spending",
	 "rating_alternateName":"false",
	 "creativeWork_author_name":"Mitch Daniels",
	 "claimReview_url":"http://www.politifact.com/truth-o-meter/statements/2011/feb/17/mitch-daniels/mitch-daniels-says-interest-debt-will-soon-exceed-/",
	 "claim_uri":"http://data.gesis.org/claimskg/creative_work/16076cfe-34c2-542b-9ffa-41a4c8677ced"},
       { "claimReview_claimReviewed":"'In just 17 years, spending for Social Security, federal health care and interest on the debt will exceed ALL tax revenue!'",
         "_score":28.52819,
	 "extra_title":"Brat says entitlement and debt payments will consume all taxes in 2032",
	 "rating_alternateName":"mostly true",
	 "creativeWork_author_name":"Dave Brat",
	 "claimReview_url":"http://www.politifact.com/virginia/statements/2015/jun/16/dave-brat/brat-says-entitlement-and-debt-payments-will-consu/",
	 "claim_uri":"http://data.gesis.org/claimskg/creative_work/928f16c7-1ae8-53bb-a6d2-8aa6a59dbd45"},
       { "claimReview_claimReviewed":"'By 2022, just the interest payment on our debt will be greater than the defense of our country.'",
         "_score":28.45221,
         "extra_title":"Will interest on the debt exceed defense spending by 2022?",
         "rating_alternateName":"mostly true",
         "creativeWork_author_name":"Joe Manchin",
         "claimReview_url":"http://www.politifact.com/truth-o-meter/statements/2018/may/10/joe-manchin/will-interest-debt-exceed-defense-spending-2022/",
         "claim_uri":"http://data.gesis.org/claimskg/creative_work/3579711f-d846-57ba-998a-6c0983c2c2cb"},
       { "claimReview_claimReviewed":"'The debt will soon eclipse our entire economy.'",
         "_score":18.63715,
	 "extra_title":"Paul Ryan, in State of the Union response, says U.S. debt will soon eclipse GDP",
	 "rating_alternateName":"true",
	 "creativeWork_author_name":"Paul Ryan",
	 "claimReview_url":"http://www.politifact.com/truth-o-meter/statements/2011/jan/26/paul-ryan/paul-ryan-state-union-response-says-us-debt-will-s/",
	 "claim_uri":"http://data.gesis.org/claimskg/creative_work/7951dee8-b793-512c-bd13-074f648a052a"},
       { "claimReview_claimReviewed":"'Our spending has caught up with us, and our debt soon will eclipse the entire size of our national economy.'",
         "_score":17.90907,
	 "extra_title":"House Speaker John Boehner has the right count on the magnitude of the federal debt",
	 "rating_alternateName":"true",
	 "creativeWork_author_name":"John Boehner",
	 "claimReview_url":"http://www.politifact.com/ohio/statements/2011/jan/10/john-boehner/house-speaker-john-boehner-has-right-count-magnitu/",
	 "claim_uri":"http://data.gesis.org/claimskg/creative_work/4c906dfc-c6af-5d38-aec4-0808edb2dcb9"}]}],"timeElapsed":1117}

claimlinker's People

Contributors

dependabot[bot] avatar fafalios avatar malvag avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.