Giter VIP home page Giter VIP logo

wesslen / verifi-icwsm-2018 Goto Github PK

View Code? Open in Web Editor NEW
7.0 4.0 1.0 21.04 MB

Supplemental materials for Karduni et al. (ICWSM 2018) - "Can You Verifi This? Studying Uncertainty and Decision-Making about Misinformation in Visual Analytics"

License: GNU General Public License v3.0

visual-analytics decision-making-under-uncertainty twitter misinformation natural-language-processing social-network

verifi-icwsm-2018's Introduction

Paper

Karduni, A., Wesslen, R., Santhanam, S., Cho, I., Volkova, S., Arendt, D., Shaikh, S., and Dou W. (2018). Can You Verifi This? Studying Uncertainty and Decision-Making about Misinformation in Visual Analytics. ICWSM 2018.

@inproceedings{verifimisinfo,
  title = {Can You Verifi This? Studying Uncertainty and Decision-Making about Misinformation in Visual Analytics},
  author = {Karduni, Alireza and Wesslen, Ryan and Santhanam, Sashank and Cho, Isaac and Volkova, Svitlana and Arendt, Dustin and Shaikh, Samira and Dou, Wenwen}, 
  booktitle = {Proceedings of the 12th International AAAI Conference on Web and Social Media},
  series = {ICWSM '18},
  year = {2018},
  location = {Palo Alto, California}
  }

Instructions to run

The code to analyze the study is written in R 3.4.3 or higher. Highly recommend using RStudio 1.1.383 or higher.

Open the file verifi-icwsm-2018.Rproj.

  1. R-stream-code: Twitter Streaming API Code to pull the original data (Rmd / HTML)
  2. Random Forest Model on Language Feature Selection (Rmd / HTML)
  3. Logistic Regression: Accuracy and Fake Decisions (Rmd / HTML)

The code for Verifi is written in python, D3.js, Leaflet.js, and Node.js. Please email [email protected] if you are interested in the system. User interaction logs were automatically logged and stored in a local MongoDB database.

Study

The in-laboratory study was approved by UNC Charlotte Internal Review Board ([email protected]), IRB #17-0251.

Data

format Description
csv User Interaction Logs
csv User Responses (Decisions)
csv Account Twitter IDs

Account information is provided in accordance to Twitter's Terms of Service, i.e., only Twitter account ID's provided. For the original dataset, please email me [email protected]. The data can only be used for research purposes only. Pre-questionaire are not provided publicly to avoid privacy concerns.

All user-level data are anonymized. Use user-id for matching.

Figures

Verifi

Verifi Interface

The Verifi interface: Account View (A), Social Network View (B), Tweet Panel (C), Map View (D), and Entity Word Cloud (E). The interface can be accessed at https://verifi.herokuapp.com.

When accessing the heroku app, Google Chrome is highly recommended. You may need to modify your zoom depending on your monitor size.

Language Features

Top 20 most predictive language features of Fake and Real news outlets as measured by each feature’s average effect on Accuracy. ‘t’ prefix indicates the feature is normalized by the account’s tweet count and ‘n’ indicates normalization by the account’s word count (summed across all tweets). Features with borders are included in Verifi. Random Forests were used (see 02-linguistic-features)

User Study

Accounts Selected for User Decisions

Eight accounts with masked account names. Background colors indicate real (green) and fake (red). Accounts were masked as requested by one of authors' institutions.

Ground truth (real vs fake) labels were selected from third party sources. Fake news accounts came from one of two sources: Propaganda and Satire, Hoax, and Click Bait. Since our study, the original source of the Satire-Hoax-ClickBait accounts is no longer online, hence our link is to the Wayback Machine link from late 2017.

Real news accounts came from one of two sources: Business Insider, Forbes

For the 82 accounts selected for our study, we manually reviewed the accounts to get a variety as well as ensure basic rules (e.g., enough tweets, English tweets, profile had not been deleted or public at the time of the study).

Account Twitter IDs provides the Twitter profile ID's as well as the third-party labels.

Truth Table

Ground Truth

Available cues for selected accounts (column) and users' response regarding the importance of these cues (row, Q1-Q6). Left: Shows each of the eight selected accounts as well as the cues available for each of them. Right: Shows average of importance for each cue per account based on participants' responses. Values in gray circles below each account name show average accuracy for predicting that account correctly. The left figure is purely based on the (conflicting) information presented in the cues and is independent from user responses. The right figure is based on the user responses on the importance of each cue coincides with the information in the left table.

Logistic Regression to Explain User Predictions (Accuracy and Fake)

Log odds ratios for each independent variable in two logistic regressions. The Accuracy column is 1 = Correct, 0 = Incorrect Decision. The Fake column is the user's prediction: 1 = Fake, 0 = Real. The @accounts variables use @XYZ as the reference level and the Group variables use the Control Group as the reference level.

Study Demographics

verifi-icwsm-2018's People

Contributors

wesslen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

scone-snu

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.