Giter VIP home page Giter VIP logo

canmd's Introduction

Introduction

The CANMD repository is the PyTorch Implementation of CIKM 2022 Paper Contrastive Domain Adaptation for Early Misinformation Detection: A Case Study on COVID-19

We propose contrastive adaptation network for early misinformation detection (CANMD). Specifically, we leverage pseudo labeling to generate high-confidence target examples for joint training with source data. We additionally design a label correction component to estimate and correct the label shifts (i.e., class priors) between the source and target domains. Moreover, a contrastive adaptation loss is integrated in the objective function to reduce the intra-class discrepancy and enlarge the inter-class discrepancy. As such, the adapted model learns corrected class priors and an invariant conditional distribution across both domains for improved estimation of the target data distribution. To demonstrate the effectiveness of the proposed CANMD, we study the case of COVID-19 early misinformation detection and perform extensive experiments using multiple real-world datasets. The results suggest that CANMD can effectively adapt misinformation detection systems to the unseen COVID-19 target domain with significant improvements compared to the state-of-the-art baselines.

Citing

Please consider citing the following paper if you use our methods in your research:

@inproceedings{yue2022contrastive,
  title={Contrastive Domain Adaptation for Early Misinformation Detection: A Case Study on COVID-19},
  author={Yue, Zhenrui and Zeng, Huimin and Kou, Ziyi and Shang, Lanyu and Wang, Dong},
  booktitle={Proceedings of the 31th ACM International Conference on Information & Knowledge Management},
  year={2022}
}

Data & Requirements

Download datasets: FEVER, GettingReal, GossipCop, LIAR, PHEME, CoAID, Constraint and ANTiVax.

Required packages: PyTorch, pandas, numpy etc. For our running environment see requirements.txt

Train Source Misinformation Detection Models

python src/train.py --output_dir=SOURCE_FOLDER --source_data_type=constraint --source_data_path=./data/Constraint;

Excecute the above command (with arguments) to train a source misinformation detection model, select dataset and path (i.e., source_data_type and source_data_path) from FEVER, GettingReal, GossipCop, LIAR, PHEME, CoAID, Constraint and ANTiVax. Trained source models could be found under ./experiments/SOURCE_FOLDER.

Adapt Source Misinformation Detection Models to Target Domain

python src/adapt.py --load_model_path=SOURCE_FOLDER --output_dir=TARGET_FOLDER --source_data_type=constraint --source_data_path=./data/Constraint --target_data_type=coaid --target_data_path=./data/CoAID --alpha=0.001 --conf_threshold=0.6;

Excecute the above command (with arguments) to adapt the source model with CANMD. Specify load_model_path to load the source trained model, select both source and target datasets and paths (i.e., source_data_type, source_data_path, target_data_type and target_data_path) from FEVER, GettingReal, GossipCop, LIAR, PHEME, CoAID, Constraint and ANTiVax. Adapted models could be found under ./experiments/TARGET_FOLDER. Alpha (i.e., lambda in the contrastive loss) and conf_threshold (i.e., tau in confidence thresholding) are hyperparameters for CANMD and can be chosen empirically.

Performance

Note that the performance may vary slightly from the reported results as we adopt an updated version of transformers in this repo.

Acknowledgement

During the implementation we base our code mostly on Transformers from Hugging Face and Abstention by Shrikumar et al. Many thanks to these authors for their great work!

canmd's People

Contributors

yueeeeeeee avatar

Stargazers

Daniel avatar Julian avatar

Watchers

James Cloos avatar  avatar

canmd's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.