Giter VIP home page Giter VIP logo

hostile-narrative-analysis's Introduction

Hostile Narrative Analysis.

Guided by theories of violence from Peace Studies, this PhD research proposes a natural language processing (NLP) spaCy pipeline to enable an idea of “Hostile Narrative Analysis”.

In conceptualising Conflict Narrative Detection, we are guided by sociological theory for technological design. The defining theory we use is “cultural violence”, which seeks to explain the processes of violence legitimisation. Derived from this theory, we have developed a novel methodology for detecting and measuring cultural violence in natural language. This methodology provides a structure for the proposed NLP pipeline for which we have conducted several experiments to inform its technical development.

The methodology and experimentation are structured are as follows (to be updated as the research develops):

Objective # Objective Technology Tests
Obj 0. Pre-processing of text
Obj 0.1 tokenize texts spaCy Tokenizer
Obj 0.2 Tag texts spaCy Tagger
Obj 0.3 Parse texts spaCy Depedency Parse
Obj 0.4 Experiment 0.1 - Named Entity recognition spaCy ner
Obj 0.5 Experiment 0.2 - Named Concept recognition spaCy custom component
Obj 0.6 Experiment 0.3 - Entity Resolution Custom component
Obj 0.7 Experiment 0.4 - Coreference Resolution Hugging Face coref
------------- --------------------------------------------------------- ---------------------------------
Obj 1. Detect the ingroup and outgroup of an orator’s text
Experiment 1.1 - Regex Hearst Patterns Regex
Experiment 1.2 - Hearst Patterns spaCy Matcher
------------- --------------------------------------------------------- ---------------------------------
Obj 2. Detect and classify phrases as ingroup elevation terms.
------------- --------------------------------------------------------- ---------------------------------
Obj 3. Detect and classify phrases as outgroup othering terms.
------------- --------------------------------------------------------- ---------------------------------
Obj 4. Infer intergroup differentiation using measurement schema.
------------- --------------------------------------------------------- ---------------------------------

For developing the pipeline, we curated a dataset comprising Hitler’s “Mein Kampf”, Martin Luther King’s “I Have a Dream”, and political speeches from George Bush and Osama bin Laden during the “War on Terror”. Except for Luther King, these texts have been used for the legitimisation of violence to bring about change, therefore, they should be regarded as culturally violent. As he sought for non-violent change, the inclusion of Luther King as a control to the other speeches may provide some insight into the variables of a text that make it culturally violent. Since the ingroup and outgroup of each text are well understood, this dataset is a good source of test data since results can be assessed by observation.

This research is funded by the Engineering and Physical Sciences Research Council (EPSCR) through the Web Science Doctoral Training Centre (DTC) at Southampton University. Supervisors:

  • Dr George Konstantinidis
  • Dr Craig Webber

In developing this pipeline big thanks go to:

  • Mark Neumann
  • mmichelsonIF
  • especially the excellent explosion.ai team for creating the spaCy library without which none of this would be possible.

I have only been coding for 18 months; any suggestions, comments or feedback are very welcome.

hostile-narrative-analysis's People

Contributors

fourthought avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.