Giter VIP home page Giter VIP logo

rescuesocialtech / sna-ah-nlu-labeling-cross-platforms Goto Github PK

View Code? Open in Web Editor NEW
1.0 3.0 0.0 187.78 MB

Natural Language Understanding, Processing, and Sentiment Testing across social media platforms on AH data using Scientific Methods. --- NLU engines, monitoring, classifications, training.

Jupyter Notebook 51.94% JavaScript 0.17% Python 0.34% HTML 47.53% PowerShell 0.01% Shell 0.01% TeX 0.01% PHP 0.01%

sna-ah-nlu-labeling-cross-platforms's Introduction

SNA-AH-NLU-Labeling-Cross-Platforms

Natural Language Understanding, Processing, and Sentiment Testing across social media platforms on Amber Heard data using Scientific Methods.
NLU engines, monitoring, classifications, training.

  • Twitter, Reddit. Instagram, YouTube, Change.org, Facebook

Analysis Testing and Data


Below on Main GitHub

Natural Language Understanding, Processing, and Sentiment Testing across social media platforms on Amber Heard data using Scientific Methods - NLU engines, monitoring, classifications, training.
Classifying Texts and Accounts in Case Study -
Automatic programs label help or harm to her using NLP of the texts for categories of support, defense, offense, defense_against and label the supporters/offenders accounts around her environment.
-> Threat analysis, negative texts, and specific word filtering related to the disinformation operations is applied in further analysis. Wordclouds show patterns across-platforms.

Data and Analysis of Twitter, Reddit, Instagram, YouTube, Change.org, Facebook

  • Support and Defense Data is included, as well as Compliments, Love Data for training NLU - including 5.9K of 12.4K labeled training
    -> 10K NLG created Supportive Compliments of her from Semiosis is included for comparison of labeled texts vs articulate compliments from AI. Compare density and prompt-based similarities.
    -> In context of domestic abuse and coercive control, it's important to show what is supportive and positive in relationships through support, defense, compliments, and love texts. The adversiarial framework of operations or tactic strategy is further a layer.
  • Monitoring and Dashboards included
  • Instagram threat analysis includes both Crime | Human Trafficking words analysis with NLU testing
  • Testing data is flooded with harms
  • High volumes apply: With the monitoring NLU, there were 122K auto-labeled tweets by June 2021 and over tens of thousands of accounts from one monitoring account. Increase of harmful volume increases risks and amounts affect her wellbeing.
    -> Correlate meaning to quantifications

Labels are: support, defense, offense, defense_against (focused on victim - target of operations)

Data used in NLU Testing Analysis is in /Testing Data folder
Dashboard files show of using program to run the monitoring bot. Monitoring js files are included.

  • Papers are provided under Guides to NLU and "Studying Technologies" in Background - Preliminary Effects folder. E.g., Argumentation research and Logic.
    Similar to Chess-Tactics, however, in dynamic context of coercive control or warfare-operations. Our goal is to mitigate the operations.
  • We've provided some data on AH's NLU to a Data Science anti-harassment start up understanding gender dynamics. Please contact us if you need more data.

Natural Language Understanding:

  • A BERT File under config shows BERT for Amber Heard NLP training
  • There are Dashboard code files in Javascript. The requirements are Reactjs, Javascript, and Django languages
  • Package examples and how to train NLU is included - e.g., nltk_data
  • Supporters and Offenders are marked by the numbers of tweets using the NLU classifications
  • There are monitoring pages, showing how the crawler monitors while classifying and storing the texts in a database

There is an NLU dashboard tester using the api for the trained NLU.
The API uses Postman.
Pre-Processing Bot code to extract text from images and testing the results with a video is added.

Requirements of NluEngine:

  • pytorch-lightning >= 0.9.0
  • torch >= 1.7.0
  • transformers >= 3.2.0
  • kaggle >= 1.5.8
  • pandas >= 1.1.2
  • scikit-learn >= 0.23.2
  • datasets >= 1.0.2
  • tqdm == 4.41.0
  • sentencepiece >= 0.1.94
  • 2nd Requirements file included

Version NLU Trained on Data from: Twitter, Facebook, Change.org primarily with some YouTube and Reddit

  1. First example (old nlpengine) is 2 classifications only of love or hate - quickly seen as not accurate and to move to an offense, defense, support strategy with multiclassification
  2. Second example, more accurate for nlp engine is multiclassifier - support, defense, offense, defense_against (focused on victim - target of operations) - trained on 12.4K texts
    Supporters both support and defend. Offenders offend and defense_against.
    Support is completely focused on uplifting her, while defense includes constructive words to defend and support her. Offense is purely harmful towards her while defense_against includes support of her adversary.

    Accounts are labeled by the number of texts classified for, against, or neutral.

Monitoring is easier than responding with NLG, hence, 'like-bots' are easier to create, and there are many of them. They sometimes do classification mistakes, creating preliminary precedence for gamification analysis of them and reverse engineering.

- This is under Natural Language Understanding, deeper than NLP and using Artificial Intelligence training on a GPU
A further development would be Multi-Agent Modeling and creating Simulations with 'Action-Trees' of perpetrators and victim, supporters and offenders.

sna-ah-nlu-labeling-cross-platforms's People

Contributors

christinataft avatar pe-mn avatar ss-fs-58 avatar yashinh avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.