ai_policyanalysis's Introduction

AI Policy Analysis

Overview

This project focuses on analyzing the recent positions of leading countries in the realm of Artificial Intelligence (AI), specifically China, Europe, and the United States. The analysis involves extracting insights from provided documents, categorized into "Main sources" and "Additional sources."

Technology and Dependencies

Python: Programming language used for analysis.
Libraries:
- pandas
- nltk
- gensim
- wordcloud
- networkx
- matplotlib

Analysis Process

1. Document Categorization

Main_sources/: Folder containing primary documents for mandatory analysis.
Additional_sources/: Folder for optional documents, including "AI_EUvsUS.pdf."

2. Text Analysis

Text Cleaning and Tokenization

Clean the text by converting to lowercase, removing stopwords, and ensuring alphanumeric content.

N-grams and Topic Detection

Generate bi-grams and tri-grams to capture meaningful phrases in the text.

Word Cloud Generation

Create word clouds to visually represent the most frequent words in the text.

Statistics Calculation

Calculate various statistics, including the number of words, unique words, and entropy of the text.

Network Generation

Create a network graph to visualize relationships between words to derive insights.

3. Report Generation

Generate bi-grams and tri-grams.
Perform topic detection and generate word clouds.
Calculate statistics and visualize networks.
Write findings in the report file.

4. Top Words and Bigrams Analysis

Calculate and print top words and bigrams for China, Europe, and the US.
Load and preprocess text data for EU and US.

5. Generate Word Clouds

Generate word clouds for China, Europe, and the US.

Report Reference

-Review the attached report for detailed findings and valuable insights from the analysis on the AI positions of China, Europe, and the United States.

Recommend Projects