msklc / define-a-statistical-hypothesis-with-imdb-movie-ranking-data Goto Github PK
View Code? Open in Web Editor NEWMy theoretical hypothesis is; “The evaluations of citizens’ voting about foreign movies are not influenced by the governments' official policy”. Because of the exhaustive hypothesis, it is not easy to collect data. So I decided to restrict the hypothesis in the example of the US and its politically affair tense countries; Russia and Iran. I used “the ratings of US citizens about Iranian and Russian movies” and “the ratings of Non-US people about Iranian and Russian movies” data from IMDB. The data was scraped (collected) from imdb.com with Python BeautifulSoup library.Mean, standard deviation (std) and p-value of the data were calculated by Python NumPy library. Also, The Pearson's r-value is calculated for each (Iranian and Russian) dataset and visualizing the result by Python SeaBorn library.