theeconomist / big-mac-data Goto Github PK

View Code? Open in Web Editor NEW

1.6K 1.6K 412.0 13.5 MB

Data and methodology for the Big Mac index

Home Page: https://www.economist.com/bigmac

License: MIT License

Jupyter Notebook 99.18% R 0.82%

big-mac-data's People

Contributors

Stargazers

Watchers

Forkers

futuraprime macrofinance markedmondson1234 mcgeedata trateotu professorhurst bilguis92 kj14393 jlucab gemmahendy jim7822 michaelchirico vitoraliaga marie-ella ianmadlenya leannewagner jorgepaddle enterstudio chandr123 lavenderbadger ateeqahmad thegrigorian romaingbr hugotave snowdj conorcussell betatim dthboyd marjukahmad jamesmyatt hrbrmstr fhoces metral pbt001 noorbrody fdoperezi stepancheg cryptto jerrywho ulriksartipy robhayward gridl joaopppobr johnaclouse drotskyzim 3dan3 dalton-0x0 fungyip dnzengou filiperigueiro mmdelc ianhongruzhang eszterrp vhcg77 gmontanari ggoral ft9738962 vukosim laixn spswanz smarshallplanit victorcabpo javierdligalvan neochaotic ivan2705 afrohacker juliandres1991 migueldaguirre rafadelamorena luvluvluvluv fabio2234 tamaracursopython franov rubensampe frangb90 vinspa mindeznal wiektor opineda fethullah-ertugrul dtatis gwd999 jbellidol rgpeart ywatanabe55 ltellomiller mcartu htnani cesargongui nuriamartinsanz gongyg generationwhy patesc derjuani ravinandankr92 aliclare chaves1234 vtchoo wanjihia1 ktaranov

big-mac-data's Issues

Finland is missing from big-mac-full-index.csv

I see Finland in your source data, however (maybe missing data for the latest year or something like that?).

Data via Quilt

Quilt provides a way to treat data like code packages. It would be good to hook this data into that service. Then someone who wants to use the data can just write:

$ quilt install theeconomist/big_mac_index
$ python
>>> from quilt.data.theeconomist import big_mac_index

I think Lebanon July 2022 is wrong

Hi folks

In the raw data, I think the value for dollar_ex for Lebanon in 2022 is wrong - down as 1512.2, while in the outward-facing data it's 25600. This gives the below figure without adjusting.

cheers
Mark

Reconciling IMF GDP PC data and the figures in the dataset

Hi and thanks for the package!

You note that GDP PC ($) figures are taken from the IMF but there's no GDP data for most countries (50-70% of the 56 countries) in the dataset pre-July-2011, and consequently no adj_price, etc. I'd be interested in extending the data back to 2000 if possible. So I'd be curious why this wasn't possible with your original IMF source/dataset, since the data does exist.

I'd also be interested in which source you did use as I downloaded the IMF World Economic Outlook Database data from datahub to try and do so myself. This database covers 55 of the 56 countries in the Big Mac dataset. Though observations are on a yearly basis, comparing the GDP PC ($) from this dataset versus the figures from the Big Mac dataset shows consistent variations:

Plot above shows average proportional difference (IMF_GDP-big_mac_GDP)/big_mac_GDP when grouped by year and country where big_mac_GDP is the average of the two GDP figures if there are two in a given year.

Assuming the IMF dataset above is downloaded and saved as values_csv.csv, then the following code reproduces the plot above:

library(tidyverse)

big_mac_data <- readr::read_csv("big-mac-full-index.csv") %>% 
  janitor::clean_names()

IMF_data <- read_csv("values_csv.csv") %>% 
  filter(
    Indicator == "NGDPDPC", # this is the indicator code for GDP pc in $
    Country %in% unique(big_mac_data$iso_a3), 
    Year %in% 2000:2020
  ) 

GDP_data <- big_mac_data %>% 
  mutate(year = lubridate::year(date)) %>% 
  group_by(iso_a3, year) %>% 
  summarize(big_mac_GDP = mean(gdp_dollar, na.rm=T)) %>% 
  inner_join(IMF_data, by=c("year"="Year", "iso_a3"="Country")) %>% 
  rename("IMF_GDP" = Value) 

GDP_data %>% 
  filter(!is.na(big_mac_GDP)) %>% 
  mutate(var = IMF_GDP-big_mac_GDP) %>% 
  group_by(year, iso_a3) %>% 
  summarize(prop = mean(var)/big_mac_GDP) %>% 
  ggplot(aes(iso_a3, prop)) + 
  geom_bar(stat = "identity") +
  labs(title="") +
  coord_flip() +
  facet_wrap(~year)

brew install R

In the "Install R" section, the suggestion for Mac is brew install R. You may consider switching to brew cask install r-app which supports fast download of pre-compiled packages from CRAN, supposedly works better with RStudio, and has some other advantages.

Lack of information about Venezuela

Hi. The file big-mac 2021-07-01 (xls) hasn't Venezuela data.
Thanks

Question about the exchange rate

Hi Developers,

Thanks for creating this amazing dataset!
I have a question about the exchange rate used in the dataset. Do you use the rate when generating the output dataset or use the rate when pushing the dataset into Github?

Thanks!

Big Mac Material Shrink/Inflation

It has been asserted that over the years, the Big Mac itself has shrunk, or has gotten bigger[1][2]. I don't know if this is true or not, but:

Does The Economist, having used the BMI for a long time:

Have any hard or soft primary source data on this?
- Interviews with employees?
- Old recipe manuals?
If so, incorporate any of this into the raw or adjusted data?

Doesn't seem to updated here or on Quandl

Hi. Has the Mac Index been updated? Thanks.

Syntax error in installation instructions

Small typo: I believe this line in the installation instructions contains an error.

install.packages('tidyverse','data.table')

The correct code would use the c() function to create the vector of package names.

install.packages(c('tidyverse','data.table'))

Provide a machine-readable data schema using data packages

A data package is a lightweight standard to describe tabular data. In it's simplest form, it's a datapackage.json file like:

{
  "name": "big-mac-data",
  "description": "Lorem ipsum",
  "resources": [
    {
      "name": "big-mac-full-index",
      "path": "output-data/big-mac-full-index.csv",
      "schema": {
        "fields": [
          {
            "name": "date",
            "type": "date",
            "constraints": {
              "required": true
            }
          },
          // ...
        ]
      }
    }
  ]
}

You can see a full example of a data package at https://github.com/vitorbaptista/birmingham_schools.

There are multiple libraries that understand this format, for example https://github.com/frictionlessdata/goodtables-py allows the data to be validated by running goodtables datapackage.json (this can even be run automatically using https://goodtables.io/), https://github.com/frictionlessdata/datapackage-py allows loading the data in Python (automatically validating and casting the data to their specific types), and there are others for R, JavaScript, Ruby and others.

I'd be happy to talk more about it, and/or write a datapackage.json and send a PR.

(cc @serahrono @pwalsh)

Question about Change in Methodology (US)

The change in methodology has created a significant (downward) price difference in the U.S. Can the names of the "four major US cities" used in the prior methodology be shared?

Calclation for Lebanon is wrong

According to the conversion data for Lebanon, a big mac would cost +30 dollars in Lebanon. I think there's something wrong with the dollar conversion rate as well as the local currency.

Dockerize this app

I'm a big fan of Docker, because it provides a universal development environment and thus prevents the infamous "but it works on my machine" problem.

Add a Travis CI badge (or other continuous integration badge)

This provides assurance that everything works as expected.

Where did the original data come from?

This repo is great, I might use some of the data to teach a class.

I have a quick question, where did the original data come from?
In other words how did you obtain the big-mac-source-data.csv file.

Thanks

Data availability prior to 2000 + Frequency Change

Dear Developers / Maintainers,
How are you today?

I'm writing to check if data before 2000 is available? Based on wikipedia it seems this index was created in 1986, but it would appear that the data on this repo starts in year 2000.

Also, it would appear that there seems to be a frequency change from annually 2000~2005 to semi-annually after 2006. How often is the data refreshed?

Thank you and have a wonderful day!
All the best,
Kathy Gao

Jan 2023 updated data

Is there plan and/or timeline to update the git data with the Jan 2023 updated data?

MXN dollar_ex

If I understand correctly the meaning of the fields, the value dollar_ex for the row MXN (Mexico) is wrong. It has been around 20 pesos for dollar for a long time. Regards, Luis

git2r missing

Using macOS Mojave (18A391) has issues with devtools R-package since libgit2 is missing.

Can be solved with

brew install libgit2

EUZ adjusted price enhancement

Hi,

Are the big mac prices really the same in the entire Euro zone, and if so, shouldn't be there a difference between a Big Mac in Germany and Slovenia or Greece for example?

I would assume that based on the GDP per capita there should be a difference that could be calculated for EUR countries.

Is this something you could consider for a future update?

Thanks,
Florian

Data Update 2023

Hi, when will the data be updated for the July 2023 data release please?

Turkey January-23

I think there might be a lag in prices for 2023 January in Turkey, right now a single Big Mac costs 88 TL in. Below you can find the image.