theeconomist / big-mac-data Goto Github PK
View Code? Open in Web Editor NEWData and methodology for the Big Mac index
Home Page: https://www.economist.com/bigmac
License: MIT License
Data and methodology for the Big Mac index
Home Page: https://www.economist.com/bigmac
License: MIT License
I see Finland in your source data, however (maybe missing data for the latest year or something like that?).
Quilt provides a way to treat data like code packages. It would be good to hook this data into that service. Then someone who wants to use the data can just write:
$ quilt install theeconomist/big_mac_index
$ python
>>> from quilt.data.theeconomist import big_mac_index
Hi and thanks for the package!
You note that GDP PC ($) figures are taken from the IMF but there's no GDP data for most countries (50-70% of the 56 countries) in the dataset pre-July-2011, and consequently no adj_price
, etc. I'd be interested in extending the data back to 2000 if possible. So I'd be curious why this wasn't possible with your original IMF source/dataset, since the data does exist.
I'd also be interested in which source you did use as I downloaded the IMF World Economic Outlook Database data from datahub to try and do so myself. This database covers 55 of the 56 countries in the Big Mac dataset. Though observations are on a yearly basis, comparing the GDP PC ($) from this dataset versus the figures from the Big Mac dataset shows consistent variations:
Plot above shows average proportional difference (IMF_GDP-big_mac_GDP)/big_mac_GDP
when grouped by year and country where big_mac_GDP
is the average of the two GDP figures if there are two in a given year.
Assuming the IMF dataset above is downloaded and saved as values_csv.csv
, then the following code reproduces the plot above:
library(tidyverse)
big_mac_data <- readr::read_csv("big-mac-full-index.csv") %>%
janitor::clean_names()
IMF_data <- read_csv("values_csv.csv") %>%
filter(
Indicator == "NGDPDPC", # this is the indicator code for GDP pc in $
Country %in% unique(big_mac_data$iso_a3),
Year %in% 2000:2020
)
GDP_data <- big_mac_data %>%
mutate(year = lubridate::year(date)) %>%
group_by(iso_a3, year) %>%
summarize(big_mac_GDP = mean(gdp_dollar, na.rm=T)) %>%
inner_join(IMF_data, by=c("year"="Year", "iso_a3"="Country")) %>%
rename("IMF_GDP" = Value)
GDP_data %>%
filter(!is.na(big_mac_GDP)) %>%
mutate(var = IMF_GDP-big_mac_GDP) %>%
group_by(year, iso_a3) %>%
summarize(prop = mean(var)/big_mac_GDP) %>%
ggplot(aes(iso_a3, prop)) +
geom_bar(stat = "identity") +
labs(title="") +
coord_flip() +
facet_wrap(~year)
In the "Install R" section, the suggestion for Mac is brew install R
. You may consider switching to brew cask install r-app
which supports fast download of pre-compiled packages from CRAN, supposedly works better with RStudio, and has some other advantages.
Hi. The file big-mac 2021-07-01 (xls) hasn't Venezuela data.
Thanks
Hi Developers,
Thanks for creating this amazing dataset!
I have a question about the exchange rate used in the dataset. Do you use the rate when generating the output dataset or use the rate when pushing the dataset into Github?
Thanks!
It has been asserted that over the years, the Big Mac itself has shrunk, or has gotten bigger[1][2]. I don't know if this is true or not, but:
Does The Economist, having used the BMI for a long time:
Have any hard or soft primary source data on this?
If so, incorporate any of this into the raw or adjusted data?
Hi. Has the Mac Index been updated? Thanks.
Small typo: I believe this line in the installation instructions contains an error.
install.packages('tidyverse','data.table')
The correct code would use the c()
function to create the vector of package names.
install.packages(c('tidyverse','data.table'))
A data package is a lightweight standard to describe tabular data. In it's simplest form, it's a datapackage.json
file like:
{
"name": "big-mac-data",
"description": "Lorem ipsum",
"resources": [
{
"name": "big-mac-full-index",
"path": "output-data/big-mac-full-index.csv",
"schema": {
"fields": [
{
"name": "date",
"type": "date",
"constraints": {
"required": true
}
},
// ...
]
}
}
]
}
You can see a full example of a data package at https://github.com/vitorbaptista/birmingham_schools.
There are multiple libraries that understand this format, for example https://github.com/frictionlessdata/goodtables-py allows the data to be validated by running goodtables datapackage.json
(this can even be run automatically using https://goodtables.io/), https://github.com/frictionlessdata/datapackage-py allows loading the data in Python (automatically validating and casting the data to their specific types), and there are others for R, JavaScript, Ruby and others.
I'd be happy to talk more about it, and/or write a datapackage.json
and send a PR.
(cc @serahrono @pwalsh)
The change in methodology has created a significant (downward) price difference in the U.S. Can the names of the "four major US cities" used in the prior methodology be shared?
According to the conversion data for Lebanon, a big mac would cost +30 dollars in Lebanon. I think there's something wrong with the dollar conversion rate as well as the local currency.
I'm a big fan of Docker, because it provides a universal development environment and thus prevents the infamous "but it works on my machine" problem.
This provides assurance that everything works as expected.
This repo is great, I might use some of the data to teach a class.
I have a quick question, where did the original data come from?
In other words how did you obtain the big-mac-source-data.csv
file.
Thanks
Dear Developers / Maintainers,
How are you today?
I'm writing to check if data before 2000 is available? Based on wikipedia it seems this index was created in 1986, but it would appear that the data on this repo starts in year 2000.
Also, it would appear that there seems to be a frequency change from annually 2000~2005 to semi-annually after 2006. How often is the data refreshed?
Thank you and have a wonderful day!
All the best,
Kathy Gao
Is there plan and/or timeline to update the git data with the Jan 2023 updated data?
If I understand correctly the meaning of the fields, the value dollar_ex for the row MXN (Mexico) is wrong. It has been around 20 pesos for dollar for a long time. Regards, Luis
Using macOS Mojave (18A391) has issues with devtools R-package since libgit2 is missing.
Can be solved with
brew install libgit2
Hi,
Are the big mac prices really the same in the entire Euro zone, and if so, shouldn't be there a difference between a Big Mac in Germany and Slovenia or Greece for example?
I would assume that based on the GDP per capita there should be a difference that could be calculated for EUR countries.
Is this something you could consider for a future update?
Thanks,
Florian
Hi, when will the data be updated for the July 2023 data release please?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.