This repository creates the weather graph below (inspired by Edward Tufte) using R's {{ggplot2}} package. Updated data is pulled directly from NOAA's servers in CSV format. The entire process is automated using Github Actions.
This repo may be useful in three ways.
- replicating or adapting this graph for a different weather station
- learning more about data viz with ggplot2
- learning more about Github Actions with R
Full disclosure: Original author is John Johnson
NOAA provides daily data for weather stations in the Global Historical Climatological Network (GHCN).
Citation:
Menne, M.J., I. Durre, B. Korzeniewski, S. McNeal, K. Thomas, X. Yin, S. Anthony, R. Ray, R.S. Vose, B.E. Gleason, and T.G. Houston, 2012: Global Historical Climatology Network - Daily (GHCN-Daily), Version 3.26. NOAA National Climatic Data Center. http://doi.org/10.7289/V5D21VHZ [February 21, 2022].
For every weather station in the daily GHCN, NOAA maintains a file with the station's entire daily history. Each day, they append a new record. In my observation, records are typically lagged by a few days.
Each weather station is assigned a unique indicator. The full list of station names, coordinates, and unique IDs is available here.
I use a merged file (data/GHCN_USC00045532_USW00023257.csv
) that contains daily weather reports from Merced Regional Airport stations (GHCN:USC00045532 & GHCN:USW00023257). The start date is June 1, 1899.
GHCN:USC00045532
- Historical data starting from June 1, 1899
GHCN:USW00023257
- More recent up to date data starting from August 1, 1998
Refer to R/Retrieve_GHCN_USC00045532_USW00023257.R
for a demonstration of downloading and processing this dataset. See NOAA's documentation for detailed descriptions of the original variable definitions.
The image graphs/DailyHighTemp_USC00045532_USW00023257.png
is created by R/BuidlDailyHigh.R
. See the README in /graphs for a step-by-step tutorial.
At a high level, the automated workflow:
(1) runs the script to retrieve the updated data
(2) commits the updated dataset to the repository
(3) runs the script to build the graph
(4) commits the graph to the repository. All this takes less than 1 minute per run.
See the README in /.github/workflows for a line-by-line discussion of the workflow file.