Giter VIP home page Giter VIP logo

Comments (1)

mikejohnson51 avatar mikejohnson51 commented on May 29, 2024

Hi Meaghan,

Thanks for the question and sharing your location data. From what you shared you are looking to extract data for 1012 stations, for 5 parameters, 10 GCMs, 107 days over 79 years. A total of 385,330,005 records! This is definitely a TON of data.

What is fortunate about your locations are that they are pretty dense, therefore I suggest using the areal extraction and NOT point extraction to get what you want. This will allow for more efficient climateR calls and the point values can be extracted as a post-processing step from the returned raster stacks.

To get a bounding area, the AOI package can be called on you points:

library(climateR)
library(sf)
library(raster)

load('./points.RData') # loads your point data called 'samp'

samp

class       : SpatialPointsDataFrame 
features    : 1012 
extent      : -83.84168, -80.81692, 34.56804, 37.29032  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
variables   : 11

p = AOI::getBoundingBox(samp) %>% st_as_sf() # convert to bounding box

plot(samp)
plot(p, add = T)

image

With this extent the below example will get summer-time data for 3 years, 10 GCMs, 5 variables for all of your locations:

# Define GCMs
gcms = c("bcc-csm1-1-m","CanESM2","CNRM-CM5", "CSIRO-Mk3-6-0", "GFDL-ESM2M","HadGEM2-ES365", "HadGEM2-CC365","IPSL-CM5A-MR", "MIROC5","MIROC-ESM-CHEM", "CCSM4")

#Define Parameters
params = c('tmax', 'tmin', 'rhmax', 'rhmin', 'prcp')

# Create a vector of GCM/parameter pairs that match climateR outputs
g.fin = levels(interaction(gcms,params,sep="_"))

# Create an empty list equal in size to the GCM/parameter pairs. We will fill this in a loop:
mylist <- setNames(vector("list", length = NROW(g.fin)), g.fin)

system.time({
  
for(y in 2020:2022){  
  yy = getMACA(AOI = p,
          model = gcms,
          param = params,
          startDate = paste0(y, "-05-01"),
          endDate   = paste0(y, "-08-15")
  )
  
    for(i in 1:length(mylist)){
      maca.index =  which(names(yy) == g.fin[i])
      list.index =  which(names(mylist) == g.fin[i])
      
      # extract point level values and tranpose matrix so that dates are rows and points are columns
      xt = t(raster::extract(yy[[maca.index]], points)) 
      
      # Convert to data.frame and add date column from rasterStack layer names
      r = data.frame(gsub("X", "", gsub("\\.", "-", names(yy[[maca.index]]))), xt, stringsAsFactors = F)
      
      # set column names to the point id in your origional dataset
      colnames(r) = c('dates', paste0("X.", samp$pt_ids))
      rownames(r) = NULL
      
      # Add (bind) data to the appropiate element of 'mylist'
      mylist[[list.index]]  <- rbind(mylist[[list.index]], r)
    }
  
  message("Finished processing year ", y)

}
})

   user  system elapsed 
 75.564  22.670 152.149 

Each year takes about ~50 seconds to process and should scale pretty linearly allowing you to process ~ 95,640 values per second, and your whole data request in ~1 - 1.5 hours

The result of this loop is a list of data.frames as shown below:

str(mylist, max.level = 1)
List of 50
 $ bcc-csm1-1-m_prcp   :'data.frame':	321 obs. of  1013 variables:
 $ CanESM2_prcp        :'data.frame':	321 obs. of  1013 variables:
 $ CNRM-CM5_prcp       :'data.frame':	321 obs. of  1013 variables:
 $ CSIRO-Mk3-6-0_prcp  :'data.frame':	321 obs. of  1013 variables:
 $ GFDL-ESM2M_prcp     :'data.frame':	321 obs. of  1013 variables:
 $ HadGEM2-CC365_prcp  :'data.frame':	321 obs. of  1013 variables:
 $ HadGEM2-ES365_prcp  :'data.frame':	321 obs. of  1013 variables:
 $ IPSL-CM5A-MR_prcp   :'data.frame':	321 obs. of  1013 variables:
 $ MIROC-ESM-CHEM_prcp :'data.frame':	321 obs. of  1013 variables:
 $ MIROC5_prcp         :'data.frame':	321 obs. of  1013 variables:
 $ bcc-csm1-1-m_rhmax  :'data.frame':	321 obs. of  1013 variables:
 $ CanESM2_rhmax       :'data.frame':	321 obs. of  1013 variables:
 $ CNRM-CM5_rhmax      :'data.frame':	321 obs. of  1013 variables:
 $ CSIRO-Mk3-6-0_rhmax :'data.frame':	321 obs. of  1013 variables:
 $ GFDL-ESM2M_rhmax    :'data.frame':	321 obs. of  1013 variables:
 $ HadGEM2-CC365_rhmax :'data.frame':	321 obs. of  1013 variables:
 $ HadGEM2-ES365_rhmax :'data.frame':	321 obs. of  1013 variables:

...

Each internal data.frame will have a column for the date and values for each point:

str(mylist$CanESM2_prcp, max.level = 2)
'data.frame':	321 obs. of  1013 variables:
 $ dates: chr  "2020-05-01" "2020-05-02" "2020-05-03" "2020-05-04" ...
 $ X.1  : num  5.72 0 0 13.28 1.47 ...
 $ X.2  : num  5.76 0 0 13.85 1.82 ...
 $ X.3  : num  6.04 0 0 13.98 1.99 ...
 $ X.4  : num  6.12 0 0 13.88 1.9 ...
 $ X.5  : num  7.16 0 0 14.28 1.75 ...
 $ X.6  : num  6.21 0 0 13.57 1.82 ...
 $ X.7  : num  6.51 0 0 13.94 2.03 ...

...

From there you can plot the data....

par(mfrow = c(3,1))
plot(x = as.Date(mylist$`bcc-csm1-1-m_prcp`$dates), 
     y = mylist$`bcc-csm1-1-m_prcp`$X.5, pch = 16, col = 'blue',
     main = 'Location 5 PRCP', xlab = "Dates", ylab = "PRCP (mm)")
plot(x = as.Date(mylist$`bcc-csm1-1-m_prcp`$dates), 
     y = mylist$`bcc-csm1-1-m_tmax`$X.5, pch = 16, col = 'red',
     main = 'Location 5 TMAX', xlab = "Dates", ylab = "TMAX (K)" )
plot(x = as.Date(mylist$`bcc-csm1-1-m_prcp`$dates), 
     y = mylist$`bcc-csm1-1-m_rhmax`$X.5, pch = 16, col = 'purple',
     main = 'Location 5 RHMAX', xlab = "Dates", ylab = "RHMAX")

image

... or write each list element to a CSV...

out_path = "./maca_climate_files/"

for(i in 1:length(mylist)){
  write.csv(mylist[[i]], file = paste0(out_path, names(mylist)[i], ".csv"))
}

Hope that helps, and thanks for pushing these functions to the limit!

Mike

from climater.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.