r-spatial / spdep Goto Github PK

View Code? Open in Web Editor NEW

122.0 15.0 26.0 94.14 MB

Spatial Dependence: Weighting Schemes and Statistics

Home Page: https://r-spatial.github.io/spdep/

R 88.72% C 8.76% TeX 2.52%

spatial-dependence spatial-weights spatial-autocorrelation

spdep's Introduction

spdep

Spatial Dependence: Weighting Schemes and Statistics

A collection of functions to create spatial weights matrix objects from polygon contiguities, from point patterns by distance and tessellations, for summarizing these objects, and for permitting their use in spatial data analysis, including regional aggregation by minimum spanning tree; a collection of tests for spatial autocorrelation, including global Morans I and Gearys C proposed by Cliff and Ord (1973, ISBN: 0850860369) and (1981, ISBN: 0850860814), Hubert/Mantel general cross product statistic, Empirical Bayes estimates and Assunção/Reis (1999) (https://doi.org/10.1002/(SICI)1097-0258(19990830)18:16%3C2147%3A%3AAID-SIM179%3E3.0.CO%3B2-I) Index, Getis/Ord G (Getis and Ord 1992) (https://doi.org/10.1111/j.1538-4632.1992.tb00261.x) and multicoloured join count statistics, APLE (Li et al. ) (https://doi.org/10.1111/j.1538-4632.2007.00708.x), local Morans I (Anselin 1995) (https://doi.org/10.1111/j.1538-4632.1995.tb00338.x) and Getis/Ord G (Ord and Getis 1995) (https://doi.org/10.1111/j.1538-4632.1995.tb00912.x), saddlepoint approximations (Tiefelsdorf 2002) (https://doi.org/10.1111/j.1538-4632.2002.tb01084.x) and exact tests for global and local Morans I (Bivand et al. 2009) (https://doi.org/10.1016/j.csda.2008.07.021) and LOSH local indicators of spatial heteroscedasticity (Ord and Getis) (https://doi.org/10.1007/s00168-011-0492-y), with further extensions in 'Bivand' (2022) doi:10.1111/gean.12319. The implementation of most of the measures is described in Bivand and Wong (2018) (https://doi.org/10.1007/s11749-018-0599-x). Extra measures contributed by Josiah Parry. Lagrange multiplier tests for spatial dependence in linear models are provided (Anselin et al. 1996) (https://doi.org/10.1016/0166-0462(95)02111-6), as are Rao's score tests for hypothesised spatial Durbin models based in fitted linear models (Koley and Bera 2024) (https://doi.org/10.1080/17421772.2023.2256810).

sfdep (https://cran.r-project.org/package=sfdep) provides a piped interface to spdep.

From spdep and spatialreg versions >= 1.2-1, the model fitting functions previously present in this package are defunct in spdep and may be found in spatialreg.

Default branch now main

Moved from R-Forge

spdep's People

Contributors

Stargazers

Watchers

spdep's Issues

Problems with the integration of functions as lagsarlm and errorsarlm with RMarkdown

Sorry for my english, I'm Brazilian. I was trying to make a Spatial Regression with the functions agsarlm and errorsarlm. It worked in the mirror, but when I tried to put them in a Rmd file to export to pdf, it doesn't work. It just be loading, but doesn't work. I searched for some explanation or reason but I didn't find out anything, I tried the same process in two differents computers, but the problem was the same. So, if you can fix it, please do so.
Att, José Vinícius.

Document how spdep functions are verified

I chatted briefly with @edzer at rstudio::conf and he mentioned the idea of writing tests and doing code coverage for spdep. Would this be something we'd be interested in?

I could take a stab at writing a few tests and at least getting some of the main functions covered. Which ones would those be, in your opinion? Maybe functions in active development could use some tests?

Warning message Interpretation: knearneigh(x = xy, k = k) : knearneigh: identical points found

Hello,

I'm running spatial statistics and I get a warning that says:

knearneigh(x = xy, k = k) : knearneigh: identical points found

Does some1 know what it means ? ( I mean mathematically where could it come from )

Thank you!

Confusing warning to install "spatialreg" during install

Hi I'm seeing this warning pop up on install on macOS for spdep, even though spatialreg is already installed:

** byte-compile and prepare package for lazy loading
Warning in eval(exprs[i], envir) : install the spatialreg package
Calls: <Anonymous> ... tryCatchOne -> doTryCatch -> sys.source -> eval -> eval
** help

spData dependency does not load

As it says in the title, when trying to load spdata, it returns the following error. I tried re-installing spdata but then when trying to load spdep, RStudio freezes and, when opening a new session, it returns the same error.

Loading required package: sp
Loading required package: spData
Error: package or namespace load failed for ‘spData’ in .doLoadActions(where, attach):
 error in load action .__A__.1 for package raster: loadModule(module = "spmod", what = TRUE, env = ns, loadNow = TRUE): Unable to load module "spmod": function 'Rcpp_precious_remove' not provided by package 'Rcpp'
Error: package ‘spData’ could not be loaded

Is there any way around this?
I'm using the following versions on a Lenovo Thinkbook 14 (16 GB RAM, intel core i5 from 2020) and the session is as follows,
spData: 0.3.10
spdep: 1.1-8

R version 4.1.0 (2021-05-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19041)

Thank you so much.

Export to sf

It would be nice if the spdep exports also in sf (no only in sp), such as in the nb2lines() and listw2lines() functions.

Update Rnws and/or add vignettes, articles

As mentioned in the website issue (#23), @edzer and I noticed that the SIDS and spatial filtering vignettes were Rnws, which we could change over to Rmds. I suggested maybe the thing to do would be to break them up into articles for the website. @rsbivand even suggested adding a new vignette or two, regarding methods for ML fitting.

Thoughts on what makes sense? I think it could also be useful to add an article/vignette called "Quickstart" walking through performing spatial analysis on one specific example, highlighting a few key spdep functions.

Enhancement: Add depth and grouping function

Would it be possible to add a depth variable used to calculate distances between two points? I research the energy industry (oil and gas) where depth is a key variable in measuring the distances between wells. In addition to depth, I would like to group wells by a variable, in my case formation, similar to the group_by command in dplyr. I've attached a file with a small sample of data.

The code below is what i'm currently using.

coordinates(wells)<- ~LonWGS84+LatWGS84
wellsnb<-neighborsDataFrame(dnearneigh(coordinates(wells), d1 = 0, d2 = .2,longlat=T,row.names = wells$UWI10))

[EXAMPLE.xlsx](https://github.com/r-spatial/spdep/files/2577605/EXAMPLE.

Thanks!

David

Poor performance of MCMC draws in `impacts()`

I'm fitting a spatial Durbin model using lagsarlm() on what seems to me to be not a terribly large dataset (1044 observations). impacts() takes 45-50 seconds to take 10 MCMC draws, which would be ~42 minutes for 500 draws. Things are even slower with my actual model, which involves 6 or 7 predictors. I'm not sure whether this poor performance is just because sometimes MCMC is slow or it indicates some serious problem somewhere.

MWE:

library(magrittr)
library(sf)
library(spdep)

dataf = readRDS('mwe_data.Rds')

weights_3nn = dataf %>%
    st_centroid() %>%
    st_coordinates() %>%
    knearneigh(k = 3) %>%
    knn2nb() %>%
    nb2listw(style = 'W')

model = lagsarlm(w_use ~ hispanicP, data = dataf, weights_3nn, type = 'Durbin')

# tictoc::tic()
model_impacts = impacts(model, listw = weights_3nn, R = 10)
# tictoc::toc()

Dataset

nb should be a list subclass

nb objects are list objects. However, in creation, they lose their list class due to setting of attributes. This means that any functions that would normally work on a list do not work on an nb class without manually adding the list class ur unclass()-ing the nb object. The latter approach is unwanted because there are no functions for recasting a list to nb.

For example in poly2nb()

spdep/R/poly2nb.R

Line 167 in be5d2ad

class(ans) <- "nb"

should be changed to class(ans) <- c("nb", "list").

Set up package documentation website

If you'd like, I can set up a package documentation website for spdep similar to the sf package website and the spData package website. Would that be of interest?

If so, would it be possible to create a gh-pages branch for me to work on?

Thank you so much!

The new Durbin= argument possible issue

Hello,

About the Durbin= argument which has been recently modified:
I'm getting an error only when using this argument to subset the RHS variables.
Is it possible to get the error "Input data and neighbourhood list have different dimensions" in the Create_WX step when the # of rows in the spatial data frame and the length of lists neighbours are in fact equal?
As per before, it was Durbin = ~ X. What's the most recent formula interface that subsets the RHS variables to be lagged as mentioned in one of the prior issues?

I don't know if the issue I'm facing is because of the updates on the argument. Looking at the source code, I shouldn't be getting this error. Any thought would be truly appreciated.

Omid

Error when calculating impacts after spBreg_lag

I am trying to calculate impacts after running

spbreg<-spBreg_lag(formula, data=data, listw=wm_reg1, type="Durbin")

But I get the following error message.

Error in impacts.MCMC_sar_g(spbreg, listw = wm_reg1, evalues = evalues, : non-matched coefficient pairs

What does it mean?

Add colnames to localC_calc

The output of localC_perm provides an attribute pseudo-p which has 8 columns. It is unclear which column contains the p value as these columns have no name.

I believe they follow the columns as localmoran_perm(). My guesses as to what these columns represent are below. Once confirmed and corrected I will make the appropriate pull request.

Mean
Variance
std dev (?)
p-value using pnorm()
Rank probability (?)
folded p-value (?)
Skewness
Kurtosis

spdep/R/localC.R

Line 244 in 7e76054

localC_p <- function(reps, obs, alternative, nsim) {

Edit: these are provided the case when conditional = TRUE. I can use that as reference.

Compilation failed on MacOSX

MacosX 12.1
R 4.12
While install spdep 1.2-1, it produced an error:
impossible de charger l'objet partagé '/Library/Frameworks/R.framework/Versions/4.1/Resources/library/sf/libs/sf.so':
dlopen(/Library/Frameworks/R.framework/Versions/4.1/Resources/library/sf/libs/sf.so, 0x0006): Library not loaded: /usr/local/opt/gdal/lib/libgdal.29.dylib
Referenced from: /Library/Frameworks/R.framework/Versions/4.1/Resources/library/sf/libs/sf.so
Reason: tried: '/usr/local/opt/gdal/lib/libgdal.29.dylib' (no such file), '/Library/Frameworks/R.framework/Resources/lib/libgdal.29.dylib' (no such file), '/Library/Java/JavaVirtualMachines/jdk1.8.0_241.jdk/Contents/Home/jre/lib/server/libgdal.29.dylib' (no such file), '/usr/local/Cellar/gdal/3.4.1/lib/libgdal.29.dylib' (no such file), '/Library/Frameworks/R.framework/Resources/lib/libgdal.29.dylib' (no such file), '/Library/Java/JavaVirtualMachines/jdk1.8.0_241.jdk/Contents/Home/jre/lib/server/

It tried to use libgdal.29.dylib whereas libgdal.30.dylib is present:
√ lib % cd /usr/local/opt/gdal/lib/
√ lib % ls
libgdal.30.dylib libgdal.a libgdal.dylib pkgconfig python3.9
√ lib % ls -al
total 102952
drwxr-xr-x 7 marcgirondot staff 224 6 jan 09:23 .
drwxr-xr-x 10 marcgirondot staff 320 6 jan 08:23 ..
-rw-r--r-- 1 marcgirondot staff 18220456 6 jan 08:23 libgdal.30.dylib
-r--r--r-- 1 marcgirondot staff 34487256 27 déc 21:30 libgdal.a
lrwxr-xr-x 1 marcgirondot staff 16 27 déc 21:30 libgdal.dylib -> libgdal.30.dylib
drwxr-xr-x 3 marcgirondot staff 96 6 jan 08:23 pkgconfig
drwxr-xr-x 3 marcgirondot staff 96 27 déc 21:30 python3.9

If I try to make a symbolic link from libgdal.30.dylib to libgdal.29.dylib, it does not work either:
√ lib % sudo ln -s libgdal.30.dylib libgdal.29.dylib

Erreur : le chargement du package ou de l'espace de noms a échoué pour ‘sf’ in dyn.load(file, DLLpath = DLLpath, ...) :
impossible de charger l'objet partagé '/Library/Frameworks/R.framework/Versions/4.1/Resources/library/sf/libs/sf.so':
dlopen(/Library/Frameworks/R.framework/Versions/4.1/Resources/library/sf/libs/sf.so, 0x0006): Symbol not found: __ZN10GDALDriver6CreateEPKciii12GDALDataTypePPc
Referenced from: /Library/Frameworks/R.framework/Versions/4.1/Resources/library/sf/libs/sf.so
Expected in: /usr/local/Cellar/gdal/3.4.1/lib/libgdal.30.dylib

The sf.so shared object is well in place.

cd /Library/Frameworks/R.framework/Versions/4.1/Resources/library/sf/libs/
√ libs % ls
sf.so

Any solution ?
Thanks
Marc Girondot

subset.nb fails when "names" attribute is set

If I am trying to subset a nb object that has an attribute "names" the subsetting fails:

> subset(nb, !is.na(gini))
Error in attr(z, xattrs[i]) <- attr(x, xattrs[i]) :
  'names' attribute [73056] must be the same length as the vector [64689]

(73056-64689) being the difference between the original neighbour list and the subsetted one.

The error seems to be caused by rows 32ff in R/subset.nb.R:

    for (i in 1:length(xattrs)) {
	if (xattrs[i] != "region.id")
	    attr(z, xattrs[i]) <- attr(x, xattrs[i])
    }

If I understand correctly, the original xattrs (from x) are copied back to the subsetted nb (z), one of these is the vector of IDs names. The names are initialized in z as numbers 1:n and then overwritten, in this case with attr(x, "names") which however includes all IDs and not only those of the subset. It is therefore larger and not of "same length", hence the error.

Fail to Install spdep package in R

Hello, I have been having trouble downloading spdep package to work on my R on Windows. I have enabled binary and dependencies in my installation process: install.packages("spdep", binary = TRUE, dependencies = TRUE). However, it's having trouble loading spData. The error code is as follows:

library(spdep)
Loading required package: spData
Error: package or namespace load failed for ‘spData’ in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]):
namespace ‘terra’ 1.4-22 is being loaded, but >= 1.5.12 is required
Error: package ‘spData’ could not be loaded
In addition: Warning messages:
1: package ‘spdep’ was built under R version 4.1.2
2: package ‘spData’ was built under R version 4.1.2

Any help would be appreciated!

Adding `localC()`

@JosiahParry has helpfully contributed a prototype implementation of localC() in #66 #67 .

A) One problem being examined is how to deal with no-neighbour observations, this may be perhaps handed by using C code already in place for handing the same problem for global Geary's C, but doubly scaling the input variable.

B) A further open problem is that the prototype permutation uses all of the observations (total permutation), but we know that for local Moran's I and local G permutatons, the current observation i is held out of the srt (conditional permutation). This relates to https://doi.org/10.1111/gean.12304 (https://doi.org/10.31219/osf.io/ugkhp), already contributed to spdep as #58 (and also very welcome @jeffcsauer.)

Glist formatting for resistance weighted db MEM

Dear spdep admins,

Thank you for the great package! I would like to perform a distance based Moran's Eigenvector Map (db-MEM) for further use in RDA to detect local adaptations. Following the guidelines for spatial weighting I truncated the distance matrix based on the
minimum spanning tree (graph-based MEM) and want to weight my neighbours based on (highly accurate) cumulative landscape resistance values. However, I am unable to get my landscape resistances in the right formatting for weighting.

Neighbour list object:
Number of regions: 29
Number of nonzero links: 164
Percentage nonzero weights: 19.50059
Average number of links: 5.655172

listw <- nb2listw(neighbours, glist=resistance_weights, style="W", zero.policy=NULL)
Error in nb2listw(neighbours, glist = resistance_weights, style = "W", : glist wrong length

The resistance_weights item is a symmetrical cumulative resistance matrix.

I can directly integrate the resistance matrix with the mat2listw but then I can't truncate the matrix as I understand is advised.

Characteristics of weights list object:
Neighbour list object:
Number of regions: 29
Number of nonzero links: 812
Percentage nonzero weights: 96.55172
Average number of links: 28

Weights style: M
Weights constants summary:
n nn S0 S1 S2
M 29 841 13149.41 902809.7 33894518

Would you have any advise on how to format the resistance matrix for weighting?

Thanks a lot!
With kind regards,
Frederik Van Daele

multivariate localC perm incorrect

When calling sample.int(), the length of x is used which is nrow(x) * ncol(x) in the multivariate context since it is cast a matrix. Reproducible code for this problem is below .

It appears that the error stems from localC_perm_calc() though I am still unclear what the function is doing—mainly the permC_int() subroutine. I've spent ~an hour trying to debug this but am struggling. The problem seems to extend past the sampling.

If you have suggestions where to look, I'll give it a shot! Thanks.

library(sf)
library(spdep)

g <- Guerry::gfrance85
nb <- spdep::poly2nb(g)
listw <- nb2listw(nb)
x <- list(g$Crime_pers, g$Crime_prop)

localC_perm(x, listw)
localC(x, listw)

spdep/R/localC.R

Line 239 in b3a613f

replicate(nsim, localC(x[sample.int(length(x))], listw,

Multivariable local C: control for correlation between the variables

Inference can again be based on a conditional permutation approach. This consists of holding the tuple of values observed at i fixed, and computing the statistic for a number of permutations of the remaining tuples over the other locations. Note that because the full tuple is being used in the permutation, the internal correlation of the variables is controlled for, and only the spatial component gets altered.

from https://geodacenter.github.io/workbook/6c_local_multi/lab6c.html#multivariate-local-geary

The current implementation doesn't control for correlation between the variables.

Nearest Neighbors based on Distance (dnearneigh) in 3D

Hello,

Would it be possible to update dnearneigh to accommodate three-dimensional data, in the same way that knearneigh was updated in the issue referenced below?

Thanks!

Originally posted by @karimhalaseh in #38 (comment)

row.names in poly2nb works only for sp, not sf?

Hi Roger! Hope you are doing well!?

It looks like the row.names argument is passing the attributes region.id only if the object is SpatialPolygons style, but not from sf/sfc? See below.

Thanks!

library(spdep)
#> Loading required package: sp
#> Loading required package: spData
#> To access larger datasets in this package, install the spDataLarge
#> package with: `install.packages('spDataLarge',
#> repos='https://nowosad.github.io/drat/', type='source')`
#> Loading required package: sf
#> Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 7.0.0
library(sf)
library(dplyr)
packageVersion("spdep")
#> [1] '1.1.3'


columbus <- st_read(system.file("shapes/columbus.shp", package="spData")[1], quiet=TRUE) %>% 
  mutate(id_name = apply(t(combn(letters, 2)), 1, paste, collapse = "")[1:n()])

poly2nb(as(columbus, "Spatial"), row.names = columbus$id_name) %>% 
  attr("region.id")
#>  [1] "ab" "ac" "ad" "ae" "af" "ag" "ah" "ai" "aj" "ak" "al" "am" "an" "ao" "ap"
#> [16] "aq" "ar" "as" "at" "au" "av" "aw" "ax" "ay" "az" "bc" "bd" "be" "bf" "bg"
#> [31] "bh" "bi" "bj" "bk" "bl" "bm" "bn" "bo" "bp" "bq" "br" "bs" "bt" "bu" "bv"
#> [46] "bw" "bx" "by" "bz"
poly2nb(columbus, row.names = columbus$id_name) %>% 
  attr("region.id")
#>  [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10" "11" "12" "13" "14" "15"
#> [16] "16" "17" "18" "19" "20" "21" "22" "23" "24" "25" "26" "27" "28" "29" "30"
#> [31] "31" "32" "33" "34" "35" "36" "37" "38" "39" "40" "41" "42" "43" "44" "45"
#> [46] "46" "47" "48" "49"

^{Created on 2020-04-12 by the reprex package (v0.3.0)}

nbcosts- no neighbour nodes

I'm running skater cluster algorithm with my own dataset colled "territories" . bellow different steps:

1.territoriesNb <- poly2nb(territories) summary(territoriesNb), I get

Neighbour list object:
Number of regions: 37
Number of nonzero links: 134
Percentage nonzero weights: 9.788167
Average number of links: 3.621622
2 regions with no links:
73 100
Link number distribution:

0 1 2 3 4 5 6 7 11
2 2 4 13 7 4 2 2 1
2 least connected regions:
64 87 with 1 link
1 most connected region:
98 with 11 links

when i run: lcosts <- nbcosts(territoriesNb, territoriesSt), I have got this error:

Error in nbcosts(territoriesNb, territoriesSt) : nbcosts: no-neighbour nodes.

MAY SOME SOME HELP TO REMOVE NO-NEIGHBOUR FROM MY SP SPATIALPOLYGOMDATAFRME ?

p-values of direct and indirect impacts

Hello,
I am trying to get, unsuccessfully, p-values of direct and indirect impacts of a SARMA model.
I can see the direct, indirect, and total impacts by print(obj) or print.summary.lagImpact(obj) but without p-values.
The code goes as:
#library(spdep)
#library(spatialreg)
#ab.sarma<-sacsarlm(output~X1+X2,data = mydata,listw = W,zero.policy = TRUE)
#obj<-impacts(ab.sarma,listw = W)
#spatialreg::print.summary.lagImpact(obj, zstats=TRUE)

zstats option not giving any output.
Perhaps, I am missing something small; any help in the direction will help.

Thanks,

AIC wrong in weighted SEM/SDEM

When observation weights are used the reported AIC is the standard AIC. When observation weights are used a dAIC measure might be more appropriate, see paper.

Fatal error on installing spdep

I am trying to use the Compind package but every time I try to install spData and spdep in RStudio, I get a fatal error. There is not a message code attached. I have updated R to version 4.1.2 but that seems to not have helped. Any ideas?

poly2nb queen = FALSE argument is inconsistent

Hello,

When using poly2nb, there seems to be an inconsistency between how queen's contiguity is treated between SW-NE corners and SE-NW corners. queen = FALSE does not apply within poly2nb to a SW-NE corner, but does for a SE-NW corner. Below is an example that gets at the issue in a basic 2x2 grid.

library(spdep)
library(tidyverse)

sq <- st_polygon(list(rbind(c(0,0), c(1,0), c(1,1), c(0,1), c(0,0))))
sq2 <- sq + c(0,1)
sq3 <- sq + c(1,0)
sq4 <- sq + c(1,1)
df <- data.frame(id = 1:4, geometry = c(st_sfc(sq), st_sfc(sq2), st_sfc(sq3), st_sfc(sq4))) %>% st_as_sf()
df %>% ggplot() + geom_sf() + geom_sf_text(aes(label = id))

This makes a 2x2 grid, as in the attached file.

Yet, running poly2nb is confusing when using queen = FALSE, as an edge is found for 1-4, but not for 2-3:

nb <- poly2nb(df, queen = FALSE)
> nb[[1]]
[1] 2 3 4
> nb[[2]]
[1] 1 4
> nb[[3]]
[1] 1 4
> nb[[4]]
[1] 1 2 3

To the best of my knowledge, the example is built correctly, as the intersection of 1 and 4 is a point:

> st_intersection(df[1,], df[4,])
Simple feature collection with 1 feature and 2 fields
geometry type:  POINT
dimension:      XY
bbox:           xmin: 1 ymin: 1 xmax: 1 ymax: 1
CRS:            NA
  id id.1    geometry
1  1    4 POINT (1 1)

Making a larger grid repeats this issue in the consistently-inconsistent way:

For now, I'm working around this by using sf::st_relate() with a number of different patterns to capture rook's contiguity, but spdep is much faster when working with large datasets.

Any thoughts would be much appreciated!
Chris

Is there a limit to the file size/number of points that can be analysed?

I am trying to run a linear model with spatially autocorrelated dependent variables using lagsarlm. The dataset contains 2521 unique locations and the spatial weights matrix was created in ArcGIS 10.4.1, exported as dbf and then saved as gwt. Loading the data and reading in the spatial weights matrix seems to work (see output below), but R crashes when I try to run lagsarlm. I've tried running it in RStudio, R console and via command prompt, but it always crashes. My gut feeling is that R can't handle the amount of data; however, 2500 points and a 5Mb spatial weights matrix doesn't seem that large to me.

I'm using Microsoft R Open 3.4.3 on Windows 8.1 with a 2.40GHz processor and 32GB of RAM.

Here is the output file I get when running it from command prompt:

R version 3.4.3 (2017-11-30) -- "Kite-Eating Tree"
Copyright (C) 2017 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

Microsoft R Open 3.4.3
The enhanced R distribution from Microsoft
Microsoft packages Copyright (C) 2017 Microsoft Corporation

Using the Intel MKL for parallel mathematical computing (using 6 cores).

Default CRAN mirror snapshot taken on 2018-01-01.
See: https://mran.microsoft.com/.

> library(spdep)
Loading required package: sp
Loading required package: Matrix
Loading required package: spData
> 
> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 8.1 x64 (build 9600)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252 
[2] LC_CTYPE=English_United Kingdom.1252   
[3] LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C                           
[5] LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] spdep_0.7-4          spData_0.2.6.0       Matrix_1.2-12       
[4] sp_1.2-6             RevoUtils_10.0.7     RevoUtilsMath_10.0.1

loaded via a namespace (and not attached):
 [1] lattice_0.20-35     deldir_0.1-14       gtools_3.5.0       
 [4] spDataLarge_0.2.3.1 MASS_7.3-47         grid_3.4.3         
 [7] nlme_3.1-131        coda_0.19-1         gdata_2.18.0       
[10] LearnBayes_2.15     gmodels_2.16.2      boot_1.3-20        
[13] splines_3.4.3       compiler_3.4.3      expm_0.999-2       
> 
> #set working directory
> setwd("C:/ExtHDD_backup/BRIT/Spatial_data/SpatReg")
> 
> ########################################################
> ##load data
> ########################################################
> d<-read.csv("testdat.csv",stringsAsFactors=FALSE)
> 
> head(d)
  practice_id wtIMD_score wtInc_score Cluster
1           1    29.16808    0.203888     hot
2           2    29.74406    0.213128     hot
3           4    34.31226    0.224040     hot
4           6    32.49078    0.231183     hot
5           8    51.58860    0.358767     hot
6           9    35.28521    0.233471     hot
> str(d)
'data.frame':	2521 obs. of  4 variables:
 $ practice_id: int  1 2 4 6 8 9 12 14 16 17 ...
 $ wtIMD_score: num  29.2 29.7 34.3 32.5 51.6 ...
 $ wtInc_score: num  0.204 0.213 0.224 0.231 0.359 ...
 $ Cluster    : chr  "hot" "hot" "hot" "hot" ...
> 
> ##create list of from IDs
> from_IDs<-sort(unique(d$practice_id))
> 
> length(from_IDs)
[1] 2521
> 
> ##########################################################
> ##read in spatial weights matrix
> ##########################################################
> nb1<-read.gwt2nb("All_spat_w_SEL.GWT",region.id = from_IDs)
Warning messages:
1: In read.gwt2nb("All_spat_w_SEL.GWT", region.id = from_IDs) :
  Old-style GWT file
2: In read.gwt2nb("All_spat_w_SEL.GWT", region.id = from_IDs) :
  111, 119, 183, 2263, 4741, 4890, 6231, 7750 are not destinations
> nb1.w <- nb2listw(nb1, style="B",zero.policy = TRUE)
> print(nb1.w)
Characteristics of weights list object:
Neighbour list object:
Number of regions: 2521 
Number of nonzero links: 269993 
Percentage nonzero weights: 4.248218 
Average number of links: 107.0976 
Non-symmetric neighbours list

Weights style: B 
Weights constants summary:
     n      nn     S0     S1        S2
B 2521 6355441 269993 519153 166596259
> 
> ################################################################
> ##lm with spatial autocorrelation in dependent variable
> ################################################################
> dlag<-lagsarlm(wtIMD_score~Cluster, data=d, nb1.w)

k-nearest neighbours from distance matrix in spdep (knn in 3D distances)

Dear all,
I am using the spdep package to compute Local Moran Index (LMI). My problem is that I am using 3D coordinates (x,y,z), and I would like to compute the k-nearest neighbours (k=10) for each point in my 3D space. I have already done this in 2D, by doing the following:

library(spdep)
set.seed(11)
x=runif(100, -1, 1)
y=runif(100, -1, 1)
full=data.frame(x,y)
neighs_k <- knn2nb(knearneigh(as.matrix(full), k = 10))  
neighs_mat_k <- nb2listw(neighs_k, style = "W",zero.policy=TRUE)

And then I can easily proceed using the neighs_mat_k object. However, when using x,y,z (3D data) coordinates I can't run the knearneigh() function on it. I tried converting my data to a distance matrix and using mat2listw() function like this:

D <- as.matrix(dist(full, diag=FALSE, upper=FALSE)) 
test1 <- mat2listw(D)

...but now I don't know how to retrieve the k-nearest weights from my test1 object (which would correspond to k-nearest neighbours) without changing the class of test1, which is:

class(test1) 
# [1] "listw" "nb" 
# and contains...: 
ls(test1)   
# [1] "neighbours" "style"      "weights"

Thus, if I change the class of test1 to a a regular list with the top 10 smallest weights for each point (corresponding to the 10 smallest distances to points), then the following steps to calculate LMI will not work, because its required that the object is a listw() class.

How should I do this? Is this even the right way to proceed? Would it be possible for the function to accept the coordinates x,y,z (3D) ?

Thanks in advance!

Robust Standard Errors in spatial error models

This is probably an issue more related to the documentation.
I asked this question on stack overflow but I think it is helpful to bring it here.

I am fitting a Spatial Error Model using the errorsarlm() function in the spdep library. The Breusch-Pagan test for spatial models, calculated using the bptest.sarlm() function, suggest the presence of heteroskedasticity.

A natural next step would be to get the robust standard error estimates and update the p-values. In the documentation of the bptest.sarlm() function says the following:

lm.target <- lm(error.col$tary ~ error.col$tarX - 1)
if (require(lmtest) && require(sandwich)) {
print(coeftest(lm.target, vcov=vcovHC(lm.target, type="HC0"), df=Inf))}

where error.col is the spatial error model estimated.

Now, I can easily adapt the code to my problem and get the robust standard errors. Nevertheless, I was wondering:

What exactly is the “lm.target” component of sarlm objects? I can not find any mention to it in the spdep documentation.
What exactly are $tary and $tarX? Again, it does not seem to be mentioned on the documentation.
Why documentation says it is "technically possible to make heteroskedasticity corrections"? Does it mean that proposed approach is not really recommended to overcome issues of heteroskedasticity?

nb plot add neighbors

Hello,

I am baffled why I see an extra neighbor when I run this code:

`
library(spdep)
NIN <- data.frame(row = rep(1:11, 22), col = rep(1:22, each = 11))
xy.rook <- cell2nb(nrow = max(NIN$row), ncol = max(NIN$col), type="rook", torus = FALSE)

library(sf)
rook_matrix <- matrix(c(NIN$col, NIN$row), ncol = 2)
plot(xy.rook, coords = st_multipoint(rook_matrix, dim = "XY"), points = TRUE)
`

I see a gridded rook configuration (as expected), and the addition of a several diagonal lines (running from point 1 to point 222, and running from point 3 to point 224, all the way to point 21 to point 242). I do not understand why these are present. I checked the neighbors at those points and they look as expected:

sapply(xy.rook, length)

Any ideas where these lines are coming from? Thank you.

Document "GE" and "LE" in dnearneigh()

The bounds argument of dnearneigh() accepts either "GE" or "LE". What these values indicate or do is not documented.

spdep/man/dnearneigh.Rd

Line 18 in cf769d3

 \item{bounds}{character vector of length 2, default \code{c("GE", "LE")}, the first element may also be \code{"GE"}, the second \code{"LT"}; the first bound default was changed from \code{"GT"} to \code{"GE"} in release 1.1-7} 

W matrix with no neighbours in mat2listw()

It seems that the function mat2listw() currently does not handle matrix entries with no neighbours, returning a misleading error message.

When trying to pass on a square W matrix to mat2listw() of which some rows contain only zeros (no neighbours), the following error message is returned:
Error in print.listw(x): regions with no neighbours found, use zero.policy=TRUE

Unfortunately, mat2listw() itself does not take the argument zero.policy (does not pass it on to print.listw()). Is there an intention to restore functionality of mat2listw() to handle entries with no neighbours?

add parallel to localmoran.sad and .exact

Probably mclapply etc. can be used with localmoran.sad and localmoran.exact.

cell2nb arguments

It appears that the argument 'ncol' in the cell2nb() actually references the number of rows and the argument 'nrow' actually references the number of columns. Here is an example of the potential problem:

t.nb <- cell2nb(nrow=50,ncol=10,type='rook',torus=FALSE)
head(t.nb)

Note that the neighbors of cell 1 are cell 2 (correct) and cell 51 (incorrect)

Interpretation issues.

Hi,
I have run a Moran I test on the residuals of my model using spdep.
I am unsure if I understand correctly how to interpret this, or if I have made a mistake.
Can you help me understand please?

mod11<-lm(B2.av ~ LAI * solar + Nest.type ,data=FINAL)
Mac<-moran.test(residuals(mod11), listw=nb2listw(nlist, style="W"))

Output:

Moran I test under randomisation

data:  residuals(mod11)  
weights: nb2listw(nlist, style = "W")    

Moran I statistic standard deviate = -43.983, p-value = 1
alternative hypothesis: greater
sample estimates:
Moran I statistic       Expectation          Variance 
    -1.405532e-02     -2.458331e-05      1.017617e-07

Does this suggest that my Moran I is -1.405532e-02, which would show no spatial autocorrelation? Or have I got that completely wrong?

Thank you

Data type matters in lm.LMtests and lagsarlm functions

I did some previous treatment on my data base and had a final dataframe, say "df". It columns were computed using tapply. when I tried to use lm.LMtests or lagsarlm with this data I got the following error:

Error in matrix(0, nrow = nrow(x), ncol = ncol(x)) : 
  invalid 'ncol' value (too large or NA)

I tried looking for this issued with no luck. After trying different solutions, I ended up discovering that the class of the columns in the original data matters. X and Y internally use the function lag.listw and it checks for is.vector. So, even though my original data was a dataframe, column classes matter. So I made sure that each column was numeric type and then everything worked.

When I had the problem, str function reported: num [1:24(1d)], and the problem was solved when str reported just num.

Hope this helps

Add a Contributor Code of Conduct

Would it be possible to add a code of conduct to this Github repo? I think it's a good practice and encourages folks to feel comfortable submitting pull requests and issues. sf has one here that we can use.

As a new contributor to open source spatial, it also makes me feel welcome in the community, which is valuable.

spatial durbin model

two.sided lee.test

The help file of lee.test() mentions the option "two.sided" for the "alternative" parameter. However, I get an error whenever I try this option. Do you know if this could be a bug? I can run lee() and lee.mc() with the same data, so I'm confident that the data is suitable for the analysis. "greater" and "less" options also work.

I'm using spdep v1.1-3
in R version 3.6.1 (2019-07-05)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 16299)

Thanks!

poly2nb not finding links after June 2021

poly2nb() not detecting non-zero links in SpatialPolygonsDataFrame.
In versions from June 19 and earlier, finds links, although the number of links is different to that found by the February 2021 version.

Problem illustrated in poly2nb_version_differences.r attached, using two sets of polygons (p.rds and cells.rds, attached) and run through each version of poly2nb (files named by date of commit, all attached)

poly2nb_version_differences.zip

Cholmod error in ME function

First of all, I appreciate all contributors and authors of spdep package.

I currently conduct a Moran eigenvector spatial filtering analysis using datasets on GeoDa Center for a study. Among datasets, there is an issue when applying ME function to New York City datasets; it throws an identical error of Error in sW %*% var : Cholmod error 'X and/or Y have wrong dimensions' at file ../MatrixOps/cholmod_sdmult.c, line 90. I guess the reason of the error is that there are several distant islands in New York City, wherein produces some problems I cannot specify when spdep package translates spatial neighbors to listw or derived matrix object. For a reference, there was no such error when applying ME function to data from other regions.

I hope that this is just a data problem.

Here is a reproducible code chunk:

## problem - reproducible code
library(spdep)
library(rgdal)

replace_na2 <- function(x) ifelse(is.na(x), 0, x)

download.file('https://s3.amazonaws.com/geoda/data/nyc_2000Census.zip', 
              destfile = 'D:/nyc_2000Census.zip') # might require a user to modify a directory to decompress
dir.create('D:/temp')
unzip('D:/nyc_2000Census.zip', exdir = 'D:/temp')

nye <- rgdal::readOGR(dsn = 'D:/temp/NYC_2000Census.shp', layer = 'NYC_2000Census')
nye <- spTransform(nye, CRS('+init=epsg:3857'))

nye$hs_d <- nye$hs / nye$over25 * 1000
nye$somecol_d = nye$somecol / nye$over25 * 1000
nye$col_d = nye$college / nye$over25 * 1000
nye$master_d = nye$master / nye$over25 * 1000
nye$prof_d = nye$prof / nye$over25 * 1000
nye$phd_d = nye$phd / nye$over25 * 1000

for (i in 57:62){
  nye@data[,i] <- ifelse(is.na(nye@data[,i]), 0, nye@data[,i])
}

nyet <- split(nye, nye@data$BoroName)
nyet.nb <- lapply(nyet, function(x) poly2nb(x))
nyet.listw <- lapply(nyet.nb, function(x) nb2listw(x, zero.policy = TRUE))

nyet.xlist <- c('GENDER_PAR', 'PER_PRV_SC', 'YOUTH_DROP', 
                'PER_MNRTY', 'mean_inc', 'HS_DROP', 'PER_ASIAN')
nyet.ylist <- c('hs_d', 'somecol_d', 'col_d', 'master_d', 
                'prof_d', 'phd_d', 'COL_DEGREE')

nyet.forms <- split(nyet.ylist, nyet.ylist) 
nyet.forms <- lapply(nyet.forms, function(x) as.formula(paste(x, '~', paste(nyet.xlist, collapse = '+'))))

# Gaussian linear regression: no error
nyet.lm1 <- lapply(nyet.forms, function(x) lm(formula = x, data = nyet[[1]]@data))

# Moran Eigenvector GLM filtering: produces an error
nyet.me1 <- lapply(nyet.forms,
                   function(x) ME(formula = x, data = nyet[[1]]@data, 
                                  stdev = TRUE, listw = nyts.listw[[1]]))
# Error in sW %*% var : 
# Cholmod error 'X and/or Y have wrong dimensions' at file ../MatrixOps/cholmod_sdmult.c, line 90

Error: cannot allocate vector of size ... when running sacsarlm regression

I am trying to run a sacsarlm regression of a dataset of 106,000 points with one spatial weight matrix. However, I am stuck by the error Error: cannot allocate vector of size 1.7 Gb in spite of using a machine with the folllowing specifications:

8 GB RAM
[email protected] quad-core.

How can I overcome the issue?

install.packages("spdef") fails because of dependency on sf

$ R
R version 3.6.1 (2019-07-05) -- "Action of the Toes"

> install.packages("spdep")
# 64
also installing the dependency 'sf'
./configure: line 2071: test: too many arguments
./configure: line 2075: test: too many arguments

./configure: line 2078: test: too many arguments
configure: CC: clang
configure: CXX: clang++ -std=gnu++11
checking for gdal-config... no
no
configure: error: gdal-config not found or not executable.
ERROR: configuration failed for package 'sf'
* removing '/usr/local/Cellar/r/3.6.1/lib/R/library/sf'
....

Platform: Mac OS High Sierra

Split model fitting functions out into new package

I'm surveying the reverse dependencies of spdep https://github.com/r-spatial/spdep and see that most packages using functions from spdep either just construct neighbour objects, or in a few cases also use tests for spatial dependence. A small cluster of packages use model fitting functions.

To make maintenance simpler, I'm considering splitting the model fitting functions out into a new package, probably to be called spreg (@gpiras). I'd welcome reactions.

zero.policy = TRUE doesn't appear to make a difference

Hi, I'm currently using spdep to try and test for spatial dependence within my data, however, I can't get the list of weights because nb2listw is telling me to have 'zero.policy=TRUE' when I already have it included in the function call.

> nb <- poly2nb(houston)
> nb2listw(neighbours = nb,  zero.policy = TRUE)
Error in print.listw(x) : 
  regions with no neighbours found, use zero.policy=TRUE

Where the houston object is the zip code .shp shapefile read in from here.

Are there any suggestions as to why this might be happening?

After .travis.yml update, button does not show correct status

@edzer could you please see what I did wrong when I tried to repair spdep and spatialreg .travis.yml files? They were failing with wrong permissions, so I copied in the recent sf file sections. Now they both pass, but the status button is wrong.

knearneigh seg faults at k >= 999

library(spdep)

# generate random long/lat data
coord <- cbind(runif(1000, -10.8544921875, 2.021484375), runif(1000, 49.82380908513249, 59.478568831926395))

# gives benign too many ties error
knn <- knearneigh(coord, k = 998, longlat = TRUE)

# seg faults
knn <- knearneigh(coord, k = 999, longlat = TRUE)

# give it more records, just in case that's the issue
coord <- cbind(runif(2000, -10.8544921875, 2.021484375), runif(2000, 49.82380908513249, 59.478568831926395))

# still seg faults
knn <- knearneigh(coord, k = 999, longlat = TRUE)

Better interface between spdep and the tidyverse ?

I am currently using spdep on some spatial data and I find my code to be sometimes akwardly complicated. For instance, for adding neighbors on a ggplot graph I am resorting to something like [SAMPLES is the non spatial counterpart of SAMPLES_sf]:

library(sf)
library(spdep)

COLUMBUS <- st_read(system.file("shapes/columbus.shp", package="spData")[1])

COLUMBUS %>% ggplot() +
  geom_sf(fill="grey90", col="white") +
  geom_sf(
    color="red",
    data = COLUMBUS %>%
      poly2nb() %>%
      nb2lines(coords= st_coordinates(st_centroid(COLUMBUS))) %>%
      as('sf')
  )

... where something like ...

COLUMBUS %>% ggplot() +
  geom_sf(fill="grey90", col="white") +
  geom_sf(
    color="red",
    data = find_neighbors(., how="share_border")
  )

... would seem more readable and fluid.

I know that spatial data are intrinsically hard, and that not every corner case can be covered. But maybe is it possible to simplify further the use of some common use cases.

r-spatial / spdep Goto Github PK

spdep's Introduction

spdep

Spatial Dependence: Weighting Schemes and Statistics

spdep's People

Contributors

Stargazers

Watchers

Forkers

spdep's Issues

Recommend Projects

Recommend Topics

Recommend Org