The function finds a bandwidth for a given generalised geographically weighted regression by optimzing a selected function. For cross-validation, this scores the root mean square prediction error for the generalised geographically weighted regressions, choosing the bandwidth minimizing this quantity.

ggwr.sel(formula, data = list(), coords, adapt = FALSE, gweight = gwr.Gauss,
 family = gaussian, verbose = TRUE, longlat = NULL, RMSE=FALSE,
 tol=.Machine$double.eps^0.25)

Arguments

formula

regression model formula as in glm

data

model data frame as in glm, or may be a SpatialPointsDataFrame or SpatialPolygonsDataFrame object as defined in package sp

coords

matrix of coordinates of points representing the spatial positions of the observations

adapt

either TRUE: find the proportion between 0 and 1 of observations to include in weighting scheme (k-nearest neighbours), or FALSE --- find global bandwidth

gweight

geographical weighting function, at present gwr.Gauss() default, or gwr.gauss(), the previous default or gwr.bisquare()

family

a description of the error distribution and link function to be used in the model, see glm

verbose

if TRUE (default), reports the progress of search for bandwidth

longlat

TRUE if point coordinates are longitude-latitude decimal degrees, in which case distances are measured in kilometers; if x is a SpatialPoints object, the value is taken from the object itself

RMSE

default FALSE to correspond with CV scores in newer references (sum of squared CV errors), if TRUE the previous behaviour of scoring by LOO CV RMSE

tol

the desired accuracy to be passed to optimize

Value

returns the cross-validation bandwidth.

References

Fotheringham, A.S., Brunsdon, C., and Charlton, M.E., 2002, Geographically Weighted Regression, Chichester: Wiley; http://gwr.nuim.ie/

Note

The use of GWR on GLM is only at the initial proof of concept stage, nothing should be treated as an accepted method at this stage.

See also

Examples

if (require(rgdal)) { xx <- readOGR(system.file("shapes/sids.shp", package="spData")[1]) bw <- ggwr.sel(SID74 ~ I(NWBIR74/BIR74) + offset(log(BIR74)), data=xx, family=poisson(), longlat=TRUE) bw }
#> Loading required package: rgdal
#> rgdal: version: 1.4-3, (SVN revision 828) #> Geospatial Data Abstraction Library extensions to R successfully loaded #> Loaded GDAL runtime: GDAL 2.4.1, released 2019/03/15 #> Path to GDAL shared files: /usr/local/share/gdal #> GDAL binary built with GEOS: TRUE #> Loaded PROJ.4 runtime: Rel. 6.0.0, March 1st, 2019, [PJ_VERSION: 600] #> Path to PROJ.4 shared files: (autodetected) #> Linking to sp version: 1.3-1
#> OGR data source with driver: ESRI Shapefile #> Source: "/home/rsb/lib/r_libs/spData/shapes/sids.shp", layer: "sids" #> with 100 features #> It has 22 fields #> Bandwidth: 302.9456 CV score: 1204.711 #> Bandwidth: 489.6869 CV score: 1211.156 #> Bandwidth: 187.5331 CV score: 1188.477 #> Bandwidth: 116.2043 CV score: 1197.936 #> Bandwidth: 197.1794 CV score: 1190.679 #> Bandwidth: 166.747 CV score: 1183.541 #> Bandwidth: 147.4414 CV score: 1180.42 #> Bandwidth: 135.5099 CV score: 1181.441 #> Bandwidth: 146.8813 CV score: 1180.393 #> Bandwidth: 145.1043 CV score: 1180.346 #> Bandwidth: 141.4396 CV score: 1180.461 #> Bandwidth: 144.5526 CV score: 1180.344 #> Bandwidth: 144.647 CV score: 1180.343 #> Bandwidth: 144.6561 CV score: 1180.343 #> Bandwidth: 144.6555 CV score: 1180.343 #> Bandwidth: 144.6555 CV score: 1180.343 #> Bandwidth: 144.6554 CV score: 1180.343 #> Bandwidth: 144.6555 CV score: 1180.343
#> [1] 144.6555