mbg is an R package for model-based
geostatistics.
The mbg package provides a simple interface to run
spatial machine learning models and geostatistical models that estimate
a continuous (raster) surface from point-referenced observations and,
optionally, a set of raster covariates. The package also includes
functions to summarize raster estimates by (polygon) region while
preserving uncertainty.
Overview of the MBG
workflow
The mbg package combines features from the sf, terra, and data.table
packages for spatial data processing; caret for
spatial ML models; and R-INLA for
geostatistical models.
Using the package
You can install the latest stable version of the mbg package
from CRAN:
install.packages("mbg")
Some core package functions rely on R-INLA, which is not available on
CRAN. If you do not already have the INLA package
installed, you can download it following these
instructions.
After installing and package and loading it using
library(mbg), you can access the package vignette by
running help(mbg), or get documentation for a specific
function by running e.g. help(MbgModelRunner).
Package workflow
A typical MBG workflow includes the following steps:
Load point data on outcomes, raster
covariate surfaces, and a raster population
surface
(Optional): Run machine learning models
relating the input covariate surfaces to the outcome, producing
predictive raster surfaces from a variety of methods
Prepare inputs for the geostatistical model. This
includes the outcomes point data, model specifications, a spatial 2-D
mesh, and either the input covariate surfaces or the ML predictive
surfaces
Run the geostatistical model. This model predicts
the outcome as a linear combination of the raster surfaces and a SPDE
approximation to a Gaussian process over space.
Using the model fit, generate gridded predictions
of the outcome across the entire study area. Uncertainty is captured by
generating 250 posterior predictive draws at each pixel location.
Summarize predictive draws as raster surfaces by
taking the mean, median, and 95% uncertainty interval bounds of draws at
each pixel location
(Optional):Aggregate from pixels to
administrative boundaries, preserving uncertainty
Many thanks to the following groups of people for their contributions
to the package:
IHME’s Local Burden of Disease core code team, for their development
of geostatistical software tools that helped inspire this package.
Special thanks to Aaron Osgood-Zimmerman, Ian Davis, John VanderHeide,
Jon Mosser, Katie Wilson, Lauren Woyczynski, Michael Collison, Michael
Cork, Mike Richards, Nafis Sadat, Neal Marquez, and Roy Burstein.
The Geospatial Analysis team at the Demographic and Health Surveys
Program