I am looking to employ terra to update an analytical workflow (from Mokany et al 2022; Glob Ecol and Biogeo) originally written in R with the raster package. The workflow involves spatial analyses of predicted dissimilarities, established from a generalized dissimilarity model (GDM). The primary data are GDM transformed predictor rasters (outputs of GDM model fitting), assembled in a single multi-layer raster object. Spatial analyses involve using values from the transformed rasters to predict dissimilarities among pairs of grid cells and applying those outputs for answering various spatial questions. The project raster is large (requires 102 gigs to process; via mem_info).
My specific question: what terra-based code is suitable for applying GDM predictions to assess spatial patterns of dissimilarity and uniqueness (i.e., mean similarity between each grid cell and a random sample of locations across the entire study region). For outputs, I am looking for 1) a raster of predicted dissimilarity and 2) a raster of predicted uniqueness.
This question is quite comparable to Employing `terra::` to avoid std::bad_alloc error when extracting values from large SpatRaster stack. I have sought to use relevant components of that helpful answer here, but haven't successfully developed a workable solution.
I have created a reprex and show the initial code chunk I'm working with. That code draws heavily on the stackoverflow answer referenced above; I show where I'm stalled.
Create multilayer sample raster
library(terra)
r <- rast(ncol=100, nrow=100, xmin=-150, xmax=-80, ymin=20, ymax=60,
nlyr=5, vals=runif(10000*5))
r[[1:3]][10:20] <- NA
r[100:120] <- NA
Initial (partial) attempt at new workflow.
Note, here a sample of values from the full raster is employed because calculating pair-wise dis/similarities among all grid cells is not advisable.
# specify percentage of raster for sampling
n.sub <- ncell(r) * 0.5
# Return matrix of sampled values
x <- na.omit(spatSample(r, n.sub, "regular", as.df=FALSE))
# Compute dissimilarity between grid-cell pairs
sub.dissimilarity <- matrix(0, n.sub, n.sub)
# Flag - if there are na values in the sample, I need to revise `n.sub` in the previous line to match the number of rows after `na.omit`
# Use an arbitrary value to set up loop
gdmRastMod <- list(intercept = 2)
for(i in 1:(n.sub-1)) {
for(j in (i+1):n.sub) {
ecol.dist <- sum(abs(x[i, ] - x[j, ]))
sub.dissimilarity[j,i] <- 1 - exp(-1 * (gdmRastMod$intercept + ecol.dist))
}
}
At this point, I surmise two functions are needed for calculating: 1) dissimilarity and 2) uniqueness (dis/similarity to the region as a whole) and applying those predictions across the study extent.
UPDATE: following @robert-hijmans helpful solution, I was able to produce a raster of predicted uniqueness across my study extent. The code scaled very well with my project data.
However, when I run the first code chunk (re: dissimilarity across 10 cells) with sample data, the output is a multi-layer raster. I need a single layer raster of dissimilarity (comparable to the single-layer raster of uniqueness produced through the other chunk). I also need to know the best way to extend that across all cells in my study extent (not just the first 10, 100, 1000 etc.) Replacing that '10' with the total # of raster cells kicks an error.
rasterize()was carried over from the old code; I checked for aterraalternative and saw that the same function name was used. My question now includes the front-end of the code you provided earlier; I've added comments to show my level of understanding. Still learning as I go. Thank you kindly