In the last section we looked at a simple single-species, single-season closed occupancy model. We still keep those assumptions, but now we model probability of occupancy as a function of habitat covariates.

### Swiss willow tits

Our example data come from the Swiss Breeding Bird Survey (“Monitoring Häufige Brutvögel” MHB). Surveys are done 2-3 times per year during the breeding season at 267 1-km square quadrats across Switzerland. We will use the data for willow tits (*Poecile montanus*) analysed by Royle and Dorazio (2008, p.87ff)^{1}Royle, J.A. & Dorazio, R.M. (2008) *Hierarchical modeling and inference in ecology: the analysis of data from populations, metapopulations and communities*. Academic Press..

You can download and check the data with the following code:

**# Read in data
wt <- read.csv("http://bcss.org.my/data/willowtits.csv",
comment="#")
str(wt)
#'data.frame': 237 obs. of 15 variables:
# $ y.1 : int 0 0 0 0 0 0 0 0 1 0 ...
# $ y.2 : int 0 0 0 0 0 0 0 0 1 0 ...
# $ y.3 : int 0 0 0 0 0 0 0 0 0 1 ...
# $ c.1 : int 0 0 0 0 0 0 0 0 2 0 ...
# $ c.2 : int 0 0 0 0 0 0 0 0 1 0 ...
# $ c.3 : int 0 0 0 0 0 0 0 0 0 1 ...
# $ elev : int 420 450 1050 1110 510 630 590 530 ...
# $ forest: int 3 21 32 35 2 60 5 13 50 57 ...
# $ dur.1 : int 240 160 120 180 210 150 115 155 165 ...
# $ dur.2 : int 240 155 105 170 225 180 105 140 165 ...
# $ dur.3 : int 240 140 85 145 235 195 95 135 180 ...
# $ day.1 : int 29 13 30 23 28 17 16 24 25 21 ...
# $ day.2 : int 58 39 47 44 56 56 37 47 46 38 ...
# $ day.3 : int 73 62 74 71 73 73 76 74 70 50 ...
# $ length: num 6.2 5.1 4.3 5.4 3.6 6.1 5.1 3.7 ...**

Here’s what the variables are:

- y.1 to y.3 : Detection/nondetection (1/0) on each of three visits to each quadrat;
- c.1 to c.3 : Counts of territories on each of the three visits;
- elev : quadrat elevation (m);
- forest : forest cover (%);
- dur.1 to dur.3 : duration of each visit (mins);
- day.1 to day.3 : Julian date of each visit, 1 = 1st April;
- length : length of the transect walked within the quadrat (km).

In this section we will use the detection/nondetection data (y.1 to y.3) and the elevation (elev) and forest covariates.

### Data preparation

Since we are not using the survey-level covariates (dur.1-dur.3 and day.1-day.3), we can simplify the model by aggregating the detection data to produce vectors with the number of visits to each site $n$ and the number of visits when willow tits were detected $y$.

**# Extract and aggregate the detection data:
Y <- wt[, 1:3]
( n <- rowSums(!is.na(Y)) ) # number of visits
#[1] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3...
( y <- rowSums(Y, na.rm=TRUE) ) # number with detection
#[1] 0 0 0 0 0 0 0 0 2 1 0 0 0 0 1 2 0 3 0 1 0 0...**

We will also standardise the elevation and forest data so that our predictors have mean 0 and standard deviation (SD) 1. We use the `standardize`

function in the `wiqid`

package.

**# Standardise the covariates:
library(wiqid)
elevS <- standardize(wt$elev)
mean(elevS)
# [1] 8.442321e-17
sd(elevS)
# [1] 1
forestS <- standardize(wt$forest)
mean(forestS)
# [1] 8.606644e-17
sd(forestS)
# [1] 1**

I like to add an ‘S’ to the end of the variable to indicate that it’s standardised. The means are not quite zero because the SD is adjusted after centering.

## The model

#### Biological process

As with the salamander model, $z_i$ is drawn from a Bernoulli distribution. But now the probability of occupancy depends on the elevation and forest cover, and is different for each site:$$z_i \sim {\rm Bernoulli}(\psi_i)$$

Because $\psi$ is a probability we use a logistic link, with an expression that’s essentially a logistic regression model. Switzerland is mountainous, with elevations ranging from 250 to 2750m; it’s likely that willow tit occupancy is highest at some middle elevation, so we include elevation^2 as a predictor to give a “humpy” curve rather than a straight line. $${\rm logit}(\psi_i) = \beta_0 + \beta_{for} Forest_i + \beta_{elev} Elev_i + \beta_{elev2} Elev_i^2$$

#### Observation process

We won’t use covariates for the probability of detection, so the observation model is the same as before:$$y_i \sim {\rm Binomial}(n_i, pz_i)$$

#### Priors

**Intercept:** The intercept, $\beta_0$, is the probability of occupancy on the logit scale for a site with all the predictors equal to zero. Since we standardised the covariates to form the predictors, that’s a site with forest cover 34.8% forest cover and elevation 1183m. I have difficulty imagining a suitable prior on the logit scale, but I can easily define a uniform prior for the probability and then convert to the logit scale: $$\psi_0 \sim {\rm Beta}(1,1)$$ $$ \beta_0 = {\rm logit}(\psi_0)$$

**Coefficients:** Since we are working on the logit scale and we have standardised the predictors to have SD = 1, we’d be very surprised if the coefficients were outside the range $\pm 5$. We’ll use ${\rm Uniform}(-5, 5)$, as we did with the socks in the box.

**Probability of detection: **We can use a uniform Beta prior for this, just as we did for the salamanders example.

#### Model diagram

#### JAGS code

**# Save this in the file "wt_siteCovs.jags"
model{
# Likelihood
for(i in 1:nSites) {
# biological model
logit(psi[i]) <- b0 + bFor * fst[i] +
bElev * ele[i] + bElev2 * ele2[i]
z[i] ~ dbern(psi[i])
# observation model
y[i] ~ dbin(p * z[i], n[i])
}
# Priors
psi0 ~ dbeta(1, 1)
b0 <- logit(psi0)
bFor ~ dunif(-5, 5) # forest
bElev ~ dunif(-5, 5) # elevation
bElev2 ~ dunif(-5, 5) # elevation^2
p ~ dbeta(1, 1)
# Derived variable
N <- sum(z)
}**

### R code for preparations and running the model

We need to include the predictors in the data to be passed to JAGS. As always with occupancy models, we need to set initial values for $\tt z$ which are consistent with the data: setting them all to 1 works fine provided we discard the first MCMC draws as burn-in. My initial run with $\tt n.iter = 5000$ gave $\tt n.eff$ values that were too small, so I increased that to get values of 10,000 or more; it still takes less than 1 minute to run provided we run the chains in parallel.

**library(jagsUI)
# Prepare the data object:
jagsData <- list(y = y, n = n, nSites = length(n),
fst = forestS, ele = elevS, ele2 = elevS^2)
str(jagsData)
# List of 6
# $ y : num [1:237] 0 0 0 0 0 0 0 0 2 1 ...
# $ n : num [1:237] 3 3 3 3 3 3 3 3 3 3 ...
# $ nSites: int 237
# $ fst : num [1:237] -1.15389 -0.50052 -0.10124 0.00766 ...
# $ ele : num [1:237] -1.18 -1.133 -0.205 -0.112 -1.041 ...
# $ ele2 : num [1:237] 1.392 1.2847 0.0421 0.0126 1.0828 ...
# Specify initial values for 'z'
inits <- function() list(z = rep(1, nSites))
wanted <- c("p", "psi0", "b0", "bFor", "bElev", "bElev2",
"N")
# Run the model (20 secs)
( out1 <- jags(jagsData, inits, wanted,
model="wt_siteCovs.jags", DIC=FALSE,
n.chains=3, n.iter=11000, n.burn=1000, parallel=TRUE) )
# ...
# mean sd 2.5% 50% 97.5% Rhat n.eff
# p 0.787 0.029 0.727 0.788 0.841 1.000 15691
# psi0 0.453 0.067 0.326 0.452 0.586 1.000 30000
# b0 -0.191 0.274 -0.726 -0.191 0.347 1.000 30000
# bFor 0.869 0.239 0.414 0.865 1.347 1.000 10040
# bElev 2.069 0.316 1.484 2.056 2.719 1.001 15521
# bElev2 -1.156 0.275 -1.718 -1.148 -0.647 1.000 30000
# N 80.525 1.350 79.000 80.000 84.000 1.000 30000
wiqid::diagPlot(out1)**

The diagnostic plots look fine. In particular we see that the posterior distributions for the coefficients, `bFor`

, `bElev `

and `bElev2`

, are well clear of the limits of $\pm 5$ imposed by our prior.

The coefficient values show that forest has a positive effect – occupancy is higher with higher forest cover – and the elevation effect has a hump (`bElev2`

is negative)

### Plotting the output

The best way understand the results is to generate some plots. We’ll start with a simple plot of how occupancy varies with elevation for sites with average forest cover (35%) with corresponds to `forestS = 0`

. We’ll use the point estimates for the coefficients in `out1$mean`

.

**# Create a vector of values for the x axis
range(wt$elev)
# [1] 250 2750
xx <- seq(250, 2750, length=101)
# Need to scale xx the same way we scaled the elev data
xxS <- standardize2match(xx, wt$elev)
logit.psi <- with(out1$mean, b0 + bElev * xxS +
bElev2 * xxS^2)
psi <- plogis(logit.psi)
plot(xx, psi, type='l')**

Occupancy reaches a maximum – according to our model – at an elevation of 1760 for sites with average forest cover. That’s quite high: let’s check what values for forest cover were observed.

**plot(wt$elev, wt$forest)
abline(h=35)**

High forest cover (70-80%) only occurs up to about 2000m, and even average cover only gets to 2200m. So the line beyond 2200m on our simple plot of occupancy above is nonsense. We’ll do new plots for three different values of forest cover, 0%, 35% and 80%. We’ll also plot credible intervals, for which we need to use the whole MCMC chains, not just the point estimates. That will give us a vector of 30,000 values for each elevation and forest cover value; we’ll do these in a for loop, just keeping the values we need to plot.

**# Standardize the forest cover values
( forS <- standardize2match(c(0, 35, 80), wt$forest) )
# [1] -1.262784855 0.007657883 1.641084260
# Get the data to plot (mean and credible interval)
toplot0 <- matrix(NA, 101, 3)
for(i in 1:101) {
logit.psi.tmp <- with(out1$sims.list,
b0 + bFor * forS[1] + bElev * xxS[i] +
bElev2 * xxS[i]^2)
psi.tmp <- plogis(logit.psi.tmp)
toplot0[i, 1] <- mean(psi.tmp)
toplot0[i, 2:3] <- hdi(psi.tmp)
}
which(xx == 2200)
# [1] 79
toplot35 <- matrix(NA, 79, 3)
for(i in 1:79) {
logit.psi.tmp <- with(out1$sims.list,
b0 + bFor * forS[2] + bElev * xxS[i] +
bElev2 * xxS[i]^2)
psi.tmp <- plogis(logit.psi.tmp)
toplot35[i, 1] <- mean(psi.tmp)
toplot35[i, 2:3] <- hdi(psi.tmp)
}
which(xx == 2000)
# [1] 71
toplot80 <- matrix(NA, 79, 3)
for(i in 1:71) {
logit.psi.tmp <- with(out1$sims.list,
b0 + bFor * forS[3] + bElev * xxS[i] +
bElev2 * xxS[i]^2)
psi.tmp <- plogis(logit.psi.tmp)
toplot80[i, 1] <- mean(psi.tmp)
toplot80[i, 2:3] <- hdi(psi.tmp)
}
# Plot semi-transparent polygons first, then add the lines
plot(xx, toplot0[, 1], ylim=0:1, type='n', las=1,
xlab="Elevation", ylab="Occupancy")
polygon(x=c(xx, rev(xx)),
y=c(toplot0[, 2], rev(toplot0[, 3])),
col=adjustcolor('yellow', 0.5), border=NA)
polygon(x=c(xx[1:79], rev(xx[1:79])),
y=c(toplot35[, 2], rev(toplot35[, 3])),
col=adjustcolor('skyblue', 0.5), border=NA)
polygon(x=c(xx[1:71], rev(xx[1:71])),
y=c(toplot80[, 2], rev(toplot80[, 3])),
col=adjustcolor('lightgreen', 0.5), border=NA)
lines(xx, toplot0[, 1], lwd=2, col='brown')
lines(xx[1:79], toplot35[, 1], lwd=2, col='blue')
lines(xx[1:71], toplot80[, 1], lwd=2, col='darkgreen')
legend('topleft', c("80% forest", "35% forest",
"0% forest"),
lwd=2, col=c('darkgreen', 'blue', 'brown'), bty='n')**

### Distribution map of Swiss willow tits

Once we have fitted our model, we can predict the probability of occupancy for other 1×1 km quadrats, provided we have the values of the covariates for each of them. The unmarked package has a data set with elevation and forest cover for all of Switzerland.

**library(unmarked)
data(Switzerland)
str(Switzerland)
# 'data.frame': 42275 obs. of 5 variables:
# $ x : int 910942 910942 911942 911942 911942 911942 ...
# $ y : int 54276 55276 54276 55276 56276 57276 ...
# $ elevation: int 340 340 380 390 357 357 500 472 462 472 ...
# $ forest : int 6 11 4 72 9 5 0 43 6 0 ...
# $ water : int 0 1 0 3 7 5 0 0 0 0 ...
# Convert elevation and forest cover to rasters and plot
library(raster)
elevR <- rasterFromXYZ(Switzerland[,1:3])
forR <- rasterFromXYZ(Switzerland[,c(1,2,4)])
par(mfrow=1:2)
plot(elevR, main="Elevation", axes=FALSE, box=FALSE,
col=terrain.colors(255))
plot(forR, main="Forest cover", axes=FALSE, box=FALSE)
par(mfrow=c(1, 1))**

Since we have the habitat variables in a data frame, we could use those to calculate $\psi$ and then convert that to a raster for plotting. But as Ugyen pointed out in the comments, you will usually be starting with a raster layer with those variables, so we’ll work with the rasters. Ordinary arithmetical operations work fine for rasters, but some functions – notably `standardize2match`

and `plogis`

– do not.

We standardise the elevation and forest data to match the way we standardised the original data for the willow tits, then we can calculate occupancy for every pixel of the map and plot it. We’ll use the means of the posterior distributions to do this.

**# Scale the elevation and forest covariates as we did for the willow tits data
elevRS <- (elevR - mean(wt$elev)) / sd(wt$elev)
forRS <- (forR - mean(wt$forest)) / sd(wt$forest)
# Calculate occupancy for each pixel of the map
logit.psi <- with(out1$mean, b0 + bFor * forRS +
bElev * elevRS + bElev2 * elevRS^2)
psiR <- 1 / (1 + exp(-logit.psi))
# Now plot the new raster
mapPalette <- colorRampPalette(c("grey", "yellow",
"orange", "red"))
plot(psiR, col=mapPalette(100),
main="Willow tit occupancy",
axes=FALSE, box=FALSE)**

### What next?

This data set is used by Royle & Dorazio (2008) *Hierarchical modeling and inference in ecology* and Hooten & Hobbs (2015) A guide to Bayesian model selection for ecologists. *Ecological Monographs*, 85, 3-28. We plan to add pages for goodness-of-fit checks and model selection in the future. But priority goes to an analysis with survey covariates for detection probability.

Download a ZIP file with the code here.

This looks great! Is there a possibility to demonstrate how to produce predictions maps from raster layers (importing the raster files) by using the parameter estimates (beta coefficients) from the model? Not an urgency but we can think of that later.

That’s a good point Ugyen, I’ve rewritten the code to work with rasters instead of the data frame.

Mike,

This is really good.

Would it be possible to elaborate on categorical co-variates? I know that many students struggle with these in Bayesian model code and to interpret output?

Only thought – or direct to some relevant examples and code?

Thanks!

Hi Lourens, Yes, that’s on the to-do list, preferably with an example where we do have a categorical covariate. In the meantime, there’s a bit on the topic here.