In this module we will extend the simplest occupancy model that we used for the salamanders analysis to include multiple species. The basic requirements of occupancy modelling are the same: **no false positives**; independent, **replicate observations** at least a subset of sites; and **closure**, which here means that the species is either present or absent at a site for all observations.

## Route NH 017

Our example data come from the North American Breeding Bird Survey (NABBS), specifically for NABBS route #017 in New Hampshire in 1991. Fifty sites at half-mile (800m) intervals were surveyed on 11 occasions in June and July by an experienced birder, and the species detected were recorded. The data are from Royle & Dorazio (2008)^{1}Royle, J.A. & Dorazio, R.M. (2008) *Hierarchical modeling and inference in ecology: the analysis of data from populations, metapopulations and communities*. Academic Press. We have transposed the data so that sites are in rows and species in columns.. You can download and check the data with the following code.

**NABBS <- read.csv("http://bcss.org.my/data/NABBS_NH017.csv",
row.names=1, comment="#")
str(NABBS)
# 'data.frame': 50 obs. of 99 variables:
# $ VIREO.Red.eyed : int 0 1 0 5 4 1 3 11 8 8 ...
# $ ROBIN. : int 11 11 11 9 9 5 10 8 5 8 ...
# $ OVENBIRD. : int 0 1 0 3 2 3 0 3 7 6 ...
# [...]
# $ CUCKOO.Yellow.billed : int 0 0 0 0 0 0 0 0 0 0 ...
Ymat <- as.matrix(NABBS)
table(Ymat)
# 0 1 2 3 4 5 6 7 8 9 10 11
# 3638 494 245 139 96 83 56 44 58 40 32 25
colSums(Ymat)
# VIREO.Red.eyed ROBIN. OVENBIRD.
# 353 252 252
# [...]
# MERLIN. VULTURE.Turkey CUCKOO.Yellow.billed
# 1 1 1
nOcc <- 11 # Each site visited 11 times
( nSobs <- ncol(Ymat) ) # Number of species observed
# 99
( nSites <- nrow(Ymat) ) # Number of sites along the route
# 50**

A total of 99 species were recorded, ranging from the Red-eyed Vireo with 353 detections (out of 50 x 11 = 550 surveys) to 12 species with only 1 detection each.

## A ‘stacked’ analysis

Our first analysis with just be a single-species analysis repeated 99 times. Doing it in a single call to JAGS means that the posterior distributions are all in the same output object, and the code gives us a starting point for more interesting models.

The model diagram is exactly the same as that for salamanders with the addition of the index *k* for the different species.

**# Single species models stacked
# File name "MSOM_stacked.jags"
model{
for(k in 1:nSobs){ # Loop through species
# Likelihood
for(i in 1:nSites) {
# Ecological model
z[i, k] ~ dbern(psi[k])
# Observation model
y[i, k] ~ dbin(p[k] * z[i, k], nOcc)
}
# Priors
p[k] ~ dbeta(1, 1) # Uninformative prior
psi[k] ~ dbeta(1, 1)
}
}**

Again the JAGS code is the same as for the single species model apart from the loop for the species with index `k`

. We prepare the data and run the model with the following R code:

**# Pack up everything for JAGS
z <- (Ymat > 0)*1 # *1 converts to numeric, keeps matrix
z[z == 0] <- NA
jagsData <- list(y = Ymat, nSobs = nSobs, nSites = nSites,
nOcc = nOcc, z = z)
str(jagsData)
# List of 5
# $ y : int [1:50, 1:99] 0 1 0 5 4 1 3 11 8 8 ...
# $ nSobs : int 99
# $ nSites: int 50
# $ nOcc : num 11
# $ z : num [1:50, 1:99] NA 1 NA 1 1 1 1 1 1 1 ...
wanted <- c("psi", "p") # Parameters monitored
# model run takes about 1.5 mins
library(jagsUI)
outS <- jags(jagsData, NULL, wanted, "MSOM_stacked.jags",
n.chains=3, n.adapt=100, n.burnin = 500, n.iter=12500,
n.thin=6, parallel=TRUE, DIC=FALSE)
library(mcmcOutput)
diagPlot(outS)**

All the diagnostic plots look ok. We only show the diagnostic plots for 8 nodes, 4 for species with lots of detections and 4 for species with only 1 detection each. The first four look fine, but the last 4 have a mode at 0 and are very small.

We won’t bother to print out the values for all 198 nodes, instead we’ll plot them.

**op <- par(mfrow=1:2)
plot(outS$mean$psi, 1:nSobs, xlim=0:1,
pch=16, xlab="psi", ylab="Species")
segments(outS$q2.5$psi, 1:nSobs,outS$q97.5$psi, 1:nSobs)
plot(outS$mean$p, 1:nSobs, xlim=0:1,
pch=16, xlab="p", ylab="Species")
segments(outS$q2.5$p, 1:nSobs,outS$q97.5$p, 1:nSobs)
par(op)**

The species with most detections are at the bottom of the plot, and both `p`

and `psi`

are reasonably well estimated. Number 62 looks odd, with high `p`

and low `psi`

: this is the Cliff Swallow, which was detected at only 1 site but 9 times out of 11 at that site; a nesting colony is likely nearby. The 12 species with only 1 detection each are at the top and unsurprisingly `psi`

is not well estimated, with credible intervals going from near 0 to practically 1. I think we can do better.

## A hierarchical model

We add a layer to the ‘stacked’ analysis to model $\psi$ and $p$. In the analysis of women’s heights in the Galton example, we used a normal distribution; we know that heights of adult humans are pretty well described by a normal distribution. In the sock-throwing example, we used a normal distribution to model the throwing ability of the 108 participants on the logit scale, and estimated the mean and SD of the distribution.

We think the 99 values of ${\rm logit}(\psi)$ can be modelled as draws from a normal distribution with mean and SD to be estimated from the data, and the same for ${\rm logit}(p)$. The advantage of working with the logit scale is that covariates can be added easily.

We replace the Beta prior for species-level $\psi_k$ with $${\rm logit}(\psi_k) \sim {\rm Normal}(\mu_{\rm lpsi}, \sigma_{\rm lpsi})$$and then add community-level hyperpriors$$\overline{\psi} \sim {\rm Beta}(1,1)$$ $$\mu_{\rm lpsi} = {\rm logit}(\overline{\psi})$$ $$\sigma_{\rm lpsi} \sim {\rm Uniform}(0, 5)$$

We also replace the Beta prior for $p_k$ with similar priors and hyperpriors. The model diagram looks like this:

And here’s the JAGS code:

**# Hierarchical model
# File name "MSOM_hierarch.jags"
model{
for(k in 1:nSobs){ # Loop through species
# Likelihood
for(i in 1:nSites) {
# Ecological model
z[i, k] ~ dbern(psi[k])
# Observation model
y[i, k] ~ dbin(p[k] * z[i, k], nOcc)
}
# Priors (species level)
lpsi[k] ~ dnorm(mu.lpsi, tau.lpsi)
psi[k] <- ilogit(lpsi[k])
lp[k] ~ dnorm(mu.lp, tau.lp)
p[k] <- ilogit(lp[k])
}
# Hyperpriors (community level)
psi.mean ~ dbeta(1, 1)
mu.lpsi <- logit(psi.mean)
sd.lpsi ~ dunif(0, 5)
tau.lpsi <- 1/sd.lpsi^2
p.mean ~ dbeta(1, 1)
mu.lp <- logit(p.mean)
sd.lp ~ dunif(0, 5)
tau.lp <- 1/sd.lp^2
}**

The data package for this model is the same. Here is the R code to specify the parameters to monitor and run the analysis:

**# Parameters monitored
wanted <- c("psi.mean", "p.mean", "mu.lpsi", "sd.lpsi",
"mu.lp", "sd.lp","psi", "p")
# model run takes about 1.5 mins
outH <- jags(jagsData, NULL, wanted, "MSOM_hierarch.jags",
n.chains=3, n.adapt=1000, n.iter=12000, n.thin=6,
parallel=TRUE, DIC=FALSE)
# mean sd 2.5% 50% 97.5%
# psi.mean 0.263 0.037 0.196 0.262 0.340
# p.mean 0.154 0.017 0.120 0.153 0.188
# sd.lpsi 1.705 0.155 1.431 1.694 2.031
# sd.lp 1.096 0.125 0.874 1.088 1.366
diagPlot(outH)**

Again convergence is ok and the diagnostic plots look good. We only show the plots for the hyperpriors and the first 4 and last 4 for $\psi$ and $p$. We note that the posteriors for` sd.psi`

and `sd.p`

are not constrained by the Uniform(0, 5) prior.

Again we’ll plot the posterior means and CrIs for all the psi and p nodes. We’ll also add the community mean and the mean ±2 SD.

**op <- par(mfrow=1:2)
plot(outH$mean$psi, 1:nSobs, xlim=0:1,
pch=16, xlab="psi", ylab="Species")
segments(outH$q2.5$psi, 1:nSobs,outH$q97.5$psi, 1:nSobs)
abline(v=plogis(outH$mean$mu.lpsi), col='red')
abline(v=plogis(outH$mean$mu.lpsi + c(2, -2)*outH$mean$sd.lpsi),
col='red', lty=3)
plot(outH$mean$p, 1:nSobs, xlim=0:1,
pch=16, xlab="p", ylab="Species")
segments(outH$q2.5$p, 1:nSobs,outH$q97.5$p, 1:nSobs)
abline(v=plogis(outH$mean$mu.lp), col='red')
abline(v=plogis(outH$mean$mu.lp + c(2, -2)*outH$mean$sd.lp),
col='red', lty=3)
par(op)**

The values for the species with lots of detections – at the bottom of the plot – are practically the same as before, but the CrIs for those with few data are now a lot narrower. For these, $p$ is somewhat higher, closer to the community mean, and $\psi$ somewhat lower.

The following plots compare the means and SDs for the stacked and hierarchical models:

The means don’t show a lot of difference, though the very low estimates of $p$ (for species with a single detection) have been pulled up slightly, and the corresponding $\psi$ have come down. There’s more difference in the SDs, again especially for species with a single detection (the clump of points on the right of each plot), which are much more precisely estimated.

How does partial pooling of the information for all 99 species improve estimation? Imagine you have a fuzzy photo of a person walking across a car park and you want to estimate their height. Maybe your first guess is 165-170 cm. Then you are told that the person is Dutch, male, aged 30. These are (on average) the tallest people in the world, mean height 183.8 cm, SD 7.1. Now the question becomes “How tall is this Dutchman?” and you’ll likely revise your estimate upwards and be more confident in your estimate. On the other hand, if it were a Japanese woman (mean 158.1 ± 5.3 cm), you would revise it downwards. In the same way, knowing that the species in question is a New Hampshire bird, and having information about other bird species in New Hampshire, helps in estimating $p$ and $\psi$ for a species with few data.

## Correlated priors

We noted above that species with high $p$ also tend to have high $\psi$. This positive correlation makes biological sense: common species tend to be present at more sites *and *more abundant – and hence more likely to be detected – where they occur.

In the models above we assumed independent priors for $p$ and $\psi$. We can extend the model by using correlated priors and estimating the correlation coefficient.

Several methods for doing this in JAGS are available: see here for a review of 4 of them. All 4 work fine with our data (I tried them all) and give the same output, so we’ll only show one. In their analysis of the NABBS data, Royle & Dorazio (2008, p.285) inserted a few extra lines to link together the priors for $p$ and $\psi$:

**# Hierarchical model
# File name "MSOM_corr.jags"
model{
for(k in 1:nSobs){ # Loop through species
# Likelihood
for(i in 1:nSites) {
# Ecological model
z[i, k] ~ dbern(psi[k])
# Observation model
y[i, k] ~ dbin(p[k] * z[i, k], nOcc)
}
# Priors (species level)
lpsi[k] ~ dnorm(mu.lpsi, tau.lpsi)
mu.eta[k] <- mu.lp + rho * sd.lp/sd.lpsi *
(lpsi[k] - mu.lpsi)
lp[k] ~ dnorm(mu.eta[k], tau.eta)
psi[k] <- ilogit(lpsi[k])
p[k] <- ilogit(lp[k])
}
# Hyperpriors (community level)
psi.mean ~ dbeta(1, 1)
mu.lpsi <- logit(psi.mean)
sd.lpsi ~ dunif(0, 5)
tau.lpsi <- 1/sd.lpsi^2
p.mean ~ dbeta(1, 1)
mu.lp <- logit(p.mean)
sd.lp ~ dunif(0, 5)
tau.lp <- 1/sd.lp^2
rho ~ dunif(-1, 1)
tau.eta <- tau.lp/(1 - rho^2)
}**

It looks as if the 2 parameters are treated differently, but remember that JAGS does not execute the lines in sequence; the code only defines the relationships between nodes, so there is no asymmetry. The code to run the model is similar: we include `rho `

in the parameters monitored:

**# Parameters monitored
wanted <- c("rho", "psi.mean", "p.mean", "sd.lpsi",
"sd.lp", "mu.lpsi","mu.lp", "psi", "p")
# model run takes about 2 mins
library(jagsUI)
outC <- jags(jagsData, NULL, wanted, "MSOM_corr.jags",
n.chains=3, n.adapt=1000,n.iter=12000, n.thin=6,
parallel=TRUE, DIC=FALSE)
jags.View(outC1)
# mean sd 2.5% 50% 97.5%
# rho 0.414 0.114 0.178 0.420 0.616
# psi.mean 0.251 0.035 0.186 0.251 0.321
# p.mean 0.156 0.016 0.125 0.155 0.186
# sd.lpsi 1.682 0.146 1.416 1.674 1.998
# sd.lp 1.002 0.110 0.812 0.993 1.240
# ...
diagPlot(outC)**

The diagnostic plots all look fine. As we expected, the correlation coefficient is clearly positive, but not very big. The values for the hyperprior parameters are almost the same. We’ll do scatter plots to check the differences in $p$ and $\psi$:

The means are very similar, with just a few being pulled down. But including correlation in the priors has a big effect on the SDs for $\psi$, especially again for the species with only 1 or 2 detections.

Modelling the individual species’ $\psi$ and $p$ as draws from a community-level distribution allowed us to use information from the community to get better estimates for species with few data. In the same way, having a mechanism to link correlated parameters allows information to be shared. To return to the previous example, knowing the Dutchman’s weight would help in estimating his height, because height and weight are correlated.

There’s more information available for $p$ than for $\psi$. Each site contributes one bit of information for estimating $\psi$. But for $p$, each *visit *to an occupied site contributes one bit, so the estimate is based on a larger sample than for $\psi$. Pooling information has a bigger effect on the estimates of $\psi$ than $p$, as can be seen in the scatter plots above.

Let’s do those plots of $\psi$ than $p$ for each species one more time, now with the narrower CrIs:

## What next?

We have no covariate information about the sites along Route NH 017. In the next unit we will work with a data set which does have covariates, and we will use the $z$ matrix to investigate differences between species assemblages at the different sites.

You can download a ZIP file with the data and code here.