Sometimes we want to include the presence of another species in our occupancy model. For example, leopard occupancy might be lower at sites where tigers are present because of competitive interactions, or it might be lower at sites where a preferred prey species such as muntjac is absent.

These we can model as one-way interactions: leopards avoid tigers but tigers don’t care about leopards; leopards need muntjac but muntjac surely don’t need leopards. In these systems, the leopards are the subordinate species and tigers or muntjac are dominant. (It may seem odd to describe prey as dominant and predators as subordinate, but that’s often the ecological reality.)

If we had reliable information on the presence or absence of the dominant species, we could include that in our leopard model as a normal covariate. But usually we only have detection/nondetection data, and need to allow for sites where the dominant species is present but not detected. We thus need to model occupancy for both species simultaneously. We’ll see how to do it using JAGS on this page.

## The example data set

Richmond et al (2010)^{1}Richmond, O.M.W., Hines, J.E., & Beissinger, S.R. (2010) Two-species occupancy models: a new parameterization applied to co-occurrence of secretive rails. Ecological Applications, 20, 2036-2046 describe a study of rails at freshwater marshes in California, USA. They hypothesized that occupancy for the sparrow-sized Black Rail would be lower at sites where the larger Virginia Rail was present.

The data collected by Richmond et al are not available, but the `wiqid`

package has a set of simulated data based on their results. You can load and look at the data with the following code:

**library(wiqid)
?railSims
data(railSims)
head(railSims)
# A1 A2 A3 B1 B2 B3 logArea reeds
# 1 0 0 0 1 1 1 -0.585 TRUE
# 2 1 1 1 0 1 0 -0.231 FALSE
# 3 1 0 0 0 1 1 1.686 FALSE
# 4 0 0 0 1 1 0 0.092 TRUE
# 5 0 0 0 1 1 1 0.155 FALSE
# 6 1 1 1 0 0 0 1.854 FALSE
# Separate the two detection histories
DHA <- as.matrix(railSims[, 1:3])
DHB <- as.matrix(railSims[, 4:6])**

The first 3 columns in the data frame are detection/nondetection data for the dominant species (Virginia rail) for 3 surveys at each site, and the next 3 columns are the detection/nondetection data for the Black rail for the same 3 surveys.

`logArea`

and `reeds`

are site covariates which we will discuss later.

## A model with no interaction

Without interaction, the model is equivalent to separate models for each species. The JAGS code below is the same as that used for salamanders, except that each line is repeated for each species:

**# File name "2sps_noInt.jags"
model{
# Likelihood
for(i in 1:nSites) {
# biological model
zA[i] ~ dbern(psiA) # 1 if sps A present
zB[i] ~ dbern(psiB) # 1 if sps B present
# observation model
yA[i] ~ dbin(pA * zA[i], n)
yB[i] ~ dbin(pB * zB[i], n)
}
# Priors
pA ~ dbeta(1, 1) # Uninformative priors
pB ~ dbeta(1, 1)
psiA ~ dbeta(1, 1)
psiB ~ dbeta(1, 1)
}**

Running this model is straightforward and produces the same output that we would get if we ran two single-species models. We will go on now to the more interesting model with interaction.

## The model with occupancy interaction

Recall that we are interested in one-way interaction: occupancy of species B is affected by the presence of species A, but B has no effect on A. Thus, the code for species A is exactly as before.

For species B, we have 2 values for probability of occupancy, `psiB[1]`

for sites where species A is absent, `psiB[2]`

for sites where it’s present. Which to use at a particular site, `i`

, depends on the value of `zA[i]`

, or rather `zA[i] + 1`

.

**# File name "2sps_Int.jags"
model{
# Likelihood
for(i in 1:nSites) {
# biological model
zA[i] ~ dbern(psiA)
zB[i] ~ dbern(psiB[zA[i] + 1])
# observation model
yA[i] ~ dbin(pA * zA[i], n)
yB[i] ~ dbin(pB * zB[i], n)
}
# Priors
pA ~ dbeta(1, 1) # Uninformative priors
pB ~ dbeta(1, 1)
psiA ~ dbeta(1, 1)
psiB[1] ~ dbeta(1, 1)
psiB[2] ~ dbeta(1, 1)
}**

Now we prepare the data; as before we can aggregate the detections across the 3 survey occasions. Then we can run the model fit:

**# Aggregate detection data
yA <- rowSums(DHA)
yB <- rowSums(DHB)
nSites <- nrow(railSims)
zA <- ifelse(yA > 0, 1, NA)
zB <- ifelse(yB > 0, 1, NA)
jagsData <- list(yA = yA, yB = yB, n = 3, nSites = nSites,
zA = zA, zB = zB)
str(jagsData)
wanted <- c("pA", "pB", "psiA", "psiB")
library(jagsUI)
out <- jags(jagsData, NULL, wanted, model="2sps_Int.jags",
n.chains=3, n.adapt=1000, n.iter=10000, DIC=FALSE,
parallel=TRUE)
mc2sps1 <- mcmcOutput(out)
summary(mc2sps1)
# mean sd median l95 u95 Rhat MCEpc
# pA 0.796 0.028 0.797 0.742 0.852 1.000 0.797
# pB 0.717 0.029 0.717 0.657 0.770 1.000 0.846
# psiA 0.467 0.040 0.467 0.389 0.543 1.000 0.564
# psiB[1] 0.818 0.045 0.821 0.729 0.902 0.999 0.690
# psiB[2] 0.391 0.057 0.390 0.279 0.502 1.000 0.605
diagPlot(mc2sps1, max=5)
postPlot(mc2sps1)
postPlot(mc2sps1, "psiB", xlim=c(0.25, 0.95))**

The diagnostic plots and posterior plots (not shown) look fine. Plotting the posterior distributions of the two `psiB`

parameters with the same x axis clearly shows the difference in occupancy, with black rail occupancy higher where the Virginia rail is absent (`psiB[1]`

) than where it is present (`psiB[2]`

).

## Interaction and detection

One species may behave differently when the other is present, and this may affect the probability of detection. Rails are secretive but have loud calls, so are mostly detected from calls. A change in the frequency of calling will affect detection probability.

We can model this by having 2 values for `pA`

and `pB`

, the first used for sites where the other species is absent and the second when it’s present. The observation model then becomes:

**yA[i] ~ dbin(pA[zB[i] + 1] * zA[i], n)
yB[i] ~ dbin(pB[zA[i] + 1] * zB[i], n)**

A further possibility is that such effects may be occasion-specific: in this case, that the small rails keep quiet (and are not detected) when the big rails are making a noise (and are detected). We then have 3 values for `pB`

depending on whether:

- the big rails are absent,
`zA = 0`

- the big rails are present but not calling (= not detected),
`zA = 1, DHA = 0`

- the big rails are present and calling (= detected),
`zA = 1, DHA = 1`

We can’t use the aggregated data for this, as we have to deal with the survey occasions one by one. Here is the JAGS code:

**# File name "2sps_detect.jags"
model{
# Likelihood
for(i in 1:nSites) {
# biological model
zA[i] ~ dbern(psiA)
zB[i] ~ dbern(psiB[zA[i] + 1])
# observation model
for(j in 1:n) {
DHA[i, j] ~ dbern(pA[zB[i] + 1] * zA[i])
DHB[i, j] ~ dbern(pB[zA[i] + DHA[i, j] + 1] * zB[i])
}
}
# Priors
pA[1] ~ dbeta(1, 1) # spsB absent
pA[2] ~ dbeta(1, 1) # spsB present
pB[1] ~ dbeta(1, 1) # spsA absent
pB[2] ~ dbeta(1, 1) # spsA present, not detected
pB[3] ~ dbeta(1, 1) # spsA present, detected
psiA ~ dbeta(1, 1)
psiB[1] ~ dbeta(1, 1)
psiB[2] ~ dbeta(1, 1)
}**

And here’s the code to prepare everything and run the model:

**jagsData <- list(DHA = DHA, DHB = DHB, n = 3, nSites = nSites,
zA = zA, zB = zB)
str(jagsData)
wanted <- c("pA", "pB", "psiA", "psiB")
out <- jags(jagsData, NULL, wanted, model="2sps_detect.jags",
n.chains=3, n.adapt=1000, n.iter=10000, DIC=FALSE,
parallel=TRUE)
mc2sps2 <- mcmcOutput(out)
summary(mc2sps2)
# mean sd median l95 u95 Rhat MCEpc
# pA[1] 0.846 0.039 0.848 0.766 0.919 1.000 1.015
# pA[2] 0.729 0.053 0.732 0.623 0.829 1.000 1.129
# pB[1] 0.809 0.029 0.810 0.750 0.864 1.001 0.834
# pB[2] 0.679 0.104 0.685 0.482 0.881 1.000 1.156
# pB[3] 0.350 0.069 0.349 0.218 0.486 0.999 1.271
# psiA 0.469 0.040 0.469 0.392 0.547 1.001 0.579
# psiB[1] 0.802 0.044 0.804 0.716 0.885 1.000 0.531
# psiB[2] 0.481 0.081 0.476 0.331 0.646 1.000 1.403
diagPlot(mc2sps2)
plot(mc2sps2)
plot(mc2sps2, "pA", xlim=c(0.6, 0.95))
plot(mc2sps2, "pB", xlim=c(0.2, 0.95))**

The model runs quickly and the diagnostic plots and posterior plots (not shown) look fine.

As we suspected, the little rails are more difficult to detect when the big rails are around, and especially when they are actively calling.

More surprising is that the big rails behaviour changes when the small ones are present, making them less detectable. But the difference is small and the posterior distributions overlap, so let’s check on the difference:

**diff <- apply(mc2sps2$pA, 1, diff)
postPlot(diff, compVal=0)**

So indeed, we can be 95% certain that detection does decline, based on our data and uniform priors.

We’ll include these observation process interactions in the models that follow.

## Model with covariates

The area of the habitat patch may affect occupancy (especially for the big rail which has a larger home range), but the difference in occupancy depends on the ratio of the areas of the sites, rather than the difference in area. Hence we work with the logarithm of the area; `logArea`

is the covariate standardized to mean 0 and SD 1.

We incorporate area into the model with a logistic link:` logit(psiA[i]) <- a0 + aArea * area[i]`

Richmond et al (2010) also hypothesized that the presence of reed-beds in the wetland would give the little rails an advantage and allow them to coexist with the big rails.

Since the `reeds`

covariate has only 2 values, `TRUE/FALSE`

, we could implement this by having 3 values for `psiB`

: species A absent; A present with without reeds; A present with reeds. Picking out the right value to use for each is tricky, however.

`reeds`

is a 0/1 vector, so `reeds+1`

is a 1/2 vector. Multiply by `zA`

and we have a 0/1/2 vector, with 0 when species A is absent. Add 1 to get a 1/2/3 vector we can use as the index into a 3-valued `psiB`

vector.

Here’s the complete JAGS model:

**# File name "2sps_covars.jags"
model{
# Likelihood
for(i in 1:nSites) {
# Ecological model
logit(psiA[i]) <- a0 + aArea * area[i]
zA[i] ~ dbern(psiA[i])
zB[i] ~ dbern(psiB[(reeds[i]+1)*zA[i] + 1])
# Observation model
for(j in 1:n) {
DHA[i, j] ~ dbern(pA[zB[i] + 1] * zA[i])
DHB[i, j] ~ dbern(pB[zA[i] + DHA[i, j] + 1] * zB[i])
}
}
# Priors
pA[1] ~ dbeta(1, 1) # as in previous model
pA[2] ~ dbeta(1, 1)
pB[1] ~ dbeta(1, 1)
pB[2] ~ dbeta(1, 1)
pB[3] ~ dbeta(1, 1)
a0 ~ dunif(-5, 5)
aArea ~ dunif(-5, 5)
psiB[1] ~ dbeta(1, 1) # when sps A absent
psiB[2] ~ dbeta(1, 1) # when sps A present, no reeds
psiB[3] ~ dbeta(1, 1) # when sps A present, reeds
}**

Now we can prepare the data and run the model. In the version of `wiqid`

on CRAN, `railSims$logArea`

is a 1-column matrix instead of a vector; we can fix that with `c()`

. We also change `railSims$reeds`

from TRUE/FALSE to 1/0.

**jagsData <- list(DHA = DHA, DHB = DHB, n = 3, nSites = nSites,
zA = zA, zB = zB, area=c(railSims$logArea),
reeds=as.numeric(railSims$reeds))
str(jagsData)
wanted <- c("pA", "pB", "a0", "aArea", "psiB")
out <- jags(jagsData, NULL, wanted, model="2sps_covars.jags",
n.chains=3, n.adapt=1000, n.iter=10000, DIC=FALSE,
parallel=TRUE)
mc2sps3 <- mcmcOutput(out)
summary(mc2sps3)
# mean sd median l95 u95 Rhat MCEpc
# pA[1] 0.834 0.041 0.837 0.752 0.912 0.998 1.064
# pA[2] 0.738 0.053 0.741 0.630 0.835 0.998 1.184
# pB[1] 0.809 0.029 0.810 0.751 0.862 0.999 0.813
# pB[2] 0.691 0.098 0.696 0.496 0.873 1.001 1.015
# pB[3] 0.345 0.070 0.343 0.211 0.484 0.999 1.409
# a0 -0.150 0.195 -0.150 -0.522 0.243 1.000 0.708
# aArea 1.477 0.266 1.467 0.954 1.998 1.000 0.708
# psiB[1] 0.803 0.044 0.806 0.718 0.888 1.001 0.574
# psiB[2] 0.313 0.086 0.306 0.155 0.487 1.000 1.068
# psiB[3] 0.701 0.116 0.698 0.492 0.950 1.001 1.442
diagPlot(mc2sps3)
plot(mc2sps3)
plot(mc2sps3, "psiB", xlim=c(0.1, 1))
plot(mc2sps3, "aArea")**

As before, the runs quickly and converges nicely; the diagnostic and posterior plots (not shown) reveal no problems.

Reed beds clearly enable the smaller Black rail to survive in wetlands where the Virginia rail occurs, giving occupancy estimates almost as high as sites without the big rail.

Wetland area is an important determinant of occupancy for the Virginia rail: a coefficient value of 1.5 is high for a standardised covariate in a logistic regression. We’ll do a plot of probability of occupancy against log(area):

**range(railSims$logArea) # +/- 2.5
xx <- seq(-2.5, 2.5, length=101)
toPlot <- matrix(NA, 101, 3)
colnames(toPlot) <- c("est", "lo", "up")
for(i in 1:101){
psi <- plogis(mc2sps3$a0 + mc2sps3$aArea * xx[i])
toPlot[i, ] <- c(mean(psi), hdi(psi))
}
plot(xx, toPlot[, 1], type='n', ylim=range(toPlot),
xlab="standardised log area", ylab="Occupancy")
polygon(c(xx, rev(xx)), c(toPlot[,2], rev(toPlot[,3])),
col='skyblue', border=NA)
lines(xx, toPlot[,1], lwd=2)**

You can download a ZIP file with all the code here.

Febri Widodo emailed me with a question about calculation of the Species Interaction Factor (SIF) (MacKenzie et al, 2004) from the output of

`wiqid::occ2sps`

. That’s a mess, but it’s very easy with Bayesian MCMC output. Here’s the calculation based on the`mc2sps1`

output:`SIF <- mc2sps1$psiB[,2] / (mc2sps1$psiA*mc2sps1$psiB[,2] +`

(1-mc2sps1$psiA)*mc2sps1$psiB[,1])

postPlot(SIF)

Here’s the plot

Mike,

Thanks again for the blog posts, they’ve been extremely helpful.

For the species-interaction model with covariates on psiA (area) and psiB (reed), how might the structure of the model differ if reed was a continuous covariate?

Joshua,

Good question! Suppose we have a continuous variable

`reediness`

which goes from 0 to 100. We simulate some random values and standardize with`set.seed(27)`

reediness <- runif(nSites, 0, 100)

`reedinessS <- standardize(reediness)`

In the JAGS code, in the "sites" loop, we need

`logit(psiBA[i]) <- b0 + bRd * reediness[i] # when sps A present`

zB[i] ~ dbern(ifelse(zA[i]==1, psiBA[i], psiBa))

[Edit: the first argument to

`ifelse`

should be a condition, so corrected it to`zA[i]==1`

.]The first line there calculates

`psiBA[i]`

for each site, depending on the value of`reediness[i]`

, and the second line uses`ifelse`

to select`psiBA[i]`

or`psiBa`

depending on the value of`zA[i]`

.`psiBa`

is the same for all sites.Then we need priors for our new parameters:

`b0 ~ dunif(-5, 5)`

bRd ~ dunif(-5, 5)

psiBa ~ dbeta(1, 1) # when sps A absent

In the R code, we need to change

`jagsData`

to include`reediness=reedinessS`

and`wanted`

to include`b0`

,`bRd`

and`psiBa`

.The rest of the code should then work ok. Of course,

`reediness`

is fiction, so we'd expect`bRd`

to be near zero, but with only 160 sites, spurious correlations can easily arise.Mike,

Thank you!!!!!

Mike,

One more quick question. I also wondered how the 2-species model structure might change if a categorical factor was included in either the “no-interaction” or “interaction” models. In this case, “Time” is the categorical factor.

For the no-interaction model: I assumed it would look like the following:

And for the Interaction Model, I assumed it would look something like the following (I’m much less confident in this

structure though!):

Any assistance on this would be much appreciated, Thanks!

I edited your post to include some formatting which you can’t include and corrected typos (italicized).

I need to run things to see if they work! Let’s assume the first 40 sites for the Rails data were surveyed in (say) January, the next 40 in March, then June and September, and we think occupancy changes through the year. So our

`Time`

covariate will be:`Time <- rep(1:4, each=40)`

Your first model works fine (after corrections) and gives 4 values for each of

`psiA`

and`psiB`

.The second model won’t work: you have

`psiB[1]`

and`psiB[2]`

inside the`nCat`

loop, so will get an error saying you are redefining a node. Take those outside the`nCat`

loop will prevent the error, but`psiB`

will not change with`Time`

.`psiB`

needs to be a matrix with different values for each`Time`

and for each`zA`

. Change the ecological model to`zB[i] ~ dbern(psiB[zA[i] + 1, Time[i]])`

and in the

`nCat`

loop have`psiB[1, i] ~ dbeta(1, 1)`

psiB[2, i] ~ dbeta(1, 1)

Do remember in all this that our sample sizes are getting pretty small; ok if your original sample is huge.

A colleague emailed me asking “how [can we] tweak the model to deal with NA data?”

NAs in the detection data are not a problem, provided the two species have the same pattern of NAs. If the analysis used the detection matrices,

`DHA`

and`DHB`

above, everything should just work.If you are using the aggregated data,

`yA`

and`yB`

, it’s a tad more complicated. You need to include`na.rm=TRUE`

in the call to`rowSums`

. Also the number of occasions is no longer the same for all sites, so you need to do:`n <- rowSums(!is.na(DHA))`

and in the JAGS code use

`n[i]`

instead of`n`

.JAGS doesn't allow NAs in covariates. If these are survey-level covariates (eg, survey date) and the NAs correspond to NAs in the detection matrices, the values don't matter, so just plug in 0 or the mean of the other values or whatever.

Got it, thanks very much Mike for the quick answer 🙂