# Occupancy with site covariates

In the last section we looked at a simple single-species, single-season closed occupancy model. We still keep those assumptions, but now we model probability of occupancy as a function of habitat covariates.

### Swiss willow tits

Our example data come from the Swiss Breeding Bird Survey (“Monitoring Häufige Brutvögel” MHB). Surveys are done 2-3 times per year during the breeding season at 267 1-km square quadrats across Switzerland. We will use the data for willow tits (Poecile montanus) analysed by Royle and Dorazio (2008, p.87ff)1Royle, J.A. & Dorazio, R.M. (2008) Hierarchical modeling and inference in ecology: the analysis of data from populations, metapopulations and communities. Academic Press..

str(wt)
#'data.frame':   237 obs. of  15 variables:
# $y.1 : int 0 0 0 0 0 0 0 0 1 0 ... #$ y.2   : int  0 0 0 0 0 0 0 0 1 0 ...
# $y.3 : int 0 0 0 0 0 0 0 0 0 1 ... #$ c.1   : int  0 0 0 0 0 0 0 0 2 0 ...
# $c.2 : int 0 0 0 0 0 0 0 0 1 0 ... #$ c.3   : int  0 0 0 0 0 0 0 0 0 1 ...
# $elev : int 420 450 1050 1110 510 630 590 530 ... #$ forest: int  3 21 32 35 2 60 5 13 50 57 ...
# $dur.1 : int 240 160 120 180 210 150 115 155 165 ... #$ dur.2 : int  240 155 105 170 225 180 105 140 165 ...
# $dur.3 : int 240 140 85 145 235 195 95 135 180 ... #$ day.1 : int  29 13 30 23 28 17 16 24 25 21 ...
# $day.2 : int 58 39 47 44 56 56 37 47 46 38 ... #$ day.3 : int  73 62 74 71 73 73 76 74 70 50 ...
# $length: num 6.2 5.1 4.3 5.4 3.6 6.1 5.1 3.7 ... Here’s what the variables are: • y.1 to y.3 : Detection/nondetection (1/0) on each of three visits to each quadrat; • c.1 to c.3 : Counts of territories on each of the three visits; • elev : quadrat elevation (m); • forest : forest cover (%); • dur.1 to dur.3 : duration of each visit (mins); • day.1 to day.3 : Julian date of each visit, 1 = 1st April; • length : length of the transect walked within the quadrat (km). In this section we will use the detection/nondetection data (y.1 to y.3) and the elevation (elev) and forest covariates. ### Data preparation Since we are not using the survey-level covariates (dur.1-dur.3 and day.1-day.3), we can simplify the model by aggregating the detection data to produce a vector with the number of visits to each site$n$and the number of visits when willow tits were detected$y$. # Extract and aggregate the detection data: Y <- wt[, 1:3] ( n <- rowSums(!is.na(Y)) ) # number of visits #[1] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3... ( y <- rowSums(Y, na.rm=TRUE) ) # number with detection #[1] 0 0 0 0 0 0 0 0 2 1 0 0 0 0 1 2 0 3 0 1 0 0... We will also standardise the elevation and forest data so that our predictors have mean 0 and standard deviation (SD) 1. We use the standardize function in the wiqid package. # Standardise the covariates: elevS <- standardize(wt$elev)
mean(elevS)
# [1] 8.442321e-17
sd(elevS)
# [1] 1
forestS <- standardize(wt$forest) mean(forestS) # [1] 8.606644e-17 sd(forestS) # [1] 1 I like to add an ‘S’ to the end of the variable to indicate that it’s standardised. The means are not quite zero because the SD is adjusted after centering. ### The model As with the salamander model,$z_i$is drawn from a Bernoulli distribution. But now the probability of occupancy depends on the elevation and forest cover, and is different for each site:$$z_i \sim {\rm Bernoulli}(\psi_i)$$ Because$\psi\$ is a probability we use a logistic link, with an expression that’s essentially a logistic regression model. Switzerland is mountainous, with elevations ranging from 250 to 2750m; it’s likely that willow tit occupancy is highest at some middle elevation, so we include elevation^2 as a predictor to give a “humpy” curve rather than straight line. $${\rm logit}(\psi_i) = \beta_0 + \beta_{for} forest + \beta_{elev} elev + \beta_{elev2} elev^2$$

We won’t use covariates for the probability of detection, so the observation model is the same as before:$$y_i \sim {\rm Binomial}(n, pz_i)$$

Work in progress