If we want to know the abundance of some species in a study area, why don’t we just go out and count them?

That will work if we can be sure that we detect every individual there, but in most real-life situations we are going to miss some individuals. Visiting the site several times and gathering data for replicate counts would be better. The highest count is going to be closer to the true value, but may still not be a complete count.

As is often the case in wildlife research, we need some way to estimate the probability of detecting individuals and to correct our observed counts for imperfect detection.

In 2004 Andy Royle^{1}Royle, A.J. (2004) N-mixture models for estimating population size from spatially replicated counts. *Biometrics*, 60, 108-115 devised an approach to model abundance when we have both temporal and spatial replicate observations of a population, ie, multiple observations of multiple sites.

As with most of our models, we have two things going on, so two submodels.

## The biological or ecological model

This describes how the individuals are distributed among the sites. We might use a Poisson distribution, so that the number at each site, $N_i$, is drawn from a Poisson distribution with some common parameter for the mean, $\lambda$.$$N_i \sim {\rm Poisson}(\lambda)$$

This can easily be extended to include covariates which may affect the number at a site.

Alternatively, we could use a negative binomial distribution, which allows a wider spread of values, or a zero-inflated Poisson distribution, which allows a greater proportion of zeros than a standard Poisson distribution.

## The detection or count model

Here we consider how the count data are generated. We generally use a straight forward binomial model, where each individual at the site is considered as a trial with some probability of success, $p$, and the count, $C_{ij}$, is the number of successes.$$C_{ij} \sim {\rm Binomial}(N_i, p)$$

The probability of success – or probability of detection in this case – can be affected by covariates.

In the next module we will simulate some count data, which will give you a better understanding of how it works.

## N-mixture models

These models are described as N-mixture models, a term which comes from maximum likelihood estimation. Every possible value of the number of individuals at a site, *N*, is considered – in practice, from the highest count to some suitably large number. A likelihood is calculated for the parameter values and each value of *N*, and these likelihoods then summed. If *N* goes up to several hundred, this can take a long time, and MLE analysis can take many hours.

Analysis in JAGS is more straightforward. The values of N for each site are simply a set of parameters to be estimated, we do not need to sum across all possible values.

I think “Poisson-binomial model” would be a better name, but the N-mixture name is now well established in the literature. And it would get complicated if you chose another distribution instead of the classic Poisson.

## Model assumptions

As always, the basic model makes a number of assumptions, some of which can be relaxed by making the model more complex,

- Closure: the number of individuals available for counting at a site is the same for every replicate count.
- No false positives, no double counting: individuals may be missed, so the counts can be smaller than the true number, but a count can never be larger than the truth.
- All individuals have the same probability of detection, and detections are independent.
- Individuals are distributed randomly and independently in space (the underlying assumption of the Poisson distribution).
- Sites are spatially independent.

Link et al (2018)^{2}Link, W.A., Schofield, M.R., Barker, R.J., & Sauer, J.R. (2018) On the robustness of N-mixture models. Ecology, 99, 1547-1551 pointed out that even quite small departures from these assumptions – too small to be detected by goodness of fit tests – would seriously impact population estimates. They recommend (1) treating estimates as indices of relative abundance, and (2) collecting additional data to estimate probability of detection directly.

## What next?

The next page shows data simulated from a Poisson-binomial model and its analysis in JAGS.