Mark and recapture

Mark and recapture is a method commonly used in ecology to estimate an animal population's size where it is impractical to count every individual. A portion of the population is captured, marked, and released. Later, another portion will be captured and the number of marked individuals within the sample is counted. Since the number of marked individuals within the second sample should be proportional to the number of marked individuals in the whole population, an estimate of the total population size can be obtained by dividing the number of marked individuals by the proportion of marked individuals in the second sample. The method is most useful when it is not practical to count all the individuals in the population. Other names for this method, or closely related methods, include capture-recapture, capture-mark-recapture, mark-recapture, sight-resight, mark-release-recapture, multiple systems estimation, band recovery, the Petersen method, and the Lincoln method.
Another major application for these methods is in epidemiology, where they are used to estimate the completeness of ascertainment of disease registers. Typical applications include estimating the number of people needing particular services, or with particular conditions.

Field work related to mark-recapture

Typically a researcher visits a study area and uses traps to capture a group of individuals alive. Each of these individuals is marked with a unique identifier, and then is released unharmed back into the environment. A mark-recapture method was first used for ecological study in 1896 by C.G. Johannes Petersen to estimate plaice, Pleuronectes platessa, populations.
Sufficient time is allowed to pass for the marked individuals to redistribute themselves among the unmarked population.
Next, the researcher returns and captures another sample of individuals. Some individuals in this second sample will have been marked during the initial visit and are now known as recaptures. Other animals captured during the second visit, will not have been captured during the first visit to the study area. These unmarked animals are usually given a tag or band during the second visit and then are released.
Population size can be estimated from as few as two visits to the study area. Commonly, more than two visits are made, particularly if estimates of survival or movement are desired. Regardless of the total number of visits, the researcher simply records the date of each capture of each individual. The "capture histories" generated are analyzed mathematically to estimate population size, survival, or movement.

Notation

Let
A biologist wants to estimate the size of a population of turtles in a lake. She captures 10 turtles on her first visit to the lake, and marks their backs with paint. A week later she returns to the lake and captures 15 turtles. Five of these 15 turtles have paint on their backs, indicating that they are recaptured animals. This example is = . The problem is to estimate N.
N=n*K/k

Lincoln–Petersen estimator

The Lincoln–Petersen method can be used to estimate population size if only two visits are made to the study area. This method assumes that the study population is "closed". In other words, the two visits to the study area are close enough in time so that no individuals die, are born, or move into or out of the study area between visits. The model also assumes that no marks fall off animals between visits to the field site by the researcher, and that the researcher correctly records all marks.
Given those conditions, estimated population size is:

Derivation

It is assumed that all individuals have the same probability of being captured in the second sample, regardless of whether they were previously captured in the first sample.
This implies that, in the second sample, the proportion of marked individuals that are caught should equal the proportion of the total population that is marked. For example, if half of the marked individuals were recaptured, it would be assumed that half of the total population was included in the second sample.
In symbols,
A rearrangement of this gives
the formula used for the Lincoln–Petersen method.

Sample calculation

In the example = the Lincoln–Petersen method estimates that there are 30 turtles in the lake.

Chapman estimator

The Lincoln–Petersen estimator is asymptotically unbiased as sample size approaches infinity, but is biased at small sample sizes. An alternative less biased estimator of population size is given by the Chapman estimator:

Sample calculation

The example = gives
Note that the answer provided by this equation must be truncated not rounded. Thus, the Chapman method estimates 28 turtles in the lake.
Surprisingly, Chapman's estimate was one conjecture from a range of possible estimators: "In practice, the whole number immediately less than / or even Kn/ will be the estimate. The above form is more convenient for mathematical purposes.". Chapman also found the estimator could have considerable negative bias for small Kn/N, but was unconcerned because the estimated standard deviations were large for these cases.

Confidence interval

An approximate confidence interval for the population size N can be obtained as:
where corresponds to the quantile of a standard normal random variable, and
It has been shown that this confidence interval has actual coverage probabilities that are close to the nominal level even for small populations and extreme capture probabilities, in which cases other confidence intervals fail to achieve the nominal coverage levels.

Bayesian estimate

The mean value ± standard deviation is
where
A derivation is found here:.
The example = gives the estimate N ≈ 42 ± 21.5

Capture probability

The capture probability refers to the probability of a detecting an individual animal or person of interest, and has been used in both ecology and epidemiology for detecting animal or human diseases, respectively.
The capture probability is often defined as a two-variable model, in which f is defined as the fraction of a finite resource devoted to detecting the animal or person of interest from a high risk sector of an animal or human population, and q is the frequency of time that the problem occurs in the high-risk versus the low-risk sector. For example, an application of the model in the 1920s was to detect typhoid carriers in London, who were either arriving from zones with high rates of tuberculosis, or low rates. It was posited that only 5 out of every 100 of the travelers could be detected, and 10 out of every 100 were from the high risk area. Then the capture probability P was defined as:
where the first term refers to the probability of detection in a high risk zone, and the latter term refers to the probability of detection in a low risk zone. Importantly, the formula can be re-written as a linear equation in terms of f:
Because this is a linear function, it follows that for certain versions of q for which the slope of this line is positive, all of the detection resource should be devoted to the high-risk population, whereas for other value of q, for which the slope of the line is negative, all of the detection should be devoted to the low-risk population (f should be set to 0. We can solve the above equation for the values of q for which the slope will be positive to determine the values for which f should be set to 1 to maximize the capture probability:
which simplifies to:
This is an example of linear optimization. In more complex cases, where more than one resource f is devoted to more than two areas, multivariate optimization is often used, through the simplex algorithm or its derivatives.

More than two visits

The literature on the analysis of capture-recapture studies has blossomed since the early 1990s. There are very elaborate statistical models available for the analysis of these experiments. A simple model which easily accommodates the three source, or the three visit study, is to fit a Poisson regression model. Sophisticated mark-recapture models can be fit with several packages for the Open Source R programming language. These include "Spatially Explicit Capture-Recapture ", "Loglinear Models for Capture-Recapture Experiments ", and "Mark-Recapture Distance Sampling ". Such models can also be fit with specialized programs such as MARK or M-SURGE.
Other related methods which are often used include the Jolly–Seber model and Schnabel estimators. These are described in detail by Sutherland.

Integrated approaches

Modelling mark-recapture data is trending towards a more integrative approach, which combines mark-recapture data with population dynamics models and other types of data. The integrated approach is more computationally demanding, but extracts more information from the data improving parameter and uncertainty estimates.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...