Functional data analysis


Functional data analysis is a branch of statistics that analyzes data providing information about curves, surfaces or anything else varying over a continuum. In its most general form, under an FDA framework each sample element is considered to be a function. The physical continuum over which these functions are defined is often time, but may also be spatial location, wavelength, probability, etc.

Level of error

The data may be so accurate that error can be ignored, may be subject to substantial measurement error, or even have a complex indirect relationship to the curve that they define. For example, measurements of the heights of children over a wide range of ages have an error level so small as to be ignorable for many purposes, but daily records of precipitation at a weather station are so variable as to require careful and sophisticated analyses in order to extract something like a mean precipitation curve.

Use of derivatives

However these curves are estimated, it is the assumption that they are intrinsically smooth that often defines a functional data analysis. In particular, FDA often makes use of the information in the slopes and curvatures of curves, as reflected in their derivatives. Plots of first and second derivatives as functions of t, or plots of second derivative values as functions of first derivative values, may reveal important aspects of the processes generating the data. As a consequence, curve estimation methods designed to yield good derivative estimates can play a critical role in functional data analysis.

Contrast with other methods

The extensive use of kernel smoothing and smoothing splines to ensure smoothness assumptions signify why functional data analysis is at its core a nonparametric statistical technique. Nevertheless, models for functional data and methods for their analysis may resemble those for conventional multivariate data, including linear and nonlinear regression models, principal components analysis among others; that is because functional data can be thought of as multivariate data with order on its dimensions. But the possibility of using derivative information greatly extends the power of these methods, and also leads to purely functional models such as those defined by differential equations, often called dynamical systems.