### principle component analysis

**(PCA)**
(statistical strategy of devising independent variables)

**Principle component analysis** (**PCA**) is a statistical
procedure/strategy to explore multidimensional statistical data,
to search for what is causing any random variation. The strategy
is to devise a list of random variables that are linear combinations
of the data coordinates that appear independent of each other
(according to the data) with the initial listed variable showing
the maximum possible variance (termed the **principle component**),
and each following variable in the list showing the maximum possible
remaining variance. The result is a set of linear transforms to
transform the data into these variables. Each transform can be
used independently and they assist in ignoring some of the sources
of variation, as well as finding the relations between the variables
in the source. One aim is to find a source of variation that is
non-random.

The technique can be used in various astrophysics areas including
various kinds of demographics, galaxy morphology
(classification of shapes), and analysis of observation data such
as light curve data. In some cases, it can be used to
automate some kinds of classification, such as identifying
a particular kind of object or event.

An obvious and undoubtedly common additional strategy is to introduce
additional variables to the data under analysis that are non-linear
transformation of the original data's coordinate, e.g., its log,
exponential, square, etc. There can be some well-established
motivation regarding which to try..

(*statistics*)
**Further reading:**

http://en.wikipedia.org/wiki/Principal_component_analysis

**Referenced by pages:**

PCA analysis

telluric line

Index