Kernel trick

Jump to navigation Jump to search

In machine learning, the kernel trick is a method for using a linear classifier algorithm to solve a non-linear problem by mapping the original non-linear observations into a higher-dimensional space, where the linear classifier is subsequently used; this makes a linear classification in the new space equivalent to non-linear classification in the original space.

This is done using Mercer's theorem, which states that any continuous, symmetric, positive semi-definite kernel function K(x, y) can be expressed as a dot product in a high-dimensional space.

More specifically, if the arguments to the kernel are in a measurable space X, and if the kernel is positive semi-definite — i.e.

<math>\sum_{i,j} K(x_i,x_j) c_i c_j \ge 0</math>

for any finite subset {x1, ..., xn} of X and subset {c1, ..., cn} of objects (typically real numbers) — then there exists a function φ(x) whose range is in an inner product space of possibly high dimension, such that

<math>K(x,y) = \varphi(x)\cdot\varphi(y).</math>

The kernel trick transforms any algorithm that solely depends on the dot product between two vectors. Wherever a dot product is used, it is replaced with the kernel function. Thus, a linear algorithm can easily be transformed into a non-linear algorithm. This non-linear algorithm is equivalent to the linear algorithm operating in the range space of φ. However, because kernels are used, the φ function is never explicitly computed. This is desirable, because the high-dimensional space may be infinite-dimensional (as is the case when the kernel is a Gaussian).

The kernel trick was first published by Aizerman et al.[1]

It has been applied to several kinds of algorithm in machine learning and statistics, including:

The origin of the term kernel trick is not known.

References

  1. M. Aizerman, E. Braverman, and L. Rozonoer (1964). "Theoretical foundations of the potential function method in pattern recognition learning". Automation and Remote Control. 25: 821–837.

See also

External links

de:Kernel (Maschinelles Lernen)

Template:WikiDoc Sources