language-icon Old Web
English
Sign In

Curse of dimensionality

The curse of dimensionality refers to various phenomena that arise when analyzing and organizing data in high-dimensional spaces (often with hundreds or thousands of dimensions) that do not occur in low-dimensional settings such as the three-dimensional physical space of everyday experience.The expression was coined by Richard E. Bellman when considering problems in dynamic programming. The curse of dimensionality refers to various phenomena that arise when analyzing and organizing data in high-dimensional spaces (often with hundreds or thousands of dimensions) that do not occur in low-dimensional settings such as the three-dimensional physical space of everyday experience.The expression was coined by Richard E. Bellman when considering problems in dynamic programming. Cursed phenomena occur in domains such as numerical analysis, sampling, combinatorics, machine learning, data mining and databases. The common theme of these problems is that when the dimensionality increases, the volume of the space increases so fast that the available data become sparse. This sparsity is problematic for any method that requires statistical significance. In order to obtain a statistically sound and reliable result, the amount of data needed to support the result often grows exponentially with the dimensionality. Also, organizing and searching data often relies on detecting areas where objects form groups with similar properties; in high dimensional data, however, all objects appear to be sparse and dissimilar in many ways, which prevents common data organization strategies from being efficient. In some problems, each variable can take one of several discrete values, or the range of possible values is divided to give a finite number of possibilities. Taking the variables together, a huge number of combinations of values must be considered. This effect is also known as the combinatorial explosion. Even in the simplest case of d {displaystyle d} binary variables, the number of possible combinations already is O ( 2 d ) {displaystyle O(2^{d})} , exponential in the dimensionality. Naively, each additional dimension doubles the effort needed to try all combinations. There is an exponential increase in volume associated with adding extra dimensions to a mathematical space. For example, 102=100 evenly spaced sample points suffice to sample a unit interval (a '1-dimensional cube') with no more than 10−2=0.01 distance between points; an equivalent sampling of a 10-dimensional unit hypercube with a lattice that has a spacing of 10−2=0.01 between adjacent points would require 1020 sample points. In general, with a spacing distance of 10−n the 10-dimensional hypercube appears to be a factor of 10n(10-1) 'larger' than the 1-dimensional hypercube, which is the unit interval. In the above example n=2: when using a sampling distance of 0.01 the 10-dimensional hypercube appears to be 1018 'larger' than the unit interval. This effect is a combination of the combinatorics problems above and the distance function problems explained below. When solving dynamic optimization problems by numerical backward induction, the objective function must be computed for each combination of values. This is a significant obstacle when the dimension of the 'state variable' is large. In machine learning problems that involve learning a 'state-of-nature' from a finite number of data samples in a high-dimensional feature space with each feature having a range of possible values, typically an enormous amount of training data is required to ensure that there are several samples with each combination of values. A typical rule of thumb is that there should be at least 5 training examples for each dimension in the representation. With a fixed number of training samples, the predictive power of a classifier or regressor first increases as number of dimensions/features used is increased but then decreases, which is known as Hughes phenomenon or peaking phenomena. When a measure such as a Euclidean distance is defined using many coordinates, there is little difference in the distances between different pairs of samples. One way to illustrate the 'vastness' of high-dimensional Euclidean space is to compare the proportion of an inscribed hypersphere with radius r {displaystyle r} and dimension d {displaystyle d} , to that of a hypercube with edges of length 2 r {displaystyle 2r} .The volume of such a sphere is 2 r d π d / 2 d Γ ( d / 2 ) {displaystyle {frac {2r^{d}pi ^{d/2}}{d;Gamma (d/2)}}} , where Γ {displaystyle Gamma } is the gamma function, while the volume of the cube is ( 2 r ) d {displaystyle (2r)^{d}} .As the dimension d {displaystyle d} of the space increases, the hypersphere becomes an insignificant volume relative to that of the hypercube. This can clearly be seen by comparing the proportions as the dimension d {displaystyle d} goes to infinity: Furthermore, the distance between the center and the corners is r d {displaystyle r{sqrt {d}}} , which increases without bound for fixed r.In this sense, nearly all of the high-dimensional space is 'far away' from the centre. To put it another way, the high-dimensional unit hypercube can be said to consist almost entirely of the 'corners' of the hypercube, with almost no 'middle'.

[ "Algorithm", "Machine learning", "Artificial intelligence", "Pattern recognition", "Statistics", "tensor train", "Intrinsic dimension", "dimensionality estimation", "Maximally informative dimensions" ]
Parent Topic
Child Topic
    No Parent Topic