language-icon Old Web
English
Sign In

Correlation

In statistics, dependence or association is any statistical relationship, whether causal or not, between two random variables or bivariate data. In the broadest sense correlation is any statistical association, though it commonly refers to the degree to which a pair of variables are linearly related. Familiar examples of dependent phenomena include the correlation between the physical statures of parents and their offspring, and the correlation between the demand for a limited supply product and its price. ρ X , Y = corr ⁡ ( X , Y ) = cov ⁡ ( X , Y ) σ X σ Y = E ⁡ [ ( X − μ X ) ( Y − μ Y ) ] σ X σ Y {displaystyle ho _{X,Y}=operatorname {corr} (X,Y)={operatorname {cov} (X,Y) over sigma _{X}sigma _{Y}}={operatorname {E} over sigma _{X}sigma _{Y}}} ρ X , Y = E ⁡ ( X Y ) − E ⁡ ( X ) E ⁡ ( Y ) E ⁡ ( X 2 ) − E ⁡ ( X ) 2 ⋅ E ⁡ ( Y 2 ) − E ⁡ ( Y ) 2 {displaystyle ho _{X,Y}={operatorname {E} (XY)-operatorname {E} (X)operatorname {E} (Y) over {sqrt {operatorname {E} (X^{2})-operatorname {E} (X)^{2}}}cdot {sqrt {operatorname {E} (Y^{2})-operatorname {E} (Y)^{2}}}}} X , Y  independent ⇒ ρ X , Y = 0 ( X , Y  uncorrelated ) ρ X , Y = 0 ( X , Y  uncorrelated ) ⇏ X , Y  independent {displaystyle {egin{aligned}X,Y{ ext{ independent}}quad &Rightarrow quad ho _{X,Y}=0quad (X,Y{ ext{ uncorrelated}})\ ho _{X,Y}=0quad (X,Y{ ext{ uncorrelated}})quad & Rightarrow quad X,Y{ ext{ independent}}end{aligned}}} In statistics, dependence or association is any statistical relationship, whether causal or not, between two random variables or bivariate data. In the broadest sense correlation is any statistical association, though it commonly refers to the degree to which a pair of variables are linearly related. Familiar examples of dependent phenomena include the correlation between the physical statures of parents and their offspring, and the correlation between the demand for a limited supply product and its price. Correlations are useful because they can indicate a predictive relationship that can be exploited in practice. For example, an electrical utility may produce less power on a mild day based on the correlation between electricity demand and weather. In this example, there is a causal relationship, because extreme weather causes people to use more electricity for heating or cooling. However, in general, the presence of a correlation is not sufficient to infer the presence of a causal relationship (i.e., correlation does not imply causation). Formally, random variables are dependent if they do not satisfy a mathematical property of probabilistic independence. In informal parlance, correlation is synonymous with dependence. However, when used in a technical sense, correlation refers to any of several specific types of relationship between mean values. There are several correlation coefficients, often denoted ρ {displaystyle ho } or r {displaystyle r} , measuring the degree of correlation. The most common of these is the Pearson correlation coefficient, which is sensitive only to a linear relationship between two variables (which may be present even when one variable is a nonlinear function of the other). Other correlation coefficients have been developed to be more robust than the Pearson correlation – that is, more sensitive to nonlinear relationships. Mutual information can also be applied to measure dependence between two variables. The most familiar measure of dependence between two quantities is the Pearson product-moment correlation coefficient, or 'Pearson's correlation coefficient', commonly called simply 'the correlation coefficient'. It is obtained by dividing the covariance of the two variables by the product of their standard deviations. Karl Pearson developed the coefficient from a similar but slightly different idea by Francis Galton. The population correlation coefficient ρ X , Y {displaystyle ho _{X,Y}} between two random variables X {displaystyle X} and Y {displaystyle Y} with expected values μ X {displaystyle mu _{X}} and μ Y {displaystyle mu _{Y}} and standard deviations σ X {displaystyle sigma _{X}} and σ Y {displaystyle sigma _{Y}} is defined as where E {displaystyle operatorname {E} } is the expected value operator, cov {displaystyle operatorname {cov} } means covariance, and corr {displaystyle operatorname {corr} } is a widely used alternative notation for the correlation coefficient. The Pearson correlation is defined only if both standard deviations are finite and positive. An alternative formula purely in terms of moments is The correlation coefficient is symmetric: corr ⁡ ( X , Y ) = corr ⁡ ( Y , X ) {displaystyle operatorname {corr} (X,Y)=operatorname {corr} (Y,X)} . This is verified by the commutative property of multiplication. It is a corollary of the Cauchy–Schwarz inequality that the absolute value of the Pearson correlation coefficient is not bigger than 1. The correlation coefficient is +1 in the case of a perfect direct (increasing) linear relationship (correlation), −1 in the case of a perfect decreasing (inverse) linear relationship (anticorrelation), and some value in the open interval ( − 1 , 1 ) {displaystyle (-1,1)} in all other cases, indicating the degree of linear dependence between the variables. As it approaches zero there is less of a relationship (closer to uncorrelated). The closer the coefficient is to either −1 or 1, the stronger the correlation between the variables. If the variables are independent, Pearson's correlation coefficient is 0, but the converse is not true because the correlation coefficient detects only linear dependencies between two variables.

[ "Geometry", "Algebra", "correlation factor", "Correlation ratio", "mathematical correlation", "correlation detector", "grey correlation" ]
Parent Topic
Child Topic
    No Parent Topic