X.—CORRELATION : ILLUSTRATIONS AND METHODS. 205
In suzh cases it would always tend to give a value for r markedly
higher than that given by the product-sum method. The
product-sum method gives in fact a value based on the standard-
deviation round the line of regression; the method used above
gives a value dependent on the standard-deviation round a line
which sweeps through all the means of arrays, and the second
standard-deviation is necessarily less than the first. We reach,
therefore, a generalised coefficient which measures the approach
towards a curvilinear line of regression of any form.
Let s,, denote the standard-deviation of any array of X’s, and
let , as before, be the number of observations in this array (Chap.
IX., § 11), and further let
Cl=3(n-5. 0/8 . <A)
Then o,, is an average of the standard-deviations of the arrays
obtained as suggested at the end of the last section. Now let
ru = 51 = 1.) _
2_ 1% 3
Noy" Fi (3)
O,
Then 7,, is termed by Professor Pearson a correlation-ratio (ref.
18). As there are clearly two correlation-ratios for any one table,
it should be distinguished as the correlation-ratio of X on Y: it
measures the approach of values of X associated with given
values of Y to a single-valued relationship of any form. The
calculation would be exceedingly laborious if we had actually to
evaluate o,, but this may be avoided and the work greatly
simplified by the following consideration. If JZ, denote the mean
of all Xs, m, the mean of an array, then we have by the general
relation given in § 11 of Chap. VIII (p. 142)
No? =3n(s.* + [M, — m,]2).
Or, using o,,. to denote the standard-deviation of m, ,
ok = Coe + Tid (4)
Hence, substituting in (3)
a
= Tn 6
Nzy ee (5)
The correlation-ratio of X on ¥ is therefore determined when we
have found, in addition to the standard-deviation of X, the
standard-deviation of the means of its arrays.
21. The correlation-ratio of X on ¥ cannot be less than the
correlation-coefficient for X and Y, and 5,2 -7? is a measure of
the divergence of the regression of X on Y from linearity. For
or