X.—CORRELATION : ILLUSTRATIONS AND METHODS. 203
found in various ways, for the most part dependent either (1)
on the formule for the two regressions rot and vt, or (2) on
v z
the formule for the standard deviations of the arrays o, V1 - 12
and 0, v/T—72. Such approximate methods are not recommended
for ordinary use, as they will lead to different results in different
hands, but a few may be given here, as being occasionally useful
for estimating the value of the correlation in cases where the
data are not given in such a shape as to permit of the proper
calculation of the coefficient.
(1) The means of rows.and columns are plotted on a diagram,
and lines fitted to the points by eye, say by shifting about
a stretched black thread until it seems to run as near as may
be to all the points. If ;, 4, be the slopes of these two lines
to the vertical and the horizontal respectively,
r= A/b,.5,.
Hence the value of » may be estimated from any such diagram
as figs. 36-40 in Chapter IX., in the absence of the original
table. Further, if a correlation-table be not grouped by
equal intervals, it may be difficult to calculate the product
sum, but it may still be possible to plot approximately a diagram
of the two lines of regression, and so determine roughly the
value of 7. Similarly, if only the means of two rows and
two columns, or of one row and one column in addition to the
means of the two variables, are known, it will still be possible
to estimate the slopes of RR and CC, and hence the correlation
coefficient.
(2) The means of one set of arrays only, say the rows, are
calculated, and also the two standard-deviations 0, and o,. The
means are then plotted on a diagram, using the standard-deviation
of each variable as the unit of measurement, and a line fitted by
eye. The slope of this line to the vertical is ». If the standard
deviations be not used as the units of measurement in plotting,
the slope of the line to the vertical is » o./o,, and hence » will be
obtained by dividing the slope by the ratio of the standard-
deviations,
This method, or some variation of it, is often useful as a
makeshift when the data are too incomplete to permit of the
proper calculation of the correlation, only one line of regression
and the ratio of the dispersions of the two variables being required :
the ratio of the quartile deviations, or other simple measures of
dispersion, will serve quite well for rough purposes in lieu of the
ratio of standard-deviations. As a special case, we may note that