IX.—CORRELATION.
re asymmetrical, the skewness being positive for the rows at
the top of the table (the mode being lower than the mean), and
negative for the rows at the foot, the more central rows being,
nearly symmetrical. The maximum frequency lies towards th
upper end of the table in the compartment under the row an
olumn headed “30-”. The frequency falls off very rapidly,
wards the lower ages, and slowly in the direction of old age.
utside these two forms, it seems impossible to delimit empirically;
ny simple types. Tables V. and VI. are given simply as illus-
rations of two very divergent forms. Fig. 31 gives a graphical
representation of the former by the method corresponding to the
histogram of Chapter VI., the frequency in each compartment
eing represented by a square piliar. The distribution o
requency is very characteristic, and quite different from that
of any of the Tables I., IIL, III, or IV.
6. It is clear that such tables may be treated by any of the
ethods discussed in Chapter V., which are applicable to al
ontingency-tables, however formed. The distribution may be
investigated in detail by such methods as those of § 4, or tested
or isotropy (§ 11), or the coefficient of contingency can be
calculated (§§ 5-8). In applying any of these methods, however,
it is desirable to use a coarser classification than is suited to the
methods to be presently discussed, and it is not necessary to
retain the constancy of the class-interval. The classification
should, on the contrary, be arranged simply with a view to avoidin
many scattered units or very small frequencies. A few examples
should be worked as exercises by the student (Question 3).
7. But the coefficient of contingency merely tells us whether,
nd if so, how closely, the two variables are related, and muc
more information than this can be obtained from the correlation-
ble, seeing that the measures of Chapters VII. and VIII. can be
pplied to the arrays as well as to the total distributions. If the
wo variables are independent, the distributions of all paralle
rrays are similar (Chap. V. § 13); hence their averages an
ispersions, e.g. means and standard deviations, must be the same,
n general they are not the same, and the relation between the
mean or standard deviation of the array and its type require
investigation. Of the two constants, the mean is, in general, the
more important, and our attention will for the present be con-
fined to it. The majority of the questions of practical statistic
relate solely to averages: the most important and fundamental
question is whether, on an average, high values of the one variable
show any tendency to be associated with high (or with low)
values of the other. If possible, we also desire to know how great
divergence of the one variable from its average value is associate