XVIL—SIMPLER CASES OF SAMPLING FOR VARIABLES. 347
of n, we may take the following case. If the student will turn to
the calculated binomials, given as illustrations of the forms of
binomial distributions in Chap. XV. § 3, he will find there the
distribution of the number of successes for twenty events when
¢=09, p=0-1: the distribution is extremely skew, starting at
zero, rising to high frequencies for 1 and 2 successes, and thence
tailing off to 20 cases of 7 successes in 10,000 throws, 4 cases of 8
successes and 1 case of 9 successes. But now find the distribu-
tion for the mean number of successes in groups of five throws,
under the same conditions. This will be equivalent to finding
the distribution of the number of successes for 100 such events,
and then dividing the observed number of successes by five—the
last process making no difference to the form of the distribution,
but only to its scale. But the distribution of the number of
successes for 100 events when ¢=09, p=0-1, is also given in
Chap. XV. § 3, and it will be seen that, while it is appreciably
asymmetrical, the divergence from symmetry is comparatively
small : the distribution has gained very greatly in symmetry
though only five observations have been taken to the sample.
We may therefore reasonably assume, if our sample is large,
that the distribution of means is approximately a normal dis-
tribution, and we may calculate, on that assumption, the fre-
quency with which any given deviation from a theoretical value
or a value observed in some other series, in an observed mean, will
arise from fluctuations of simple sampling alone.
The warning is necessary, however, that the approach to
normality is only rapid if the condition that the several drawings
for each sample shall be independent is strictly fulfilled. 1f the
observations are not independent, but are to some extent positively
correlated with each other, even a fairly large sample may con-
tinue to reflect any asymmetry existing in the original distribution
{¢f. ref. 32 and the record of sampling there cited).
If the original distribution be normal, the distribution of
means, even of smali samples, is strictly normal. This follows at
once from the fact that any linear function of normally distributed
variables is itself normally distributed (Chap. XVI. § 6). The
distribution will not in general, however, be normal if the
deviation of the mean of each sample is expressed in terms of the
standard-deviation of that sample (cf. ref. 30).
14. Let us consider briefly the effect on the standard error of
the mean if the conditions of simple sampling as laid down in
§ 2 cease to apply.
(a) If we do not draw from the same record all the time, but
first draw a series of samples from one record, then another
series from another record with a somewhat different mean and