324 THEORY OF STATISTICS.
The square of the standard-deviation is given by the sum of
the terms in col. (4) less the square of the mean, that is,
rr=np { gr-1+20n- )gn=rp + 80g spn Bde |- np,
But the series in the bracket is the binomial series (q+ p)"!
with the successive terms multiplied by 1, 2, 3, . . . It therefore
gives the difference of the mean of the said binomial from -1,
and its sum is therefore (n — 1)p +1. Therefore
oZ=np{(n-1)p +1} — n%p?
= np — np? =npq.
7. The terms of the binomial series thus afford a means of
completely describing a certain class of frequency-distributions—
v.e. of giving not merely the mean and standard-deviation in
each case, but of describing the whole form of the distribution.
If &V samples of n cards each be drawn from an indefinitely large
record of cards marked with 4 or a, the proportion of A-cards
in the record being p, then the successive terms of the series
N(q +p)" give the frequencies to be expected in the long run of
0, 1, 2, . . . 4-cards in the sample, the actual frequencies only
deviating from these by errors which are themselves fluctuations
of sampling. The three constants XN, p, n, therefore, determine
the average or smoothed form of the distribution to which actual
distributions will more or less closely approximate.
Considered, however, as a formula which may be generally
useful for describing frequency-distributions, the binomial series
suffers from a serious limitation, viz. that it only applies to a
strictly discontinuous distribution like that of the number of
A-cards drawn from a record containing 4’s and a’s, or the number
of heads thrown in tossing a coin. The question arises whether
we can pass from this discontinuous formula to an equation
suitable for representing a continuous distribution of frequency.
8. Such an equation becomes, indeed, almost a necessity for
certain cases with which we have already dealt. Consider, for
example, the frequency-distribution of the number of male births
in batches of 10,000 births, the mean number being, say, 5100.
The distribution will be given by the terms of the series
(0-49 40-51)1900 and the standard-deviation is, in round numbers,
50 births. The distribution will therefore extend to some 150
births or more on either side of the mean number, and in order
to obtain it we should have to calculate some 300 terms of a
binomial series with an exponent of 10,000! This would not
only be practically impossible without the use of certain methods
of approximation, but it would give the distribution in quite
200