XIV.—REMOVING LIMITATIONS OF SIMPLE SAMPLING. 277
greater than 0-5, greater than is possible for positive errors. The
assumption is not, however, likely as a rule to lead to a serious
mistake ; as stated at the commencement of this paragraph, the
point is of importance only when = is small, for when # is large the
distribution tends to become sensibly symmetrical even for values
of p differing considerably from 0-5. (CF. Chap. XV. for the
properties of the limiting form of distribution.)
2. In the second place, the student should note that, where we
were unable to assign any a priori value to p, we have assumed
that it is sufficiently accurate to replace p in the formula for the
standard error by the proportion actually observed, say
Where 7 is large so that the standard error of 2» becomes small
relatively to the product pg the assumption is justifiable, and no
serious error is possible. If, however, n be small, the use of the
observed value = may lead to an under- or over-estimation of the
standard error which cannot be neglected. To get some rough
idea of the possible importance of such effects, the approximate
standard error ¢ may first be calculated as usual from the
observed proportion 7, and then fresh values recalculated, replac-
ing 7 by 7+3e. It should be remembered that the maximum
value of the product pg is given by »=¢=05, and hence these
values, if within the limits of fluctuations of sampling, will give
one limiting value for the standard error. The procedure is by
no means exact, but may serve to give a useful warning.
Thus in Example iii. of Chap. XIII. the observed proportion of
tall plants is 29/68, or, say, 43 per cent. The standard error of
this proportion is 6 per cent., and a true proportion of 50 per
cent. is therefore well within the limits of fluctuations of sampling.
The maximum value of the standard error is therefore
i
(20x50) = 606 per cent.
On the other hand, the standard error is unlikely to be lower
than that based on a proportion of 43 — 18 =25 per cent.,
i
(BX) =5'25 per cent.
3. The two difficulties mentioned in § 1 and 2 arise when n,
the number of cases in the sample, is small. The interpretation
of the value of the standard error is also more limited in this
case than when = is large. Suppose a large number of observa-
tions to be made, by means of samples of # observations each, on
different masses of material, or in different universes, for each of
which the true value of p is known. On these data we could