273 THEORY OF STATISTICS.
form a correlation-table between the true proportion p in a given
universe and the observed proportion = in a sample of n observa-
tions drawn therefrom. What we have found from the work of
the last chapter is that the standard-deviation of an array of =’s
associated with a certain true value p, in this table, is (pg/n)t;
but the question may be asked —What is the standard-deviation
of the array at right angles to this, <.e. the array of p’s associated
with a certain observed proportion =? In other words, given an
observed proportion w, what is the standard-deviation of the true
proportions? This is the inverse of the problem with which we
have been dealing, and it is a much more difficult problem.
On general principles, however, we can see that if n be large,
the two standard-deviations will tend, on the average of all
values of p, to be nearly the same, while if » be small the standard-
deviation of the array of =’s will tend to be appreciably the
greater of the two. For if #=p +38, 8 is uncorrelated with p,
and therefore if o, be the standard-deviation of p in all the
universes from which samples are drawn, o, the standard-
deviation of observed proportions in the samples, and os the
standard-deviation of the differences,
ol =o} +0}.
But o} varies inversely as ». Hence if » become very large, os
becomes very small, o, becomes sensibly equal to a, and therefore
the standard-deviations of the arrays, on an average, are also
sensibly equal. If n be large, therefore, [m(l—=)/n]} may be
taken as giving, with sufficient exactness, the standard-deviation
of the true proportion p for a given observed proportion 7. But
if » be small, os cannot be neglected in comparison with a, oy, i8
therefore appreciably greater than a, and the standard-deviation
of the array of #’s is, on an average of all arrays, correspondingly
greater than the standard deviation of the array of p’s—the state-
ment is not true for every pair of corresponding arrays, especially
for extreme values of p near 0 and 1. Further, it should be
noticed that, while the regression of = on p is unity—a.e. the
mean of the array of ’s is identical with p, the type of the
array—the regression of p on = is less than unity. If we as-
sume, therefore, that a tabulation of all possible chances, observed
for every conceivable subject, would give a distribution of p
ranging uniformly between 0 and 1, or indeed grouped symmetri-
cally in any way round 05, any observed value 7 greater than
0-5 will probably correspond to a true value of p slightly lower
than , and conversely. We have already referred to the use of
the inverse standard error in § 13 of Chap. XIII. (Case IL, p. 269).
If we determine, for example, the standard error of the difference
Tw