346 THEORY OF STATISTICS. 
than the other. If two samples be drawn quite independently 
from different universes, indefinitely large samples from which 
exhibit the standard-deviations o;, and o,, the standard error of 
the difference of their means will be given by 
oi 0% 
SR ot 1D 
This is, indeed, the formula usually employed for testing the 
significance of the difference between two means in any case: 
seeing that the standard error of the mean depends on the 
standard-deviation only, and not on the mean, of the distribution, 
we can inquire whether the two universes from which samples 
have been drawn differ in mean apart from any dyfference in 
dispersion. 
If two quite independent samples be drawn from the same 
universe, but instead of comparing the mean of the one with the 
mean of the other we compare the mean m, of the first with the 
mean m, of both samples together, the use of (6) or (7) is not 
justified, for errors in the mean of the one sample are correlated 
with errors in the mean of the two together. = Following precisely 
the lines of the similar problem in § 13, Chap. XIII, case IIL, we 
find that this correlation is Nn J(n, + ny), and hence 
ny 
0 =10; (my + 7g) h : \ . (8) 
(For a complete treatment of this problem in the case of samples 
drawn from two different universes ¢f. ref. 22.) 
13. The distribution of means of samples drawn under the 
conditions of simple sampling will always be more symmetrical 
than the distribution of the original record, and the symmetry 
will be the greater the greater the number of observations in the 
sample. Further, the distribution of means (and therefore also of 
the differences between means) tends to become not merely sym- 
metrical but normal. We can only illustrate, not prove, the 
point here ; but if the student will refer to§ 13, Chap. XV., he will 
see that the genesis of the normal curve in this case is in accord- 
ance with what we then stated, viz. that the distribution tends to 
be normal whenever the variable may be regarded as the sum 
(or some slightly more complex function) of a number of other 
variables. In the present instance this condition is strictly ful 
filled. The mean of the sample of n observations is the sum of 
the values in the sample each divided by n, and we should expect 
the distribution to be the more nearly normal the larger n. As 
an illustration of the approach to symmetry even for small values