Full text: An Introduction to the theory of statistics

THEORY OF STATISTICS. 
however, be chosen, for simplicity in classification, so that no 
limit corresponds exactly to any recorded value (cf. § 8 below). In 
some exceptional cases, moreover, the observations exhibit a marked 
clustering round certain values, e.g. tens, or tens and fives. This 
is generally the case, for instance, in age returns, owing to the 
tendency to state a round number where the true age is unknown. 
Under such circumstances, the values round which there is a 
marked tendency to cluster should preferably be made mid-values 
of intervals, in order to avoid sensible error in the assumption that 
the mid-value is approximately representative of the values in the 
class. Thus, in the case of ages, since the clustering is chiefly round 
tens, ¢ 25 and under 35,” “35 and under 45,” etc., the classification 
of the English census, is a better grouping than ¢ 20 and under 
30,” «30 and under 40,” and so on (cf. the Census of England and 
Wales, 1911, vol. vii., and also ref. 5, in which a different view is 
taken). When there is any probability of a clustering of this kind 
occurring, it is as well to subject the raw material to a close 
examination before finally fixing the classification. 
1. Classification.—The scale of intervals having been fixed, the 
observations may be classified. If the number of observations is 
not large, it will be sufficient to mark the limits of successive 
intervals in a column down the left-hand side of a sheet of paper, 
and transfer the entries of the original record to this sheet by 
marking a 1 on the line corresponding to any class for each entry 
assigned thereto. It saves time in subsequent totalling if each 
fifth entry in a class is marked by a diagonal across the preceding 
four, or by leaving a space. 
The disadvantage in this process is that it offers no facilities for 
checking: if a repetition of the classification leads to a different 
result, there is no means of tracing the error. If the number of 
observations is at all considerable and accuracy is essential, it is 
accordingly better to enter the values observed on cards, one to 
each observation. These are then dealt out into packs according 
to their classes, and the whole work checked by running through 
the pack corresponding to each class, and verifying that no cards 
have been wrongly sorted. 
8. In some cases difficulties may arise in classifying, owing to 
the occurrence of observed values corresponding to class-limits. 
Thus, in compiling Table I., some districts will have been noted 
with death-rates entered in the Registrar-General’s returns as 
16-5, 175, or 185, any one of which might at first sight have 
been apparently assigned indifferently to either of two adjacent 
classes. In such a case, however, where the original figures for 
numbers of deaths and population are available, the difficulty may 
be readily surmounted by working out the rate to another place 
20
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.