The 1852 Census of Upper and Lower Canada:
Proposed Oversampling Strategy, and Discussion

Reaction by Michael Haan, Department of Sociology, University of Toronto

Following up on my promise to send you information on nearest neighbour/hot deck imputation methods, I attach 3 references, 3 from Statcan methodologists, and one from a couple of academics...

Note, however, that all of these resources refer to individual imputation, not regional imputation. To perform this, you will first have to create an aggregate dataset (at the level of the subdivision), sort it by the variables of interest (district, plus any other characteristics you pull off the published counts), then take the preceding or succeeding value (usually preceding) and merge the new, complete, data back to an individual file.

Many statisticians warn against doing this, as it results in underestimated variances in any regression run on the data.

If you include a flag, however, people could choose not to use the imputed data if they don't want to.

