The 1852 Census of Upper and Lower Canada:
Proposed Oversampling Strategy, and Discussion

Reaction by prof. Lisa Dillon, Département de démographie, Universtié de Montréal

Hi Kris,

Thank you for your comments on my proposed 1852 extra-sampling approach as well as on Benoit Laplante's recommendations. I think you make a very good point that with weights, I can design a number of schemes to restore representativity in a number of ways, and that this approach is more flexible than my own idea to add extra "substitute" cases (Mike Haan, a grad student from Sociology at U of T, told me this is known as "nearest neighbour imputation").

Gordon has recommended trying both approaches, and I think I would like to do that, and we might be able to do it in the context of taking another 10% slice of the manuscript (adding up to a 20% sample). Next week, we are going to look at our budget to see if we have enough money to do another 10% (in addition to verification, checking and cleaning); if we just did another 10%, I could identify within the second 10% certain cases which could be said to "stand in" for the missing cases in the first 10%, allowing users to try out the "nearest neighbour" approach if they want, but also adding a series of weight variables to the sample which would work with the data if all 20% were used.

My only remaining concern about Benoit Laplante's recommended approach is that I would have to find an expert to help me design the weights, and we have limited funds.

Additional points: We have been examining the aggregate census statistics for the 1852 Census, and have found a few problems. For example, the proportion of children aged 0 to 5 in the population in the aggregate statistics is lower than it ought to be. As a result, we are going to have to go through these aggregate statistics quite carefully before using them as the basis for any sort of extra-sampling or weighting.

