PRDH 1852 :: Reaction by prof. Gordon Darroch

1852 Oversampling Strategy

Reaction by prof. Gordon Darroch

The 1852 Census of Upper and Lower Canada:
Proposed Oversampling Strategy, and Discussion

Reaction by prof. Gordon Darroch, Institute of Social Research, York University

Lisa, I had a chance to read Laplante's suggestions as well as your proposal, and I pass on a couple of thoughts, for what they are worth.

I thought Laplante's ideas sound and interesting. I don't know anything about bootstrapping techniques, but hear a lot about them and know there is a good deal of literature and experience with them. The general logic seems sound to me - weight down the existing sample to match "estimates" made from aggregate data about the population. I think it has the merit of adjusting the example to 'known' values without risking making population estimates from an oversample that will tend to homogenize values (by duplicating sampled cases - the donors), where there might have been more diversity in the population. It also does not allow users to inflate N's by failing to weight an oversample, though probably you can guard against this in the db dissemination. It has the limitation that it tends to deflate N's by weighting down (I think).

The oversample approach has a stronger intuitive appeal and may, therefore, encourage users. It also seems to me workable, though maybe less elegant than Laplante's suggestions. I assume you would make the weighted, added cases very evident, so user's have to make a decision to use them and know their implications.

But, you have an opportunity (in principle) to do both. I suggest you try both at least for a subsample of the 1852 sample and compare them in a series of cross tabs, regressions and other models, with appropriate weights and find out what difference it makes. In some analyses, I bet, it won't matter at all; in others it may make a big difference. At last, whichever approach you adopt, if you haven't time or funds to examine both, you should consider, I think, systematic comparative analysis between the original sample and the weighted ones, and provide these results for user's to see the differences, if any.

Finally, are there enough small towns and parts of urban areas to use them to estimate other urban areas in the same fashion- to weight the existing few urban area cases or to add cases? Would it be worth an experiment? Or could you look for rural areas that might match the aggregate tabs for urban ones and use these to estimate urban populations? It is something of a presumption, especially in 1852, that urban populations are much different than rural neighbors - after all, they were moving back and forth, in and out of urban places all the time (the old transiency and migration findings).

I am interested; Keep me informed.
Cheers.

Last updated: 2/10/2021

Reaction by prof. Gordon Darroch

The 1852 Census of Upper and Lower Canada: Proposed Oversampling Strategy, and Discussion

Reaction by prof. Gordon Darroch, Institute of Social Research, York University

The 1852 Census of Upper and Lower Canada:
Proposed Oversampling Strategy, and Discussion