Monday, January 02, 2006

New Canada Election poll 2006: Globe and Mail / CTV / Strategic Counsel

Can we trust the latest Globe and Mail / CTV / Strategic Counsel poll (released January 2)? Sometimes, the numbers just don't add up. In this case, Strategic Counsel's latest election poll does just that.
  • Canada: 1,000 (3.2)
  • Quebec: 248 (6.3)
  • Rest of Canada: 753 (3.6)
  • Ontario: 568 (5.0)
  • West: 297 (5.7)
  • B.C.: 133
How the hell do these numbers add up to anything? Adding any combination of these numbers to try to reach the sample size of 1,000 is fruitless. The only numbers which make sense are if you add Quebec with the Rest of Canada (but even then, you end up with 1,001 and not 1,000).

ANOTHER THOUGHT: "Findings have been rolled up and analyzed over a three-day period. Interviews were conducted between Dec. 31 and Jan. 1." Something must not be right here. Dec. 31 to Jan. 1 is a two day, not three day, period. On second thought: I suppose if you count today, it would be correct in saying that they were rolled up over a three day period. My bad.

AND ANOTHER THOUGHT: Is the Strategic Counsel even polling the Atlantic region? From what I have seen - no. Not in the poll, and not in any past polls that I can remember.

Which begs the question: can such a poll even be trusted?


Patrick said...

A 1000 person poll does not just poll 1000 persons across the country. It might, for example, poll 1500, and then weight each part of the country for the proportion of seats they have in Parliament to come up with the 1000 figure. That is why adding Ontario plus the west plus Quebec doesn't add up to 1000.

They then report the numbers according to statistical significance. That is why the North (with only three seats where, in theory, one person represents the entire territory of Nunavut in the poll) is not reported. That is likely also why Atlantic is not reported.

When they have a part of Canada where there is greater interest, they can choose to poll a larger number. For example, they polled 568 in Ontario, whereas their share of the national population would have warranted 335 in a poll of 1000.

That doesn't explain why they reported B.C. (133 is not a significant number) and not Atlantic (where the number for the four Atlantic provinces would have been about 16%.

Basic polling theory indicates that the only two numbers which have any sort of reliability for prediction are those for Canada as a whole and for the Rest of Canada. Everything else is just data.

With regard to the three day period, the interviews were over a two day period, but "the findings have been rolled up and analyzed over a three-day period", so the analysis was today.

Jonathan said...

Thanks for clearing things up Patrick.

I don't get why they don't report Atlantic Canada though...or make it appear as though they don't report it.

I guess I was jumping the gun based on the little information the CTV poll gave me. Strategic Counsel still hasn't posted their raw numbers yet.

Patrick said...

The Maritimes number can be extrapolated from what has been reported, since it's the only set that hasn't been reported. This is what I came up with on my spreadsheet - likely within a 1/2 point of actual in each instance.

Conservatives 36.5
Liberals 38
NDP 19.5
Green 6

I know it doesn't add up to 100 - the original numbers didn't either. But it gives you a good approximation.

The actual gap between the Conservatives and the Liberals would actually be a little closer because I had no way to pull out the North's three seats, and in the North, the Liberals are far ahead of the Conservatives.

Bottom line, the Conservatives are within a point or two of the Liberals in the Maritimes. I'll leave it to you to conjecture why neither CTV nor Globe and Mail pointed that out, while dealing with a much smaller sample set for B.C.

Patrick said...

By the way, also unreported is the figure for the Rest of Canada (Canada not including Quebec.

There, we can extrapolate from the data that the Conservatives are ahead of the Liberals by 38 to 35. Even with Ontario being 38 to 32 for the Liberals. This is the first time the ROC poll has shown such a large difference in the Strategic Counsel's figures.

You can come up with a pretty good idea of what's really going on by combining Decima and Strategic Counsel.

Ogilvie said...

I have to admit the mathematics of it all is a little beyond me. It always surprises me how accurate polls usually are. It's a great pity, because I think releasing results during the campaign tends to influence voters. On one American statistician's website, he said there are indications that NDP supporters will flood to the Liberals to block the Conservatives if the latter appear to have power within reach. I've never been phoned by a pollster, but would refuse to answer anyway. I guess they somehow factor in people who refuse to respond. Would they be included in the undecideds, I wonder?

Patrick said...

No, they would be included in the 80% who either refuse to participate or who aren't reachable.

Polling theory is based on the principle that if you have enough of a sample size of randomly chosen participants, then the extremes on each end will not be reflected in the final results because they will cancel each other out. For example, there are some who will deliberately give the wrong answer (i.e. not what they actually believe). Maybe twenty in each sample. But they give DIFFERENT wrong answers. There are others who say they're voting Liberal because they think the pollster is an agent of the government wanting to check out their fealty to the ruling party. But they're cancelled out by those who say they're voting Conservative because they want to send the message that they're mad as hell at corruption. And so on.

The ideal sample size is 2000. 1000 is borderline in its usefulness. What a 1000 sample poll is great for is tracking trends IF the same answer is asked the same way in successive polls.

The one thing almost nobody understands is that the range of 1000-2000 is the required sample size for any population above 100 thousand. You need the same sample size to accurately poll London, Ontario as you do to poll China.

Stats Prof said...

Good eye, Jonathan. This is very wonky poll reporting. On the SC website, it's presented as a TWO-day Dec 30-31 poll (n=1000) as follows. They usually do 3-day roll-ups, but not this time due to the stat holiday. So, this is a typo, I believe.

297 West
379 Ontario
753 ROC
247 Que

I'm going to assume that they MUST be polling the Atlantic. In which case, that n=170. But why do they never report it? What neglect of both comprehensive poll reporting and of Atlantic Canada. The sample size is enough to report, as the Atlantic margin of error is +/- 7.5% (based on n=170).

Patrick: I'm afraid you do not understand polling or polling theory (more properly termed "sampling theory") completely, and you are unintentionally misinforming people with your post. I hope this helps:

First, no polling firm conducts interviews in the North as part of their public release polls. None. They only do the provinces. That's a fact.

Second, if they report the poll as a n=1000, then they polled 1000. They did not poll more or less. Why? Because when you calculate margin of error, you must do so based on the UNweighted numbers.

Third, regional quotas are used to ensure statistically valid and comparable sample sizes in the different regions. Therefore, statistical weighting (using actual Census region, age, and gender data, usually) of the results is conducted to ensure that the final data-processed results (that we see, that they report) actually reflect the country's real population proportions. BUT, because the actual regional sample sizes are based on quotas and because margin of error is based on these unweighted figures, we have statistically valid and somewhat comparable regional results.

Fourth, statistical weighting is done on the census variables I mention above. During elections, polling firms also add other factors into the mix such as Past Vote Intent, or they slightly weight-up the Conservative (or maybe the Liberal these days) vote because of pollster beliefs that these voters are less likely to honestly self-report. They ABSOLUTELY DO NOT weight based on proportion of seats in parliament - NO firm does or would do that. It does not make proper sense. They use Census data.

Fifth, if you are confused about how they do riding projections ... they apply these weighted data to past vote results, sometimes with other formulas applied. But they do not weight their data based on parliamentary proportions.

Sixth, 133 IS a statistically significant number within its confidence interval or level (aka margin of error). In this case, 133 holds a +/-8.5% MOE. In fact, any number of completed surveys of 35 or greater can have a statistical significance test (a Z-Test on these results, T-Test where there are mean averages to test) applied to it to test and confirm whether it is significant. This is a standard test that all these firms apply to their data before processing and releasing it.

Seventh, your discussion of the "ideal sample size" being 2000 reveals that you do not understand sampling/polling theory properly. A poll of 1000 has a +/-3.1% margin of error with a 95% confidence interval. That's very reliable. For 2000, it's +/-2.2%. Stronger, but not much stronger overall. However, it would provide larger regional sub-samples with greater reliability. Still, the regional sub-samples in the polls that we've seen in this election are okay as long as you keep the margin of error in mind when reading the results.

You are correct in saying that "the one thing almost nobody understands is that the range of 1000-2000 is the required sample size for any population above 100 thousand. You need the same sample size to accurately poll London, Ontario as you do to poll China" ... but where you come up with 2000 as the ideal sample size for any type of survey research is beyond me. There is no scientific basis for that comment. And, in terms of economic doability - well it's really expensive. Ask any market/opinion research firm how often they do a survey among a sample size that high - for release or for their big-walleted private sector and public sector clients - and they will tell that this happens maybe 3-5 times a year.

This isn't an attack, Patrick. I'm merely trying to help you and Jonathan and the rest of the readers.