Are polls aggregators bad for democracy?

Clickbait title but this was, partially at least, the debate I was having on Twitter earlier today.

It all started when Frank Graves, from Ekos, tweeted this:


As well as this along the conversation:


Then Darrel Bricker, from Ipsos, followed through by tweeting this:


And finally Jean-Marc Léger (from Léger360) weighed in as well:




Ok, so let's break down the various issues raised here.

1. Seat projections were bad in 2015 while polls were accurate

Yes and no. Yes seat projections, mine included, were definitely quite off. Most polls aggregators and forecasters had the Liberals winning a minority. I even had them winning a relatively small one. They ultimately won a comfortable majority. My model failed on many levels.

With that said, I get tired of the claim that polls were right. Yes polls did relatively well in 2015, but let's not pretend that they didn't underestimate the Liberals, in particular in Quebec. A raw poll average of the last week puts the Liberals at 37.8%. They ultimately got 39.5%. This is almost 2 full points below. Only one poll -Forum- actually overestimated the LPC. For polls to be "right", the poll average should be closer.

I know what you'll say: yes but margins of error. Fair enough. But the poll average has really low margins of error. So missing the mark by 1.7 points isn't nothing. For projections, a 2-points difference can easily turn a minority into a majority. Seat projections -actually seats in an election- are very sensitive to every percentage point. We don't get the luxury of being within 2-3% of the correct result and call it a success.

The polls were particularly bad in Quebec where they underestimated the Liberals by over 5 points while overestimating the NDP by 1. This alone can account for most of the reasons while seat projections had the Liberals with a minority. This isn't the only one. Models, like mine, probably failed to forecast the increased turnout in some key ridings.

I personally underestimated the Liberals even more because I don't allocate undecided proportionally. My method has worked well in Quebec, Alberta and BC but for 2015, it made me miss the Liberals even more. I stand by my choice of going with a non-proportional allocation as evidence clealry shows incumbents tend to be underestimated. Also, my final average was actually closer, overall, than the one from 308/CBC for instance. It's just that it made me miss the most important party that time. With that said, it made me get the Conservatives closer to their real result. I will take the blame on this one as I should have accounted for the last minute momentum towards the Liberals but I'll continue to aggregate polls with a non-proportional allocations of the undecided.

Still, my model, for all its shortcomings in 2015, would have predicted a Liberal majority with 170 seats with the correct percentages. So while I agree I need to improve the model -particularly when turnout can change greatly- I also don't think 2015 should be remembered only as the victory of the pollsters and the failure of seat projections. And polls don't have a share of the blame in the failure of the projections.


2. Polls aggregators are potentially bad for democracy

I can't see how. If anything, I find it way more misleading when the media report on only one poll instead of doing an average. In recent weeks for instance, we've had some Forum polls that were very different from the average -I talked about them in the past- and some newspapers ran with it. This is way more dangerous and misleading than a careful aggregation. One single poll is subject to much higher margins of error than an average (well except for 2015 where, somehow, the latest Nanos or Forum polls would have done better alone even at the regional level despite ridiculously small sample size...). I don't buy for one second the argument that it'd be less shameful or dangerous for democracy if newspapers or the medias were paying for polls and analyzing them separately. I don't even see the connection.

As for the risk of influencing the election. Meh. We hear the same arguments against polls themselves. There as well, polls aggregators at least provide a better analysis. I see seat projections as providing information and information is always welcome.

The bigger debate here was also how the medias aren't paying for polls anymore and instead "steal" the numbers from other polls they didn't pay for. I understand why it could be frustrating to see the giant CBC not ordering polls and getting a ton of traffic by simply averaging the other polls. At the same time, if pollsters aren't happy with that, they could just stop providing public polls. My understanding is that political polls have long been a way for pollsters to get exposure, not a money maker. And if that's the objective, then they should be happy sites like mine mention them. But there is maybe a good case for medias to pay for polls.

As for asking for permission to use their numbers... Sorry but this doesn't hold in my view. The numbers are publicly published. Why should I, or anyone else, actually ask for permission to use them? Bricker said he doesn't think his data should be used for riding-level projections and doesn't endorse it. He has the right to think this but that's pretty much it. I don't even understand the logic. The numbers themselves aren't intellectual property or anything, they are pure information.

Just to be clear, I always had a good relationship with all the pollsters mentioned here. But on this issue, we won't agree.

I'm not a lawyer but I'd like to hear the opinion of one.


3. The seat projection models aren't transparent enough.

I'm not sure I agree here. Eric Grenier et 308/CBC always put pretty much the entire methodology of his simple model on his site. Read the blog and you can recreate the model in 10 minutes. I don't personally put all the coefficients and details, but I provide enough of the methodology. Also, my model is publicly available for everyone to use. Some do tend to think they are the sh*t and have a secret, magical recipe. The Signal from Vox Populi was a perfect example of that.

Look, I'm not totally against publishing the source code and everything on github for instance but I'm not sure pollsters are the ones who should tell me to be more transparent.

Pollsters publish very few details of their methodology. Ipsos publishes more detailed tables but that's all and they are almost the only ones. None publishes the exact sampling method or weighting process. Let's remember that those are essential. In the US, the same raw data was giving to four scientists and they all got different estimates out of the same raw data. There was one poll last US election that gave access to its full data (ironically, that turned out to be one of the least accurate poll... well at least the estimates based on the original weighting).

If the pollsters don't have to publish their "secret sauce", I'm not sure why polls aggregators should.