psepho

Posts taken from an old project with an expired domain name

Toronto election data

As with any analytical project, we invested significant time in obtaining and integrating data for our neighbourhood-level modeling. The Toronto Open Data portal provides detailed election results for the 2003, 2006, and 2010 elections, which is a great resource. But, they are saved as Excel files with a separate worksheet for each ward. This is not an ideal format for working with R.

We’ve taken the Excel files for the mayoral-race results and converted them into a data package for R called toVotes. This package includes the votes received by ward and area for each mayoral candidate in each of the last three elections.

If you’re interested in analyzing Toronto’s elections, we hope you find this package useful. We’re also happy to take suggestions (or code contributions) on the GitHub page.

Toronto election data

As with any analytical project, we invested significant time in obtaining and integrating data for our neighbourhood-level modeling. The Toronto Open Data portal provides detailed election results for the 2003, 2006, and 2010 elections, which is a great resource. But, they are saved as Excel files with a separate worksheet for each ward. This is not an ideal format for working with R.

We’ve taken the Excel files for the mayoral-race results and converted them into a data package for R called toVotes. This package includes the votes received by ward and area for each mayoral candidate in each of the last three elections.

If you’re interested in analyzing Toronto’s elections, we hope you find this package useful. We’re also happy to take suggestions (or code contributions) on the GitHub page.

First attempt at predicting the 2014 Toronto mayoral race

In our first paper, we describe the results of some initial modeling - at a neighbourhood level - of which candidates voters are likely to support in the 2014 Toronto mayoral race. All of our data is based upon publicly available sources.

We use a combination of proximity voter theory and statistical techniques (linear regression and principal-component analyses) to undertake two streams of analysis:

  1. Determining what issues have historically driven votes and what positions neighbourhoods have taken on those issues
  2. Determining which neighbourhood characteristics might explain why people favour certain candidates

In both cases we use candidates’ currently stated positions on issues and assign them scores from 0 (‘extreme left’) to 100 (‘extreme right’). While certainly subjective, there is at least internal consistency to such modeling.

This work demonstrates that significant insights on the upcoming mayoral election in Toronto can be obtained from an analysis of publicly available data. In particular, we find that:

We are now moving on to something completely different, where we use an agent-based approach to simulate entire elections. We are actively working on this now and hope to share our progress soon.

What is PsephoAnalytics?

Political campaigns have limited resources -–both time and financial - that should be spent on attracting voters that are more likely to support their candidates. Identifying these voters can be critical to the success of a candidate.

Given the privacy of voting and the lack of useful surveys, there are few options for identifying individual voter preferences:

The goal of PsephoAnalytics* is to model voting behaviour in order to accurately explain campaigns (starting with the 2014 Toronto mayoral race). This means attempting to answer four key questions:

  1. What are the (causal) explanations for how election campaigns evolve – and how well can we predict their outcomes?
  2. What are effects of (even simple) shocks to election campaigns?
  3. How can we advance our understanding of election campaigns?
  4. How can elections be better designed?

Psephology (from the Greek psephos, for ‘pebble’, which the ancient Greeks used as ballots) deals with the analysis of elections.