psepho
Posts taken from an old project with an expired domain name
Friday, September 12, 2025
As with any analytical project, we invested significant time in obtaining and integrating data for our neighbourhood-level modeling. The Toronto Open Data portal provides detailed election results for the 2003, 2006, and 2010 elections, which is a great resource. But, they are saved as Excel files with a separate worksheet for each ward. This is not an ideal format for working with R.
We’ve taken the Excel files for the mayoral-race results and converted them into a data package for R called toVotes. This package includes the votes received by ward and area for each mayoral candidate in each of the last three elections.
If you’re interested in analyzing Toronto’s elections, we hope you find this package useful. We’re also happy to take suggestions (or code contributions) on the GitHub page.
Tuesday, September 16, 2014
As with any analytical project, we invested significant time in obtaining and integrating data for our neighbourhood-level modeling. The Toronto Open Data portal provides detailed election results for the 2003, 2006, and 2010 elections, which is a great resource. But, they are saved as Excel files with a separate worksheet for each ward. This is not an ideal format for working with R.
We’ve taken the Excel files for the mayoral-race results and converted them into a data package for R called toVotes. This package includes the votes received by ward and area for each mayoral candidate in each of the last three elections.
If you’re interested in analyzing Toronto’s elections, we hope you find this package useful. We’re also happy to take suggestions (or code contributions) on the GitHub page.
Friday, September 12, 2014
In our first paper, we describe the results of some initial modeling - at a neighbourhood level - of which candidates voters are likely to support in the 2014 Toronto mayoral race. All of our data is based upon publicly available sources.
We use a combination of proximity voter theory and statistical techniques (linear regression and principal-component analyses) to undertake two streams of analysis:
- Determining what issues have historically driven votes and what positions neighbourhoods have taken on those issues
- Determining which neighbourhood characteristics might explain why people favour certain candidates
In both cases we use candidates’ currently stated positions on issues and assign them scores from 0 (‘extreme left’) to 100 (‘extreme right’). While certainly subjective, there is at least internal consistency to such modeling.
This work demonstrates that significant insights on the upcoming mayoral election in Toronto can be obtained from an analysis of publicly available data. In particular, we find that:
- Voters will change their minds in response to issues. So, “getting out the vote” is not a sufficient strategy. Carefully chosen positions and persuasion are also important.
- Despite this, the ‘voteability’ of candidates is clearly important, which includes voter’s assessments of a candidate’s ability to lead and how well they know the candidate’s positions.
- The airport expansion and transportation have been the dominant issues across the city in the last three elections, though they may not be in 2014.
- A combination of family size, mode of commuting, and home values (at the neighbourhood level) can partially predict voting patterns.
We are now moving on to something completely different, where we use an agent-based approach to simulate entire elections. We are actively working on this now and hope to share our progress soon.
Wednesday, September 10, 2014
Political campaigns have limited resources -–both time and financial - that should be spent on attracting voters that are more likely to support their candidates. Identifying these voters can be critical to the success of a candidate.
Given the privacy of voting and the lack of useful surveys, there are few options for identifying individual voter preferences:
- Polling, which is large-scale, but does not identify individual voters
- Voter databases, which identify individual voters, but are typically very small scale
- In-depth analytical modeling, which is both large-scale and helps to ‘identify’ voters (at least at a neighbourhood level on average)
The goal of PsephoAnalytics* is to model voting behaviour in order to accurately explain campaigns (starting with the 2014 Toronto mayoral race). This means attempting to answer four key questions:
- What are the (causal) explanations for how election campaigns evolve – and how well can we predict their outcomes?
- What are effects of (even simple) shocks to election campaigns?
- How can we advance our understanding of election campaigns?
- How can elections be better designed?
Psephology (from the Greek psephos, for ‘pebble’, which the ancient Greeks used as ballots) deals with the analysis of elections.