When I started the blog a couple of years ago I sort of promised to write a series of posts on how the simulation works so that others could replicate the results, if need be. Unfortunately gainful employment has interfered, and one week out from the election there is no way I will get this finished. Still, better late than never.

While there is a bit of maths going on behind the scenes, the general principle is surprisingly simple: average all available NZ political polls, and then run a Monte Carlo Simulation to get all the interesting information we need to make the graphs. This process is summarised in the schematic below:

The process can be divided up in to 3 main steps:

**Polling information (red):**Moving averages of the political polls and information regarding electorate swings are calculated from the input polls and the results of the 2005 and 2008 General Elections. (NB, this information, along with the party lists in step 3, is the only information that goes in to the calculations.)**Election simulation (blue):**Using the Monte Carlo method, running on a standard laptop with a standard pseudo-random number generator, an election is simulated, based on the polling averages and electorate swings calculated in step 1.**Scenario analysis (green):**Using the simulated election results from step 2, we look at the party lists and figure out who gets in to parliament. We then look at any other result that may be of interest. Normally this would be the number of seats won be each party, which parties will form a coalition and so on, but in theory, if the simulation in step 2 is working correctly, it can be anything you may be interested in looking at after a real election. For example, if you wanted to, you could look at the number of women candidates winning a South Island electorate seat.

Of course, depending on the pseudo-random numbers dished up in step 2 you may get a relatively unlikely result: perhaps based on current polling your simulation gives National 47% of the vote, Labour 32%, Greens get 14%, and New Zealand First 5%. This is possible, but not the most probable outcome. To make sure the results are realistic we simply repeat steps 2 and 3 a large number of times, and keep a running total for each variable or outcome we are interested in measuring. By doing this any unlikely statistical fluctuations should cancel each other out, and we can get a meaningful measurement of the numbers we are interested in.

Typically steps 2 and 3 are repeated 50,000 times for each day we simulate an election for, which takes about a minute or so of computer time. To get the time series graphs, we have to do these simulaitons for each day we are interested in, although normally they are just run for the last couple of hundred days to update any recent movement, such as the Scenario Analysis time series graph, for example (scroll down to “Scenario Analysis”).

Each time we complete step 3, we update a running total of the variables we are interested in (number of seats won by National, number of women candidates winning a South Island electorate seat, etc.) and also the variables-squared (number of seats won by National squared, number of women candidates winning a South Island electorate seat squared, etc.). We then divide by the total number of simulations (say, 50,000) and that gives the expected values and expected value-squareds. For example, in yesterdays simulations the National party won 3,270,000 seats, and dividing by 50,000 gives an expected value of 65.4 seats. A bit of seventh-form stats gives the root mean square (RMS) error on the expected value, and that is how we get the final value of 65.4 +/- 0.8 seats for National (scroll down to “Seats in Parliament”).

That’s all there is to it. The calculations for the poll averaging and the simulation get a bit more involved, although probably not much harder than a first-year uni level maths course, but the general principle of the calculation should be surprisingly simple.

## Leave a Reply