Claims of US election fraud: the ‘abnormal’ updates.

The 2020 US presidential election vote count happened ‘live’ — not only were there YouTube feeds from some of the counting places but also ‘running totals’ for a state could be observed live, as they were being updated with newly counted votes. One is able to obtain the numbers of votes in each update together with the exact time the update was made.

This led to claims of foul play. Because Joe Biden’s vote was concentrated in areas of high population density, because he dominated the mail-in voting, it is reasonable to suppose that the updates consisting of mail-in votes from large, densely populated metropolitan areas were mostly votes for him. It was, after all, President Trump who discouraged people from voting by mail, resulting in more of his supporters opting to vote in-person.

There were several examples of such claims — using various methods but having the same ‘trick’ under the bonnet: the assumption that vote shares across an insufficiently large geographical region conform to a Gaussian distribution and asserting updates that are outliers to be fraudulent. Here is one such analysis, and here is the kind of chart typically presented as evidence:

How not to apply a Gaussian fit to election results.

The above plot is a histogram of the updates for Michigan. The x-axis is logarithmic — a value of 3 means an update in which Biden has a thousand times as many votes as Trump, whilst a value of -2 means an update in which Trump has a hundred times as many votes a Biden. The green bars are updates prior to 3AM on election night, and the red ones are the ones later than 3AM. The claim being made is that the red updates on the extreme right violate the (typically log-normal) distribution usually seen during elections. They are therefore either extremely improbable or fraud.

In reality, those updates were indeed votes from heavily Democrat, urban areas — and, in one case, a quickly corrected clerical error in Shiawassee County, with an extra zero added to the Biden total. For example, Milwaukee officials confirmed that their mail-in ballots had to be added to the total in one update of about 169,000 in size: ‘Wisconsin law requires the results of those absentee ballots be reported all at once, Wisconsin Elections Commission Administrator Meagan Wolfe explained Wednesday’.

In case you still do not believe it, think of it this way: surely it makes sense to update with the mail-in ballots for a large city all in one go and surely that update would indeed swing heavily towards Biden? And, since the mail-in ballots take longer to count, surely those would be added towards the end of the count?

It is a valid technique — used in the literature — to flag up possible election fraud by looking at vote percentages and turnouts and detecting anomalies. Here, it is applied to detect fraud in the Russian and Ugandan elections — the characteristic bi-peaked distribution pinpointing areas of high turnout and high vote share for the winning candidate in a small minority of jurisdictions in which fraud takes place. The fraudulent second peak is circled in the diagram below:

An image from the paper of Klimek, Yegorov, Hanel and Turner plotting turnout against votes for winner for 12 different elections. The high-turnout, high-vote ‘second peak’ for Russia and Uganda are evidence of ballot-stuffing. In Canada, the second peak represents Quebec voting for nationalist parties.

Those high-turnout, high-vote share areas are most likely ones where ballot-stuffing (the fraudulent addition of votes of the people who did not vote to the total of one candidate) took place. But this analysis has several key differences to the ones performed by Trump supporters.

Firstly, an update is not the same as a vote total for a region because mail-in and election day votes for the same region are in different updates whilst a single update might contain votes for several regions. Secondly, the fact that ballot-stuffing gives higher ‘turn-out’ is not factored into this particular pro-Trump analysis — turn-out figures are not considered at all. Thirdly, if there were vote-rigging for Biden, the whole state of Michigan would look anomalous compared to the rest of America whilst its metropolitan areas would look anomalous compared to the other metropolitan areas in terms of turnout and/or Biden vote share. This is not the case: neither the battleground states nor the democrat-voting large metropolises within them are outliers when considered against other states and other large metropolises respectively.

For example, take the swings by state. The highest swing to Biden in a ‘battleground’ state is Georgia with +5.37%: this is not even in the top ten of Biden swings with Vermont leading on +9.00%. Arizona is on +3.81%, Michigan is on +3.00% whilst Pennsylvania and Wisconsin below +2%, therefore below the nationwide swing of +2.35%.

The turnouts in the battleground states are likewise not out of the ordinary compared to the others— a claim to the contrary being a result of miscalculating turnout for 2020 by diving the number of people who voted by the number of registered voters: in all previous years turnout was calculated by dividing the number of people who voted by the number of those eligible to vote.

Regarding the Democrat-supporting cities in those states, it is also not true that Biden’s performance in them was inconsistent with the other US metropolitan areas — a claim alleging the contrary was shown to be false. In fact, Biden improved his margin of victory compared to Hillary Clinton in 31 out of 36 urban counties — and Philadelphia was even one of the five in which he did not.

The moral of the story is that an update to a ‘live’ election count is not an official election statistic. It is an artefact of the counting method and not in itself ‘hard’ election data — such as vote totals split by county or precinct or by voting method. Unless the supposedly anomalous patterns spotted within the updates are corroborated with anomalous patterns in the ‘hard data’, they do not mean anything. This is especially so when the supposed ‘anomalies’ in the updates have a perfectly reasonable explanation not involving fraud.