Claims of US election fraud: geographical variations in the Biden vote part 2

In part 1, we considered allegations of vote-fixing in suburban — including deeply republican — counties. Biden’s under-performance in the metropolitan areas such as Philadelphia compared to his over-performance in these suburban and exurban areas was considered to be evidence of fraud. The argument was that the heavily Democrat metropolitan area contained the ‘actual’ Biden performance whilst the Republican outlying areas in which Biden made gains had their performance ‘fixed’ somehow.

One feature of the claims of Trump’s supporters is their scattergun approach — they pump out claims that are mutually contradictory. In a paper by John R. Lott, the exact opposite claim is made: supposedly the election was fixed in the metropolitan areas such as Fulton County in Georgia and Allegheny county in Pennsylvania. Whilst Hounsell cites Allegheny as an example of ‘normal’ voting behaviour and other areas not so, Lott claims Allegheny is fraudulent and areas around it ‘normal’. Such contradictions are common for Trump supporters as they scramble to construct multiple narratives with time running out.

Donald Trump retweeted Lott’s paper personally

Here is Lott’s paper —’A Simple Test for the extent of Vote Fraud with Absentee Ballots in the 2020 Presidential Election: Georgia and Pennsylvania Data’. I will only focus on the data-related claims it makes. Lott’s methodology is to pick pairs of precincts across county boundaries. The ‘data’ group is a pair of precincts one of which is in a ‘problem’ county, another in a ‘no problem’ county. The ‘control’ group is a pair of precincts that are both in a ‘no problem’ county.

He takes, for each pair of precincts, the differences between Trump’s in-person vote shares and, separately, between his mail-in vote shares, constructs a linear model that relates the two differences and adds an extra negative ‘fraud’ term for when one of the pair of precincts is in a ‘problem’ county. He performs a statistical test on whether the data is better fit with or without a ‘fraud’ term. He finds that in 2016 the answer is ‘without’ but in 2020 the answer is ‘with’. Here is his model, as he states it:

Lott’s description of his model.

A comparison of racial and gender demographics is performed and the model is corrected for those. Strangely, no social class/income level analysis is done, in spite of the fact that having a degree is a bigger predictor than ever in whether one votes Trump or not.

Two cases of ‘problem county’ are discussed: Fulton County in Georgia (vs four of its neighbours) and Allegheny County in Pennsylvania (vs four of its neighbours). The results purport to show that, when corrected for the differences in the in-person vote, Trump’s share of the mail-in votes is consistently lower by around 7% in the precincts on the Fulton side of the county border than just opposite them on the other side. For Allegheny, the corresponding figure is 3–4%. The analysis of 2016 results shows up no such cross-border variation at all. In other words, the claim being made is that all those boundary areas are politically uniform except for the discrepancy in the mail-in votes for the 2020 election.

Georgia county map showing Fulton County and four of its neighbours considered in the analysis — Cherokee, Forsyth, Carroll and Coweta counties.

The problems. No detail is given regarding which precise precinct pairs are considered and how they were selected for this analysis except for stating that:

‘In one case, Fulton County precinct ML02A matches up with four different precincts in Cherokee County (Mountain Road 28, Avery 3, Union Hill 38 and a small portion of Freehome 18)’

Fortunately, we can consult maps and work out what precincts he could have picked. Note the following phrase above: ‘the other counties are matched west to east and south to north’. The only way to interpret this is to suppose the ‘control group’ consisted of pairs of precincts along the borders of the ‘no problem’ counties with each other.

Possible precinct choices in Fulton. Let us dwell on the possible precinct choices for Fulton, using the data for Fulton County precincts here.

Carroll County border. Carroll county is a strange choice. It only borders one Fulton precinct and a small piece of another. Both those precincts occupy a considerable area — Lott’s claim about ‘the precincts just across the street’ is demonstrably false. We have no idea if he included them into his analysis, however it seems likely, otherwise why mention Carroll County at all?

Coweta County border. Coweta County, like Carroll, borders Fulton from the south, and of the 4 bordering precincts (one of those also borders Carroll), only one is not sizeable in area.

Forsyth County border. Forsyth, in the north-east, borders 13 Fulton precincts, one of which also borders Cherokee County.

Cherokee County border. Here, things look better, with 7 border Fulton precincts that do not go very deep into Fulton county. Even so, one precinct goes 4 km away from the county border. This is still not ‘across the street’, like Lott is claiming, you would expect some variability in voter behaviour here. The precinct ML02A that Lott mentions explicitly is here, stretched along the border, but still going as far as 2.5km away from the border.

Let us fetch the Google satellite photo of this precinct. Notice that the county border here is a river. A river! And yet Lott wants us to think that communities separated by a river, potentially a ten-minute drive away from each other must vote in an identical way! The sheer audacity of such a suggestion takes the breath away.

Screenshot of a part of the Fulton-Cherokee county border (the border is in orange and Fulton is to the south and east of it).

If you do not believe my 10-minute claim, here is the length of drive from a location in ML02A to Mountain Road 28, the voting location for one of the Cherokee County precincts that Lott mentions explicitly.

Lott’s analysis has 22 ‘observations’ — but above we counted 24 precincts of Fulton bordering Cherokee, Forsyth, Carroll and Coweta. Furthermore, one has to assume that an ‘observation’ is a pair of two precincts sharing a border, given Lott’s explicit example of one Fulton precinct bordering 4 Cherokee ones and, presumably, giving 4 data points. One has to wonder about not just which border precincts from Fulton were included but also which ones were left out and on what basis.

The control group. Now let us consider Lott’s ‘control group’ — pairs of adjacent precincts where he claims no fraud occurred in either. His description of how he selects them is very unclear, but it is reasonable to suppose that they are all along either Cherokee/Forsyth or Carrol/Coweta boundaries, as he says in the paper.

Look at the link Lott himself supplies for Georgia precinct boundaries. Only two Cherokee precincts border Forsyth, for one of these the boundary cannot be more than a few hundred meters in length. They border four Forsyth precincts, but two of them also have a very short boundary. All in all you can make just two pairs here.

Along the Carroll/Coweta boundary, two Carroll precincts border three Coweta ones. How many data points does this make for Lott? We are not told, but at most three, looking at which precincts share a common boundary. So all in all Lott uses a control group size of… five. I hope, for his sake this is wrong and he considered a far larger control group, but this is the conclusion I have to make going on the information in his paper and in the maps.

Selection bias. As well as small sample issues, there are of course wider selection bias issues. Of the ten counties bordering Fulton, just four were picked. No attempt was made to generalise the analysis or to show what the expected variation in vote share across a precinct boundary might be.

Issues with the model. The central feature of Lott’s model is the relationship between in-person and mail-in votes, which he postulates to be linear with a noise term. However, in 2020, due to COVID-19, due to statements about mail-in voting made by Trump, this relationship can be said to be particularly ‘noisy’ for Democrat voters — there is statistical variation in how many people switch to voting by mail in different precincts.

Suppose that in a precinct, there are the following variables:

a=number of Trump voters by mail

b=number of Biden voters by mail

c=number of Trump in-person voters

d=number of Biden in-person voters

Let us ignore voters for other candidates for simplicity. Lott operates with the fractions a/(a+b) and c/(c+d) by forming differences of those fractions over precinct boundaries. However, if the terms d and b have a ‘noisy’ relationship with a lot of potential variation, all this analysis is compromised. Taking differences further magnifies this noise — when you take differences the standard deviations are added, not subtracted.

Therefore, one has to wonder whether Lott’s claim of a ‘7% discrepancy’ in Fulton is just an artefact of the noisy and uneven way in which Democrats switched to mail-in voting, a result of his too small sample sizes and failure to build a model in a way that keeps noise down.

The unaccounted-for change in the nature of mail-in voting compared to 2016. The comparison to 2016 is therefore heavily problematic, because the nature of the mail-in ballots changed so much in the meantime. Back in 2016, you could expect mail-in ballots to have the same Trump vote share as in-person, now you cannot. When the central feature of your model is the relationship between in-person and mail-in votes, one that you further postulate to be linear with a noise term, you simply cannot overcome such a massive difference in that relationship that has happened in the meantime.

Other possible reasons for cross-border discrepancy. But let us give the author the benefit of the doubt (despite the very strong reservations about the validity of the results and even though his credibility has been put under question) and suppose there is statistically significant discrepancy. It would show that Democrats, for some reason, were able to go the extra mile in finding mail-in votes for Biden in Fulton that were not obtained in the four other counties. What could explain this other than fraud?

  • The different conditions for mail-in ballot delivery in different counties. The great thing about mail-in ballots — despised by the Republicans — is the ability to extend the vote to people who might not have been regular voters in previous years, for any number of reasons.
  • Signature verification and matching is not an exact science. Unfortunately, there is a degree of unconscious subjectivity involved. It is reasonable to suspect that one county might be more ‘liberal’ in what it allows than another, within the rules.
  • Quicker, more efficient ballot curation process, helping voters correct their mail-in ballots if there is a problem.
  • According to Gabriel Sterling, volunteers for both paries helped inform voters if their ballot was rejected. Democrats were supposedly more efficient at doing this.

All these share one common theme: it is not law-breaking to attempt to make mail-in voting easier or harder as long as you stay within the regulations and as long as you treat Trump and Biden (and any other party’s) voters the same. The Republicans are working hard to convince us that this is fraud, but it is not. Lott has failed to show us that rules were broken in two ways: through his statistical analysis being questionable and also through there being credible explanations other than fraud for the features he claims to observe in the data.