Claims of US election fraud: Benford’s Law.

Mara Nale-Joakim
3 min readDec 29, 2020

Benford’s Law is used for detection of fraud in finance, accounting, forensics and, sometimes, elections. It states that in a ‘naturally occurring’ set of numbers, the leading digit satisfies a particular distribution, called the log-normal distribution. If this is distribution is not satisfied, it might mean that the numbers have been manipulated.

The expected distribution of leading digits in a ‘naturally occurring’ dataset.

In the US election, the smallest unit for which vote totals are available is the precinct. Benford’s Law was applied to the two candidates’ precinct-by-precinct vote totals in some of the heavily Biden-supporting counties in the US swing states and presented as evidence of vote tampering:

Leading digits of precinct vote totals in Chicago for each candidate.

Evidence pointing to fraud? Not really. And not just because Benford’s law indicates fraud might have happened and further investigation is needed — it cannot prove guilt on its own. Here, it is misapplied as the precinct sizes are not random — they are designed to contain roughly equal voter numbers. To give an arbitrary example: suppose my city has precinct sizes of around 800 registered voters. Any precinct that votes more than 25% for Biden will have its Biden vote total not starting with a ‘1’. If my county has many precincts that are pro-Biden, the Benford distribution will not be satisfied by the leading digit. Of course, in reality precincts are not exactly the same size, however this principle still holds — for a more detailed explanation see this article, giving data for Chicago, Milwaukee, Allegheny and Fulton Counties, four out of the five areas in the country whose results were the most vigorously contested by Trump. (One of the things it states is that precinct sizes in Chicago vary from around 500 to around 1200 registered voters — making my 800 estimate about the average size).

So why does Trump’s vote in those areas follow Benford’s Law? Because Trump’s figures were relatively low in our hypothetical precincts of size 800 he would get vote totals whose first digit was more likely to be a small one — better fitting the distribution mandated by Benford’s Law. The charts posted by Trump supporters even offer evidence towards this hypothesis — notice how Trump’s vote totals have first digits of ‘1’ and ‘2’ slightly more frequently than the fitted distribution (in yellow) says they should do.

We see Biden’s vote totals dominated by numbers with leading digit 3–6 and Trump’s by numbers with leading digit 1–3. This is what one would expect in a Democrat-heavy county with average precinct size of several hundred registered voters.

The fact this analysis was not applied, for comparison, to areas of concentrated Trump support is in itself a red flag — once again underlining the selective way that Trump’s cheerleaders use data. If it were, applied to areas of Trump support with precinct voter numbers of several hundred, one would expect Trump’s vote totals to violate Benford’s law in exactly the same way.

What are the situations in which Benford’s Law can be useful? There are scholarly articles that apply it to elections. But, it is usually applied to vote totals across an entire country, with far greater variability of electoral jurisdiction size. They are also not applied to areas of support for only one of the candidates. The linked article applies Benford’s law to vote totals in 366 voting areas across Iran, including both sparsely and densely populated regions (varying from around a thousand to several hundred thousand voters). It also applies the law to the second digit as well as the first and to the combinations of the firs two digits. Whilst the conclusions can be contested, this is an example of a rigorous approach, unlike that of Trump’s supporters.

--

--