Benford’s Law: An Analytical Tool For Sniffing Frauds

As a High School Student I was always fond of asking questions to “Guru Joe” an online game character on the web, portraying as a fortune teller. For every funny questions I asked, he would look thoughtful, see through his magic ball and show me either thumbs up or down. Now as a professional accountancy student I along with my colleagues come through many difficult situations where we need to assess whether there are any anomalies, frauds, unusual transactions, misleading information etc. in the propositions put before us. But this time perhaps we need not depend on “Guru Joe” or in fact his programmer for any doubts, thanks to the analytical tools and techniques that have been developed over years. One of such analytical tools is Benford’s Law, a digital analysis tool also known as the first digit law or the first digit phenomenon or leading digit phenomenon that evaluates the digital frequencies over the position of appearance in the data provided to detect anomalies.

Benford’s Law

Benford’s Law suggests that when considerably large population of naturally calculated unbiased random numbers is selected their digits tend to follow a fascinating phenomenon। We know that there are 9 natural numbers viz। 1 to 9, thus with our simple probability the odds or chances that the first digit of any no. being 1 should be 1/9. However this might not be true, because if we empirically observe any large population of data, we find that the percentage of the first digit being 1 is around 30.10 %, 2 is 17.61% and so on in decreasing order. To our amusement, this trend is unconditionally followed by every group of numbers equally truly like the Law of Gravity, provided that the numbers are not biased or manipulated.

This was exactly what Simon Newcomb, an American Astronomer and Mathematician noticed way back in 1881 when he observed that the library copies of the books of logarithms were more worn out in the beginning pages which dealt with low digits and progressively less worn on the pages dealing with higher digits। He inferred from this pattern that the fellow scientists used those tables to look up numbers which started with numerical one more often than those starting with 2, 3 and so on. He thus calculated that the probability that a number has particular non-zero first digit is given by:

P(D1=d1) = log10(1 + 1/d1) for d1 {1, 2,...,9}

So probability of first digit being 9 would be log10 (1 + 1/9) i.e. log10 (1.11) or 0.04576. This means that the probability of first digit being 9 is 4.576%.

Though Newcomb formulated the model he never gave the theoretical explanation to the phenomena neither he extensively tested his theory। It was only in 1938 almost 50 years later that Dr. Frank Benford, a physicist working in General Electric again noticed that the first few pages of his logarithmic table was more worn out than last few. He made an extensive test of his hypothesis by collecting 20 lists of numbers with a total of 20229 observations. His lists came from varied sources, such as geographic, scientific and demographic data. One list contained all the numbers in an issue of Reader's Digest. He found that about 31% of the numbers had 1 as the first digit, 19% had 2, and only 5% had 9 as a first digit. Benford then made some physics-related assumptions about the distribution of naturally occurring data and, using integral calculus, he computed the expected frequencies of the digits and digit combinations (see Table 1 for the calculated expected frequencies of the digits).

The above table suggests that the probability of appearing 1 in the first place is 30।10 % whereas in second place is 11।39 %। Similarly the digit 6 should appear 6.69 % in first place of the numbers under consideration.

With appropriate variations in the formula the numbers provided can be tested in various levels of tests, wherein the digits’ appearances can be tested at various positions and even for various combinations like first two digits, last two digits, second digit, last digit etc।

Logical Explanation of Law

A billion rupees worth question would be why the digits appear in exactly the calculated frequency.
An intuitive explanation of Benford's law is to consider the total assets of a mutual fund that is growing at 10% per year। When the total assets are Rs. 100 million, the first digit of total assets is 1. The first digit will continue to be 1 until total assets reach Rs. 200 million. This will require a 100% increase (from 100 to 200), which, at a growth rate of 10% per year, will take about 7.3 years (with compounding). At Rs. 500 million the first digit will be 5. Growing at 10% per year, the total assets will rise from Rs. 500 million to Rs. 600 million in about 1.9 years, significantly less time than assets took to grow from Rs. 100 million to Rs. 200 million. At Rs. 900 million, the first digit will be 9 until total assets reach Rs. 1 billion, or about 1.1 years at 10%. Once total assets are Rs. 1 billion the first digit will again be 1, until total assets again grow by another 100%. Table 2 and 3 explains the dominance of lower digits in the first place of the numbers.

The persistence of a 1 as a first digit will occur with any phenomenon that has a constant (or even an erratic) growth rate। The numbers are “base invariant” and “scale invariant” as well thus it won’t matter whether the numbers are in any currency, is multiplied by something or divided by something, it will unquestionably follow the law।
Applicability in the Accounting and Auditing Profession

While the applicability of the Benford’s law is wide-ranging, its application to the accounting and auditing profession has been a boon। It helps to find out fraudulent and irregular perpetrated accounts and data prepared with mala-fide intention. Benford’s law actually has been integral part of detecting tax frauds and bank frauds and is included in the analytical procedures by many big firms and government authorities. The application could range from simple application through spreadsheets to the specialized computer programs with complex calculations.

So how can it be used to detect fraud would be the question। To explain it, let’s discuss an incidence in Georgia Institute of Technology. Dr. Theodore P. Hill from the University asks his mathematics students to go home and either flip a coin 200 times and record the results, or merely pretend to flip a coin and fake 200 results. The following day he runs his eye over the homework data, and to the students' amazement, he easily fingers nearly all those who faked their tosses. When interviewed Dr. Hill told that the reason he was able to point out the fake data was because many students didn’t actually know the real odds of such an exercise and thus they couldn’t fake data convincingly. Actually when the coins are tossed for a long period of time, at some point the heads or tails tend to appear for more than 6 consecutive times in a row. The student who pretended to flip the coin couldn’t guess this phenomenon and easily got caught.

Same is the problem with the fraud perpetrators। People cannot behave truly randomly even when it is to their advantage to do so and try to impose their intelligence while faking accounts। Even when people invent numbers without a goal such as fraud in mind, the digital frequencies do not conform well to Benford's Law.

So by extracting the percentage of actual appearance and then comparing it with the calculated percentage of Benford’s Law will give us an idea whether there are any abnormalities। If there seem vast deviations in the appearance of digits from calculated frequency, it can be reasonably assumed that there is something wrong.

One such instance where the auditors were able to point out fraud was in a bank where the digit 4 appeared quite abundantly than the Benford’s law suggested in credit card debt write-off figures। While making further scrutiny auditors found that the officer writing off most of the debts had the credit write off limit of $ 50,000. So what he was doing was actually calling on his friends to use credit card for purchases exactly just below 50,000 and not pay. And then he would write it off. Thus figure four was appearing more than it should have been.

Many countries have also implemented the use of Benford’s law to find out tax fraud where the tax returns data submitted by tax filers are scrutinized through specialized software। Several States of US including California are using detection software based on Benford's Law.

When to use and when not to use the Law?

It is however to be taken care that the fit of number sets with the Benford’s Law can also be fallible। There is need of applying the law with reasonable discrimination. Suppose if we check the telephone numbers of the Kathmandu Valley obviously we would find that the frequency of appearance of digit 4 will be almost cent percent. Similarly when there is fixed sales value like Rs. 24 per unit of sale unit, obviously we would find that digit 2 will appear more frequently than others if there is abundance of single unit sales. Thus care should be taken not to jump on the assumption of fraud as soon as the deviations are found. Biased numbers, assigned numbers like the Cheque no. or Invoice no. and the numbers just picked up without going through any calculations or arithmetic do not tend to follow the law. Dr. Nigrini a prominent researcher in the application of Benford’s Law in Accounting further explains "You can't use it to improve your chances in a lottery. In a lottery someone simply pulls a series of balls out of a jar, or something like that. The balls are not really numbers; they are labeled with numbers, but they could just as easily be labeled with the names of animals. The numbers they represent are uniformly distributed, every number has an equal chance, and Benford's Law does not apply to uniform distributions."

The law can be reasonably used to test most sets of accounting data including accounts receivables, accounts payables, disbursements, sales, expenses, full year’s transactions, bank transactions or any transaction level data। However it won’t be working to the numbers that are influenced by human thoughts like prices of Rs. 1.99 etc.

An example of digital analysis

Still skeptical of the real applicability of the law even after going through lots of texts over the internet on this particular topic I decided to test it myself। So I opened up an Excel Sheet and entered into it first 518 numbers appearing in the Mid January 2004 Bank and Financial Statistics published by Nepal Rastra Bank available to me. To my astonishment, the results were nearly identical to the Law. Table 4 and Graph 1 depict the result obtained.

The digits’ frequency nearly followed the law with only the significant variance being in the appearance of digit 2 of 3।43% and digit 7 of 2.32%
Easy Steps to Perform Test in Excel

If you still have any doubt about the law, the following easy steps to perform First Digit Benford Test in Excel could help you out।

§ Open up an Excel Spreadsheet and fill in the first column “A” with available numbers from any source (for distinctive appliance of Law large population is required usually more than 50 numbers).
§ In the second column “B” extract the first digit of the numbers entered with the formula
= LEFT (cell no, 1)
§ Enter numbers 1 to 9 in the third column “C”.
§ In fourth column “D” find out the frequencies of the appearance of the digits in Column B with COUNTIF function. For instance against digit 1 in cell D1 enter the formula.
= COUNTIF (cell range in Column B, 1)
Similarly for counting number of appearances of digit 2 enter the following formula in Cell D2
= COUNTIF (cell range in Column B, 2) and so on till digit 9.
§ After finding the required frequencies of all digit through 1 to 9, total the numbers in cell D10 by formula
= SUM (D1:D9)
§ In column “E” find the percentage of appearance of each digit by dividing the no of appearance of each digit by the total in cell D10 and multiplying by 100.
§ Enter the Benford’s suggested % in column “F” as given above in Table 1.
§ Deviations from the Law can be calculated by subtracting actual appearance % in cell E from Benford’s suggested % in column “F”.
Conclusion and Further Reference

The sustainability and value of any profession undoubtedly depends on the quality of services the members in the profession provide. Unlike to sticking to conventional judgemental techniques in performance of audit and other related services new developments and tools shall be embraced for the value addition in the services provided. Perhaps applying various digital and computerized analytical tools like Benford’s Law Test would be one step towards it.
Benford’s Law seems to have vast scope of appliance in sniffing corporate fraudulent reporting and anomalies to finding the tax frauds being a handy tool to our profession. Its usefulness is pervasive in basically pointing suspicion at frauds, embezzlers, tax evaders, sloppy accountants and even computer bugs.

(Published in ICAN Journal)

Further References

§ I’ve Got Your Number, Mark J. Nigrini
§ http://www.nigrini.com/
§ Following Benford’s Law or Looking Out for No. 1, Malcolm W. Browne
§ The First Digit Problem, R. Raimi