Ella places two printouts on John’s desk and points at the total columns. “Why don’t these numbers match?” she asks. John looks for a second, hoping Ella has made a simple mistake. But she’s right, they should match. John spends the rest of the day crawling through dozens of spreadsheets, attempting to find the source of the discrepancy. Eventually the problem is found, but only after turning up five or six similar issues. The team collectively pauses for breath when they consider the number of decisions made on faulty data. It is another week before John has fixed the problems and can return to his original task.

This is a particularly familiar story to many growing financial companies, which often prompts the question “how did things get like this?”. To find the answer, we need to investigate a misunderstood dynamic in spreadsheet and tool development. Consider the following diagram:

Figure 1 – The Square Law of Spreadsheets Effects Diagram.

There is, roughly, a set number of defects per spreadsheet. Therefore, for each new sheet, a number of faults are introduced. This creates labor to locate (and remove) faults. However, for each new sheet, there are extra places to look. In this dynamic, labor and risk are considered the same thing. This is what we call the ‘square law of spreadsheets’ and it leads to the situation shown in the following table.

The table shows that 10 sheets or tools are manageable for a financial engineering team. However, as the number of sheets or tools tends towards 100, the potential amount of work needed to maintain them is an order of magnitude higher. This is how some financial engineering teams ‘grind to a halt’; at one point, the maintenance work simply explodes. Often, a ‘capacity problem’ is blamed; ostensibly there is simply not enough staff to do all the work. However, it is often a case of the engineering team relying on manpower instead of machine power to do a lot of the work. The engineering team needs to start listening to the Matrix Principle: “never send a human to do a machine’s job”.

The first steps toward maximizing machine power are to create automated tests for all the teams’ spreadsheets and any other software tools they have developed. Our article ‘First Steps in Testing’ describes an easy path to getting started. It’s also the beginning of dealing with the square law of spreadsheets. Therefore, this technical practice has a direct influence on operational risk.

Other Factors: Requirements and Equation Reuse
Automated testing helps reduce the manpower to locate faults by being partially automated and by focusing the team on writing basic tests. However, there are two more things that can reduce the operational risk/labor:

• Reduce the number of sheets
• Reduce the amount of code in each sheet

By strictly managing the creation of new sheets – remember for every 10 new sheets there are 100 work units of maintenance – the total amount of work can be limited. With the same logic, retiring older sheets is also a way to reduce the number of sheets. By removing duplicated equations, for example by using VBA instead of copying and pasting equations between sheets, the amount of labor can also be reduced.

Finally, let’s reconsider the scenario laid out at the start of this article. In a team which has a thorough automated test-suite running on a regular basis, it may go something like this:

John receives a new e-mail. It’s an automated message from the test system, telling him that a test has failed on one the spreadsheets he is responsible for. Strange – he thought he’d checked that before committing the changes to the team’s repository. Still, the problem has to be in the changes he made to the sheet in the last hour – the time when the tests last passed. Quickly he realizes that he simply forgot to check in the new version of a library he was using, meaning the sheet was using an old library with a bug in it. It takes five minutes to fix the issue and re-run all the tests, which pass this time. Ella never needs to talk to John who happily moves onto his next task, pausing only to give thanks to the test system.