With the current headlines being dominated by the collapse of the Silicon Valley Bank, it seems apt to share this recent coding mistake I made with floating point numbers. My mistake did not cause and was not caused by the SVB collapse, but it occurred due to some work being done as a result of needing to modify systems that were previously built to use SVB.
As with many companies, we were regularly processing transactions through Silicon Valley Bank. We needed to quickly adjust our process to allow us to send the transfer through a different financial institution. Rather than taking the long route and modifying our whole payment process, I proposed to write a conversion script to run on the files generated by the existing system, as this would be quicker to implement, be lower risk for delivery, and have less chance of introducing new problems into the system, which has mostly been working well for years.
The setup for this file conversion is relatively standard as far as financial transfer files go. Ready through a 30-40 page PDF that outlines file structure and naming, which fields are required and the length and contents of each. Mostly boring details which are tedious, but important to get correct or the bank's system will reject the whole file and often not provide any useful error. As far as development cycles, it would be quite painful if we needed several iterations to diagnose and resolve an issue.
However, all of that boring stuff turned out to be straightforward, and after we discovered a particular naming convention we needed to follow, the converted files worked the first time.
The most pertinent detail in all of this was that our input file stored transaction values in decimal such as
40.62where the new format used integers to store this same value in "cents" as
Enter the bug
We had run several tests using previous transaction files to ensure our script would generate files acceptable for the bank's systems to ingest and process and were feeling confident. However when we ran the conversion of the live data for payments, it was noted that there was a discrepancy in the total payment amounts.
The payment file included several hundred thousand dollars of transactions, but was short by $3.89.
Diagnosing the issue
Fortunately, I had also spent a few minutes writing a script to convert back while writing the conversion script. Doing this allowed me to diff the input file with a like-for-like display of what was included in the output file. This quickly highlighted where the problem was, which was everywhere, and what the problem was.
In the diff, this showed that the input value of
40.62 had been converted to
40.61. For exactly 389 transactions recorded in the payment file, the amount to be transferred was off by one cent ($0.01). This quickly triggered memories of floating point errors, with the popular claim that such and such programming language is bad because
0.1 + 0.2 == 0.3 // false
0.1 + 0.3 // 0.30000000000000004
Floating point arithmetic
Floating point errors can occur when computers represent real numbers with a finite number of bits. This can result in rounding errors and inaccuracies in calculations. While these errors may seem small, they can accumulate over time and lead to significant discrepancies in the final results.
Knowing what the problem was made it easy to pinpoint the line of software at fault. This line of code was able to correctly convert the dollar amount to cents most of the time. But had some cases like in the above, where the floating point representation of a number became "almost correct".
amount_in_cents = (amount.to_f * 100).to_i
The 'to_i' method in Ruby is used to convert the result of the mathematical operation 'amount.to_f * 100' to an integer. Which was the source of the problem, as sometimes this would result in a number just a little larger or smaller than the expected result. For numbers which were a little larger, the
to_i function would truncate these, effectively rounding them down with worked in our favour. However for the numbers which were a little smaller, the
to_i function would round them down a whole unit, so
40.62 would be converted to
The fix is in
To resolve this issue, we needed to explicitly round the result to the nearest integer, rather than truncating the number, which is accomplished in Ruby using the
amount_in_cents = (amount.to_f * 100).round
And then, after adding some tests to ensure these cases would forever be covered, a new file was generated and sent to the finance team for processing. The team checked the numbers and confirmed that everything added up.
A better solution
After some further research, it appears that Ruby provides a standard library for BigDecimal, which is actually the appropriate solution to use in this case. By casting our
amount to a
BigDecimal we can then apply our calculation of
* 100 without losing precision.
amount_in_cents = (BigDecimal(amount) * 100).to_i
And as we have previously written unit tests covering these cases, we can be confident with our change.