A number of people have commented that "nobody" uses Floating Point in financial calculations, for all the reasons Dan Clarke went into in his recent article on Floating Point and rounding issues. But it seems to me that this misses the point. Yes, I'm old enough to remember when IBM mainframes, and its COBOL, supported BCD ( …
Why this phobia of floating-point?
Like David Norfolk, I cut my programming teeth in the pre-Java, pre-C#, pre-Perl era. My first language was FORTRAN, and that forced me to understand the difference between integer, real and double-precision, to understand the limitations of each data type, and to know when to use each one. I've carried that knowledge with me for a quarter of a century, and I apply it today when I write in C and Java, because it's equally valid in any strongly-typed language.
I'm no genius. I'm just an ordinary programmer, and I've made my fair share of dumb mistakes in the past, yet I know when and how to use floating-point numbers, and even how to mix them correctly with integers.
So I have to ask: why do so many programmers today seem to have such a problem with floating-point numbers, verging apparently on a phobia?
Perhaps we should require all new programmers to serve an apprenticeship working in COBOL or FORTRAN for a year or two to gain a better understanding of the fundamentals of programming, stripped of the fancy add-ons for GUIs, networking, XML, web programming and the like.
Then, when they truly understand the Tao of floating-point, we'll let them write in Java or C# or Perl.
Training for programmers
Good point - but I'd suggest that mainframe assembler is even better training - as long as part of the assesment is whether the programs fail in production and can be maintained in the future.
With something like C# you can sometimes get away with poor structure and program design, but with Assembler, you really can't!
More seriously, I got into professional IT when I was picked up off the streets of Canberra as a failed chemist and bored hifi salesman. I was sent to college to do computer science and given on-the-job training in the holidays. The organisation I was working for trained a percentage of its input in this way in the "good practices" it wanted and placed us in key places in the organisation (the best person in the intake - not me - became training officer for the next year).
But back in England, all you had to do to get a job in IT (all too often) was to "sex up" your CV - you could easily get training in specific technologies, but training in fundamental computer science and "good practice" was harder to come by...
Perhaps it's different now.
At least one company...
A past employer of mine uses floating point for money, and they specialise in accounting software.
When the company was set up, the first generation developers came from a background where they simply didn't need to know about the problems with floating point. Only suddenly they were working with a mixture of C and assembler on early PC compatibles, and floating point seemed the obvious choice.
Somehow, when they decided to 'fix it for good', that meant a library that would handle all the rounding issues in as near a consistent way as possible, still using floating point. I know they wish they'd done better, if only to save the embarrassment when they explain it to new developers - yes, it really was the old guys who were programming in the seventies who got it wrong! So much for the ignorance of us (relative) youngsters, eh! Mind you, when you look at the kids that learned to code in the last ten years or so, who wouldn't know a bitwise operator if it bit them, mutter mutter mutter...
Anyway, neither 0.1 nor 0.01 can be precisely as floating point values, so there is always a tiny error when money is represented using floating point unless you have an exact number of dollars. But there is normally plenty of precision, so that the error is a tiny fraction of a cent.
Adding and subtracting aren't a problem as long as you round over and over again to prevent errors accumulating. The rounding keeps pulling the value back to the nearest possible approximation of the correct figure. Multiplication by whole numbers is, similarly, never much of a problem - though it could be if the numbers were big enough, of course.
Problems tend to occur when dealing with percentages (tax calculations etc), but the rules tend to allow for small variations due to different rounding.
Big errors only really occur in three cases...
- Multiplying a value with a small error by a large multiple.
- Dividing a value with a small error by a small divisor.
- Dividing by a small value with a small error.
These don't happen much in bookkeeping, since on the whole the small errors occur at the ends of calculations. For instance, you calculate the tax on totals, not separately for each item. Then the tax amount is saved to the database, neatly rounded to the nearest cent, to avoid the need to recalculate it again later (when the code has possibly been modified so the rounding might come out different).
Not that I'm recommending the use of floating point by any means, but the world doesn't necessarily end because of it.
By the way, BCD just to handle money amounts is absurd. It used to be done as much to make the I/O easier as anything, saving the binary to decimal conversions to display the figures, but its not a very good trade-off.
As has already been pointed out, it's all a matter of scale - just use fixed point arithmetic with scaled integers. That can simply mean working in cents/pence/whatever instead of dollars/pounds/whatever. Or if you need more precision use a larger scale factor. Worst case, you need wider integers to cope with the range of values. And it's hard to imagine a case that 64 bits can't deal with, though it's easy enough to roll your own wider-than-the-widest-available-int type if you need one, and there are variable-width integer libraries available if you need them.
A general lack of understanding
Floating point vs. integer, etc.... A problem, yes.
But what about the ignorance of other factors that affect overall efficiency, such as how to avoid clogging up the "pipes" with useless data transmission? What about thinking about bloated and nonsensical loops? What about comments?
It seems that in that due to the increased speed and memory of the "standard" users' machine, as well as GUIs and OOPs, the training (or lack thereof) doesn't bother to teach/remind programmers that computing resources and time are finite.
Lets stop acting like miserable old gits
People have been complaining about modern tools encouraging poor development practises ever since the first high level languages and compilers were developed - probably even before then. But wasting endless developer time over trivial details is by far the greater problem.
If you criticise developers for every problem that is obvious now, you create an environment where everyone wastes time worrying about hundreds of possible future problems that will probably never occur. The codebase gets overcomplicated dealing with all these non-issues, and maintenance gets harder as a result.
Personally, I was told a lot in the classroom, but it didn't mean a lot. I really learned through experience - ie by seeing the results of my own and other peoples mistakes. The real value of the teaching was probably to help me recognise the true nature of those mistakes.
Where the same mistakes are relevant using modern tools, the modern tools aren't at fault for 'hiding them'. In what sense are they supposed to be hidden? For instance, the problems using floating point in Java, C++ or Delphi are no different to, and no more hidden than, the problems using floating point in Fortran or whatever. But its so easy to forget that we had to learn from our mistakes too.
You think that's not the issue? You definitely remember being taught the problems with floating point in some classroom in the seventees, and never made that mistake ever?
Well, what about all those lessons you were taught all that time ago that turned out, with experience, or just as times changed, to be wrong or trivial and unimportant?
I remember a teacher making an endless fuss about flowcharts. No design was complete without a flowchart, that had to describe every single detail of the code, because after all the design had to be directly and mechanically translatable to code.
You know flowcharts - every arrow is effectively a goto! And they are such low level representations that, used as that teacher intended, they aren't design at all - they are just graphical code.
Basically, he was telling me never to hack together real testable code but rather to hack together untestable graphical code and expect it to magically work first time. In general, the best way to keep him happy was to write the real code behind his back and translate the working result into a flowchart. Don't tell me you never did the same!
But then he'd never worked on more than a few hundred lines of code at a time, as far as I can tell. He basically expected to work on one well-specified subroutine at a time, flowcharting no more than two or three loops, ready to code it in assembler. And he thought compiled high level languages were a fad that would eventually be restricted to a few niches, insisting that there was no real gain since 'a single line procedure call maps to a single line jsr call anyway'.
The point being that he probably became a teacher because he was already a bit of a dinosaur even then, and struggling to get development work.
So maybe you suddenly recognise the value of learning from experience, and of not obsessing over everything you were told in the classroom. A degree of scepticism is healthy, and you learn what is really important by doing the job and making an occasional mistake.
The world would be a strange place if the more experienced didn't know things that the less experienced don't. It's nothing to get angry about. Just be glad about it, in this age of change when the less experienced have an annoying habit of knowing things you don't.
In praise of being modern...
Well, there really is something in all that, a miserable old git writes. IT is one area where the young turks do sometimes know things that the old gits don't - it's a young discipline and changing fast.
BUT, sometimes the young turks have new jargon instead of new ideas - and I sometimes get a bit tired of seeing the wheel reinvented, not once, but over and over again.
As for flow charts - yes I agree about the "graphical gotos" - which is why I preferred Nassi Schneidermann charts. But flowcharts (updated and improved) are coming back into fashion - see Borland Defineit and any number of business process modelling tools. Perhaps even MDA. So, as long as we rememember to bone up on the new jargon, us old gits can start to look young and fashionable again :)
Sometimes you have no choice but to use floating point
9/3 was coming out at 2.9999999999 (9999999 ...)
and there was no way to fix it.
I still don't know how he made it look right.
It still amuses me that more precision meant a less accurate result, at least to a human.
- Review Ubuntu 14.04 LTS: Great changes, but sssh don't mention the...
- Vid CEO Tim Cook sweeps Apple's inconvenient truths under a solar panel
- Antique Code Show WTF happened to Pac-Man?
- HTC mulls swoop for Nokia's MASSIVE Chennai plant
- Study shows dangerous asteroid impacts hit Earth every six months