The law of unintended consequences...
Reading this has dredged up memories form a dim & distant past when I wrote code that ran on computers using assembly level language, rather than stuff that tries to run in peoples' brains using PowerPoint:
I've been deliberately obtuse here. See if you can identify the company & product I'm talking about.
Case 1: "But it always worked up until now!"
Back in the 1980's I worked in the third line support unit for the main database product offered by a UK based computer manufacturer. This particular product had been written back in the 1970's as an in house tool on one system, source code translated to run on another, and then finally migrated into the systems I was looking after it on. Additional functionality had been bolted around on the sides, patched into the middle, and some of the worst excesses brought about by automatic code translation had been massaged out by hand, but it was still at core the same stuff bolted together as an in house tool in the early 70's.
Then one day, one of the core pieces of functionality, a hashing algorithm used for both storage and retrieval of data, started failing randomly when being run in the test labs on a new model of computer with a completely re-designed hardware architecture.
This architecture had some fiendishly clever pipe-lining built into the CPU design which allowed for pre-emptive execution of code instructions, essentially executing several instructions "ahead" of where they normally would be executed, thereby allowing memory to be shuffled around more effectively. Very clever indeed. Except for the piece of self-modifying code in the guts of that hashing algorithm, which now failed to execute because certain instructions were being executed before they could be modified.
It's over Thirty years ago, but if I recall correctly, the "fix" was to patch in an exec instruction immediately before the piece of self modifying code in question which instructed the CPU to "Stop pipe-lining", and then another immediately afterwards that said "As you were". Nobody had the guts to touch the algorithm in question. It had remained unchanged since 1972, even down to the bug that failed for certain low address numbers...
Case 2. "Who put the brakes on?"
Same company. Same Database. Different family of computer systems.
So, about five years after the time of the first story I was sitting in the Pub one lunchtime (Still in the 80's then eh?) with a mate who worked on the hardware design side of the company. He was telling me that they were running performance tests on some new multiple node configurations, but that something was slowing them down. In theory, each node should operate independently, running the virtual machines currently resident within it and only synchronizing with the other nodes in the configuration when a special event occurred.
However these "stop everything an pay attention" instructions were coming along more often than anyone had expected when they started testing customer workloads.
When I asked what caused the syncs to happen he reeled off a series of circumstances, and one nearly made me bite through the rim of my pint glass. There was a particular low level instruction which returned the system time. I'll call it "Clock". "Clock 0" returned the node time, "Clock -1" returned the system time, and this caused a node synchronization. One of these "Hang on people, let's compare notes" events that was slowing the systems.
Now, I'd been looking at system dumps for long enough to know that:
a) We had "Clock -1" scattered through the code all over the show. It was used every time we wanted
a time stamp for something important, such a s DB log entry, or for something relatively trivial, like a time stamp for a journal file or to return a value used as a seed for a randomization algorithm call.
b) The "high level" language that chunks of the Operating System, Database, Transaction Monitor, and just about everything else was written in had a "Time" instruction that took no parameters and compiled down to "Clock -1".
No wonder the multi-node performance was, how to put this, "non-optimal". The sytems were stopping to have a huddle every couple of hundred CPU cycles.
Suffice it to say that I spent the next month writing an unending series of software patches turning "Clock -1" into "Clock 0".
We source-cleared the problem by exploiting a little known feature of that "high level" language which allowed the coder to embed snippets of assembler into their module.
People don't go to the Pub any more. So I guess this sort of problem never gets spotted. Explains a lot...