The mind truly boggles ..
Perhaps this approach could be tried elsewhere.
Nuclear reactor control systems
Early warning systems
Intensive care units
Maybe it is !
I recently wrote about how a bad round of software testing lost Wall Street trading firm Knight Capital an estimated $440m – enough to almost put the company out of business. I speculated that Knight could be bailed out if it's allowed to unwind its computer system's unexpected burst of loss-making trades on the stock market - …
Perhaps this approach could be tried elsewhere.
Nuclear reactor control systems
Early warning systems
Intensive care units
Maybe it is !
It's amazing what you can achieve when your main KPI is lower cost.
Why introduce new software if it wasn't to improve efficiency (ie reduce cost)?
Now what about adding windows for warships to your list?
"We have to do a better job on our testing environment.”
That's got to be a contender for Quote of the Week
It should, with attribution, become QotW's permanent header!
the software had MADE $440 million?
That question is about as relevant to the discussion as the question "What if unicorns flew out of your ass?".
Sure, it might lead to some interesting or entertaining speculation, but it's got no useful bearing on the situation at hand, so why spend braincells on it?
Well it did, didn't it? Just not for the people running it.
Actually, it is relevant to the discussion. Say the company had made $440m, then we most likely wouldn't be having this disucssion as the news would be a small section on page 7 of the Financial Times. Yet the fact would still remain that they put a badly managed system into production and escaped by sheer good fortune!
What it serves to highlight is that win or lose, a mistake is still a mistake and given that the house usually wins, for every time you get lucky and benefit from a mistake, the house will win another nine times. So what this whole thing really proves is the need for solid configuration management and effective release processes that stop test applications getting released into production. Also the question has to be asked, if all this was done by a testing app, shouldn't that app have been making copious amounts of diagnostic output?
It wouldn't have made $440 million and certainly not in an hour.
It's far easier to loose large amounts of money on the markets than to make it.
This a hundred times over, it's not some simple two way street that Knight went the wrong way on - there's no possibility the system could have made money as fast as it lost it.
The story has made news because of the impressive amount lost in such a small time because of bad IT practise.
If they had good IT practises they would have tested outside of a live environment, implemented the system and made small amounts of money over time.
Err, that's what it's designed to do. The whole point is to complete LOTS of trades at a very small profit each.
The problem was that the test wasn't validating that the deal was making a profit, and it just merrily went ahead with trades profit or loss.
then the whole 'market is the best way to do everything' bollocks that has driven the descent of western civilisation for the last 40 years will have been proved wrong and we could all go back to being decent human beings.
It's designed to process lots of trades quickly, each one making a profit.
However, the $440M was only manifested in the time period concerned because the trades were unprofitable (and bloody stupid to boot).
Had the trades being made by the system been profitable for Knight Capital or their backers, the story would have a completely different tone but also a much smaller amount of money associated with it.
That story would also have featured a whole lot of traders initiating unscheduled brownware deposits in the rear baggage areas, as they suddenly saw a future in which they themselves were obsolete. But then again, you know what they say about things that sound too good to be true. :|
"Err, that's what it's designed to do. The whole point is to complete LOTS of trades at a very small profit each."
No, that's not the point of the software. Knight are a market maker. That means their job is not to make money on the market - though they possibly do a small bit of that as I think they hold 'stock' (as it were) of some less frequently traded shares. What they do is aggregate trades for a bunch of clients, to make life easier, and cheaper. They then charge a small fee for each transaction.
So if customer a has 100 Apple shares it wants to sell, and customer Z wants to buy 90 of them, then Knight can handle both those trades, and may hold on to the spare 10 Apple shares to fulfill an order tomorrow. Or sell them on itself. Only on a rather bigger scale.
One of the advantages for their customers is that Knight will pay all the cash up front for a sale, rather than leaving shares on the market for people to buy in chunks, as they want them. So you just dump all the stock you don't want on Knight, and you've got the cash instantly to go shopping with.
So they don't make their cash from taking risks on investments, but by being useful.
The mistake would not have been noticed, and the terst loader would continue to make willy-nilly trades until its lucky streak ran out. Sooner or later a $>10^9 loss would have occurred, so they would still be in the same pile of poo but perhaps landing in it a day or two later.
Very unlikely given the nature of the problem.
If you look at the price of shares on the market, you get two prices, a lower price at which you can sell them, and a higher price at which you can buy them.
In this situation, if you start firing off hundreds of buy and sell orders every second, you are guaranteed to lose money, and that is what happened to them.
I should hit this point harder in the article(s). Knight isn't a firm that trades to make money. They aren't a hedge fund or anything like that. They don't take on much risk, they execute trades on behalf of other brokerages. They're a cog in the big financial services machine and, as 'I Ain't Spartacus' (great name, btw) put it, they make their money by being useful.
To me, this is a major reason why this story is so interesting and chilling at the same time. These guys aren't taking a lot of risk, not doing anything that's wildly or even mildly speculative - they were just testing out a new trading package. And in less than 45 minutes, an IT error almost took down their entire company to the tune of a $440 million loss. That's gotta give anyone in IT pause.
Proof that the stockmarket IS a casino, no skill involved.
> These guys aren't taking a lot of risk
They certainly were. $440M of it...
ITYM "they didn't think they were taking much risk". And that's the thing about risk - we're notoriously crap at assessing it...
Thank you for the sausages.
"To me, this is a major reason why this story is so interesting and chilling at the same time. These guys aren't taking a lot of risk, not doing anything that's wildly or even mildly speculative"
In my original post I wrote than Knight don't take risks on the market, and then deleted the sentence for the obvious silliness, given they'd just lost nearly half a billion dollars. I couldn't find an elegant way of describing the position.
Their business model only involves a low risk. However they can make catastrophic errors that could lose them hundreds of millions. But this situation isn't unique to companies in the financial services industry. Toyota lost a fortune on having to fix design faults in cars, any large transport or building company can kill hundreds of people by some combination of bad management, error, carelessness, negligence or rogue employees, and that could result in similar sized losses. It's just that's a bit too much text to fit on the t-shirt...
As you say, it's pretty scary what can go wrong. I work in the water industry, and about this time every year you get a news story about how a few people have died in a Legionnaires Disease outbreak. And then a few days later you're talking to someone on the phone who wants to spend the least amount of money possible on their equipment, and you just know that their idea of a maintenance regime is to call someone when water doesn't come out of the taps in about 15 years time...
At least they noticed he problem after 45 minutes with a $440 million loss. If they'd left it running overnight and only caught it in the morning that loss could well have been $10 billion.
They only noticed the problem because their credit lines ran out. So they couldn't lose any more simply because no one was giving them any more money.
Not sure whether it was their credit lines or not. They may have also been alerted by the exchange. Their trading had caused huge price and volume swings in some stocks and the NYSE traced it to Knight.
Just reading about this mind-bogglingly stupid act was so chilling and sphincter puckering that it was the equivalent of realising that I'd done something quite stupid myself.
I'd like to thank Knight Capital for raising my awareness.
" And since it’s just a testing program, it didn’t keep track of any of its activity – meaning it wasn’t easy for Knight to immediately understand the magnitude of what had happened and the massive losses they’d incurred."
That implies that development was far from finished and had no audit logs in place.
What's the modern equivalent of "Designed on the back of a fag* packet?"
*cigarette packet for you Yanks
"Designed on the back of a fag* - yeah, that would be considered a hate crime here...
We ain't all chavs...
So irrespective of the testing program not keeping logs, Knight were using the new RLP application to place trades on the NYSE and it is a regulatory requirement to maintain an audit trail of all transactions.
What they really meant was that there were so many logs that they could easily tell what trades were the testing software and what trades were placed through other means.
I would have thought a testing program would take more logs than normal operation. Otherwise how do you know the results of the test?
Certainly my test programs create more logs than my released software.
"So irrespective of the testing program not keeping logs, Knight were using the new RLP application to place trades on the NYSE and it is a regulatory requirement to maintain an audit trail of all transactions."
Err, no. As I understand it there is the RLP part which was supposed to be connected to the NYSE and comply with all its various rules. Then there is the testing package which is only meant to be used in the lab to load test the RLP (in the lab) to see if it is reacting correctly to the trades thrown at it and the transaction rate it is being hit with.
What happened is that the testing part was released into the wild with the RLP part and instead of just the RLP part connecting to the exchange, the testing module was now connected to the exchange also (rather than its lab setup of direct to RLP). This meant that the load generator started merrily firing off transactions and load testing but rather than being in an isolated environment with RLP it was on the actual exchange - somewhere it was never intended to connect to. Couple this with the fact it isn't supposed to log, track, or report trades as it is an artificial load generator not a trading system and you therefore have an issue with how many trades have been done. I'd imagine they would have to get some sort of report off of the exchange for trades requiring settlement and have to back out the ones the real system did correctly for clients to identify the rogue ones. When you look at how quickly these systems work it should come as no surprise as to how many transactions resulted and how much they lost - I believe NYSE has sub-millisecond trade speeds.
> What happened is that the testing part was released into the wild with the RLP part
You have to wonder why the testing package wasn't on a physically separate machine which gets thrown into the canal once its job is done.
Surely someone must have thought about the risk of losing track of it...
"It's far easier to loose large amounts of money"
One of the few occasions when the wrong spelling of 'lose' is just about OK. Still annoying, though...
Fingers faster than brain :)
"Fingers faster than brain :)"
I'm sure that's what Knight's IT guy is using as an excuse to his boss right about now...
Automated trading programs are running faster and faster. Some companies will pay huge amounts to trim milliseconds off transaction times so they can then make tiny amounts on each transaction - and make vast numbers of those transactions.
More and more companies are shifting more and more money through those systems.
$440 million is trivial compared to what might happen - and probably will happen sooner or later. And not very much later.
It appears some banks are too big to fail and the others will just be bailed out anyway!
Other than the fact they're not a bank, and it was quite possible they could be allowed to fail* you've got everything else in that sentence right...
What do you meant there's nothing else in there? Oh dear.
*They were bailed out because they were a useful and profitable company, who happened to make one enormous fuck-up. So long as they don't connect their test software to the real world again, they can be expected to carry on doing their boring, but useful job for many years to come. They're not a risk-taking casino-banking operation. Or at least they're not supposed to be...
He also got the bailed out bit wrong as well.
Bailed out is what happens when governments throw a loads of money at a business or an industry with little or no hope of a return.
Knight has been loaned the $400m by other firms (no government involvement) and in return the other firms get dividends and the opportunity to convert the loans into shares. If you have ever taken part in a company share scheme then they usually offer you the opportunity to take shares or cash with interest after a minimum term. This is basically the same except they firms lending the money get the interest payments (dividends) from the start
You left out an important word. Here is the corrected sentence:
Bailed out is what happens when governments throw
a shit loads of money at a business or an industry with little or no hope of a return.
"I speculated that Knight could be bailed out if it's allowed to unwind its computer system's unexpected burst of loss-making trades on the stock market - effectively taking a mulligan* on the 45-minute debacle. Turns out that ain’t gonna happen."
And I'm glad for it. These few traders permitted to use HFT (high frequency trading) systems -- the fantasy they promote is that they "provide liquidity", i.e. if the HFT systems did not exist nobody would buy up those stocks others want to sell. In reality, they are parasites -- if you find a seller at $10 a share, and a buyer at $10.05, will you get that $0.05 a share? Hell no, an HFT system will beat you to it, EVEN if you have a buy order already entered (because the HFT computer will execute a buy faster than your computer!) buy up all that $10 stock, and either sell it to you at $10.04, or to the other person directly at $10.05. Taking that $0.05 profit right out of anyone else's pockets and putting it into their own. BUT, there've been at least two major HFT malfunctions before, and before, NYSE would obligingly roll back their trades. How fair is that? They get this privileged position that lets them make countless millions of dollars every day at the expense of every other stock trader, but when the HFT system cocks up and everyone else makes a little of that money back at the expense of the HFT systems, the NYSE would roll back those trades! I'm glad they've shown the stones to refuse to do this this time!
This isn't a HFT system.
Knight are a market maker. Their primary business is not to make money trading. They're simply a convenient sales and order processing system for people who do. Which is why they get paid for doing something that traders could otherwise do themselves (but at slightly greater financial and time cost).
Whoever was responsible for making this happen will be fired, re-hired, double-fired, then re-hired and tripple-fired. Assuming the company isn't bust now.
...it's worth pointing out (as non-Spartacus has done several times) that Knight weren't pro traders here. That is to say: they weren't buying and selling on their own behalf, but on behalf of their clients, e.g. pension funds, other banks, hedge funds, high net worth individuals (aka the 1%)
Based on other stuff I've read the test module which went wrong was designed to simulate orders from those clients in a test environment. It wasn't a flash git algo gone wrong - it was never designed to go into production.
Vic, who gets all the above, then says "You have to wonder why the testing package wasn't on a physically separate machine which gets thrown into the canal once its job is done."
This simply isn't how it's done. I'd put large amounts of money on the the production systems being physically separate (and probably beefier) machines from the test systems. Most likely these production hosts sit on a different network in a secure data-centre somewhere, running primary/standby (or more) and proofed against all manner of threats. What happened here - and I'm guessing, but it's a guess based on a number of years' industry experience - is that it was a failure of deployment, i.e. a 'test mode' switch was left flicked, an extra module was installed during an install/upgrade...
So this time, we can't blame Cocaine Jimmy the Sociopathic Trader, nor Bob the Banker (he who bathes in the tears of widows and orphans). We should look inwards. It's an IT failure, and anyone who hasn't made a mistake like this (perhaps without the repercussions, but the same level of error) simply hasn't been around long enough. Although the game was played out in a financial arena, this was an IT cockup, and better IT practices are called for.