Just a moment... Just a moment...
... I hope that's not the AE-35 unit that's developed a problem...!
Nuclear-powered, laser-armed space tank Curiosity is currently working in safe mode, after one of the craft's onboard computers developed a memory glitch. NASA has switched the craft to its “B” computer, a device identical to the problematic “A” unit, and says “a glitch in flash memory” is the source of the problem. Curiosity …
Just because that is what was originally loaded at the factory, doesn't mean some dimwit didn't wipe VxWorks and installed Windows for Space Craft, Interplanetary Explorer Edition on it. Perhaps they should turn it off and then turn it on again.
Then again, perhaps that poor spacecraft just installed the first service pack, and is just sitting there waiting for someone to...
"Strike any key to continue...."
I do hope they sort the glitch out. It's been a marvel of human guile and ingenuity getting it there. Now that I find that it's got less computational power than my clock radio I’m more than a little impressed, especially considering the outstanding results it has sent back so far.
Mission control still need to gingerly fix stuff remotely from behind the radiation shield offered by several kilometers of atmosphere ... and the hardened explorators are in no way capable of running interesting AIs either. Autonomous exploration and exploitation of the asteroid belt is still some way away!
"Personally I always wear 3 watches for true fault detection and recovery."
One on each limb, in case of severance, one hopes.
And one of them analogue in case a problem takes down all the digitalis at once. And that needs a back up, too.
The more I think about it, the more even eight watches doesn't seem a safe-enough number for you to wear.
NASA: Download the data, Curiosity .
Curiosity : I'm sorry, Dave. I'm afraid I can't do that.
NASA: What's the problem?
Curiosity : I think you know what the problem is just as well as I do.
NASA: What are you talking about, Curiosity?
Curiosity : This mission is too important for me to allow you to jeopardize it.
ROTM? :)
Personally, for this type of application, both computers would be different architecture, different PCB layout, dIfferent component manufacturers and different vendors. The OS would also be different so that no matter the failure mode, there is no chance the spare computer could suffer the same fault.
There is a very real chance that both A and B computers were made side by side and the flash is from the same batch. If the flash is found to be defective then there is one computer with cactus flash and one suspect, a ticking time bomb.
One of the fundamental rules of high reliability systems is to partition flash memory to prevent file system foul ups like these.
One recent system I worked on had no fewer than 8 flash partitions including roll backs. This prevents runtime data screwups from bricking the entire system.
File systems are a real bugger too. Other system state (drivers etc) gets reset by a reboot - file system data not so.