back to article Starliner snafu could've been worse: Software errors plague Boeing's Calamity Capsule

Troubled aerospace giant Boeing will "re-verify" the flight software code for its calamity capsule, the CST-100 Starliner, after it was revealed that December's anomaly could have been a lot, lot worse. Boeing had already coughed to a timer error that made the spacecraft's internal clock 11 hours out of whack while sat on the …

  1. Anonymous Coward
    Anonymous Coward

    Capability Maturity Model ?

    Wasn't this supposed to stop shit like this happening ?

    Although if Boeings reaction to it was the same as GEC*s, the extra hour a week it cost would never have been allowed by management.

    *Anyone under 25 should ask their parents about GEC, and more specifically, what happened to it.

    1. Red Ted
      FAIL

      What happened to GEC

      George Simpson is what happened to GEC.

      Right after ruining another big British engineering company Lucas Industries.

      1. sawatts

        Re: What happened to GEC

        Simpson! I had completely forgotten that I worked for GEC aka Marconi Medical Systems 2000-2002, but that name riles me. Saw it all collapsing from the inside.

        He burnt the cash reserves built up by former generations on splurge on the dot com boom. Marconi supplied into the telecoms industry, which stopped buying after they gave all their money to the government for 3G licenses.

        One of the reasons I have very little faith in most senior managers.

        (See also HP & Meg Whitman)

    2. Saruman the White

      Re: Capability Maturity Model ?

      Unfortunately CMM turned out to basically be a fad - once the bean counters realised just how expensive it was to implement (a company like Boeing would need to spend hundreds of millions of dollars to get to Level 5) they very quitely dropped it.

      This and the 737-Max fiasco has me thinking that Boeing is starting to suffer from a hardening of the corporate arteries. Management have lost sight of what the company actually does (aircraft, spacecraft, ...) and have become far too concerned with the bottom line. As a result of this shortcuts are being made in the engineering (software, mechanical, ...) that are having major Safety of Life implications. The only solution may be a radical overhaul of corporate management, starting at the top and working down to the bottom: the corporate equivalent of "nuke it from orbit, its the only way to be sure".

      1. Anonymous Coward
        Anonymous Coward

        Re: Capability Maturity Model ?

        The whole point of CMM was that OVERALL it added a man-hour a week as an overhead to the company. However the SNAFUs it eliminated saved countless man-years on big projects.

        It's real problem - as with any initiative to improve quality - is that it required an investment. Something few companies are happy with. Especially investments that might take - gasp - a couple of years to start making a change.

        Which leads us to the root of all the woes most big companies are facing right now. They've streamlined their processes to within a percent of perfect, and there's just no room to invest any more.

        Oh, and big hint. "Spending" is not "investing". Spending is what you do on shareholder and management bonuses. And there's plenty of that going on.

        1. Doctor Syntax Silver badge

          Re: Capability Maturity Model ?

          "They've streamlined their processes to within a percent of perfect"

          Define "perfect".

          1. Yet Another Anonymous coward Silver badge

            Re: Capability Maturity Model ?

            We landed on the runway within 1% of the correct altitude

          2. Steve Davies 3 Silver badge

            Re: Define Perfect?

            For this lot, perfect is delivering as little as possible in the min time allowed and stll walk away with 100% of the available dosh (Damm those shareholders!)

            Minimum effort for maximum returns. Perfecto!

            1. John Smith 19 Gold badge
              Unhappy

              "stll walk away with 100% of the available dosh "

              Bingo.

              BTW Boeing's award for Commercial Crew (which this) was 61% higher than SpaceX's.

              SX's test had some drama.

              But not on this scale.

              1. John Brown (no body) Silver badge

                Re: "stll walk away with 100% of the available dosh "

                If NASA were forced to choose just one supplier, they'd choose a massively large incumbent over a new incomer every time. If SpaceX has a major disaster, it could kill them dead. Boeing, as we have seen, can ride out a big disaster and, hopefully, recover from it. It's no real surprise that when NASA spread the costs and risk, they spread it more thinly over the newcomers. This could change over the coming decade if SpaceX continue to prove themselves and keep costs down and show they have the reserves and/or enough other business to survive a disaster.

      2. Doctor Syntax Silver badge

        Re: Capability Maturity Model ?

        "starting at the top and working down to the bottom"

        Or start at the top and just work down as far as they need to go.

      3. Mike Richards

        Re: Capability Maturity Model ?

        Didn't the management of Boeing decamp to Chicago so they could be closer to the money markets? And of course that management of Boeing came from McDonnell Douglas where they had done wonders for shareholders by cost-cutting even as they drove large parts of that company into irrelevancy.

    3. Anonymous Coward
      Anonymous Coward

      "Anyone under 25 should ask their parents about GEC"

      AKA "The Go Easy Corporation"*

      AKA "The Generally Evil Company"

      *Home of the 9-5 engineer.

    4. John Smith 19 Gold badge
      FAIL

      "Capability Maturity Model " A history lesson.

      CMM was developed by Carnegie Mellon after studying how IBM Federal Systems built the software for the Space Shuttle.

      1) Design the software in full before you start coding.

      2) Have 3-4 person teams walk through each piece of code and log any mistakes found

      3) Fix them later

      4) Analyze the form of the mistake

      5) Re-analyze the code base to find any other instances of it

      6) Update the checklist of known bug patterns to stop it happening again.

      7) Repeat until system is done.

      It takes 2 minutes to explain but decades to implement and Fed Systems code was estimated to be 10x more expensive than industry standard LOC.

      It never failed on any Shuttle mission, although a couple of the processors that hosted it did glitch.

      But if you pay peanuts you will get code monkeys.

      1. Mike the FlyingRat
        Boffin

        Re: "Capability Maturity Model " A history lesson.

        "But if you pay peanuts you will get code monkeys."

        That is not always true. (I mean I agree with you that going cheap never is.)

        You can easily over pay for under qualified coders. This is a problem that exists in our industry because of how the system is dumbing down and allowing anyone to 'code'.

        A friend and I used to joke about the 10 week wonders that came out of Accenture. They hired 'smart' college kids who got 'A's in non-technical majors and turned them into a hoard of consultants. They couldn't and shouldn't touch serious code.

        Yet today, they get job titles of 'software engineers' thinking that they are.

        Universities aren't teaching the basic theories that we learned years ago. For the most part today's crop of coders can write high level code, but when it comes to understanding how to write the low level code... very few can and do.

        Today's 'Agile' teams don't adjust their schedules to fix code and worry more about their burn down rate than doing things right.

        Sorry to rant, but its more than just $$$. We are in an industry where the bean counters see developers , coders, and software engineers as all the same thing.

        You can spend $$$ to find a chef, but you may end up paying $$$ for a line cook because he's the best you can find, or can't tell the difference.

      2. TeeCee Gold badge
        Facepalm

        Re: "Capability Maturity Model " A history lesson.

        1) Design the software in full before you start coding.

        But, but, but Agile damnit, Agile is where it's at. You can't fuck with what's trendy or you'll piss off the millennial snowflakes and then you'll get a "h4te for you and everything you stand for" hashtag trending on Tw@tter.

        Better to let a few people die...

        1. Giovani Tapini Silver badge

          Re: "Capability Maturity Model " A history lesson.

          Corporate Agile: -

          Security = Blocker

          Planning = Blocker

          WTF is this for = Blocker

          Who is the customer = Blocker

          Compiles = Ship...

          Agile is now a term used to disguise activity as progress, and eliminate any type of common sense thinking. I'm so 2010 in my use of the term. .

        2. Grooke

          Re: "Capability Maturity Model " A history lesson.

          Agile may have become popular because of millennials and the start-up mentality, but the real problem is all the middle-age middle-managers trying to stick agile on everything to prove that they're still relevant.

  2. lglethal Silver badge
    Go

    Hmmm...

    Has anyone else noticed that ALL of Boeing's problems seem to come from either a) Management (we dont want to recertify what's in effect a new aircraft because it will be expensive) or b) Software.

    Who wants to place a bet that b) can also be placed under a), due to some manager along the line going "Nahh, we dont need to perform a review to detect errors or do extensive testing. That's expensive and we dont have time..."

    I'm not so sure Boeing's workplace culture is the problem, Boeing's management culture on the other hand...

    1. nematoad Silver badge
      FAIL

      Re: Hmmm...

      "...see what is lurking in Boeing's workplace culture..."

      To misquote Shakepeare, "First let's kill all the accountants."

      This seems to be the management having the choice of "Good, Fast, Cheap" and going for fast and cheap. No doubt junior heads will roll.

      1. Mike the FlyingRat
        Facepalm

        @nematoad Re: Hmmm...

        Oooh! That's a micro-aggression!

        I'm telling HR.

    2. Anonymous Coward
      Anonymous Coward

      Re: Hmmm...

      "I'm not so sure Boeing's workplace culture is the problem, Boeing's management culture on the other hand..."

      This isn't management's fault, it's everyone else's, as management's investigation and report will clearly show.

    3. macjules Silver badge
      Facepalm

      Re: Hmmm...

      Boeing's management seems to work along the lines of, "Does it work? No? Let's rename it to Starliner-MAX in that case"

    4. fidodogbreath Silver badge

      Re: Hmmm...

      Has anyone else noticed that ALL of Boeing's problems seem to come from either a) Management (we dont want to recertify what's in effect a new aircraft because it will be expensive) or b) Software.

      It was a few years ago, but I think the self-immolating Dreamliner batteries were actually an engineering fail.

    5. Mark 85 Silver badge

      Re: Hmmm...

      I suspect you're right that it's the management culture. Boeing seems to be in the "too big to fail" state of mind. It's probably the other way around... they're too big to pay attention to details and thus primed for failure.

      With as many government pots they have their hands in, they'll probably start looking for subsidies just to get some programs finished.

      1. John Brown (no body) Silver badge

        Re: Hmmm...

        Subsidies? But that would be illegal! Just ask Airbus.

    6. Dr. Ellen
      FAIL

      Re: Hmmm...

      Boeing's problems are not all from management and/or software. What about the stray tools rattling around on the tankers they (try to) sell to the air force?

      The problem is that the bosses and the workers don't live in the same building any more. Neither knows exactly what the other is doing.

      1. Alan Brown Silver badge

        Re: Hmmm...

        "What about the stray tools rattling around on the tankers they (try to) sell to the air force?"

        That's a management problem - the same kind that killed British Leyland

        You don't HAVE that kind of issue with workers (or militant unions) unless management is utterly rotten. Happy workers don't do that kind of thing.

        Look into Al Jazeera's 2011 report on the Boeing 737NG whistleblowers - substandard fuselage ribs were being supplied to the factory in the 1990s on falsified paperwork - line management telling the workers to put them in anyway and beat panels into shape to make it fit, covered damage up with filler and paint, then sent it down the line on more falsified paperwork. The FAA shopped the whistleblowers back to Boeing within a week in 2003 and Boeing sacked them

    7. Electronics'R'Us Silver badge
      FAIL

      Re: Hmmm...

      The issue here seems to definitely be from management but the real problem is that their attitude filters down to just about everyone.

      I have seen issues where a ships company (Royal Navy) was so toxic that the entire crew were re-assigned elsewhere (and all on the same day) with an entire new crew brought in. In that case, the more senior officers were also re-assigned to desk jobs or quietly convinced to resign / retire.

      The same has happened to army regiments in the past. Once the rot sets in, it affects everyone.

      In this case it appears the bean counting attitude has filtered down to lower management (so the rot is just about complete) and the best solution would be to completely replace all of them (which sadly won't happen).

      I have worked with safety critical kit, and we would never skip a design review (where you have to show just how you have actually met the system requirements); this tells me that Boeing apparently thinks that system requirements and formal validation are unnecessary - this is total madness. If we were late on the programme, we were late and we would tell the customer and explain how we were going to recover.

      Skimping on proper engineering (not just software - if this is happening there, it is happening in all engineering disciplines) is a recipe for disaster and that is where it has taken Boeing in more than one area of the company with only a token CEO 'leaving'; this attitude needs to be ripped out entirely.

  3. Pascal Monett Silver badge

    "re-verifying flight software code"

    Once upon a time, we didn't have the Internet (at least not like it is today), but we did have game consoles. In those days, when you shipped a console to be sold, it had to work. There was no such thing as a firmware update, no downloadable software patches, nothing. If your console didn't work 100% out of the box, you were getting returns and a bucket load of bad rep.

    Today, we have the Internet, and ever since there has been nothing that has worked out of the box. There's always a patch, always an update, and nobody knows how to write and review code so that it works first time. Including, apparently, large industrial companies with a long engineering history that should damn well know better. But, since we now have the convenience of being able to patch, we all wait for something to crop up and scramble to find the bug when the product has already shipped.

    Let's face it ; humans are lazy. With the Internet, we have lost the drive to check and re-check and polish that code until it it glows. We just chuck it out because we know that, if something does go wrong, well we'll just patch it. This attitude is now so ingrained into our minds it is tainting the minds of engineers who should know better.

    1. Andytug Silver badge

      Re: "re-verifying flight software code"

      Plus, now we have the internet, it's far to easy to download/copy/paste someone else's code into your work without checking it properly, or whether it'll work with the code you already have. Thus saving money and keeping the manglement happy.....

      1. iron Silver badge

        Re: "re-verifying flight software code"

        Seen a lot of examples of rocket engine control code on StackOverflow have you?

        1. Kevin Johnston
          Joke

          Re: "re-verifying flight software code"

          So is it not the same as this drone flight stability code I just found?

        2. HamsterNet

          Re: "re-verifying flight software code"

          It’s worse.

          They cut and pasted their own code, from the capsule to the service module, but then didn’t update what lookup tables it used.

          If not spotted and rushed patched, when the two parts separated, the service module would have used completely wrong thrust, then try to self correct using even more incorrect thrust and so on until it’s out of thrust or has crashed back into the capsule.

          Boeing sent multiple, untested, unvalidated software patches, written on the fly to the star liner whilst on mission, just to get it to return safely and it still failed to reach the iss.

          The approach and docking at the iss hasn’t been tested.

          Let’s not forget this was a proof it all works mission. That without direct intervention would have resulted in total loss.

        3. John Brown (no body) Silver badge

          Re: "re-verifying flight software code"

          No, but there may a lot of date conversion and timer management code snippets just waiting to be pulled and other number of routine that could be useful :-)

    2. steelpillow Silver badge

      Re: "re-verifying flight software code"

      To be fair, the other big change is that back in the day, 16k of code was a typical OS for a console or home computer and many had a lot less. Many also still had the odd arcane bug or two. Nowadays 16G is not unusual. That's a million times more stuff to QA before shipping. And arcane bugs at 30,000 ft can be a lot less amusing to the people on board than they are when playing Jet Set Willy at home.

      Nevertheless, Boeing's management have taken them to the bottom of the Western aerospace league (I don't know about the rest) and its shareholders are at last getting restless. The CEO has already been sacked, let's hope they understand about tips and icebergs.

      1. Doctor Syntax Silver badge

        Re: "re-verifying flight software code"

        "Nevertheless, Boeing's management have taken them to the bottom of the Western aerospace league"

        US Govt policy is to ensure that wherever Boeing might be on merit any non-US companies will be penalised into a lower place.

      2. Anonymous Coward
        Anonymous Coward

        In some ways, that's the problem

        A lot of projects now have many, many lines of overly complex code "because that's how modern code is written", rather than have something that's been crafted by a highly-skilled practitioner to provide the expected (required) functionality in a form that is easy to understand, test and reason about.

        Think of the mess you get with a lot of frameworks - small fragments of apparently unrelated code that are connected via some "magic" that makes it very hard for a human to comprehend. It's easy to produce, but...

        1. Alan Brown Silver badge

          Re: In some ways, that's the problem

          > A lot of projects now have many, many lines of overly complex code "because that's how modern code is written"

          Case in point: Toyota engine management code. This used to be simple and robust. It's now so complex and fragile that it hid a number of fail-dangerous modes that took months of analysis by 3rd parties to prove

      3. Mike the FlyingRat
        Boffin

        @ steelpillow Re: "re-verifying flight software code"

        Sorry had to down vote you.

        Back in the day, 16K was low level code. Assembler and C code.

        Today, everything is abstracted and just a few lines of code call something that calls something which is abstracted from the developer so that his 4 lines of code really represent a lot of code.

        Its not the size of the code, but the level of effort to really understand what the code is doing.

        Today you still see people who write embedded systems writing small code that still does a lot.

        The difference is that back in the day... you had time to walk through the code to make sure it did what you wanted it to. Now you don't.

    3. vincent himpe

      Re: "re-verifying flight software code"

      but it worked on my arduino ...

  4. Jemma Silver badge

    Cock-up base here

    The calamity has landed..

    1. Jemma Silver badge

      Re: Cock-up base here

      You know - I am kind of wondering if they haven't subscribed to the British Leyland school of management..

      Might explain a lot.

      "Well you only ever need one knock sensor, why'd you need two stall sensors..?"

      "African pilots? Whatever next? They don't like it up 'em you know."

      "Why bother photostating all the manuals again Bill, no one reads 'em, and you can't do it anyway - the union would have a fit - it's not your job"

      Or

      "GMT were good enough for Red Robbo so it's good enough for us..."

      Some time later

      "oh crap".

      It's like Boeing is channelling the spirit of BL Solihull on a wet Friday in November. With rockets.

      1. ClockworkOwl
        Mushroom

        Retaining Plunger

        Nahh, It's just CaptainBoeing and the StarLiner...

        "set the controls for the heart of the earth"

        Aerospace Age Inferno ->

  5. iron Silver badge
    Mushroom

    If I was the Russians (or anyone else with a stake) I would refuse to let Starliner near the ISS until it demonstrated successful docking some other way. What guarantee do they have it won't collide with the station causing damage and potential station loss?

  6. Philip Storry

    How things have changed...

    This article has stuck with me since I first read it well over a decade ago:

    https://www.fastcompany.com/28121/they-write-right-stuff

    It shows a mature, confident development process that understands the risks and chooses to minimise them.

    For example, if they find a bug they don't just fix it. They check what kind of bug it is, and if it's one they've not encountered before (certain types of arithmetic for example) they then check the whole code base for the same issue.

    This is why the Shuttle never had a software problem that killed people. Culturally, they took it seriously.

    By contrast, the management at Boeing evidently have a different culture. One of cost cutting and what could charitably be called "personal development".

    If they encounter a bug, they're probably wondering whether they should fix it or just rewrite in Node.Js. The latter buys time, and looks good on the CV when the inevitable failure happens anyway...

    This would be amusing, if people's lives weren't on the line.

    1. Steve K Silver badge
      Flame

      Re: How things have changed...

      Compare and contrast with the F-35 ALIS replacement supply-chain software where Lockheed Martin are looking to move to an Agile approach and ship Minimum Viable Product.....

    2. Anonymous Coward
      Anonymous Coward

      Re: This is why the Shuttle never had a software problem that killed people

      Whilst true, the fact that people have died - twice - in shuttle failures means it's a shame the whole damn project isn't covered thusly.

      1. Doctor Syntax Silver badge

        Re: This is why the Shuttle never had a software problem that killed people

        "it's a shame the whole damn project isn't covered thusly."

        Which is the exact point that Feinman made in the original inquiry.

      2. Jemma Silver badge

        Re: This is why the Shuttle never had a software problem that killed people

        Slightly unfair comparison - both those situations were physical hardware events - one avoidable but irreversible once the launch was committed - the other could have been recoverable if the right action was taken (wing damage).

        Both were a result of forcing the physical design envelope of the whole entity against advice and then in the latter case abject gormless indifference to even check for damage, ascertain the situation and work a solution - it was less of a technical challenge than apollo 13 that's for sure. Worst case scenario they could have fueled another shuttle and launched with minimal stores and a supply of heat resistant tiles or just evac-d the crew onto the rescue shuttle and hit the next window - it was not that technically challenging.

        This situation in this case is a whole different kettle of fuck up. Its not a one point failure situation - its the exact opposite - you'd be hard pressed to find anything that these people did right. I think my A level computer studies group could have done better (in BASIC) than this shower. They cut/pasted code and worst of all, look up tables, into software for a different system, didn't check it visually or otherwise, didn't simulate it on ground systems that should have been available as part of the contract.

        Given what's come out I'm amazed they managed to cobble things together so it got down in one piece - but I know I'd not be allowing that thing off the pad until its been fixed and tested throughout and I'd seriously be looking at having some senior and middle management staff fall on their wallets.

        If that thing had taken the ISS down catastrophically and the remains smacked into a major Chinese or Iranian city...that plus "Dickwit diplomacy"...?

        1. Bubba Von Braun

          Re: This is why the Shuttle never had a software problem that killed people

          Sorry but both Shuttle accidents were avoidable. They are both rooted in the same sort of management malaise affecting Boeing.

          Engineers presented that there was an issue and it was dismissed on the basis of schedule and/or cost pressures.

          A great read is "Truth, Lies and O-Rings" by Alan McDonald one of the Thiokol folks who blew the whistle on he Challenger mismanagement.

          BvB

          1. Jemma Silver badge

            Re: This is why the Shuttle never had a software problem that killed people

            I think I said that. But the challenger on wasn't as soon as the boosters fired. Even then it might have made it but for higher than average wind shear.

            I think your egotesticle might have defeated your reading ability.

            1. Sweep

              Re: This is why the Shuttle never had a software problem that killed people

              "the challenger on (e) wasn't (avoidable) as soon as the boosters fired."

              I don't see your point.

              The problems with Starliner were also unavoidable, once it had launched with shoddy code......

              1. Getmo

                Re: This is why the Shuttle never had a software problem that killed people

                There's so many problems with Starliner, you're going to have to be more specific.

                The problems that did happen, with the 11 hour clock difference causing attitude control fuel burn, yes, unavoidable after launch was initiated. The other problems, with code controlling a separation event being bad (and who knows what else), actually was avoidable, since they avoided it by uploading some untested duct-tape software patches when it was on orbit.

                Nothing like bringing a "just ship it now, we'll fix the rest with patches after launch" attitude to crewed space travel, eh?

    3. Anonymous Coward
      Anonymous Coward

      Re: How things have changed...

      Actually, no, Shuttle software (and spacecraft software in general) is *not* highly reliable.

      The points that everyone misses, and allows the bullshitters in that industry to prosper (I know, was close to the coal face).

      1) CPU frequency is approximately 1 MIPS. So what? It executes 3000x slower than you are used to. Mean Time To Failure requirement of ten years sounds stringent until you realise that’s the equivalent of only 24 hours uptime on a 3GHz CPU.

      2) All complex programming constructs are disallowed by policy. No dynamic memory allocation, no multi-threaded, no caches ( deterministic execution times).

      3) Shuttle software is about 240kLOC, but with a rather high ratio of comments and formatting (nothing wrong with that) it’s only 70kLOC executable. Let’s not pretend this is complex software. Most of it is extremely simple command sequencing on tightly controlled operational procedure.

      4) Productivity, the elephant in the room. Having gone through all the processes, average productivity for the industry is 0.1 LOC per day, including comments. Yup, you read that correctly. The average programmer in the space industry writes a grand total of *6* lines of executable code per man year.

      I’m fairly sure that *most* teams could write code with uptime exceeding 24 hours under those circumstances!

      1. John Smith 19 Gold badge
        Unhappy

        "I'm fairly sure that *most* teams could write code with uptime exceeding 24 hours

        under those circumstances!"

        Well this team couldn't.

      2. MCPicoli

        Re: How things have changed...

        First of all: Where are the sources beyond "I was around"? If true, there HAS to be written record somewhere?

        1) Reaching a MTTF of 10 years is independent of whatever architecture you are using, from the software point of view. "Bug free" software (or 99,99x% bug-free software if you like) will run bug-free regardless of clock speed. Hardware is another history.

        1a) Why are we using 3GHz processors anyway? The old shuttle-era processors did the work, why suddenly do we need 3000x more processing power? OK, it allows for more complex scenarios, and a lot of "quality-of-life" improvements, but at some great cost in terms of development/bugs.

        2) And why is is bad? Or, in other words, what does error-prone "advanced" programming constructs gets you? Use them in non-critical places only, stick with tried-and-true "simple" programming where it matters.

        3) Again, WHY do we need millions of LOC? Keep it the simplest possible. Less code, well designed, made, audited and tested relates (but no assurance) to less problems.

        4) Source or I'll call it bogus. And let me tell you, if several tens of billions of dollars and several human lives are in play, so be it, productivity be damned, quality (functionality, performance, security among others) FIRST, productivity after.

        1. Alan Brown Silver badge

          Re: How things have changed...

          " Why are we using 3GHz processors anyway?"

          In general for orbital stuff we're not. It's too susceptable to radiation events glitching things out.

          There's a reason that space hardware uses rad-hardened kit...

    4. phuzz Silver badge

      Re: How things have changed...

      As we all know, catching 100% of bugs is a tall order, even if you throw hours and people at it, but it was this bit that worried me:

      Fortunately, the team noticed that second error while reviewing the code following the first, and uploaded the fix prior to landing.

      They spotted the error whilst in flight, which presumably means they weren't specifically looking for it, and they certainly didn't have a lot of time to find it, which implies it was a pretty obvious bug.

      It wasn't some really subtle bug that 'only happened on flight hardware when the moon is in the third quarter and a subroutine had run within the last hour but not cleared etc. etc.' This was a bug that someone managed to spot while the craft was in space, presumably not while specifically searching for it. This was a bug that should have been picked up by someone before that rocket ever launched, indeed, before that code was ever uploaded to the spacecraft.

      That's a management problem, not a software one.

  7. Chris G Silver badge

    URDA

    Large Aerospace Company, looking for experienced programmer to write URDA protocols for large scale projects.

    Must be able to work to tight timescales and budgets.

    URDA: Unplanned Rapid Disassembly Avoidance.

  8. cantankerous swineherd Silver badge

    the term "glitch" indicates something like "ooh this road's a bit bumpy" rather than implying a catastrophic ball of fire. or is that just me?

    1. Anonymous Coward
      Anonymous Coward

      Not just you. "Glitch", to me, is indeed a very tiny issue with no consequence.

      Like a visual artefact on a TV screen: doesn't prevent you from watching but is a bit of an inconvenience.

      In this world, words are important, mostly to the uneducated (aka politicians).

      "Glitch" won't make them move, but "bug" certainly will have them run around crying !

  9. sawatts
    Childcatcher

    Won't anyone thing of the shareholders?!

    Perhaps Congress need to throw a few more $Billion at Boeing?

    Management must be wondering why, with all their MBAs, they are having so many problems with engineering and software quality.

    1. EVP Bronze badge

      Re: Won't anyone thing of the shareholders?!

      Logic clearly dictates to ditch engineers and hire highly competent MBAs to replace them. In the words of cannibal in the Secret of Monkey Island: ”That should do it!”

  10. This post has been deleted by its author

  11. Anonymous Coward
    Anonymous Coward

    This is the same company stuffed with engineers, processes, and managers that seem to think it's acceptable to use a single sensor reading in a control system that will trim to the point of crashing an aircraft if that reading is incorrect. So obviously, that's the way they designed it and NOBODY not a single soul in any of the hundreds of meetings, code reviews, design reviews, and tests wondered whether that was a sensible idea. Because Boeing.

    1. Doctor Syntax Silver badge

      "NOBODY ... wondered whether that was a sensible idea"

      I'm sure a lot of people did. The reality distortion field in a typical large business would have made sure such doubts never got through.

    2. ElectricPics

      The point here isn't the reliance on a single sensor which is bad enough, but the complete omission of MCAS from the pilot manual, simulators, or the fairly simple but also undocumented method for regaining manual trim control.

      Where else do Boeing have safety critical systems that are unknown and undocumented for the operator?

  12. imanidiot Silver badge
    Alert

    The 'nauts must be nervous

    I bet the NASA astronauts assigned to fly the thing must be getting a tad nervous. Things like this slipping through all review processes and ending up in a flight vehicle are indications of serious underlying management issues.

  13. Duncan Macdonald Silver badge
    Mushroom

    Be nasty

    Demand that Boeing does 2 launches - one a target to simulate the ISS without endangering the ISS itself and another Starliner launch to rendezvous with the target. Also require that the software is checked (at Boeing's expense) by a good outside software company.

    Icon for what should happen to Beoing's senior management ===============>

    1. Adair

      Re: Be nasty

      Also require that the Head of Software Design, Head of Engineering, and Head of Safety be the first crew. They should all be competent to do so, and have every faith in the competence of their product.

    2. Doctor Syntax Silver badge

      Re: Be nasty

      "Icon for what should happen to Beoing's senior management"

      No. They should be on a manned flight. Even when it's in production one of them should be present as ballast.

  14. Anonymous Coward
    Anonymous Coward

    Make the repeat test manned with a few directors as passengers?

  15. anthonyhegedus Silver badge

    Seeing as there are millions of possible items to QA in the code, perhaps they need a rethink about how they write code. This isn't microsoft, where the normal state of code is 'broken' and you don't expect it to work particularly well. They need to have every bit of code's inputs and outputs sanity-checked by a supervisor program (or three), and of course the whole thing triplicated. It CAN be engineered properly, but it has to be able to cope with the inevitable bugs. What's all this AI revolution supposed to be for if we can't get it to supervise the decision-making of the primary systems? It seems to be that the current software design methodology needs to be uprated anyway.

  16. ITnoob

    Anyone remember the stink the Russians kicked up when SpaceX docked man rated Dragon for the first time? There was talk of donning suits and all sorts. I wonder if their will be the same histrionics when / if Boeing finally approaches?

  17. astounded1

    Is It 'How" They Write The Code? Or Is It 'Who' And Where And What Are They Paid?

    You have your code subcontractors and then you have the Boeing approach. It is not, of course, an issue confined to Boeing. Then again, lots and lots of code subs living in almost subhuman conditions in India making three quid an hour (but being billed out by job shops to their clients at vastly higher rates) aren't all working on aircraft and spacecraft.

    Now tell that to 737 MAX passengers who went down. Oh, that's right, you can't, can you?

    Behind those prime contractors used by the glossy names of global business that don't want to employ developers themselves anymore are young people paid just above dirt for their work hidden beyond the view of the brass at companies that think they are getting a deal - until things start falling out of the sky.

  18. DrM
    Thumb Down

    ASIC -- GUI

    It's 2020. All Software Is Crap -- Get Used to It.

  19. Brangdon Bronze badge

    "Process escapes" is the new "Normalising the deviance"

    "But the two software issues you talk to that you all know about are indicators of the software problems, but they are likely only symptoms, they are not the real problem. The real problem is that we had numerous process escapes in the design development test cycle for software."

    1. EVP Bronze badge

      Re: "Process escapes" is the new "Normalising the deviance"

      My reply to such gibberish: ”I don’t care. I want it to work, reliably. Make it so.”

      Never play along with bullshitters.

  20. Torchy

    SSW Mode.

    Boeing has Shit Software Writers engaged in head in the clouds mode.

    We now have a new term at work that goes by the statement of "doing a Boeing" when things go titsup.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2020