back to article One teeensy little 13-minute power cut, and WD you look at the size of that chip supply cut!

A power failure in Yokkaichi, Japan, has thrown Toshiba and Western Digital’s flash supply into chaos – and will have a significant knock-on effect on global supplies, say analysts. The temporary loss of manufacturing capacity will reduce global flash supplies around 24 per cent between August and October this year, we're told …

  1. David Austin

    Backup Power

    I would have liked to think they had (tested) battery backups and onsite generators with enough fuel to last them though a 15 minute power cut, but after losing six extrabytes of capacity, I'm guessing that'll be fixed soon.

    Better to bolt the stable door after the horse has bolted, than not at all, I guess...

    1. Pascal Monett Silver badge
      Trollface

      Re: Backup Power

      Yeah. Seems like expensive lessons are the ones best learned. Again.

      What a shame that there's not a body of information that could have warned just how important it is to have control over one's power supply. Especially in an industrial environment that is time-critical.

      I mean, gosh, it's not like power cuts have ever happened before, right ?

    2. Mark 85 Silver badge

      Re: Backup Power

      Not sure what their power needs would be but that would probably one hell of battery back up. Same for a generator back up.

      1. dew3

        Re: Backup Power

        Minimum 10s of megawatts.

        There was a small chip fab in a neighboring town, it used ~25 MW continuously. For comparison, the municipal power department serves ~35,000 people, with associated downtown, shopping centers, office parks, etc. The fab used almost 50% of all power used.

        1. 404 Silver badge

          Re: Backup Power

          Thought about that too - where tf was their backup power - then considered what they're using in machinery. Nope. Backup power plants would cost more than what they lost, can you imagine? lol...

        2. CountCadaver

          Re: Backup Power

          There's an Aluminium smelter in NZ that pretty consumes ALL of the power from a large hydro plant....

    3. tip pc Bronze badge

      Re: Backup Power. — Dual Feeds

      If you can’t do ups then you’d normally ensure dual feeds and then ensure critical kit has some kind of ups to speed recovery in the event of a total supply outage across both supplies. It’s very odd they hadn’t made at least some of the plant resilient.

      1. Anonymous Coward
        Anonymous Coward

        Re: Backup Power. — Dual Feeds

        If you trace your power feed back far enough you will find a SPOF unless you're very lucky. As this appears to be an area wide cut it is likely an alternate supply was toast too.

        1. stiine Silver badge

          Re: Backup Power. — Dual Feeds

          No, it just costs more, a lot more, money. Adding a motor-generator set big enough to support a large factory is even more expensive. UPS's are just large battery arrays, they're not that expensive. The benefit of spending all of this money is that that power only went out one time in the 10 years I worked there.

          You have to apply the same rules and logic to network connections, i.e. you can't allow your redundant link to 1) use the same fiber cable as your primary link and 2) you can't use the same interduct as the primary and 3) you can't use the same CO as your primary link. This is /very/ expensive but if you want to never go offline, there's not any other way.

          1. the Kris

            Re: Backup Power. — Dual Feeds

            The above comments suggest that most people assume you need to supply the entire factory with backup power.

            Although unclear for these factories, it is very unlikely they need to do that to prevent the current problem.

            Anecdotal, but I have never seen that. Emergency power is supplied only were needed, e.g. server room, security systems, a few specific machines or rooms, ... stuff like that.

            The required emergency power can be drastically less than the full operational power requirements.

    4. Lee D Silver badge

      Re: Backup Power

      It's just not that simple for an entire factory, when the whole city's factories had the same blip. You're talking megawatts of power that can only come direct from the power station and anything like a generator or a UPS is out of the question and would cost more than the place itself.

      And even the briefest interruption may well mean, say, a wafer cooks for a second longer than intended, the chemical bath is submerged for an extra amount of time, something cools enough to destroy the delicate process... and results in a supervisor coming along, pointing to a point on a conveyor belt that may be miles long with the huge snaking belts around the place (I know someone who worked in industrial control at a Sainsbury's warehouse... even *they* have miles and miles and miles and multiple levels of belts all over the place just in a big warehouse) and condemning every product beyond that point to the bin.

      And because of the nature of the process, which takes 10 weeks the article says, that means on average you're ditching 5 weeks worth of production. That's 10% of your annual production. 10% of your annual production is not going to buy an independent power supply capable of guaranteeing that it doesn't happen again, certainly not cheaper than just throwing it away and starting again.

      And they probably don't really care: "But Western Digital’s share price jumped 5% on Friday, as several analysts said that the incident will likely end up helping prices in a flash-memory market that is currently oversupplied. Mr. Cassidy noted that a fire in a memory fab owned by SK Hynix in 2013 led to an 18% gain in dynamic random-access memory, or DRAM, prices the following year. God does work in mysterious ways."

      1. dew3

        Re: Backup Power

        +1 on everything you said... except a very small ding on, "You're talking megawatts of power "

        No, you are talking 10s of megawatts, more likely 100MW+ for a fab this big.

  2. chuckufarley

    Maybe I am cynical...

    ...but isn't having "an accident" that will cause a price spike good news for them since the market is getting flooded with flash and DRAM memory at this point in time? Insurance will cover "an accident" but not the cost of warehousing unsold goods or the inventory taxes on those goods.

    1. Lennart Sorensen

      Re: Maybe I am cynical...

      Might be good for the market, but if you don't have any product to sell, it will be your competitors that benefit.

      1. Doctor Syntax Silver badge

        Re: Maybe I am cynical...

        If there's a current oversupply then presumably there is stuff in warehouses to sell.

      2. DougS Silver badge

        There have been cases in the past

        Where big memory players have been shown to collude to distort the market.

        Get them together, eenie, meenie, minie, moe, and they pick the one who will take the hit and suffer an "accidental" power cut.

  3. Anonymous Coward
    Anonymous Coward

    If the same had happened to a UPS manufacturer, my schadenfreude would have overflowed.

  4. Anonymous Coward
    Anonymous Coward

    What?

    Aren't there some fabs (elsewhere) that have recently scaled back production due to over-supply? Sounds like it's time to ramp up the line and take the money that these guys now won't be able to accept...

    1. Doctor Syntax Silver badge

      Re: What?

      You mean like the bit in the article that says "Micron is cutting its flash chip production, reducing wafer starts by 10 per cent"?

      I too wonder if Micron will cut its cuts.

      1. Flocke Kroes Silver badge

        Re: What?

        I had it the other way around: When will Samsung announce their 10% cut?

  5. Anonymous Coward
    Anonymous Coward

    Ultimately this is why "effciency savings" are risky ...

    Inefficiency is also a buffer for such instances. By all means strip it out and count the savings ... until something happens.

    1. Old Used Programmer

      Re: Ultimately this is why "effciency savings" are risky ...

      Some bean counter got his cost/benefit analysis wrong. Perhaps they'll now listen to their engineers a little more closely.

      1. OssianScotland Silver badge

        Re: Ultimately this is why "effciency savings" are risky ...

        Who are you trying to kid?

      2. phuzz Silver badge

        Re: Ultimately this is why "effciency savings" are risky ...

        "Some bean counter got his cost/benefit analysis wrong"

        If they go bankrupt in a year, then you'd be right, but I suspect the bean-counters got their maths correct.

        Yes they've lost a huge amount of money, but if it only happens once every ten years, and the amount they lost is less than the cost for a fully redundant power supply*, then it was the right call.

        * If that's even an option for an entire factory. Various commentators have estimated needing tens to hundreds of megawatts which is a significant fraction of the output of a medium sized power plant.

  6. Anonymous Coward
    Anonymous Coward

    Welcome to the world of "just in time" supply chains.

    They're a great idea, especially for some classes of commodity, as long as everything goes according to plan.

    If, on the other hand, the inevitable happens at an inconvenient moment, and appropriate precautions have not been taken...

    1. Anonymous Coward
      Anonymous Coward

      Re: Welcome to the world of "just in time" supply chains.

      Quite timely, considering the Japanese warning on a no-deal Brexit, noting that some UK plants are working to windows in hours.

      1. Dave314159ggggdffsdds

        Re: Welcome to the world of "just in time" supply chains.

        Anyone doing jit in hours is either working within the UK or using air cargo. Neither would be particularly affected by idiot-Brexit scenarios.

        (I wish I could say it's ridiculous to suggest the government will impose any of the idiotic and harmful ideas floating around instead of the better options, whatever form of Brexit is eventually chosen. But the adults aren't in the room.)

        1. Muscleguy Silver badge

          Re: Welcome to the world of "just in time" supply chains.

          Doesn't have to be air, just a spaced line of truck and trailer units moving 24/7 and timed for Ferries/Tunnel. Which is basically how they calculate the costs of congestion to the economy and why a Hard Brexit would be a VERY bad thing. The EU will have the French on side and the UK will be left to stew with NOTHING moving by air, sea or rail. The British Govt will have to cave when the media starts posting pictures of empty supermarket shelves precipitating panic buying across the nation.

          Remember the fuel strike? It will be MUCH worse than that much faster.

          Do you fancy your neighbours knocking on your door using 'my kids need food' and standover tactics to extract your stockpile?

  7. oldtaku
    FAIL

    Just 13 minutes

    If you're wondering how they lose $600M of stuff in just 13 minutes, I do vacuum engineering work (as one of the hats).

    Generally a setup like this is miles and miles of 'robots'. Not humanoid, but hexagonal with a chamber on each side. Each chamber exposes the wafer to things to build it up (gold), things to etch it, things to cure it. You roll up some wafers, the robot in the center moves one into chamber 1, does a process, then moves it from 1 to 2 and puts a new one in 1, etc, till all of the wafers have gone through the station and are ready for another combined process at the next station.

    Critically a lot of these processes are done at low vacuum (like 10 mTorr) and often with toxic gases or worse, pyrophoric gasses that explode on contact with normal air, like silane. Everything is closely timed, and you have to carefully maintain 1) the pressure of the chamber, 2) the rate of incoming substance(s). If you cure the wafers for only 3 minutes instead of 5, you lost the wafer. Now into this happy little juggling act you throw a power loss.

    *Honestly, it doesn't matter whether you lost if for 13 minutes or 13 seconds, you're done.*

    Your CDGs that measure pressure generally take two hours to get back to correct internal temperature, so they're reading wrong. That doesn't really matter anyhow because your valves failed and you either put not enough gas into the chamber or way too much. If you put way too much in now your chamber is contaminated. And your vacuum pumps all failed, so you lost pressure control. The turbo pumps spin at 75000 RPM and can't handle any amount of thick gas, so maybe you bombed them (shattered the fans). The computers controlling these don't like being hard powered down.

    Worse, and this is low probability and means you designed something wrong, but if you got too much silane and it contacted air because your pumps are down, maybe your robot caught on fire. Probably not, but either way you have to check all your turbos, open up all your robots, remove the destroyed wafers, clean your chambers. Oh, and now you need to recover all those process computers.

    Nightmare scenario.

    1. Anonymous Coward
      Anonymous Coward

      Re: Just 13 minutes

      Good to see some knowledgeable input here. This is interesting, anyone got more?

    2. Richard 12 Silver badge

      Re: Just 13 minutes

      What's in place to deal with this?

      Surely each plant has sufficient UPS to do a controlled shutdown so you only lose the wafers and not the plant itself?

      1. eionmac

        Re: Just 13 minutes

        Yes. This is why a 10 week delay to get back online. Plant is still operative after cleaning etc. Plant replacement is about 18 to 24 months *if* factories can accept orders.

    3. Anonymous Coward
      Anonymous Coward

      Re: Just 13 minutes

      Thank you, very nice writeup, focused around semiconductor manufacturing. Not sure how many readers here understand wafer fabrication. I happen to have a distant interest in that kind of thing, long ago, and more recently a bit of involvement back in the days of Silicon Glen in the late 20th century. How many other folks can relate to it?

      Do lots of people here remember how vehicle manufacturing plants/factories work? Can anyone come up with a corresponding example? I'll make a start, based on my experience of a few in the UK.

      'Raw' materials (and/or subassemblies etc) come in, get processed, and increase massively in value as they go through the various stages.

      The environment in a modern car engine plant (for example BMW Hams Hall) is quite clean relative to their predecessors (Ford Bridgend (RIP)? )but even the new ones are easy peasy in comparison with a semiconductor fabrication plant.

      Imagine the cost of an 'incident' at somewhere like Hams Hall, which contaminates the factory atmosphere or the production plumbing (dust, solvent vapour, other contaminants/FOD, etc). Everything in the factory at the time needs to be inspected and either cleaned (machinery) or rejected (work in progress). Same goes for production line paint shop. And so on.

      It's potentially days/weeks of downtime, corresponding loss of revenue, and so on. Cars with no engines don't sell real well, and the cult of JIT means that these things are built to order not to stock, so the picture is bigger than just the engine plant or paint shop or whatever..Customers badly affected will go elsewhere (not always easy to do *that* in the chip business these days).

      This isn't like closing an airport for a few hours because the baggage handling systems got confused due to power supply disruption.

      This isn't even like stopping a PCB assembly line, throwing the dubious product away, and resuming from basically where the line left off.

      The economics (and hard-vacuum physics and chemical engineering and control software and ...) of a wafer fabrication factory are probably not found in many other manufacturing sectors.

  8. Anonymous Coward
    Anonymous Coward

    Power losses and Restaurant Fires

    There are power backups and switchovers at all Fabs. But when you're intentionally trying to cause a power outage to get rid of inventory, you need to make sure those don;t work. Toshiba is very skilled at this

    Similar story:

    "Gee, did you see that Bob's restaurant burned down? ... and it happened right when he was running out of money due to massive losses. how sad... he just can't catch a break

    Now he will have to take the insurance money and start a new business... poor thing"

  9. Henry Wertz 1 Gold badge

    Holy hell...

    Holy hell... I mean, I worked a while back at a plant where IT equipment had UPSes but production equipment generally did not. Partially, the power use of this stuff was high and it would have needed a HUUUGE battery backup or generator. But there was another very good reason -- power cuts were not that common and the consequence of one was just not that severe. The power cut once there (the shift before I came in), and was apparently out about 30 minutes. The area I was in, it meant removing about 1 minute of production off the line (whatever was actually actively being produced when the power went out), maybe 5-10 minutes for the thing to boot back up and about 5-10 to put production settings back in (these lines could be set up to produce several variations rather than strictly one item.) They figured (including the 30 minutes of no power) it put them back about 45 minutes. Apparently a few machines (not in my area) where more complicated and stubborn, they were down like 3-4 hours.

    A month of downtime from a power outage? WOW.

  10. taxythingy

    Looks reasonable...

    Estimated NAND production revenue lost: half of one third of global quarterly supply (ca. US$10B), so about US$1.6B. (1)

    Battery to generator fail-over for several 100MW for all semi-critical processes: $500 million (10 year lifetime) and $50 million annual support, so about $100 million per year. (2)

    Expected frequency of power cut forcing fail-over to generators: once per 10 years (3).

    Effective break-even point: $1B revenue loss per power cut. Can afford more due to follow-on NAND price increases; can afford less due to contract and reputation losses. Call it even.

    Handwaving and crudely based on relevant figures, but within a factor of 2. I bet Toshiba have better actuaries than me.

    (1) Trendforce figures, via Anandtech article on this power cut

    (2) Based on US EIA 2016 estimate of liquid fuel generator installation cost of $1,600 per KW and doubling for the rapid start and other non-trivial requirements.

    (3) Haven't heard of this happening to these fabs before, but with likely exception of Kobe earthquake. Multiple power stations and likely suppliers, multiple interconnects, high-level system hardening, very tightly specified contracts.

  11. Anonymous Coward
    Anonymous Coward

    "The blackout affected process machinery, which are still not working properly,"

    'Machinery' is singular thus "The blackout affected process machinery, which IS still not working properly,"

    I'll get my pedantic coat...

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2019