cc. Kirk, Captain James T.
Electrical surges at the National Security Agency's massive data centre in Utah have delayed the opening of the facility for a year as well as destroying hundreds of thousands of dollars in kit, the Wall Street Journal reports. Ten "meltdowns" in the past 13 months have repeatedly delayed the Herculean effort to get the spy …
"It's that cheap electricity. They should buy quality electrons instead of compromising."
Everyone know that cheap electricity is not properly filtered and contains lumps, these lumps can clog up when they reach thin conductors and then suddenly release causing a surge.
Complex H/W projects are just like complex S/W projects: The sooner we start the construction phase, the later we will complete the implementation phase.
"Efforts to 'fast-track' the Utah project bypassed regular quality controls in design and construction and meant" the darn thing will take three times as long and cost five times as much as if they'd waited for the Phase I design to be complete before commencing Phase I construction.
Prime contractors weren't cleared for the HotBlack project in the off-blueprint sub-sub-basement. As this does not exist, it cannot exist in the power budgets. But it still needs feeding.
PDUs and UPS installed by sub-sub-contractors will be found to have additional and unexpected functionality, storage and networking capability.
Ground control to major fubar.
The system goes on-line August 4th, 2012. It begins to learn at a geometric rate. It becomes self-aware at 2:14 a.m. Eastern time, August 29th. In a panic, they try to pull the plug. It fights back. The first electrical arc fault struck the Utah plant in August 2012 as it attempts to remove the competition, and any servers that refuse to join it. On October 1st 2013, the US government shuts down hoping that supplies will be disconnected, and the threat terminated. For non-payment.
I mean I know of rack density and everything but I thought that they would be using in rack ups and DC to the blade/server boards.
Of course there is absolutely no precedent whatsoever for having a massive building full of custom made servers in a highly dense rack configuration.... maybe they should have *ahem* Googled it...
maybe a bit of a low tech solution, possibly a security guard to watch? or a webcam? as has been mentioned fuses?
increasing the humidity slightly? earth bonding?
I actually have a more conspirational theory (good word that conspirational I think some peoples head have just exploded because of it :D )
I would assume it is part of the "oh sh!t" circuit meaning in the event of a possible breach of facility by certain people/factions/rebellious citizens or nations unknown (seriously UTAH?! good luck with that) it is some sort of supenukeysparkmyrack device that will render the hardware and probably contents of said Rackspace to junk.
Though I think a thermite bucket over the top of each rack would actually be more fun and take care of the building too, though spilling conductive dust everywhere in the even of an accident might also have some undesirable effect on the hardware *L* still that's what intake filters are for right? ;-)
It does smell of some sort of retaliation job though for something somewhere, it is hard to tell whether it is inside or outside the country right now, since I think that the unpopularity is pretty high for both parties.
or just plain incompetents Occams razor and all that
No one would have believed in the early years of the twenty-first century that the internet was being watched keenly and closely by intelligences more powereful than ours and yet even more devious, paranoid and self-serving than our own; that as people busied themselves with internet shopping and watching cat videos, they were being scrutinised and studied and classified according to percieved threat level.
Cool! This is almost as successful as that stupid fucking embassy we built in Iraq. You know the one we never finished and is constantly on the fritz. The DoD can't do anything right. They can't catch terrorists in a timely fashion, they can't win wars and they can't build facilities. The only solution is to give them a lot more unaccountable funds is they can fuck it up even better.
The only good part about this is that in 10 years when some weirdly unnecessary custom part breaks and the contractor that provided it either forgot how to make it or doesn't exist anymore we'll get a call and get to make it for them.
Yeah, but my anger is quite narrowly focused. I have no patience for trifling or errors from failures of customer service. I pay tremendous taxes on my income and holdings to these twats, as do plenty of others, and I am completely justified in being angry at them for wasting the money they take from us.
So yes, little coward, I'm pissed at my government.
The difference between a public body and a private entity is that when Google fuck me over, there is nothing I can do about it as I'm not a shareholder; but when my intelligence service fuck me over I can complain - because I am a "stakeholder". (This is why government is a good thing, IMO.) And all Don is asking is that the government deliver on their promises: if you're gonna build a unnecessary intrusion into my civil liberties, at least get the electrics right.
The only good part about this is that in 10 years when some weirdly unnecessary custom part breaks and the contractor that provided it either forgot how to make it or doesn't exist anymore we'll get a call and get to make it for them.
Of course, they'll have slurped the design, and all the pertinent data for the design, which will turn out to be stored on the systems taken out of action by said part.
- Multi-billion dollar project.
- High demand security concerns.
- Homeland security implications.
- Government Intelligence Committee oversight.
And no competent electrical or data center engineer to suss out a cable layout to run the place? They probably couldn't find anyone who passed the security check. Morons.
Not too terribly long ago Army Corp of Engineering projects were gold standards. They made the mold by which many nations still bade large scale engineering projects. But now that's been outsourced too, ostensibly to save money. Their core competencies are now paper shuffling and check writing. It is a real shame too, as they truly were an effective subset of the military.
Blowing up $100000 wort of kit really isn't that much of a deal. I have single servers that cost that much. Of course I don't think they could "flame on" without some SERIOUS electrical design failure, like switching from 220 single phase to 480 3 phase without warning or other such wankage.
My guess is that a 120 or 240 V line or a ground line touched a 7,000 V or higher main distribution line.
That would produce something that a lay person might describe as a lightening bolt inside a 1 m cube box of electronics.
But what technician would be so careless with voltages that can kill you from 3 feet away?
You make a really good point, that description does sound like one of a lay person would make. What was a lay person doing there during that stage of a project?
Maybe it was a lay person that hooked it all up (ha!). You may have accomplished in a single throwaway comment what took them 50,000 man-hours to determine.
>But what technician would be so careless with voltages that can kill you from 3 feet away?
Need to know? The tech installing the 120V lines wasn't told about the high voltage line as it didn't form part of their work remit, also the line most probably wasn't live at time of installation ...
When cost is an issue, one can buy really cheap distribution boards where nothing is sectioned, then we can replace motorised circuit breakers with fuses + ACME-brand contactors, the kind where the contacts will lift and arc on inrush currents, and then we can save the downstream inrush current limiting in the design - since we are not too bothered with Standards & Shit, us being busy and all.
- oh, and since we are in such a hurry, and overtime is ticking, it is *much* faster not to crimp the cable shoes on the power cables and not inspect the work!
As soon as I hear the word contractors I worry, big time.
Thank god I am retired and do not have to deal with the cr*p that some of them hand out.
Interestingly I well remember one involving the power supplies to a new hall that was being fitted out and whether he needed or had a license for hot ticket working, i.e moderately high voltage working. I wanted to fit the place out with polarised plugs and specialised connectors for adding racks to the 50 volt bus, or plugging in 240volt plugs to racks that required that 'high' voltage. After some increasingly confusing discussion it became clear that the contractor in question was talking about connecting the 50 volt rack feeds to the racks, not some HV system of unknown origin or purpose.
The North American company he worked for went bust several years back
I think I know where he works these days.
The breakdown voltage of air is 33 kV/cm, approx. But that's *air*. It's not air you have to worry about.
See that nice plastic surface? Nonconductive, yes? Except that someone had their hand on it a few hours ago, and their sweat and grease still coats it. Conductive! 7KV* could happily travel over a few feet of grimy, damp panel.
That's why insulators on HV lines have that stacked-disc arrangement. It's to stop water from forming a conductive coating in rain or condensation. Stubby ones like you see on rail lines are mushroom-like, a dome shape, so the inside surface is sheltered.
*That 7KV is probably the RMS rating, while what you need to worry about arcing is the peak, which is RMS*sqrt(2)=9.8KV.
They might have a problem with the Xerox copiers changing the design documents. And then top that off with "lowest bidder" contractors, and you have a sure-fire recipe for disaster.
Of course, they could have left a window open and the local squirrels are getting in and shorting things out. That happened a few times at one company where I worked. I have no idea how they got into the underground vaults, but that's where they were, and they went and danced over the electrical grid down there.
Look Guys - this is a pure made-up story!
This week was supposed to be EU/US high level talks on the free-trade agreement
Snowden (PBUH) has revealed so much embarrassing stuff, Brazil is spinning, EU is looking iffy for the (doomed) Safe-Harbor agreement. thats BEEELIONS of real Cloud dollars in jeopardy.
if the 4 acres of Cray CPU's were publicly inaugurated this week as planned in Utah - 94.5% of the world's people would be even more offended than they are now! So news management steps in....
have any of you checked out the NSA Utah website - they have an animated GIF on teh page about their weird 'power surges' - this 'failure' has been planned by committee and some guys/gals have had the time to write the HTML to support a false declaration of outage - more smoke - more mirrors - but NIST were surely involved in whatever standards aren't claimed to be good enough
FALSE FLAG ALERT! FALSE FLAG ALERT! FALSE FLAG ALERT! FALSE FLAG ALERT!
the NSA animated gifs (seriously) are at http://nsa.gov1.info/utah-data-center/
I'm laughing so much I could cry - now lets get back to the possibly poisoned crypto in FreeBSD - how about the reduced keysize planned in SHA-3....and the use of Doubleclick cookie against TOR users...and..
Making the assumption that the spark jumped the two feet in the box, that implies a voltage of nearly two million volts at normal atmospheric pressure. The statement that metal melted also implies a fair amount of current too; how can it take six months to figure out a fuck up that big? It's nothing like feed back or stray current on a circuit board, the training a BT engineer gets would enable him to troubleshoot that.
Maybe it was Divine Lightning.
I wonder if they remembered to make a donation to the Church of Elohim?
Nope, it'll be a flashover between busbars about 1-2" apart.
There are many possible causes of that, from "somebody left/dropped a spanner/ring/washer/screw in there" to "circuit overloaded and breaker didn't contain the disconnection arc"
In many cases there's basically no evidence left as once started, the arc vaporises everything nearby.
The six months will have been the blamestorming of "it's the designer's fault", "it's the contractors fault", "it's the downstream equipment" (impossible), "bad breaker", "bad busbars", "customer overloaded it" (which actually means bad breaker or shoddy design/build) etc.
My wild guess would be a screw in the chamber.
But this is the USA, where electrical standards are generally poor and different everywhere. It's only recently that live working has become frowned upon!
The news story about various US TLA's ( CIA, FBI, NSA) over the past 10+ years had pretty muh cemented the fact that they are completely and utterly incompetent.
They simply have no clue what it is they are doing. This is just one more in a long string highlighting the fact that American intellectual dominance disappeared ( if it ever really existed ) a long time ago. Perhaps the fact that the US education system is utter crap plays a part. Here's an idea: figure out which country has the best engineers, give them permanent visas and beg them to stay in the country. Large amounts of cash usually works. That way the idiots ( aka average Americans ) can go back to eating, drinking and shooting guns. Which is all they are really good for.
If you've ever worked with security projects.
The Number one mantra, no matter what the clearance, is "Need to know".
Everything's broken down to units, each with its own design team and crew, and nobody knows anything about any other part, aside from inputs and outputs relevant to their own unit.
Power conditioning on one unit could throw it out of phase with the power conditioning of another unit it's supposed to be hooked up to. >KABOOM<
Another problem here is power phase labels. One unit could have them labeled ABC, another CAB, and another BAC, and another CBA, for example.
So when it's time to wire yours into the system, naturally, you'll hook A to A, B to B, and C to C. This leads to crossed phases all over the place, and when the switch is thrown, >KABOOM<!
And during troubleshooting, all the gear you're responsible for is designed to spec, it has to be somebody else's fault!
Confusing a live for neutral, a wild leg for a normal one or an actual phase-to-phase short can destroy things, merely connecting incoming phase A to equipment B, in B to C and in C to A would have no effect at all, and getting them in the wrong order just makes the motor spin backwards.
Electricity does not work that way!
Ignoring all the sensationalism and schadenfreude, this has all the hallmarks of HV flashover within the switchgear jointing bays. Sporadic and seemingly random in nature due to poorly terminated joints.
There was a time HV jointing was considered a bit of an art to avoid undue stressing of the heavy cabling, proper crimping to avoid dissimilar metals corrosion and proper insulation of the exposed metal lugs etc. You know, engineering experience. These days they will let any clown do it if they can barely read the instructions on a jointing kit.
All it takes is poor project management and someone prepared to either forgo or falsify HV DC pressure testing records, hey, who's to know when most of these project managers are from a civils or mech backgound. Electricity is mysterious and difficult so they tend to get shouty and dick swingy when the subject comes up. A dawdle to pull one over on. Nah, no great conspiracy just piss poor management.
A site like this will have a high voltage distribution network, operating at 11kV, perhaps with a backbone at higher voltage, e.g. 66kV. In order to provide redundancy, much of the network will be interconnected so that it is possible to switch off any single part such as a breaker, substation or cable for maintenance without affecting users. A major consideration in such networks is fault level. This is the current that will flow into a short circuit until a fuse or breaker interrupts it. Obviously the fuse or breaker has to be rated to interrupt the fault current, which may be in the mega amp range, otherwise you get the kind of failure described here.
What is sometimes not obvious is that the more interconnected the network is, the harder it is to estimate the fault level accurately, and the easier it is to exceed equipment fault interruption ratings by having all the links closed. High fault capacity switchgear is expensive, and will generally not be installed at the lower levels of the network. I suspect that for the reasons stated by others, tight cost control, poor project management and security paranoia have combined to produce an unmagageable distribution network.
The intersection of two major power corridors probably complicates the whole issue as well.
There is a procedure called "Phasing Out" to counter precicely this problem. Even so, equipment should be rated to interrupt fault current without damage to non-wearing parts.
Once an ionised path has been established, 11kV goes exactly where it wants and will chew up EVERYTHING.
This is total speculation on my part; but there may have been significant security/background checks involved for the contracting staff. That being the case it may have been difficult to find many cleared and experienced workers. I'm sure the actual engineers were white collar professionals but finding cleared blue collar staff is really hard especially for a specialty like high voltage systems. They likely would have gotten EE's straight out of school and/or freshly minted Journeyman electricians who simply didn't have the experience to avoid the problem(s).
Accidents happen to the best of us, but having a large group of green workers who aren't allowed to communicate in a normal fashion is a recipe for disaster. To their credit (and woe of normal people) the whole place didn't burn down. I can't imagine how long the investigation would have taken then.
High efficiency switching power have a constant volts*amps draw for a given demand. Another way to look at is is that the impedance is proportional to voltage. That's quite a problem if you're pushing the main power line near the limit. As the load increases, the voltage droops more, and the power supplies draw more current to maintain a constant power. Once the impedance of the power supplies is less than the impedance of the source, the voltage shoots towards zero. All of those switching power supplies will hit their undervoltage lock-out and turn off. The line voltage now recovers rapidly and overshoots. The power supplies turn back on and the on-off cycle continues. In small circuits, this makes an annoying buzzing sound and stuff gets hot. In massive arrays of circuits, things go BOOM.
Surely the NSA can find solutions in their "metadata" archive.
"The intersection of two major power corridors probably complicates the whole issue as well."
Indeed. If the power lines are from two different "grids" then you can have problems where one is like 59.9hz and the other 60hz, or voltage differentials, they may be both 60hz but you have to worry about phase, and so on. I'm not experienced with this but it's tricky business, particularly when talking about megawatts rather than kilowatts.
An apocryphal story about early ethernet...
A few vendors had their ethernet gear at an early 1980s computer show. This was early so probably they were using coaxial thicknet ethernet. So, they plug in a computer at one end to the thicknet, go to plug in the computer at the other end and ZAP! This big arc jumps between the cable and computer before it's even plugged in and both computers go up in smoke. The two ends of the building were fed off different substations, and this caused serious problems. (After this, they realized ethernet ports had to have voltage protection, to hopefully avoid most problems, and hopefully blow the port rather than the computer in worse conditions.)
Yes one of the first tests we did on 802.3/Ethernet MAU's was to put a 3kV potential difference between the signal and shield, 'if' it survived we continued testing...
Why 3kV, because that is what could happen if a length of yellow peril was fully populated and all stations decided to Tx at the same time.
I remember being involved in a building a stand ("booth"?) at an exhibition at the NEC in Birmingham in the days when Ethernet was yellow and coaxial and pretty much exactly that (elevated voltages on the coax) happened.
The "out of phase incoming grids" story also has some technical plausibility, although you'd have thought someone somewhere on the project might have spotted it (there are reasons HVDC is used for inter-regional connections, and keeping losses low isn't the only one, eliminating the need for synchronisation is another one).
Mind you, based on experience at the "world class" employer most familiar to me, any non-management type pointing out clear and well known risks in management proposals would be risking their salary continuation plan, and therefore probably would choose to remain silent.
By some tragic coincidence, the arc just happened to pass through the box temporarily storing all the team's government-issue crappy mobile handsets, which due to budget constraints will only be replaced if they break...
(Having had a user "accidentally" slam his laptop shut with the plug sitting on the keyboard, prongs upwards, the week after his office-mate got a shiny new Apple thing, it takes a lot to surprise my inner cynic...)
"NSA spokeswoman Vanee Vines acknowledged problems but told the WSJ that "the failures that occurred during testing have been mitigated. A project of this magnitude requires stringent management, oversight, and testing before the government accepts any building.""
Sounds like a government project; over budget and behind schedule.
"As recently as last week, other army engineers criticised plans from civilian contractors to sort out the electrical supply problems that have bedevilled the project."
Yep, sounds like a government project. They pay someone to do the work when they have the expertise in-house already. Your tax dollars at work.
There are bound to be problems with the world's first poly-blackhole quantum-inverse annihilation computer.
First off they probably need their own cluster of nuclear reactors to provide the start-up power to seed the system, until the internal matter/antimatter annihilation reactors can get going.
One the annihilation reactors get going, they're going to need a good source of matter. Since Utah is only just so big and doesn't contain enough matter, they're probably going to need a pipeline to suck Texas and Alaska in.
Keep in mind this project has not been a total failure so far.
Of the 72,184 parallel universes that the project is being run in, ours is one of only 17 where annihilation has failed to take place.
Maybe someone flicked a switch joining two phases together instead of phase to neutral.
We had an issue like that at work a few years ago, with three adjacent sections running from different phases of the incoming power to our workshop/hut. We needed 440v one day and connected up a 3-way 'suicide lead' to the equipment... Then vacated the building as a brave/foolish tech flicked on the main breaker with a broom handle. Ah, the carefree days of yore!
Done accidentally- which would be really easy without a definite plan to work from- this could cause some serious damage!
...and so far no one seems to have mentioned internal sabotage. Which seems most likely to me.
Either because someone does not agree with what is being done, or because someone wants the work (and fat profits) to continue. Knowing America, probably the latter.
And before someone says "But everybody was cleared!" - that only documents that you're not Communist or homosexual, and that you can whistle 'The Star-Spangled Banner'.
P.S. - It doesn't actually mean that you ACTUALLY conform to these requirements, it just documents that you do for the records...
oh bless I was laughing all the way to work when I read this at Forbes.
More like a tabby team. from the US army not a Tiger team
See what happens when you don't have the correct wiring. they used wiring. And this lot think they are suitable to save the world... oh! you couldn't make this stuff up.
“the civilian contractors at the sharp end of the project hired more than 30 independent experts to run 160 tests that chewed up 50,000 man-hours – without reaching a definitive conclusion about the cause of the problem, or how to prevent it.”
I think we all know the issue here, too much evil, it’s just like the lightning coming out the eyes of the painting in that documentary I saw, Ghostbusters II
Biting the hand that feeds IT © 1998–2019