...the systems were in the middle of one of their tediously long forced reboots that tends to accompany patching when airtime came upon them.
A Pittsburgh, Pennsylvania television station was forced to broadcast its noon newscast from its parking lot on Wednesday because this month's Windows update wasn't installed in time. The "major meltdown" occurred during a ribbon-cutting ceremony to celebrate Channel 11's new facility, attended by executives from Cox …
I tried WPXI on YouTube, but only found a 19-second splash on the opening of the new HDTV-studio:
If anyone can come up with a recording of a flaw, it would be more fun !
Btw., if you guys bothered to read the link, there was no mention of the story from ElReg, but a lot of other, almost equally funny incidents with the shiny new system.
That's why so many systems in broadcasting run on Linux. Sometimes you have single devices running 20 or more copies of the Linux kernel.
There just is virtually no reason for using Windows for such a system. It's not like you titler or disk recorder has to run Word.
Well sounds like they indeed need competent IT staff!
On the other hand, I would by no means say the linux update solution is any better - OK, it does not reboot your system without you knowing.
I have used Suse and Ubuntu for years and:
Suse's automatic update decided to install a glibc version which was too new for the vast majority of programs running on the box, one morning - all of a sudden - not one program I tried worked, except reboot .... - I mean even xterm would not come up, thank god I had one open. After reboot it did not even want to boot into singleuser - I grabbed ubuntu cd!
Ubuntu? well, twice it has broken my X server and it even recently UNINSTALLED xinetd - of course without telling me and throwing everything in /etc/xinet.d away !!!!! WTF ? I am a programmer and need a couple of [x]inetd daemons running ...
Of course, I still use GNU/linux but MacOS X's updater is just so much better, honest! I have the /Applications folder in my dock and a lot of apps in that folder. So to speed things up, I classify them - I have Utils, Multimedia ... folders where I put my apps. Well, iTunes is in Multimedia and updater never replaces the iTunes.app in Multimedia, BUT just instals the new one into /Applications! WTF!
None is perfect, but the Mac OS X glitch is the easiest to live with, IMHO!
I'm no [Linux|Mac|Win] fanboy!
Fraggle, the problem is that MS decides when you'll do an update and when it is critical. Any tech who decides they must override MS's warnings is putting their job on the line whereas if they obey MS's exhortations that the World Will End Soon and things screw up, that wasn't their fault (or at least there was no way for any problem NOT to be their fault, whatever they did).
Get some decent IT staff in who can manage updates properly. With the tools MS provide to manage updates there is simply no excuse for this sorta thing. It's dickheads like these guys that give Windows a bad name.
All OS's have and require updates. Downloaded Ubuntu last week. It's under 6 months old and there were 122 'security updates' waiting for me to download and apply - which sorta blows your theory out of the water Christian.
Linux, Windows, Mac, even Unix - all require patching. Just so happens that a large number of Windows admins are incompetent, giving the platform a bad rep.
Shite admins = shite systems, regardless of platform.
Ah yet another problem that has Windows involved it must be the bad Microsoft. Err not this time. There is only one reason to patch a live system, and that is because something does not work. So you have to ask yourself why are they patching the system are they trying to say that the system did not work before the patch was installed. If that was the case then they should have pulled the plug hours if not days before as the risk is too high. Did they leave windows update on and auto reboot, then the system admin’s are just crap and should be fired. Either way I am afraid that it is not MS at fault this time.
Although it's not unknown for large worldwide financial institutions who run Linux, to use antiquated, unpatched, out of date distributions because a.) applying fixpacks can (and does sometimes) break their functionality, b.) because the fixpacks do not include all fixes for every vulnerability found, c.) upgrading to the latest and greatest distro which is more secure, is a massive undertaking whilst ensuring BAU and finally because d.) the distro vendors pull support for after a while for old distros.
This is no different to running Windows in a production environment - all updates and service packs must be fully tested before deployment. If the update breaks something, then what do you do? Often it's easier to just avoid the fix than to roll out updates to system software across thousands of servers across the world.
Any kind of automated update on a production server is bad irrespective of the OS.
"Not have maintenance performed just before being required to be available unless *absolutely necessary*."
Windows seems to be perfectly capable of forcing "maintenance" and rebooting at inconvenient times no matter what options you set for update -- unless you turn updates off completely, which is also not a good idea.
They should have scrapped the news bulletin altogether and put up a photo of Bill Gates instead. OK, I know he's theoretically not "in charge" any longer, but most of this arrogance of "we know better than you when to reboot your machine even though you're running an overnight job and you've told us not to do it" came about in the days when he was.
As regards the choice of Windows in a critical environment, it's possible that management decided to buy an application that only run under Windows (after a few business lunches with the vendors!). Leaving the systems guys with no options, once the PHB has committed them to this choice of OS.
But i'd like to know why they were patching the system that close to going live. You just DON'T disturb a live system, without an exceptionally GOOD reason. If you leave auto-update on, then you've only got yourself to blame when it does what it's designed to do, rather than what you wanted it to do (but didn't tell it to do!). The question is really, what was he doing fannying around with a live, critical system (that would have been up for days and already grabbed automatic updates) minutes before they go live? And why does one windows box going down cause the entire studio to end up in the parking lot??? Surely it doesn't control that much of production that they basically cannot function without it???
A competent *broadcaster* had a mission critical system exposed to the internet? WTF were they thinking?
They performed updates on a *live* system? WTF were they thinking?
Or maybe they only *have* one system? See above...
Broadcasting IT - playout systems, newsroom systems, scheduling systems etc - is a black-box technology. There's absolutely no reason to attach it to the world. If you want the world to talk to you, you drop it into firewalls and DMZs. Even the manufacturer should only be able to get access to it after an explicit permission and connection from the inside, with appropriate risk assessment and method statements properly considered before the event:
Procedure: unspecified system update
Risk: complete system failure
Likelihood might have been low but you don't run that risk without very careful thought.
And as Gordon has pointed out: was the vision switcher, the audio mixer, the lighting system, the camera controls, the entire gallery, and the talkback all running from the *same* system? This way madness lies...
Playout system goes down. It happens. You read it from a script... in the studio.
Disgusted of Tunbridge Wells
Leaving the "Why Windows?" argument aside....
What kind of moron does system maintenence or updates at a high or critical usage time? Updates should be done when the system isnt in use, or when only stuff that can be delayed is being done... Such as 2am in the morning, when even if there was a breaking news event, only a handfull of folks would be awake to notice.
They clearly need a new IT team.
Ok, you've got XYZ OS it is critical to your production infrastructure, do you:
a) Install clearly untested updates (or even worse have your machine setup to automatically install all updates)
b) Test them properly to make sure that the updates don't screw your systems and make you look like an incompetent tit on live TV.
It's a toughie, but I'd got for B. I don't think this is a Windows problem. I've seen updates screw zOS, OS390, Solaris, AIX, HP-UX, Linux and Windows (I don't know anyone who uses OS-X in my line of work, but I'd wager you'll get updates that can knacker it as well.) The thing is that the vast majority of these updates were being tested on pre-prod systems, yes there were more problems for Windows updates, but when you have a company with 1 Z server, about 50 or so Unix boxes, a handful of Linux and about a thousand Windows servers, you'd expect that..
Like most OSes Windows can be very stable, if you bother to use quality hardware (Proliant, nothing else comes close) configure it properly, rather than take defaults, change controll it and not run software on it that is badly written or untested. You will get any system to run unreliably if you install it straight out of the box, don't bother checking updates or testing new software, run it on shit hardware and don't have change controll.
The trinity of how to break your systems without human intervention... automatic download, automatic install, automatic reboot.
I don't even do these things automatically on my personal computer, let alone mission-critical servers. Having a machine "notify of updates" is as far as I'm prepared to go with automation.
(a) automatic download - when that massive Service Pack gets posted to Windows Update, your network is at risk of being flooded with incoming stuff.
(b) automatic install - all those individual updates can drag a system's performance down. Plus, should you rely on Microsoft getting a fix right first time?
(c) automatic reboot - never, ever let a critical system reboot itself without it being a planned maintenance activity (i.e. during a low-risk time window)
Get your operators to go through each system in turn rather than letting them all go nuts after Microsoft say so.
Of course, I'm talking of your handful of major critical systems here: you have to make an exception if you've got hundreds of parallel blades to manage.
Several years ago in Chicago, the 10pm newscast got blown off the air by a circuit breaker tripped by a floor buffer. After several minutes of dead air, the weatherman resumed the broadcast from the emergency office used to broadcast tornado alerts, with an independant power system.
Not a windows issue - admin issue, sound like they had automatic updates switched on with the reboot option, well that and the fact the IT screwed up somewhere and used the ol' "It was windows update" excuse (I have also used this 'viable' excuse to cover bugs in our software). If the machine(s) did go down I would of thought it was a forced reboot rather than a crash.
As for all the "Should of used linux" talk.. No you shouldn't they want a easy to use system with production software that works and isn't written in someone garag, to be honest I'm surprised that they aren't run off macs, the only commerically available "weather forecast display" software I've seen runs on macs, least they have one good sector.
@Outcast - Damn right
This is exactly why we use Windows 2000 configured with the great XPLite (LitePC) tools. Once you remove all the superflous junk that comes with every windows install (solitaire on a server?), you have what is actually quite a lean, fast and stable operating system.
Once you've uninstalled windows updates, Internet Bugsplorer, Telnet server, remote admin, file&print sharing and the other assorted junk, windows becomes quite a secure operating system...
5 minutes to install and reboot...
Leaves 25 minutes.
Methinks the sysadmins are just a bunch of incompetants and decided to point the finger at Microsoft because it's "the done thing", also to save their own jobs.
BTW, there was nothing in this month's 'patch Tuesday' that forced an involuntary reboot. Funny, there is that little prompt that you can actually press the 'Later' button.
Why is everyone assuming it was the studio production systems that went down, it could have been the security system that lets people in to the building. If you cant get in you cant broadcast, but its less embarrassing to say it was the studio production system as its way more complicated than most other things in the building
Getting performant reliability out of a system is completely unrelated to the OS and comepletely RELATED to the admin!
I have run Windows, Linux, Unix servers in broadcast critical envrionments and they have run flawlessly apart from when use error! (like when i left the overnight playout system to auto-update at 2am and it rebooted! - i turned that off as soon as i got in at 6am the next day)
You cannot blame MS for this, its the admins who should not have been messing with updates /letting auto update do the work. Auto update in servers should always be OFF, you take a full live backup of a system which you can switch back in, in a matter of seconds, then update, test, and if all is ok you're safe, otherwise roll back.
I used to do this during my radio station's downtime - between 2am and 6am not once i'd got back into the office at 9am!
Even in their darkest days, Macs were primarily chosen because they sit and does their job.
I recently saw a million dollar audio recording studio running OS X 10.2.8 (from 2003!) and old version of protools without any kind of problem.
MS policies forced these people,professionals to give up perfectly working windows 2000 (with all service packs) and move to XP and they are pulling same tricks to move them to Vista now.
Using Windows on broadcast environment may seem cheaper but if we speak about TCO, TVs (even locals) may actually pay $20.000 for a single missed advertisement roll. That is the cheapest price BTW.
There are many options to control updates in a corporate enviorment (like WSUS) and eliminate auto reboots. A user can't even install a patch even if he tries as a matter of fact if you want. Sure Windows has it's flaws but when it has the lion share of the market it is going to have the lion share of the horror stories even if all the platforms were equal.
Like a few people who understand the problem and aren't fanboys said:
crap admins = crap systems
any one else thinking of the tv station hack in hackers?
<Norm> security, uh Norm, Norm speaking
<Dade> Norman? this is Mr. Eddie Vedder from accounting
<Dade> I just had a power surge here at home that wiped out a file I was working on
<Dade> listen, I'm in big trouble
<Dade> you know anything about computers?
<Norm> uh gee
<Dade> right, well my BLT drive on my computer just went AWOL, and I've got a big
project due tomorrow for Mr. Kawasaki.....
Using Video toaster on Windows XP has costed us printing 60 pages of EDL (offline edit text file) on a laser printer and re-editing hours of video.
Video Toaster’s fault? NO, it was traced to idiotic handling of files by Windows XP..
Just in case if any broadcast professionals who are at verge of deciding whether to Mac or Windows reading these comments.
I repeat: Toaster was doing its job, it was just OS which couldn’t handle a simple,text only file right.
- You don't disturb a live system
- they should have pulled the plug hours if not days before
- Get some decent IT staff in who can manage updates properly
All very true. Now I'm just curious about one thing : has anyone taken into account the fact that we're talking about TV ? And that they have timing requirements like nobody else ? Does anyone really think that, apart from CNN or Fox News, any local TV station has the means to have a second, fully operational IT infrastructure that is just waiting to go ?
Sure, they should. I'll bet they probably can't. And when the manager says "go", it's go - whether you're ready or not.
The only thing I find really interesting is how they were able to broadcast although their IT was down. Looks like the good old analog systems are not going to disappear yet.
did ANYone read the source story? Apparently the control software ("Ignite!", running under Windows) had been crap from the initial install. In fact, the ventor tech reps were on site trying to fix the problems that had existed since the initial install, and the condition was getting worse!
Shame on the author of this story! Nowhere in the source article was Windows mentioned. No mention of an update, just crappy production software that didn't like to interact with the new hardware. (no mention of what hardware)
IF (and based on the initial article, its a very BIG if) this was in fact due to a software update, two things immediately come to mind:
1. since the system had a history of failing to work properly, it may have been a production system change/update implemented by the 'tech reps' that was responsible, and
2. the tech reps may have blamed an uninstalled, noncrititical patch as the issue (our software relies on the latest beta version of xxx (mdac, perhaps) and since you don't have it...
Well they were lucky enought to even _have_ updates for those machines. There are plenty of "embedded" Windows machines which just fail to work once you update Windows. That's also a reason why those systems should never be connected to anything else.
BTW, di you already know that a lot of TV-transmitters today support SNMP?
Much of TV technology is suffering from being stuck in a no-man's land stuck between the worlds of IT, TV engineering and TV ownership and associated finance.
Many of the old world TV engineers initially rejected the early production control arrangements which were built on general purpose computers. In fact, they rejected pretty much anything which hadn't been built with massive redundancy and well-documented, graceful failure tendencies. So many of them never gained any experience with these platforms until they were forced to, late in the game.
Meanwhile, station owners and managers have been desperate to trim costs, and the last 10 years have created many ways to replace expensive dedicated hardware or humans with general purpose computers. For the people making these computer systems, they cut costs to be competitive with one another by using off-the-shelf windows and other components, just like the rest of the IT world.
So the new shape of hell involves old-school video engineers who don't trust the new kit and don't want anything to do with it and often refuse to take responsibility for it while retreating back to specialties not yet replaced by general purpose equipment. Management solves that problem by tossing IT whiz kids into the mix, and they typically have no understanding of a production schedule or the impact of a mis-timed update.
There you are, with the near-perfect storm where management literally can't afford to do it the old way with reliable dedicated gear and quality human talent. Yet it stumbles forward with two (often hostile) classes of engineers with different mindsets and operational priorities working to er, patch it together. Toss in a "free" installation of new robots by the vendor's own team and you can guarantee that the resident crew left to deal with it won't understand the cable markings or operational control.
As for the parking lot broadcast? It feels safe to guess that the studio had installed robotic cameras in the newsroom, and without a working control system or trained/rehearsed operators on staff, it probably made more sense to activate their emergency remote broadcast plan. Note that this would also clear the studio to give engineering more time to clean up for the 5pm.
Is there anyone in the Windows IT field who doesn't know about the second Tuesday of each month?
By at least 10 AM Pacific time on Patch Tuesday, the updates can be manually installed. No one is compelled to wait for automatic updates.
Despite my best efforts, I have not been able to come up with any Paris Hilton or Britney Spears angle. However, if one delves back into the 1920s, one can find the Clara Bow angle.
She was known as (I'm SO ashamed) "The IT Girl!"
Old geezer slowly shuffles across room and fetches coat...
I'm forced to use a MSXP pro system, and I had it running several critical tests for new hardware,
I had automatic updates turned off, because I was tired of the damn thing spontaneously rebooting, or nagging me to do so.
Guess what I discovered this last Wednesday AM?
Yeap, POS had rebooted itself for a "critical security update" and reset to full auto update.
Linux/BSD/UNIX/SOLARIS realize I'm running something desides a games box, and don't do this crap!
Perhaps it was a situation of over priced hardware from companies that depend on XP and think that it is fantastic. These companies also release vaporware all of the time. Can you imagine having 8 different XP systems talking cryptic crap to each other and by the way there is no documentation for most of this.
Or it could be just a really stupid mistake that they blammed on Windows because that is what they see others do all of the time.
....It's an XP "server" sitting between a huge Newsroom and the Production Control Room. It parses data from the Newsroom over IP and then ends up control Audio Mixer, Video Switcher, Graphics, Teleprompter, Cameras, Audio Server for sound bits...etc.. It does pretty much everything but lighting.
I've been around the broadcast industry a long time and have heard of similiar meltdowns with this type of technology....but what's interesting is that it's common for non-automation systems to crash as well. For example the Video Switcher that this automation system uses is also a Windows machine...even though it looks like a highly sophisticated piece of hardware taken out of a space ship. A station without automation has a heck of a hard time getting clean newscast on air if a Switcher fails. They sometimes fail, because people will plug a laptop into them and transfer some graphics...or they land it on the network to load software...
Windows has snuck into almost everything in Broadcast....an industry that traditionally used Linux etc....
Maybe that trend will start to reverse after more accidents like this.
Biting the hand that feeds IT © 1998–2019