IBM has launched its next-generation System z mainframe, the zEnterprise 196. Now we will get to find out, in the next few quarters or so, if the mainframe business still has some legs and can grow or the Great Recession of 2008 and 2009 has permanently knocked it down a peg or two. At the launch event today in New York, IBM's …
Why Don't They
..use/adapt QEMU and emulate S/390, x86, SPARC, PA-RISC, Itanic and all that on POWER. Emulation/Translation now approaches Native Speed.
The S/390 CPUs are artificially overpriced to pay for total system R&D costs anyway. They are a kind of dongle for the mainframe software. Just adapt the POWER/QEMU pricing so that it results in a competitive MIPS/$ rate.
IBM could then say
"We run ANY operating system (zOS, AIX, OS/400, VM, Windows Server, Linux, Solaris, HPUX, BSD, MacOS X*, Haiku,...) on our hardware."
"You can use each and every processor for any operating system and workload".
The current solution is the typical kind of hodgepodge which only The Pointy-Haired invent.
*at least technically. Legally they would have to talk to His Royal Highness, The Steve of California before doing that.
Hmm IO is not near native speed and since this is a mainframe we are talking about its pretty darn important... Which is also where a lot of the R&D goes - making sure the IO paths are all blazingly fast.
For that matter how fast would all the mainframe specific CISC instructions run emulated on a RISC CPU... Not very would be my guess.
Maybe leave the solutions to the pointy haired IBMers in futture.
Indeed I/O performance is one of the major strong aspects of the mainframe. CPU throughput has never been impressive; the "Seymour Cray One Man Show" beat the hoardes of IBM engineers on a regular basis in terms of CPU performance.
But I/O is actually the easiest to virtualize/ translate. A POWER system can have exactly the same I/O facilites (I/O Processors/Channels, DASD, Fibrechannel, Infiniband, 10G ethernet, DMA etc) as a current mainframe. What I am suggesting is to just remove the S/390 CPU but not necessarily all other mainframe hardware components.
My point was that it does not make sense to develop a POWER and an S/390 CPU at the same time. As POWER is the fastest CPU architecture around, even translated CISC programs would have excellent performance. Various projects at HP, DEC, Apple, Transmeta and now QEMU proved this. Some even argue that Translation can optimze code "on demand" and based on a real workload and achieve higher throughput than a statically optimized program. It is well-known that profile-driven optimizers achieve best results.
DEC achieved competitive performance with their Alphas emulating (actually dynamically translating) x86 CPUs using FX!32 technology.
I was under the impression that the z/Arch CPUs were modified POWER CPUs anyway, just with different decode front ends and/or microcode.
So, if they're already using a POWER core, but instead running z/Arch instructions in hardware or microcode... that's better than having to translate to POWER instructions, and then to the micro-ops of the real CPU, no?
That would imply they can perform some sort of translation inside the CPU. Transmeta did this, but they could apparently not compete with "real" x86 processors. Also, their technology included some very sophisticated software performing the translation and I have never heard of that with Power or S/390 CPUs which are part of production systems.
My guess is that IBM reuses ALUs, caches, branch predicition logic etc from Power CPUs in S/390 CPUs and the other way around. Microcode can only be used for infrequently used instructions, as that makes them *really* slow.
What I suggest is very similar to Apple's CPU transition strategies. The successfully supported 68K code on Power and the Power code on x86 with acceptable performance penalties.
Marketing talk about the "System of Systems" jargon
IBM's Mainframe was made to run z/OS and that's the only system it run effectively... not just due to HW or OS reliability, but largely because all the IT staff surrounding it.
Let's face it, for just one mainframe box running some LPARs for critical workload processing, there are dozens of IT high qualified (and well paid!!) professionals working to keep it healthy. Since they don't have to support thousands of different machines most even now in details the source code from a legacy COBOL application which supports their company business. Within the Unix and Windows environments the situation is not even remotely comparable... a support team productivity is often measured in terms of "hundreds of machines per employee", thus making impossible to provide the same effort for supporting a critical system....
The integration of POWER and x86 servers is being marketed as it is part of the mainframe technically speaking, when in fact is nothing more than normal racks interconnected through Ethernet... The advantage of integrated management using the HMC is basically software, so besides a strategic reason I can't see why it couldn't be done with Blade Chassis sold separately and connected in the same way...
By the way, the idea of the RAIM memory sounded very interesting at first... humm, a RAID of memory... but then I remembered that even a simple 2 socket Intel x86 machines already provide Memory Mirror for some time, isn't that similar to RAID 1 as well?? I couldn't find further details, so maybe it is something like a RAID 5, but then I also wondered... is there a RAIM controller for that or the processor has to spend cycles managing that? Would that controller have cache!?? Well, nevermind...
"the rock-solid IBM mainframe environment"
I'm not disputing the reliability of the IBM mainframe environment (though you can get pretty close to it on a well-configured Unix/Linux or even Windows box, if you understand what you're doing), but it's heavily dependent on IBM's total control of the hardware and OS (no third party drivers for this beast). I don't see how plugging an Intel blade into this box will allow it to run Windows or Linux more reliably than a similar blade in a Dell or HP box.
Shops that maintain IBM mainframe environments usually have a clear dividing line between the folks supporting the mainframe and those supporting Intel servers (partly for historical, but mainly for good, practical reasons, the skill sets being rather different). Human nature being what it is, there's often some tension (not always in a bad way) between these teams - the PC guys regarding the mainframers as computing dinosaurs, and the mainframers see the PCs as toys while they get on with the serious computing. If you place an Intel blade inside an IBM box, the mainframers are likely to want to do the monitoring and support - cue lots of 'my server's stopped working', 'nah, our monitor shows it's fine, your problem must lie elsewhere' discussions.
Rock Solid Windows!
Mainframe reliability -> "...you can get pretty close to it on a well-configured Unix/Linux or even Windows box, if you understand what you're doing"
lol! What the f* do you do to a Windows box to make it as reliable, or 'pretty close', as a mainframe! Unix maybe more reliable but MTBF for the components in any of those servers is not in the league of a Mainframe by miles. If you come up with the Chris Miller method of making cheap Windows and Unix machines that reliable you will be a billionaire in no time. Getting anywhere near mainframe levels of availability for an app on Windows or UNIX requires a distributed design - not one box - which is where the Mainframe usually sits, its a workload that cannot be distributed (Or easily anyway).
Your statement about third party drivers shows you have little understanding how reliability is designed into zOS. There are plenty of shitty bits of code running on zOS - unlike Windows and a lesser extent UNIX they could *never* take the system down.
"What the f* do you do to a Windows box to make it as reliable, or 'pretty close', as a mainframe!"
You configure and manage it correctly. Just because you don't know how to do it doesn't mean it can't be done. I expect my Windows servers to deliver at least 99.999% availability and that's just the sweet spot - higher levels are achievable, but it starts to cost serious money and most applications can't justify it.
Of course, if you're a bank or a stock exchange or a telco five-nines isn't good enough. But they usually go for high-availability solutions such as HP Integrity (Tandem to us old timers). IBM mainframes, good though they are, are really just for running legacy code.
You don't work for Big Blue, by any chance?
I won't even bother with the software side of things.... What Windows server has components with MTBF of decades with continuous heavy use?
Fine nines availability over what period, a day, month, year? We are talking about decades here and I'm saying mainframes not IBM mainframes (Ok so there is a bit of a large bias towards IBM in the market!) A client I work for has a Unisys mainframe for example that has been running non-stop for over 15 years. These are not systems made from off the shelf motherboards where cheap labour pops a chip / mem into say a Dell wintel server. Redundant power supplies, memory, IO everything you can think of is built into a mainframe, and built in by engineers in a clean room. Ok so this example is IBM only - the eFuse technology even allows bits of a CPU to be re-routed when they start to fail to avoid system failure.
No amount of configuring is going to make a Windows box any where near as reliable as a mainframe. I think though you have confused availability with reliability in your last paragraph - I mentioned before I would not deny any well designed distributed system running Windows / UNIX or any other OS or hardware you fancy could be 100% available as long as you design for failure and the workload is suited to being distributed.
Thanks for responding - I can, to some extent, see where you're coming from. Five-nines is equivalent to a second a day or 5 minutes a year, so we tend to measure it over a rolling 12-month period.
Availability is the only thing that counts to the business. It's a second order concern how reliable the hard drives are, as they're in a RAID configuration my systems are still available even if (when) one fails. I do monitor them, because I need to budget for replacement, but actually modern hard drives are very reliable with an MTBF measured in years. You may choose to believe that IBM have some secret sauce that they apply to make them even more reliable, but I believe they use the same heads and platters as everyone else, but put their own chips and connectors on them - partly to improve performance, but mainly to stop you using cheap plug-compatible parts for replacement or upgrade.
Every business should be asking itself, what level of availability does this application require? Five-nines is more than adequate for many typical apps (other factors, such as the application software itself, are likely to be significantly less reliable) - and for that level of availability a Wintel solution can suffice and be significantly more cost effective. This is the reason that there have been essentially zero new mainframe sales in the last decade, nearly all the purchases (which amount to several billions of dollars a year) have been for upgrades or replacements to increase performance or reduce maintenance and power consumption costs.
Seems we are arguing on the same side now - availability is always possible with cheap windows or UNIX boxes. And you are right to say its a function of reliability of the underlying components to a degree, although designing the system to be parallel will make for a very highly available system even with unreliable components.
This is not the space the Mainframe operates in -> mostly workloads are not able to be distributed so the reliability of the underlying components is very important when you have one very large box in your primary datacentre and another parallel sysplexed into your backup DC. You would never put such a workload on one Windows box in a DC - even if it could handle the demands it would never be reliable enough. You no longer seem to be arguing against this point...
The physical storage MTBF is a poinless discussion since disk is not in the mainframe but would be FICONd in from an attached SAN. Same SANs (DS8000s for example) are used in pSeries and Z so its a different question. Its also an example where the sub-system is designed to be highly available even when the components it is made up of are not, so an analogy there to the overall discussion of reliability vs availability....
The mainframe is back...
Hmmm... impressive stats. No doubt top-end TP shops that have moaned about performance and/or the encroachment of Dell etc now have an option. Whether the z196 will actually capture more market share is debatable. If its not a mainframe shop now, unlikely it will change to be one unless performance / reliability / price-point matrix is *very* compelling...
[Flashback... 5250/3270.... great days.... and no-one could hack RACF...]
Is RAIM really new? I remember testing a prototype proliant box in 2004ish and that had RAIDed RAM, RAID 3 if memory serves. I presumed this was 'mainframe trickle down' technology.
Also, traditionally Mainframe, UNIX and wintel/lintel have been run by separate hardware purchasing policies. For this box to be successful, it would require people from the mainframe, unix and intel teams to work together, understand each other's technologies and not start a fight by slagging each other's systems off...
I heard IBM Mainframe cpus are a derivate of POWER6, is it true? Then it explains the high Hz of 5.2GHz and it explains why the Mainframe cpus are so dog slow. You need four POWER6 to match two 2.93GHz Intel Nehalem (not Nehalem-EX, just ordinary Nehalem). A Mainframe CPU is 5-10x slower than a modern x86 such as Nehalem-EX or AMD Bulldozer. If you have 16 Nehalem-EX then it gives more CPU performance than this newest IBM Mainframe with 96 cores. That is a bit funny. No, cpu performance has never been the IBM Mainframe strong side.
BTW, there are OpenVMS machines with 17 years of reported uptime. OpenVMS clusters are legendary for their availability and runs for decades, where you upgrade one machine at a time without any downtime.