Feeds

back to article HP revs up Integrity, Superdomes for Itanium 9500s

Now that Intel's "Poulson" Itanium 9500 processors are out and Oracle is supporting its database on HP-UX 11i v3 running atop those processors, HP CEO Meg Whitman has two fewer things to worry about. The life of Ric Lewis, the new general manager of the Business Critical Systems division who took over that job late last week, is …

COMMENTS

This topic is closed for new posts.

Page:

IT Angle

2.7x the performance over Tukwila systems isn't very much to brag about... that still leaves Itanium half the per-socket performance of first-generation POWER7 (which also scales to 1024 cores, not 256).

3
2
Silver badge
FAIL

"......half the per-socket performance of first-generation POWER7...." If only IBM could actually build a system that could balance out performance throughout the whole design, rather than concentrating on harping on about fictional core performance and ignoring the bottlenecks that make their AIX servers no faster than Integrity ones at best. Oh, and don't mention the Power blades - an even worse case of bottleneck city, all wrapped up in a blade chassis that can't deliver enough power to actually run all the blades at full whack!

/Aint this FUD stuff fun?

2
2
Anonymous Coward

Double the cores = double the performance... again

The second generation in a row where HP... er, I mean Intel (being held by the collar by HP so they don't run off the stage), has doubled the performance of the CPU!*

*Double the performance with twice the number cores = zero performance improvement

1
1
Silver badge
FAIL

Re: Double the cores = double the performance... again

"....Double the performance with twice the number cores = zero performance improvement." Well, going on the hp press release (I haven't got a demo box to verify with yet), there is more than double the performance increase. But the actual article quotes; "We have systems in the labs that are significantly above 3X." You did read the article before dribbling, didn't you?

0
3
IT Angle

a few questions from folks stuck with a few Itanium systems

1) Is Itanium the last major chip to get to 8 cores? Better 3 years late than never I guess.

2) Does it still have the strange 5 QPI links? You need 7 for the blades and they only use 3 for superdome

3) Do they have hardware based virtualization yet? IVM needs hardware assists

4) When will HP have an SAP benchmark? three generations since a benchmark?

5) What is the performance per oracle license? Seems to be the lowest in the list.

6) After tukwila delay we were promised socket compatibility thru kittson as a reason, what happened to that promise.

7) Are they still planning to have two versions of kittson the first one handicapped to pretend there are two more chips?

to-da-loo

3
3
Silver badge
Facepalm

Re: a few questions from folks stuck with a few Itanium systems

Well, on the plus side the IBM FUD is getting briefer, even if it is still just as much fantasy as the last launch. You know IBM are worried about a competitor when they start FUDing so hard!

"1) Is Itanium the last major chip to get to 8 cores? Better 3 years late than never I guess." But the first in a balanced design without the performance bottlenecks crippling Power systems, which are already being outperformed by cheaper Xeon systems.

"2) Does it still have the strange 5 QPI links? You need 7 for the blades and they only use 3 for superdome" Meaning what? That it worked before and works now with even faster QPI links? Please, try and actually come up with a point to your FUD, Alli.

"3) Do they have hardware based virtualization yet? IVM needs hardware assists" You know that hp has had a far better partitioning story for years than Power, having full electrical isolation hardware partitioning (nPars) years ago and which IBM still can't match. There's also software partitioning (vPars) and virtual machine hosting (IVM) and shared resources (Containers). IBM simply cannot match that. I'm always amused that the IBM trolls even want to try arguing this point seeing as AIX has been playing catchup on partitioning for years.

"4) When will HP have an SAP benchmark? three generations since a benchmark?" Dunno, ask hp. Which benchmark do you have in mind as I'm just dying to point out it's probably one IBM gamed with $5m of storage doign the real work. I always prefer benchign in my own environment, with my own data, but if you want to blindly follow benchmarks then I suppose it's no surprise you buy IBM.

"5) What is the performance per oracle license? Seems to be the lowest in the list." Compared to what? Seeing as you don't have any benchmarks to compare that is pulling a statement on performance out of your anus.

"6) After tukwila delay we were promised socket compatibility thru kittson as a reason, what happened to that promise." They are socket compatible, it's just the new servers have extra motherboard tech to maximise the advantage of the new core design. Now, concentrate, and try and remeber an IBM upgrade that wasn't a fork-lift one...? Trolls in glass houses and all that.

"7) Are they still planning to have two versions of kittson the first one handicapped to pretend there are two more chips?" What on Earth is that about? Seeing as Intel haven't released any such details (I get to see the NDA stuff from hp and Intel) I'd have to say that's more info pulled out of your ar$e. Please provide a link to backup (any of) your claims. I'm betting you can't!

1
4

Re: a few questions from folks stuck with a few Itanium systems

HP instructed Intel to break Kittson (renamed “Kittson22” or “K22”) into “two

sequential HP system product releases” separated by one to two-and-a-half years “with the

timing as requested by HP . . . .” The obvious and intended purpose of this is to further the

illusion of a longer roadmap—and again, extend the end of life visibility date that was so

important to customers. In HP’s words, “HP will be able to extend the Itanium roadmap by

releasing a follow-on to Kittson about 2 years later (dubbed K22+ for now)” which will

“[e]xtend our BCS and TS profit pool longer (this takes us to about 2017).” Importantly, the

“second” Kittson chip (K22+) will not reflect incremental development and functionality. Under

the agreement, the aggregate functionality of the two releases is established first, with HP

retaining the right to withhold known Kittson functionalities until the ostensibly next-generation

chip. HP did not reveal any of this to the marketplace.

19. The new agreement also clearly allows Intel to disinvest in Itanium,

immediately. A key part of this is that K22 is to be a “Xeon socket compatible” microprocessor.

That means that Intel is only developing a new Itanium “core,” which will then be combined

with Xeon components (“uncore”) to create the full chipset solution. The typical reason this is

done—and the reason here—is that it is cheaper, here for Intel, to reuse uncore from another

product (Xeon) rather than build specialized uncore for Itanium

2
1
Silver badge
Happy

Re: Re: a few questions from folks stuck with a few Itanium systems

LOL, Alli is mainining the IBM FUD! I particularly like an IBMer trying to FUD the idea of a speedstep when that is all the Plus versions of the Power chips are! What, can I claim IBM have "given up on" Power just because they released a Power7+? Truly a desperate attempt at FUD. Look at the Xeon roadmap, look at the "tick-tock" design philosophy - introduce a new chip on the tick, then a die shrink on the tock - and you'll see the K22+ idea is simply business as usual, as used on the previous generations of Itanium. IBM really are getting desperate!

I also like the bit where the longterm hp plan of Xeon and Itanium socket-compatibility, first discussed back in 2002, is somehow "disinvesting"! Could it be because the benefits to Intel's partners that make both Xeon and Itanium servers - having only to develop one line of servers for both - won't be available to IBM, because IBM will still have to pay the extra to develop a completely separate Power servers line alongside Xeon ones? Well, that's if IBM haven't sold all their x64 server bizz to Lenovo by then. Come on, Alli, go give your Elmers a kicking and tell them they need to do a lot better than that effort. If anyone is "disinvesting" it would seem to be IBM's marketing people as, going by this poor effort, they obviously can't pay for serious competitive anaylsis.

Of course there is the other option - IBM aren't FUDing Kittson too hard because they plan to have their own future Xeon servers also have Itanium chip options. After all, they do need a way to get off the Power bandwagon before it gets completely overtaken by x64. And the last time they shipped Itanium servers, without even trying, they sold 10,000+ X445 units DESPITE only selling it as an option when their customers said no to Power. Just think how much IBM might sell if they actually didn't have to worry about the politics of the Power lobby? Maybe even half as much as hp.

0
3
Mushroom

Re: a few questions from folks stuck with a few Itanium systems

Matt has got completely insane.

My questions are not ibm, they are customer questions.

I jokingly copied the kittson remarks from Oracle's website and he thinks its from IBM marketing. google the kittson comment and you will see they are oracle. The Oracle rep here is still telling us they are doing the minimum development possible and since hp-ux is not getting any new licenses customers would be insane to put new releases on something that does not have critical mass cause they might get nuked.

Then Matt goes on to say ibm might have itanium servers in the future. Matt are on you vacation in Colorado smoking somthings? Last I checked the only o/s's running on itanium were HP-UX, VMS and Tandem non-stop. I am sure there is some obscure OS since they are showing bull, nec, hitachi china somebody. I think NEC is really just a HP OEM so they dont count.

socket compatibility with xeon is a divestment from intel not a customer benefit since they broke their promise of socket compatibility from tukwila to kittson.

well so much for my weekend in FL, gotta catch my flight back to NY. I miss FL

Too-da-loo

1
1
Silver badge
FAIL

Re: a few questions from folks stuck with a few Itanium systems

"....I jokingly copied the kittson remarks from Oracle's website...." Oh, so you simply switched to mixing Oracle's FUD in with your usual IBM FUDfests? I can't say I noticed any increase in either relevance or reality.

".....The Oracle rep here is still telling us they are doing the minimum development possible....." The minimum is that Oracle is contractually bound to deliver their software on Itanium for hp-ux. There is no such binding contract for AIX. In fact, hp-ux and OpenVMS on Itanium are the only OSs with the guaranteed availability of Oracle software. I'm sure the US legal authorities would love to talk to your Oracle rep about his statements as they seem to be in breach of the US court judgement of September 20th 2012; "A California judge has ruled that Oracle's decision to drop support for the Itanium processor in future versions of its database products was a breach of contract, that Oracle is required by its agreement to continue to develop products for Itanium "until such time as HP discontinues the sales of its Itanium-based servers," and that "Oracle is required to port its products to HP’s Itanium-based servers without charge to HP.""(http://arstechnica.com/information-technology/2012/08/hp-wins-judgement-in-itanium-suit-against-oracle/). Looks like your Oracle FUD is just as easily debunked as your IBM FUD, will you try Dell's next?

"....... Last I checked the only o/s's running on itanium were HP-UX, VMS and Tandem non-stop....." A very carefully crafted answer, as you should be well aware that IBM was on the Itanium bandwagon before they realised in 2002 it would kill their mainframe golden goose. A version of AIX 5L was booted on Itanium, IBM even sold forty licences for it before they killed the project, and it wouldn't take much development work to get the current releases working on Itanium.

".....since they broke their promise of socket compatibility from tukwila to kittson....." But they didn't. They are socket-compatible, it's just that you need the new mobo's with their additional tech to get the advantage of the new Poulson CPUs. But it's not like this line of FUD is new, it's the same with all Intel chip releases, you get one or the other from the IBM Elmers - if the new CPU fits in the old socket in exactly the same way the IBM FUD is "there is no development, that chip is dead"; and if there is any change then the IBM FUD is "they have not kept socket-compatibility". LOL, so predictable, and just as lame. Seriously, if this is the best FUD you can get then IBM seem to be the ones disinvesting in marketing.

"....Too-da-loo." Ah, I think I have just spotted the source of your new FUD - the loo!

0
3
Anonymous Coward

Re: a few questions from folks stuck with a few Itanium systems

I don't think you can call it "Oracle FUD" when Oracle takes exactly what HP executives are writing to each other about scamming their customers and then copies/posts it. I suppose it does create Fear, Uncertainty and Doubt... but completely justifiably. If you are on Itanium and don't have any doubts, you are either on the 4 person Itanium development team or you haven't been paying attention for the last decade.

1
1
Silver badge
FAIL

Re: a few questions from folks stuck with a few Itanium systems

"I don't think you can call it "Oracle FUD" when Oracle takes exactly what HP executives are writing to each other about scamming their customers and then copies/posts it....." But that's not what Oracle did. As the judge in teh case ruled, they took unrelated quotes and info, wound it into a completely incorrect story, and then tried to use it to influence the market and sell their own servers. The outcome was not what Oracle intended - hp's Integrity line is now the ONLY one with a guarantee of availability of Oracle software. IBM's Power does not have that, even Oracle's own CMT range, nor Fujitsu's SPARC64.

".....you are either on the 4 person Itanium development team....." From the article; "That's because some of the largest financial networks employ Itanium-based servers in one form or another, as do most of the Fortune 100." Seem a lot of us customers are quite fine with Itanium, and we're in the Fortune 100 before you ask.

0
2
Silver badge
Boffin

"....what happened to the i3 generation of machines....."

Apparently there is an Intel chip with that name and hp didn't want journalists getting confuzzled. According to our hp salegrunt, they also wanted to stress the doubling of the number of cores per socket, hence i2 to i4.

1
2
Headmaster

Well.....

Nice article!

IMHO, the problem for HP, is gong to be that SD2's are going to compete against power 770 and 780's, and that is going to BE tough on price, as you are competing against a midrange system with your best system.

Although i Think this is a nice upgrade, but it is to little to late.

// Jesper

2
1
Silver badge
WTF?

Re: Well.....

"....SD2's are going to compete against power 770 and 780's...." Why? The modular 770 and 780 can be outmatched by the hp Integrity blades. To go up against the SD2 with Poulson IBM will have to go to the expensive 795 and still not have an answer to either hp's superior partitioning options.

0
3
Alert

Re: Well.....

show proof that itanium can compete with Power7+ and is not 1/3rd the performance. Everything we have seen from ibm and hp would say Power7 cores are 2.5x the performance of tukwila cores and Power7+ will be at least 3x the performance of Poulson cores. When do Poulson systems ship? We got delivery of Power7+ systems last month.

do the math and sd2 can only compete with 770 and not with the new 780 power7+ box

still trying to find out why the tukwila/poulson chips have 5 qpi links since ther are not enough for the blades and only 2 (3) are used on sd2. Just seems bizarre.

1
1
Silver badge
FAIL

Re: Well.....

".....show proof that itanium can compete with Power7+...." Actually, I'd like you to show a real World case where this is true.

"..... Everything we have seen from ibm and hp would say Power7 cores are 2.5x the performance of tukwila cores and Power7+ will be at least 3x the performance of Poulson cores....." What, you mean those really realistic IBM benchmarking sessions, where they switch off all the CPUs in the systems except one, but keep all thirty-two sockets of cache switched on to distort the performance figures? Or the ones where they have $5m of short-stroked storage doing the actual work in the backend? Yes, we all know how believable the IBM benchmarks are. The truth is IBM will not backup their benchmarks with guarantees. If you ask IBM to guarantee you will see the same performance on site then they suddenly start backtracking and making all types of excuses. I know, I have asked them.

The truth is, despite the Oracle media hype during the courtcase, and despite IBM's best efforts to both FUD Itanium and poach hp SD customers during the case, hp still kept on selling Itanium servers. With the court judgement removing the Oracle threat, and with the main plank of IBM's attack removed, it's pretty easy to see that hp will soon be winning sales of SD2 against 795.

".....5 qpi links since ther are not enough for the blades...." I love how you keep repeating this statement without supplying any actual technical argument as to why! Do you really believe that if you repeat your fondest wishes enough some fairy godmother will alter reality to make them true? Sorry, darling, that's not how the real World works.

1
3
Holmes

Re: Well.....

You've gotta be kidding Matt.

Yes, I've also seen the HP sales manual for how to sell against POWER, that what you get from partnering with everyone. That doesn't mean that it's right.

Do you seriously think that a 8 socket Poulson Blade with limited IO, limited Memory, can compete against a POWER7+ based POWER 770/780 with 16 sockets, fullblown IO and Memory ? And hotswap of everything ?

That is in Kebabbert territory of fanboiship. You should know better.

I think the HP Poulson blades looks like a solid product, but it's not remotely in a position to compete against POWER 770 or 780's.

Now I have no doubt that the bl890 i4 will be able to beat the PS704 blade, but again that is a 2.4GHz POWER7 system, in a form factor that I would never use for enterprise computing.

And you say it yourself.. partitioning... POWER stopped doing partitioning back in 2005. You know ... it's kind of a bugger to claim that you have the best moped, when trying to win a race against someone in race car. Nobody ... except SUN and HP talk about partitioning anymore... and HP does actually have a decent virtualization option.

come on....

// Jesper

3
1
Devil

Re: Well.....

"where they switch off all the CPUs in the systems except one, but keep all thirty-two sockets of cache switched on to distort the performance figures?"

Eh ? You need to go and read some manuals, Matt. When turning off cores in a POWER7 chip, you are also turning off that part of the distributed L3 Cache, that is held closely to the core. Kind of a bugger IMHO,

And as unfair as I think the whole Oracle versus Itanium story is.. it has cost Integrity, it will still have the smell of "dead man walking". Which I take absolutely no pleasure in stating that it has. But again.. people don't buy hardware they buy solutions to run a software stack on, and when the higher parts of the solution stack, states that they don't like the lower parts... then it's the lower parts that has the problem.

And as for guarantees, that is typical EDS sales tactics, and we've seen that in action a few times.. Sure we hit our performance guarantees wrong with a factor of 2.. dear client.. here are som more HP x86 blades for you as we promised, oh.. yes you have to pay for us managing them.. and the Oracle licenses ?

Now that is also you who has to pay the extra cost there.. check your contract.

Guarantees is not about having the best hardware but the most scrupulousness lawyers.

// Jesper

1
0
Silver badge
Happy

Re: Re: Well.....

".....You've gotta be kidding Matt....." You wish. Unlike Alli, I can back my statements up.

".......Do you seriously think that a 8 socket Poulson Blade with limited IO......" How is the IO limited? A BL890c i4 will have 12 PCIe mezz slots, each of which can hold quad-port cards if required. And that's on top of the built-in Flex LOMs which give 16 Flex-10 converged network adapters. A four-box p780 has to start adding expansion IO bays in a second rack to get even close to the single BL890c i4's IO capability, and then still does not have built-in CNAs. For the P780 you have to add in dual-port PCIe FCOE cards to get a CNA option, and doing so reduces the number of PCIe slots available to other adapters - only six per box unless you add expansion IO cages. And when the P780 adds IO expansion cages it does not add bandwidth but merely spreads the same bandwidth across more IO slots. That's not scale-out IO, that's inevitable IO contention.

".....limited Memory....." <Yawn> The existing BL890c i2 can do 1.5TB with the old style memory, and it looks like the new RAM options for the BL890c i4 will at least match the maxed out P780's 4TB.

".....a POWER7+ based POWER 770/780 with 16 sockets...." Ah, but you didn't mention the limits on your P780 when you start trying to load it up to sixteen sockets. To have octo-core CPUs you have to drop the speed down to 3.8GHz, and then those CPUs only come with 4MB of L3 cache per core compared to the 4.2GHz CPU's 8MB. But then that's a trick in itself - what IBM do is supply a proecessor card for the P780 with two octo-core 3.8GHz CPUs which you can then run in either MaxCore mode (as a slower octo-core with 4MB cacher per core) or TurboMode (four core's switched off, the remaining four cranked up to 4.14GHz and sharing the cache out at 8MB per core). Once again, another IBM design, just like the IBM blades, where you can't have speed and scale becasue the system can't supply enough power to run all the cores flat out and keep them cool. Once again, the P780 is another case where IBM forces you to choose either scale or performance, but you can't have both. Hilariously, IBM tries to sell this inability to deliver as a feature! With the new eight-core Poulson Itanium CPUs actually drawing LESS power than the Tukwila quad-cores you can still maximise both scale and performance with the hp option, being able to use the fastest octo-core Poulsons without having to switch cores off or choose less cache and less CPU grunt.

"....POWER stopped doing partitioning back in 2005...." I think that's more of a case of IBM throwing in the towel when they realised the limitations of the Power architecture meant they could not match the hp option of nPars (true share-nothing hardware partitioning), vPars (software partitioning), Integrity Virtual Machines (hosted VMs) and now Containers (shared OS image supporting multiple sandboxed instances). Please do explain to Alli that IBM gave up in 2005 as she seems to be running about eight years behind everyone else.

0
3
Silver badge
Pirate

Re: Re: Re: Well.....

Oh come on, Jesper, I've been waiting all day for you to fall into my trap, please get a move on! What, one post stopped you dead? Even Alli keeps on FUDing longer!

OK, let's take it another step and look at what Turbomode on the P780 actually does and - more importantly - how, and then why IBM tries so hard to keep the realities of Turbomode a secret from their customers. First, look at how IBM sells Turbomode, it's all about "increasing performance by increasing the core frequency". By how much? A paltry 7% increase in clock. Yet IBM will talk wildly about 10-20% increase in single-threaded applications (just don't ask them for a guarantee of this). So how can a 7% increase in clock make that much difference? Truth is it doesn't, all the increase in performance is in the increase in cache and memory per core that you get when you switch half the cores off. Instead of eight cores per socket fighting for the same cache and memory bandwidth it's now divided between four cores. It probably also helps that the congested IO structure on the P780 is eased when half as many cores are trying to drag data through. In truth, the 7% clock hike is just there to make you think the cores are generating the extra oomph, when the reality is it is just IBM smoothing out the bottlenecks in their system.

And that's just the start of why this is a big secret that IBM tries to hide. The Power architecture has awful bottlenecks and it is a truly terrible piece of design when switching off half the cores actually makes the overall system faster. But the even bigger secret IBM will not tell you about Turbomode is the impact on licensing, and if you bring this up in the sales conversation it will really make your IBM salesgrunt cry. Since Alli is so fond of mentioning Oracle, maybe she'd like to explain what happens with Oracle Enterprise licensing when you switch your off half your cores? No? OK, I'll tell you - Oracle doesn't give a f*ck and still charges you for all eight cores per socket! Yes, that's right - if you want the same thread count and plan on using Turbomode it will not only mean doubling up on the P780 modules you need (extra hardware cost) but also effectively doubles your licensing cost. And that is why BL890c i4 will win against P780 in Turbomode becasue the savings on Oracle licenses and support costs will pay for another server!

0
3
Trollface

Re: Well.....

Actually you are not backing up your claim.

And first before I start, I have the outmost respect for the BL8x0c i2/4 blades these are great products, but it is what it is, a blade server.

Memory.

Well, the memory you can buy now in a bl890c i4 is 1.5 TB, many of the HP itanium servers have always had good memory sizes. Sure they needed it cause of the bloated code but that is another story. In a POWER 770-MMD it's 4 TB, both machines will most likely double the amount of memory that will be able to hold, But the POWER server has a trick up it's sleeve, and that is memory compression. And with the POWER7+ that is done in hardware outside the core.Sure it's not a miracle cure, and although it supports up to a factor of 10 and IBM marketing crap material will put in big numbers, then we've seen good results ranging from 1.5-2.

And then there is HPVM, lets see what does the sizing guide says oh yes set aside 18% for virtualization overhead, and then then there is the usual few percent for bloated Itanium executables. So your 1.5TB of RAM all of a sudden is down to 1.2TB, Not that a POWER server doesn't use memory for virtualization, but not in the ranges that HPVM does.

IO.

Come on. You have 1 QPI link going off each chip to a single INTEL IO chip. that is a total of 8 QPI links, verus the 16 GX++ links in the 770-MMC and I haven't seen the MMD layout but potential it can to to at least 32 links, as each chip have two GX++ busses.

And the BL890c i2/4 cannot even hotswap an adapter.. honestly.. it's a blade.

And as for the number of adapters... the POWER 770/780 can house up to 184 PCI-e x8 adapters.... versus 12 for the bl 890c i4. Honestly.. how can you even start to compare.

And your ramblings about limits in the POWER 780.. RTFM. Sorry, not to be harsh but ....

As for Processor power..

Again according to INTEL then the specintrate2006 numers for Poulson is ~2.3 times that of Tukwila suggests a result of less than 1250 specintrate2006 for a bl890c i4 blade. And that sure is good documentation that the i890c i4 blade is just as fast as a POWER 780-MHD that documented does 6130 specintrate2006.

That difference is so big...... that it's really not a contest. You should start to be a bit more sceptical about marketing material.

Furthermore the bl890c is a blade... it can't hotswap memory, SMP links, Processors, IO, it doesn't have redundant clock, service processors and and and...

Now virtualization...

The fact that you still think partitioning is king just shows that you don't get it. Sure partitioning has advantages when it comes to limiting and controlling software licenses, but the whole electrical isolated carving a machine up into smaller more inefficient bits is.. well.. so 90ies. Sure HPVM is on the right track, and far superior to anything that Oracle is doing, But it's not in the same class as POWERVM, it's a freaking HPUX os running guest HPUX'es. It's not like people are doing different firewalls zones, mixing and matching test, development and production inside the same VMhost without blinking as people are on POWERVM. Try to get out in the real world a bit, you've been do to many x86 blade solutions.

// Jesper

1
1
Trollface

Re: Well.....

Amazing ... you haven't even understood the difference between TurboCore mode and then intelligent Energy Optimization (what corresponds to Intels TurboBoost.). There is not such thing as something called Turbomode for POWER.

And you build a whole post up around this.. amazing...

TurboCore mode is mostly useless, unless you have bought a fully loaded system where you only enabled half the cores, then you have the option of letting the half of the cores you have activated run at a constant higher frequency.. IMHO not particular usefull. But again.. it's not a feature that costs anything. It's IMHO mostly marketing bull. But hey it seems to have worked.. it got you all fired up.

And I am very well aware of the fact that Oracle will charge you for the full 8 cores, which IMHO is not that unfair... you have bought all the cores.. and I don't think that Oracle recognises activations as a hard partitioning technology.

// Jesper

1
1
Silver badge
Happy

Re: Re: Well.....

".....the memory you can buy now in a bl890c i4 is 1.5 TB...." Not very convincing start when you start with a mistake! You can't order BL8x0c i4 models yet, only the i2 versions. Which does kinda blow a big hole in the rest of your comparisons.

".... and that is memory compression...." Oh dear, do you really want to go there? I'm sure you don't want to tell any potential customers about the performance hit you get when you try compressing memory in p-series.

"......lets see what does the sizing guide says oh yes set aside 18% for virtualization overhead...." What, first generation Integrity Virtual Machines? Times have moved on. But how about you compare to hp npars or vpars, which don't require any overheads, which can be booted, started, stopped and patched independently of each other, then try and find an IBM equivalent? Don't mention Lpars as PowerVM means all your eggs are in one basket. And does PowerVM run on smiles and pixie dust? Er, no, it requires a service partition which is an overhead, just like Integrity Virtual Machines. And when you need to patch that PowerVM service partition? All the PowerVMs have to come down then!

"....as each chip have two GX++ busses...." Each chip might, but both chips on a module are connecting to the same two port bus on the back of the module. You also need to add in more power (and cooling) into your rack for the P780 expansion cages, whereas with the BL8x0 blades the power (and cooling) is in the blade chassis and adding mezz cards and/or more chassis switches does not mean running more power cables into the rack. Oh, and don't forget the cost of those external switches you have to add to your P780 solution to get to close to the same capability as are already in the blades chassis. Compared to hp's i4 Integrity blades P780 will be pricey and use far too much rack space, and when compared to Superdome2 with Poulsons it will be slow and use far too much rack.

"....specintrate2006...." And we're off into the usual hidey-hole for IBMers, the Benchmark Zone! Not to be confused with the Twilight Zone though there is a similar amount of smoke and mirrors and a marked lack of reality. Why does an IBMer like talking about specintrate2006? Because it is as far from the reality of daily enterprise computing as possible. If your job entails running specintrate2006 all day, then you might be interested in actually running a specintrate2006 comparison for a Poulson-equipped BL890c i4 (hp haven't yet, so Jesper is just guessing). Let's ignore that the hp blade Jesper does want to compare results with was not only running the older CPU but also wasn't even running the fastest version of that CPU, so his scaling for Poulson performance is already way off, but instead ask is your company is willing to buy a P780 and run JUST ONE CORE? Not one CPU, but only one core is used in the SPECint tests. Now, what happens on Power when you run one just one core? You get all the cache and all the memory. Does any customer in the World run their business software on just one core of a fully-stacked P780? Of course not. But even if it was your business to run nothing but specintrate2006 all day, you couldn't actually run the same test as the software used by IBM at the time of the test (Oct 2012) isn't available for sale ("Software availability" is Nov 2012 in the test report but not listed on the IBM website). So not only is Jesper's use of the benchmark a farce, he couldn't even reproduce that test for a customer that wanted it!

".....That difference is so big...... that it's really not a contest...." Well of course it's not, seeing as you actually haven't even run the test on Poulson! Maybe you should wait until the Poulson systems are orderable and then buy one to test. Better still, try some real World benchmarking with real data and applications. I would never think of making a purchase without doing just that, but I can see why you would prefer pushing IBM benchmarks.

"..... it can't hotswap memory...." That one always makes me smile. IBM will say they can hotswap EVERYTHING! Then ask them to do it on a live system with penalties in the support contract if it falls over. Guess what they ask you to do then - stop the system!

Then you have to consider what happens with the P780 when you hotswap items such as adapters because they can't offer Virtual Connect or any similar capability on either LAN or SAN on P780. When you hotswap an HBA on a P780 you have to first stop and move all the SAN devices connected to it if you want to stay connected - but your app probably won't like that, so you have downtime for your app even if the box is still up. After you "hotswap" it you then have to go through all the SAN switch zoning and change all the WWN entries before you can re-start your application. With LAN "hotswaps" on P780 it's the same story, the MAC address changes. In short the so-called hotswap is just the start of a larger administration exercise. On the hp BL8x0 blades I can use Virtual Connect and Flex Fabric to ensure the same WWN and MAC addresses are still presented to the same slots, meaning I can actually change the WHOLE sever out and still have a new one boot up with the same identity WITHOUT any changes to the LAN or SAN. But I suppose the extra admin tasks keeps IBM Global Services admins in work. And, at the end of the day, seeing as BL890c i4 is likely to be a lot cheaper than P780 even without you spending twice as much on licenses for the P780 cores you can't even use, I can always build a cluster of two BL890c i4s and still save on the cost of a P780 solution.

"......partitioning has advantages when it comes to limiting and controlling software licenses...." What, after you cost the customer TWICE in Oracle licensing because you had to turn half the cores of to make the P780 go faster, now you want to NOT talk about license savings? LOL! A lot of the UNIX market out there is consolidating older systems onto new and more powerful ones, which means partitioning if you are to get the best utilisation. Of course, seeing as you don't seem to care about license costs you probably don't care about utilisation either. Sure, there are customers out there that actually want 256 cores in one OS instance but they're a much smaller amount of market than the ones doing consolidations. And for those that do want large instances hp will simply put up a SD2 option, probably two for the same cost as you wasted on core licenses you can't even use on your P780!

0
1
Silver badge
Happy

Re: Well.....

"....There is not such thing as something called Turbomode for POWER...." Apologies, I admit it's been a few months since the IBM salesgrunt got laughed out of the room for suggesting we try it.

".....TurboCore mode is mostly useless...." Really? You posted a specintrate2006 benchmark that used exactly that, a fully-stacked P780 with only four cores per CPU active, i.e. in TurboCore mode! In fact, it seems to be the preferred way for IBM to configure their system for all benchmarks. So what you're saying is that all IBM benchmarks are useless? Glad we agree on something at last. Care to explain why you think a system where you have to turn off half the CPUs to get it to go faster is a good design when it is very obviously a move to surmount bottlenecks in the system? Surely you would like to think that more cores would mean more processing power, not less, especially when IBM like to harp on about their superduperfast cores?

"...... it's not a feature that costs anything....And I am very well aware of the fact that Oracle will charge you for the full 8 cores......" LOL! So you simply chose to ignore the point on Oracle Enterprise licensing being per core for all eight cores in each Power CPU even when you switch off half of them? That would seem to be a feature that costs a small fortune! Looking at the list price for RAC it's $23000 per core without software upgrades and support - ouch! So that's $736,000 dollars flushed down the drain for your benchmark P780 in TurboCore mode just on RAC, what about the rest of the software stack? No wonder you don't want to talk about it.

0
1
Thumb Down

Re: Well.....

Oh.. just what I needed after a terrible day at work with dorks that

"Not very convincing start when you start with a mistake! You can't order BL8x0c i4 models yet"

Big clients get privileges, you can be quite sure that If I call HP and ask for 100 BL890c i4 blades.. fully equipped I'll be able to put in an order, you can bet you I would. And with regards to how much RAM the bl890c i4 can have and that of the POWER7+ based 770/780. Then I am stuck with what is out there of information.

A part of my job is to evaluate and interact with different vendors, this includes HP,IBM,Cisco, Oracle and others, hence I talk about what is out in the open space. Not the different vendor NDA info I have.

"Memory compression"

Actually it performs quite well, specially on POWER7+, sure there isn't anything like a free ride, but it actually cuts down on memory bandwidth needs. But again it's not for all software stacks. But it works.

"18% HPVM/IVM overhead("

The source is current and valid no bull here, and it's even with 64K page size. Again... not 4K pages *hint* *hint* So no worst case bullshit from my side, I am being deadly honest and serious. And no we do not run 6.1 yet. An n-1 strategy works. I know that if you look at sizing a partition then it's 8.0-8.5% you have to allocate. But for the whole server the sizing guidelines from HP says 18%. It is what it is.

"nPars/vPars"

Sure they require an overhead. It's a partition layout overhead, but it's still an overhead.

One of the big advantages of big servers is that you are able to 'pack' the virtual machines better. This is first year computer science Matt. And you know it.

"POWERVM"

Service partition ? POWERVM ? Ehhhh... it can do IO virtualization through X number of Virtual IO servers, no single point of failure *hint* *hint* if you choose to do so. You don't have to. And POWERVM can do processor pools which basically is partitioning, specially good for limiting licenses. But which is still under the shared pool processor concept. We had a client who financed two POWER 795 in license savings, by simply having me and their WW lead architects sit down and design things right.

The whole logical/virtual/physical processor abstraction layer, actually protects against errors, simply by the fact that it puts in another real abstraction layer between the hardware and the OS. Hence it's basically not the OS that primary has to worry about failing hardware.

AFAIK it's much the same functionality you get in HPVM. So your whole talk about vPars and nPars just shows that you are out of touch with reality.

"Each chip might, but both chips on a module are connecting to the same two port bus on the back of the module."

No. I don't know what you are talking about. Again you need to read a manual. I think I recognise that exact wording... isn't that from the HP attack kit against POWER servers. It looks like the line that is to counter MultiChipModules. But again it's not even the slightest relevanse to current POWER7+ servers.

And sure to house 184 adapters you need IO drawers. Which means you can have close to 800 physical LAN/SAN ports, is so beyond what you can have in bl890c i2/4 that you are being ridiculous. But generally the internal IO is more than enough. So the modular design means that you are not limited to what you can put in a blade chasis.

Furthermore if you've just touched any serious recent literature on datacenter design you would know that having high heat density blade chasis, it being IBM/HP/DELL, will have serious consequences for your datacenter. I remember 6 years ago where I was in a project where a client insisted upon buying blades, due to the huge savings, only problem was that their datacenter couldn't handle the heat density, so they build a extra cooling unit just where the blades were. Which kind of ruined their business case. And just as a side story, I remember their HP hw technician was furious cause the dorks hadn't taken airflow into account so the temperature in their racks of rx4640's (and IBM and SUN eq also) actually meant that they violated their service contract.

And besides your whole premiss that a BL890c i4 is just as fast as a POWER 780 is a load of bull.

Now your whole rambling about specint.. is just ... well.. "You can't keep using Oracle won't let us do benchmarks", as an excuse all the time. And I am relating to the Intel released numbers, which is not done on a recompiled code... Again Poulson is a new compiler target for compling code for Itanium. So we should see better numbers.. perhaps up to the factor of 3 that HP is claiming.. but it's still way behind POWER7+. And you don't want to talk about RL performance... or price/performance.. it just makes the case for POWER better.

As for hotswap.

Sure.. with the skill you are displaying with regards to POWER, then surely they will ask you to stop the system. That actually sounds rather responsible of the HW technician. If you client don't seem to have a clue, then don't put their system into jeopardy.

When I design solutions where RAS is of the essence, then you can be damn sure that I'll involve my IBM/HP/SUN whatever hw technician(s), and get his/her opinion, and let them know the system they are to service in the future. The knowledge base that those people can draw upon is far greater than what I have access to. So I would be a fool not to involve them. And sometimes my HP/SUN/IBM hw technician will say, I know that the manuals says xxx but we have a gut feeling that .... so we think we need to close down the machine, then I'll not hold it against them, on the contrary, I'll know that I am dealing with a professional vendor.

And we do hotswap memory and cores hot patch microcode etc on POWER, but it's not you just... you have to know what you are doing.

And your ramblings about hotswapping adapters on a POWER 780 is ... well laughable.. it has nothing to do with reality. Basically the ways you are describing is how you would have done things on a POWER4 based p690 or a SD. It has nothing to do with how things are done today. Again you are talking blade servers versus old old iron. RTFM Matt. Don't project the shortcomings of how you did things when you actually had anything to do with systems onto your competitors current offerings.

And even if you want to run with physical adapters then there is something called MPIO.

But again we run virtual machines with up to 168 logical cores on a slice of a POWER 780, with fully virtualized IO, that does several GB/sec of IO sustained. Sure such large single system images needs a bit of special care, but it's not a problem.

As for licenses.

Why do you keep on talking about TurboCore, don't you get it ? You are rambling. We do not have a single system that in production runs TurboCore, sure the few 780 and 795 we have can do it, again it's a feature code that doesn't cost you anything so why not order it. You are trying to use the worst case scenario on POWER, a scenario that no sane architect would ever use.

You comparison is fundamentally flawed, consolidation means partitioning ? Why.. why not just simply virtualize. I have POWER 770 and 570's here that run hundreds of different virtual machines from perhaps 20-30 different clients, production,test,development... located in different firewall zones.. all on the same physical box'es.

Sure we do need smaller machines like for example we have some 740's, for those people that has license restrictions, but the virtual machines are more expensive on the small hardware. (Again a good reason for using POWER is that you can get away with perhaps a Oracle std. [one] edition, where you have to cough up for a more expensive license on itanium.)

And you are still talking about partitioning when doing consolidation..... Amazing.. how can you stay competitive using such antiquated methods ?

And as for 256 cores, who would want that, that is in 99% of the time a sign of a bad design, but perhaps the architects where you work aren't any better ?

If you really want to talk about a problem with the POWER servers, then it's that they are getting to powerfull the capacity growth rate is higher, than the growth rate for more capacity of many of our clients. So that even our largest clients who used to have hundreds of UNIX servers we are today contemplating moving them to a shared environment. And we are trying cannibalize on other solutions.

But what stands when the dust clears is that your comparison between an entry level product like blades and highend/midrange UNIX servers is fundamentally flawed.

// Jesper

1
1
Silver badge
FAIL

Re: Well.....

"....Big clients get privileges, you can be quite sure that If I call HP and ask for 100 BL890c i4 blades.. fully equipped I'll be able to put in an order, you can bet you I would...." Great, but you haven't which means you're just guessing. It's OK, you can admit it, everyone on the forum knows it. Well, except probably Alli as she seems to be still on the loo.

"......sure there isn't anything like a free ride....." So you admit there is a performance hit. Good. Next point?

"....And no we do not run 6.1 yet....." So you're not talking about the latest version, as I said.

"....."nPars/vPars" Sure they require an overhead. It's a partition layout overhead, but it's still an overhead....." Wrong! The nPars tech has zero overhead as it's managed by the frame, not the hardware used by the OS images. And vPars you only use the system when you create or change the vPars, which means no overhead in operation. Which means both are superior to PowerVM, which does have overhead, and you still haven't answered the original question and provided any p-series comparable tech. I'm not surprised you'd avoid that comparison though.

".....with the skill you are displaying with regards to POWER, then surely they will ask you to stop the system....." LOL! Our Power stacks are designed, POC'd and implemented with IBM, so if they are duff designs as you contend then so are IBM's best practices! It says a lot when IBM themselves have no faith in their own so-called "hotswap" tech!

"......And your ramblings about hotswapping adapters on a POWER 780...." Gosh, I was rambling, yet you didn't manage to disprove a single point! You claim it was that way on Power4, that it's not the same on P780, yet fail to actually supply any information to back up this claim. Want to try again?

".....Why do you keep on talking about TurboCore...." Well, apart fromt eh way IBM try and sell Power by saying it's so fast, look at the GHz rating, over 4GHz (in quad-core mode, not eight-core), but also because YOU insisted on basing your argument around a specintrate2006 result. That result was attained using TurboCore to work round the bottlenecks in the Power architecture. You then salted as TurboCore as "useless" - make your mind up! And now you admit "We do not have a single system that in production runs TurboCore"! So the whole range of P780 benchmarks using TurboCore is completely unrealistic. And yet you saw fit to use it as the basis of your argument that P780 was the better system..... You sure you know how to design systems for customers? How many of them did you feed that specintrate male-bovine-manure to? But I do note your whole rambling non-argument STILL does not address the licensing issue.

So, let's review - you admit you don't have a BL890c i4 to test; you admit the memory tech you tried to tell us was "great" actually has a performance hit; you admitted you were not talking about the latest version of Integrity partitioning; you failed to answer the challenge of matching nPars and vPars; you failed to answer the point about IBM not having faith in their own hotswap tech; you failed to answer the problem with adapter swaps on P780 as it can't do Virtual Connect; you fail to counter the licensing point; and finally you fail (again) to note that the tech you described as "useless" is how IBM avoid the bottlenecks in Power. Looks like you must have had a very bad day as that whole rambling post of yours was just more evasion. Try again!

0
1
Big Brother

Re: Well.....

"Great, but you haven't which means you're just guessing"

No.I am not. I don't have the prices for the individual parts, and very detailed info. But I do know which Itanium processors will be available for the blades, I know minimum and max RAM sizes etc etc. Again, there are benefits from being a business partner.

"So you admit there is a performance hit. Good. Next point?"

Sure there is, why would I claim otherwise ? If you want to critisize the POWER7 platform then it should be that at least the 'B' versions have had to much processor power compared to Memory, you simply ran out of memory before you ran out of processor resources. Now here we've had very positive effects with using that excess processor power to level out the playing field.

Try looking at this SAP paper:

http://www.sdn.sap.com/irj/scn/go/portal/prtroot/docs/library/uuid/b074b49a-a17f-2e10-8397-f2108ad4257f?QuickLink=index&overridelayout=true&51346334076479http://

On the 'C' versions of the POWER7 machines, things got better with 4TB RAM support, on the machines we use (770's and 780's)

"So you're not talking about the latest version, as I said"

No, you said "What, first generation Integrity Virtual Machines?" And no we don't, we do n-1. And I haven't seen any references to a change in memory overhead in 6.1. So.. nice try to brush off. I'm right and you know it.

"Wrong! The nPars tech has zero overhead as it's managed by the frame"

You don't listen to what I say. I'll try to cut it out into cardboard for you.

I have two SD's lets call them SD1 and SD2 each with 128 Cores

I carve SD1 up into 16 nPARS of 8 cores each inside that I can run vPars that matches the virtual machine size I want. On SD2 I simply run one large IVM of 128 cores.

Now lets say I want to fit 'n' client virtual machines onto these machines, of an average size of 3 cores. Then the overhead I am wasting is on average something like Average virtual machine size divided by 2.

On SD2 the 'waste overhead' will be around ~1.5 Cores. On SD1 it'll be around 16x1,5 cores~= 24 cores.

It's really not that complicated, this is why partitioning is not cutting edge any more. Again this is pretty basic capacity planning. That is why technologies like IVM/HPVM and POWERVM is kicking the butt of partitioning technology, like for example LDOMS/vPARS And we haven't even beginning talking about overcommitment yet.

"Gosh, I was rambling, yet you didn't manage to disprove a single point! You claim it was that way on Power4, that it's not the same on P780, yet fail to actually supply any information to back up this claim. Want to try again?"

Are you kidding ? Are you using your obvious lack of knowledge of something as basic as MPIO, (that is MultiPath I/O) to try to fabricate an issue ? Do you want me to sit down with you and run through the whole freaking manual ?

RTFM Matt.. it's not that hard. And with virtualization and NPIV and MPIO it's even easier, cause then all the devices are virtual.

"Turbocore"

No the Specintrate2006 result I mentioned was not using turbocore mode. You just have to look it up:

http://www.spec.org/cpu2006/results/res2012q4/cpu2006-20121002-24653.html

"CPU(s) enabled: 128 cores, 16 chips, 8 cores/chip, 4 threads/core"

That wasn't that hard was it ? And no.. the POWER 795 running in TurboCore mode with 128 4.1GHz cores is not the same as the POWER 780 running with 128 cores with Intelligent Energy Optimization enabled and thus a maximum frequency of 4.1 GHz.

Again you haven't got your facts straight, the whole premiss for your argument is Dead wrong.

And no I don't use specint.... in my work. We pay a company called ideasInternational, to use their independent measurements figures to do sizings. For example they have an measurement called RPE2. And I can't praise them enough. But I am not allowed to post their RPE2 numbers for the different platforms on the web. It's in their contract, you have to understand this is their bread and butter. And IMHO it's money well spend. Although I think their numbers for the Oracle T4 systems are to high.

The reason I used specint in my post here is that it more or less the only Tukwila benchmark that HP have posted.

And btw. IBM does sell a 8 core per chip POWER7 running at 4.1 GHz. There is one in the Flex System p260 Compute Node,

Your "let's review"

So.. there is no licensing issue, no turbocore issue (other than we agree that upon that it's mostly useless), no hotswap issue and the male-bovine-manure comes from You. And if your local IBM staff is crap, then hire a business partner or take responsibility and educate your people, or don't use their products. What do I care..

// Jesper

1
1
Silver badge
Happy

Re: Re: Well.....

".....If you want to critisize the POWER7 platform then it should be that at least the 'B' versions have had to much processor power compared to Memory...." So, as I said, it's an unbalanced design with bottlenecks, and IBM's (and yours and Alli's) constantly harping on about core speed is pointless as the bottlenecks in the design mean the core performance is neither here nor there. Thanks for confirming that point. Which is why I would suggest anyone looking to buy Power (or any server) benchmark it in their own environment, with their data, and not believe the fairytales that are IBM benchmarks and core perfarmance blather.

"....no we don't, we do n-1. And I haven't seen any references to a change in memory overhead in 6.1...." So, as I said, you don't use the latest version of hp's partitioning software and don't have proof it is still the case with the latest version.

".......Now lets say I want to fit 'n' client virtual machines onto these machines, of an average size of 3 cores....." Which just goes to show you don't understand hp's partitioning or the SD2. In your carefully constructed worst case, three cores, I would simply create an npar of three Superdome2 blades (twelve sockets) and then split those down into 3-core vPars without any wasted cores whatsoever, and no overhead. Indeed, I could go a step further and include iCAP or TiCAP cores so I can have the exact right cores at the time of implementation and then activate extra cores as required should the solution grow, adding those core online or even migrating them online between vPars. Now, do you really want to compare that to Power (no nPars)? And I bet you don't want to compare that with the wasting four cores per socket of P780 TurboCore!

".....It's really not that complicated....." Not when you know what you're talking about.

"......And with virtualization and NPIV and MPIO it's even easier, cause then all the devices are virtual....." You mean using the FCOE adapters, the only CNA option on P780, which just doesn't have the flexibility or featureset of Virtual Connect or FlexFabris? Nice try. What about the other LAN and SAN cards on P780, all of which have fixed MAC and WWN addresses? Sure, you can multipath, but it's not the same as FlexFabric.

".....http://www.spec.org/cpu2006/results/res2012q4/cpu2006-20121002-24653.html....." Oh, you mean the one where IBM had to dial down the core speed to 3.724GHz? Because that's what they have to do with the Power design, limit core frequency as they scale up, because the system cannot supply enough power or cooling for running eight cores continually at 4+GHz, and nowhere near the 4.4GHz they blather on about in their marketing slides when they implement TurboCore. And all Intelligent Energy Optimization does it let the cores run very short bursts at high frequency, and then not all the cores at once. In short, it's another IBM fudge. I didn't realise you were talking about that particular specintrate result becuase it was with RHEL (evidently it's not just SLES that is faster then AIX then), rather then the fastest 780 result with AIX which was what I linked. My sincere apologies for not realising you wanted to show that AIX is slower than that Linux distro too.

".....And no I don't use specint....." So you admit that you wouldn't use TurboCore becasue it is "useless", and wouldn't use specint in your work (so why mention it then?). Yet your argument is based around specint and you quote frequency figures for TurboCore......

"......But I am not allowed to post their RPE2 numbers for the different platforms on the web....." So, having quoted a benchmark you wouldn't use in work, you now want to refer to a benchmark you can't share.... Still not convincing! What's next, are you going to channel the dead for some performance stats? Why not ask IBM what they say about benchmarks - "All performance estimates are provided “AS IS” and no warranties or guarantees are expressed or implied by IBM. Buyers should consult other sources of information, including system benchmarks and application sizing guides to evaluate the performance of a system they are considering buying.Actual system performance may vary and is dependent upon many factors including system hardware configuration and software design and configuration. IBM recommends application-oriented testing for performance predictions." (IBM Power Facts and Features IBM Power Systems, IBM PureFlex and Power Blades October 2012). Looks like IBM don't actually put any worth in the specintrate blather either!

"....IBM does sell a 8 core per chip POWER7 running at 4.1 GHz. There is one in the Flex System p260 Compute Node...." Which is basically a specilaist blade with very limited scaling capabilities for AIX and very limited choice of specialised low-height RAM modules. The BL860c i4 will trounce it in every way. So IBM can build a two-socket Power7+ blades and run all eight cores per socket at 4.1GHz? No, they can't. Again, it's "up to 4.1GHz" in the p260 because they're using Intelligent Energy Optimization to hit the 4.1GHz figure the average speed you can actually get is closer to 3.7GHz. And the four-core option for p260 goes up to 4.4GHz per core, again, which hints at what the eight-core could do if IBM could figure out power and cooling. BTW, weren't IBM marketing talking about 5.1GHz Power7+ at launch, where did that go then?

The likely reason Power7+ has bottleneck issues is because IBM chose to use cheaper and more space-efficeint DRAM for L3 cache rather than faster SRAM (which is what Poulson uses). In their mainframes they add an off-die L4 cache to get around the bottleneck of slow L3 cache, but that's too pricey to implement in p-series. Cache is very important (unless your name is Kebabbert) as you need cache to keep your cores spinning productively, otherwise you're just wasting cycles and electricity whilst waiting on memory or disk to provide data. So you want lots of cache and as fast as possible. IBM compromised and went for lots of slow DRAM as L3 cache so they could fit it inside the power and cooling envelope of the Power7+ package. To have used SRAM on Power7+ within the same package-size would have meant reducing the L3 cache (meaning more cycles spent waiting for data, making the Power bottleneck even bigger) or reducing the number of cores to make room on the die and provide power for the cache (sales suicide with Poulson going eight-core and Xeon already at ten). So instead IBM compromised wth TurboCore, which allows half the cores to get twice as much (slow) L3 cache. Whilst Poulson can do a full-speed eight-core with lots of fast cache, in a balanced system design, with IBM it's compromise, compromise, compromise, choose either performance or scale, they just can't seem to get round those bottlenecks in the Power design.

0
1
Anonymous Coward

Re: Well.....

Chill Jesper. Honestly you, Matt and Alli are descending into a non-sensical rant.

> Sure they needed it cause of the bloated code but that is another story.

Why would you want to run HP-UX or AIX anyway and pay extraordinarily high costs. Except for the few folks with bags of money the world+dog doesn't give a sh!t about UNIX in the long term.

Show me 1 workload that won't run on x86 ... and I'll show you 1000s that do...

0
0
Anonymous Coward

Re: Well.....

Wow, how can you even compare Poulson to P7+?

Power 7+ will have 3-4x the L3 cache, run at twice the clock speed, likely have more threads and threading options as 7 runs native 4 threaded SMT and Poulson's main enhancement is that it will run "up to 4 threads"... whatever that means. This is better than Tukwila, but not comparable with Power 7+ on any level. Probably have very similar performance to the Xeon E7 that it is based on.

1
1
Thumb Down

Re: Well.....

"So, as I said, it's an unbalanced design with bottlenecks..."

*CACKLE* now that is kind of pathetic. No you were doing some rather incoherent rambling about TurboCore, and how that means that an 8 core POWER7 don't perform. And now that I say the there is to much processing power in the earlier versions of the B machines, in comparision to how much RAM the machines could hold. Then that is what you said ?

Have you considered running for office in the US ?

"So, as I said, you don't use the latest version of hp's partitioning software and don't have proof it is still the case with the latest version."

It's the same. The 6.1 manual referes to to the same whitepaper, as the 4.3 manuals. And actually there are examples in the 6.1 manuals that indicates that the average memory overhead have gone up.

But at least they then this time talk about overcommitment. So sorry Matt you are wrong.

"Which just goes to show you don't understand hp's partitioning or the SD2. In your carefully constructed worst case"

Whoever talked about a SD2 ? And it's not the worst case. I could have been much worse. I've seen much worse, but that was on SUN eq. back in the late 90ies.

But I can evaluate your proposition. First I talked about an 'average' size of 3, not an exact size of 3.

Again this means that you'll on each nPAR have a waste of ~1.5 core due to not an exact fit. Furthermore each nPar will have to have it's own iCAP and TiCAP cores, rather than drawing from a single pool.

Hence you will have an overhead on a machine with 4 nPars, 4x ~1.5 cores = 6 cores + an overhead cause you'd have to have 4 pools of iCap processor not not just one. Lets be super optimistic and put that waste at half a core for each npar, again cause we are nice to Matt. that gives us 2 cores.

Then there is the vpars. Again we talked about an average size of 3 cores. That means that all the virtual machines that needs 2.1-2.9 cores will need 3 cores. Hence aprox 0,5 cores is wasted per vpar. And inside your 12 socket nPAR you have 32 vpars hence that is roughly 16 cores wasted.

Now as we are talking partitions, not virtual machines capable of running overcommitted. Then the processor time that these vPARS aren't using is wasted.. it is wasted.. if a virtual machine isn't using it CPU cycles.. nobody else can use it. That means that if a virtual machine runs at 100% utilization for 1 minute and for the next 4 minutes runs at 0% utilization, then 80% of that processor power is wasted. Sure normal workloads are a bit different, but the general picture is the same.

Everybody that runs VMware, IVM, Xen or POWERVM, zVM .... knows this for a fact. So on top of the partitioning waste then you are stuck at perhaps 20-30% physical utilization, rather than the 50-70% you would be if you ran fully virtualized environment.

So basically Matt your SD2 here running the antiquated way you want to do things will have a massive overhead, most likely somewhere around 50% off all the cores inside the machine.

If this is how you design solutions, how the hell can you stay in business.. you guys need to be outsourced.

"You mean using the FCOE adapters...."

No I mean virtual devices inside the virtual machines. You know where you by a few quick commands or clicks of a mouse or a script whatever you want .. have created new virtual networks, with virtual network adapters and virtual storage adapters.

"Specint"

You blew it.. just admit it. And who cares about clock speed, people care about capacity. And no matter how much TurboCore and GHz FUD you throw. It still stands that Intel numbers (you know the guys who make Itanium) claims that the top bin Itanium will do around 1250 specintrate2006 on 8 socket and 64 cores in a BL890c i4 blade. And you are claiming that that blade is just as fast as a POWER 780 with 16 sockets and 128 cores doing 6130/6134 specintrate2006. Again.. the claim is ridiculous, and the fact that you just don't admit you are wrong is.. actually kind of sad.

RPE2.

I really don't understand why you need to put down Ideas International, and the fact that I use an independent vendor when doing sizings for our clients. I personally think it a great idea, and I would recommend that clients used for example ideas international when benchmarking vendors against each other. It is IMHO a great investment, as it lets you cut through the vendor crap.

And sure all performance depends... if you don't size things right and utilize the features which the sizing depends upon.. then things will differ. You are FUD'ing.

"Which is basically a specilaist blade with very limited scaling "

So ... when everybody but HP is doing blades then it's specialist.. and crap *CACKLE* Actually the 2 socket p260 blade is OK. It supports 512GB RAM and a dual Virtual IO server setup.

And yes it's 4.1GHz, not up to 4.1GHz, with 'boost', which kind again makes your ramblings about turbocore and speeds and.. kind of funny.

"The BL860c i4 will trounce it in every way."

Oh ? ... *cackle* You do know that the p260 beats the ProLiant BL460c Gen8 with Intel Xeon E5-2680 with around 35%. So you are actually predicting that the bl860c i4 will basically crush the current Intel Xeon processors ?

Perhaps you should think a bit about that, and contact your local Intel sales representative for a confirmation on that wet dream.

Your last TurboCore SRAM cache story is.. well like so far out that it's out there, that it belongs with the same category as the Orbital MindControll Lasers and the faked moon landing.

Personally I think that increasing your per core L3 cache from 4 to 10MB per core onchip cache is better than reducing the cache size as have happened from Tukwila to Poulson. (6 MB to 4 MB per core).

Again you ramble on, without checking your facts.

// Jesper

1
1
Headmaster

Re: Well.....

Well, IMHO your comment is actually very valid.

But for large companies and people who do sourcing of IT for clients it does make sense.

We have the advantage of being able to have a sheer volume that actually makes Large UNIX servers a viable option.

For example the UNIX shared environment that I am responsible for, it's actually cheaper for you to buy a virtual server that runs AIX and has XXX number of capacity units than it is buying a virtual Windows server.

Even though our UNIX hardware platform is a huge hotswap everything UNIX machine compared to a 2 socket somewhat good quality x86 servers.

Simply cause we have to volume to do things efficiently and thus expensive UNIX servers make sense. So it's not for everyone.

Now having 1000 wintel servers that all run at lousy utilization compared to a few large UNIX boxes is.. well.. expensive. You just use your money on something else.

// Jesper

1
1
Silver badge
FAIL

Re: Re: Well.....

"Power 7+ will have 3-4x the L3 cache...." Did you even read the spec of Poulson or Power7+ before you wrote that? If you did then you have the mathematical ability of a goldfish. Poulson has 4MB of faster SRAM L3 cache per core, whereas Power7+ has 5MB per dore of the slower DRAM L3 cache. Last time I checked, five was not "3-4x" four..... I won't even bother trying to explain the advantage the Poulson has with faster SRAM cache as that would require some technical knowledge to understand, and you obviosuly don't qualify.

".....run at twice the clock speed...." Top bin Poulson 2.53GHz for all eight cores. Top bin Power7+ 3.7GHz for all eight, with possible bursts of up to 4.1GHz for individual cores when using Intelligent Energy fudge mode. Even when they switch off half the cores they max out at 4.4GHz. Now, concentrate, it's the maths bit - what is two times 2.53GHz? Want some help? It's 5.06GHz, not 3.7GHz or 4.1GHz or even 4.4GHz. I know IBM marketing prattled on about 5GHz+ chips at teh Power7+ launch but they are nowhere near that in reality. I suggest next time you try reading more than just the marketing material.

"....7 runs native 4 threaded SMT and Poulson's main enhancement is that it will run "up to 4 threads"... whatever that means....." It means both can run four threads per core. Your inabaility to comprehend even that basic statement is - frankly - worrying. I can't believe you actually work in computing so please tell me you are not in a position to operate machinery as you seem a danger to yourself and others.

"....Probably have very similar performance to the Xeon E7 that it is based on." You really don't have a clue, do you? Please go do a lot more reading on the development history of the three CPUs mentioned before attempting a post, and make sure you do some remedial maths.

0
1
Gimp

Re: Well.....

"Poulson has 4MB of faster SRAM L3 cache per core, whereas Power7+ has 5MB per dore of the slower DRAM L3 cache. Poulson has 4MB of faster SRAM L3 cache per core, whereas Power7+ has 5MB per dore of the slower DRAM L3 cache. Last time I checked, five was not "3-4x" four..... I won't even bother trying to explain the advantage the Poulson has with faster SRAM cache as that would require some technical knowledge to understand, and you obviosuly don't qualify."

Apropos math. You do know that it's 10MB for per POWER7+ core right ? 80MB L3 cache divided by 8 gives... you 10MB L3 cache per core.

"Top bin Power7+ 3.7GHz for all eight, with possible bursts of up to 4.1GHz for individual cores when using Intelligent Energy fudge mode."

BZZZZZZZZ Wrong. IT's 4.1 GHz without any boost. It won't make it more true just cause you repeat it.

And we haven't seen top bin yet, that usually goes in the POWER 795. So who knows.. x2 frequency (not that it matters) might end up being right.

You haven't been doing anything but posting false numbers about Processors frequencies and TurboCore number and and.. have Kebbabert perhaps hacked Matts account ?

// Jesper

1
0
Silver badge
Facepalm

Re: Well.....

"....You do know that it's 10MB for per POWER7+ core right ?...." Apologies, I was looking at Power7, not Power7+. It's still not "3-4x" as claimed by the other IBM troll, though. And it is still slow DRAM cache, not faster SRAM cache. Please do try and deny that, I'd love to see how you want to make out that DRAM cache will outperfrom SRAM cache.

"....BZZZZZZZZ Wrong. IT's 4.1 GHz without any boost....." Really? So IBM gurantee you will have all eight cores spining at 4.1GHz? Want to ask them? I already have. Check the brochures, they coyly state "up to 4.1GHz" because they cannot all spin at 4.1GHz at the same time. Once again, IBM cannot supply enough electrical power and cooling to do that, they have to keep them throttled back. It's only when they can restrict it to four cores per socket, by turning off half the cores per socket (but still requiring eight core licences per socket, i.e. doubling licensing costs), that they can suddenly spin the cores up to 4.4GHz. Now, please do explain how they can spin four cores to 4.4GHz but eight only "up to 4.1GHz) if there is no issue with power and cooling and no need to throttle the cores back?

"....And we haven't seen top bin yet..." So you want to introduce vapourware into the discussion? I know you like to play fast and lose with the facts but that's stretching it even by your low standards.

"....You haven't been doing anything but posting false numbers about Processors frequencies and TurboCore number ....." Go look in the IBM guide I posted a link to. You also earlier admitted to the TurboCore issue and already tried to ignore it by calling it "useless", despite it being used by IBM for their benchmarks that you also referred to. What, anything you find too hard to agree with isn't allowed? Good luck trying to sell to us customers, we kinda like asking questions rather than just accepting whatever marketing slides you throw at us.

0
2
Silver badge
WTF?

Re: Re: Well.....

"....We have the advantage of being able to have a sheer volume that actually makes Large UNIX servers a viable option....." I nearly fell off my chair laughing at that! I'm guessing you designed the Windows environment because the idea of making Power even match x64 prices is laughable. Shall we compare? A top end Xeon already scales to ten cores and costs a lot less than a Power CPU. A sixteen-core AMD costs even less. I guess what you are trying to pretend is that you are using PowerVM on a large p-series system and comparing it to the most expensive two-socket Xeon box you can find. You are no doubt using PowerVM to produce lots of insecure instances with one common service partition (which means if you have to work on the servcie partition then all the instances have downtime). That is easy to exceed in VMware, which is more feature-rich than PowerVM, more secure, and - frankly - more stable, and a whole lot cheaper, and I can do it on Xeon systems that scale to hp DL980 size (eight ten-core Xeons) for a fraction of the price of p-series, and maximise utilisation with VMware. Or I can scale out on cheaper 2-socket blades like the BL460 or the SL6500 scaleable platforms, both of which will be massively cheaper than p-series, and still maximise utilisation with VMware (or KVM, or Xen).

But don't take my word for it, let's look at real large-scale environments like Google, are they using Power? No, they are using x64, they just design their solutions a lot better than your architects seem able to. For you to even try and pretend that AIX on Power can be cheaper than Windows on x64 is too silly to contemplate.

0
1
Holmes

Re: Well.....

"Cache"

Actually it's not that simple, as fast versus faster. Size actually also matters a lot. And the cache in itanium is shrinking on a Per core level, where as POWER's is increasing.

"Frequency"

Again you are just quoting HP marketing material. Kind of pathetic really. If you had bothered read and understand how energyscale works, you would have realized, or I assume you would have, that it is the frequecy boost beyond 100%, that is not guaranteed. Here is what the manual says:

Support Notes

1 Note that CPU frequencies in excess of 100% are not guaranteed.

Again you don't get it and start with the turbo core mode again, and wild speculations,that have no hold in reality. The way that the power7 processor acts is highly configurable, hence you can tell it to max performance or try to save energy, or you can simply tell it to cap the powerusage or not to boost frequency beyond 100% (that is the 4.1 ghz). And you can schedule this behaviour. Hence your rather outdated expectations to how a modern processor works comes up short. And funny enough the Intel xeons behave in much the same way.

// jesper

1
1
Silver badge
Happy

Re: Well.....

"Actually it's not that simple, as fast versus faster..." Of course not, you're trying to defend IBM's decision to use cheap and slow DRAM so they could squeeze it onto the same die package, why would I expect you to admit it's slower than the SRAM Intel have on Poulson?

"....And the cache in itanium is shrinking on a Per core level...." Yes but it's still faster SRAM, meaning it will handle those requests from the cores faster, meaning lower latencies. And IBM does not have good cache hit ratios as Intel so the faster SRAM is even better in practice.

"....Again you are just quoting HP marketing material...." No, all frequencies quoted for IBM CPUs are from IBM brochures.

".....that it is the frequecy boost beyond 100%, that is not guaranteed....." Well, using the phrase "up to 4.1GHz" doesn't sound very guaranteed! And then there's the obvious question - if 4.1GHz is supposedly flat out, how come the cores run at 4.4GHz when in TurboCore mode? Last time I checked, 4.1GHz was slower than 4.4GHz, or are you now going to tell us it's not as simple as 4.1GHz being slower than 4.4GHz, that IBM has special cycles that make 4.1GHz actually NOT slower than 4.4GHz?

"....and start with the turbo core mode again, and wild speculations,that have no hold in reality...." What, now you're saying I made up TurboCore? OK, can you deny the cores run at 4.4GHz in TurboCore mode but only 4.1GHz in eight-core mode (I'll be genarous and forget the "up to" as it pains you so much)? That should be quite easy for you. Then I want you to look at the two numbers and admit 4.4GHz is faster than 4.1GHz. Then I want you to explain how the cores can run faster in four-core mode than eight-core mode. If IBM could they would run all eight cores at 4.4GHz, which means they either can't supply enough power through the socket to do so, or they can't deal with the heat of eight cores running at 4.4GHz. That is called a design limitation. I know it's hard for you to accept IBM also have design limitations, they always try to sell them to us customers as "features", but they do. And this is a "feature" where you get a tiny jump in frequency but still have to pay for licences for all eight cores. That is called expensive.

0
2
Pirate

Re: Well.....

"You are no doubt using PowerVM to produce lots of insecure instances with one common service partition"

*CACKLE* You have really no bloody clue of how POWERVM works ? You are quoting the insecurities of HPVM and projecting them onto POWERVM. *CACKLE*

And sure Plastic Virtualization as VMware has more features and bells and whistles than all other virtualization technologies put together.

But still what is left here is the fact that you really have no clue. You don't.

And the google comment is even more hilarious. I have great respect for the stuff that Facebook, Google and others are doing. But the problem they are solving is more alike to supercomputing than 'business computing'. The mechanics is different.

And yes x86 hardware is cheap, but sometimes cheap is expensiv. Again pulling the DL980 out of the hat is actually again text book HP sales tactics. Again.. I get to see the slides, "when you fail to lead with itanium First SD2/BL890c i2/4 then pull out DL980", What became of all of the Itanium benefits ? What became of all the HP blade system advantages ? No.. now it DL980.

Again you are not trying to champion one platform over another, you are simply trying to FUD the POWER platform. Your sudden change of direction kind of is a tell.

Sorry Matt but you are kind of predictable.

// Jesper

1
1
Anonymous Coward

Re: Well.....

"It's still not "3-4x" as claimed by the other IBM troll, though"

Power 7+ has 80 MB across 8 cores. Poulson will have 20-33 MB across 8 cores... or 25% to 41% that of Power 7+. I suppose it is technically 2.42-4x the amount of L3 cache, but, point being, a ton more than Poulson.

"So IBM gurantee you will have all eight cores spining at 4.1GHz?"

The IBM data sheet has four procs in the 770 CECs at 4.42 GHz. Again though, we are talking miles apart. Call it 4.4, 4.1, 3.8.... much faster than anything Poulson can provide is the point. You are splitting hairs to determine if P7+ is much faster or much, much faster than Poulson. If you determine it is much faster, is that some sort of victory for Poulson? The highest clock speed even *claimed* by HP is 2.53 Ghz... probably subject to the same thermal caveats you claim to be true for Power. Power 7+ fully clocked down will still be considerably faster than Poulson top bin.

1
0
Silver badge
FAIL

Re: Well.....

"....You have really no bloody clue of how POWERVM works...." Really? So what is the VIOS, the Virtual IO Server, in PowerVM? It's the IBM marketing name for the service patition that has to sit below ALL instances in PowerVM to handle virtualising the hardware for the software instances! Shall we see what IBM says in its Redbooks about VIOS? "The Virtual I/O Server as a virtualization software appliance provides a restricted, scriptable command line interface (IOSCLI). All Virtual I/O Server configurations must be made on this IOSCLI using the restricted shell provided......A Virtual I/O Server partition with dedicated physical resources to share to other partitions is also required......" So not only is it an overhead (it requires system cycles and RAM to operate), but you also have to go play with it in a CLI, no GUI! How last century! Don't beleive me then go read here, page 218 and page 223 - http://www.redbooks.ibm.com/redbooks/pdfs/sg247940.pdf

But it gets better! Not only is VIOS an overhead, but IBM actually advise you to mirror the service partition as they know it is a vulnerability that could take down ALL the hosted VMs, so you actually end up with DOUBLE the overhead! Please do try and deny it, Jesper, just for fun. Wait, don't tell me, you're now going to claim IBM knows nothing about PowerVM too.....

"....And the google comment is even more hilarious...." More evasions. You stated you could produce a large AIX environment on Power and make it cheaper than 2-socket Windows instances. The cookie-tray type of x64 server hardware used for the Google systems is exactly what can be used with Windows and is being done so by hosting companies. You just don't want to admit you were wrong. Again.

".......Again pulling the DL980 out of the hat is actually again text book HP sales tactics...." So providing proof of a large-scale x64 system that is a lot cheaper than your IBM equivalent can't be mentioned because hp might actually have told their salesgrunts to sell it? Oooh, winning argument - NOT! You brought Windows into the discussion with your claim you could make AIX and Power cheaper, don't get upset when I show how easy it is to debunk that.

0
2
Silver badge
FAIL

Re: Well.....

"....a ton more than Poulson...." LOL! So you can have four times SLOWER cache (becasue DRAM is slower than SRAM) but only if you turn off half the cores on Power. Hmmmm, let me see, do I really want to double my hardware costs (to get the same number of cores in TurboCore means twice as many CPUs which means twice as many P780 system blocks) and double my licensing costs (you still have to pay core licenses for the cores you have switched off), just to get enough cache to get round the IBM bottlenecks? Sounds a very expensive option to me!

"....much faster than anything Poulson can provide is the point...." And this is the crux of the IBM sales schpiel - "it has a faster clock, that's better, because faster is always better!" So Power7+ much be better than Poulson, right? And of course, Power7+ muct be better than Power6, right, otherwise why upgrade? And Power6 must have been much, much better than Power5, right, as Power5 was only 2.2GHz max? If IBM's and their trolls are to be believed, higher clock frequency means better performance, so you think their own benchmark results would show this (and we know Jesper just loves IBM benchmarks). But they don't. If you go look at the TPC-C results for the IBM System P5 570 (Power5 2.2GHz dual-core) it produces a result of 64,073.125 per core, but the 595 (Power6 5GHz dual-core) scored 95080.71875 per core, only a 48% increase. Ignoring that IBM made gaming TPC results an art, this is not as impressive a jump as the IBM trolls would like us to believe seeing as this faster result was with twice as many CPUs, eight times as much memory in the system and the the core frequency actually went up 127%! The test rig went from $4.5m to $17.1m (after discounts) - an increase of 280%!

http://c970058.r58.cf2.rackcdn.com/individual_results/IBM/IBM_570_16_20060213_ES.pdf

http://c970058.r58.cf2.rackcdn.com/individual_results/IBM/IBM_595_20080610_ES.pdf

So more than double the clock frequency did not produce double the performance, despite what the IBM trolls like to pretend. But it gets better! Power6 is actually faster than Power7+, or didn't you know that? Power6 is running at 5GHz in Enterprise 595 servers. Best Power7+ is 4.4GHz. Oops, did that just blow a great big hole in the "faster clock is best" bullcr*p? Want to go back to the Pentium4 at 3GHz versus an i7 at 2.8GHz and try claiming the Pentium4 will give better per core performance just because it has a faster clock? Only idiots swallow the "faster clock means faster system" sales schpiel and you have just been exposed as an idiot.

What IBM desperately try to hide is that clock frequency makes little difference, otherwise there would have been a far bigger jump between Power5 and Power6, and Power7+ would actually be SLOWER per core than Power6. The truth is it is what you do with each clock cycle, and not only does Poulson do more with each clock cycle, the hp Integrtity servers do a better job of making sure the required data is in cache each time it is needed.

0
3
Anonymous Coward

Re: Well.....

"but IBM actually advise you to mirror the service partition "

It learns! Up till now you have either been lying or just shown a complete lack of competence. Which is it?

"but you also have to go play with it in a CLI, no GUI! How last century!"

I retract the question.

1
1
Holmes

Re: Well.....

"L3 cache."

You do understand why the Itanium implementation of an EPIC architecture needs large caches to generally perform well right ? And caches that are much larger than for example x86 and RISC's.

You do understand that the fact that you try to claim hilarious things like "And IBM does not have good cache hit ratios as Intel so the faster SRAM is even better in practice.", (in the context of Itanium) shows that you don't really understand the whole idea behind Itanium ?

RTFM Matt, my first compile and testing of Itanium was in January 2001, on an Intel 'whitebox'. And I've been using the architecture on and off since then.

Again

Merced 1 core 800 Mhz 0MB L3 cache 0 MB L3 cache per core

McKinley 1 core 900 MHz 1.5MB cache 1.5MB L3 cache per core

McKinley 1 core 1 GHz, 3MB cache 3 MB L3 cache per core

Madison 1 core 1.5 GHz, 6MB cache 6 MB L3 cache per core

Madison 1 core 1.67 GHz, 9MB cache 9 MB L3 cache per core

Montecito/Montvale 2 cores 1.66 GHz, 24 MB L3 cache 12 MB L3 cache per core

Tukwila 4 cores 1.73 GHz 24 MB L3 cache 6 MB L3 cache per core

Poulson 8 cores 2.53 GHz 32 MB L3 cache 4 MB L3 cache per core.

So basically the amount of L3 cache that is in Poulson brings us back to the day of McKinley, in the L3 per core ratio. And furthermore Poulson uses HW multithreads, which makes it even worse, in comparison to for example Madison.

For comparision POWER7+ has 80MB L3 cache, which is actually very clevery divided into two parts a local and a 'not so local', which speeds up access of 10MB of the L3 cache per core.

And last rumours has it that haswell will using eDRAM,

"No, all frequencies quoted for IBM CPUs are from IBM brochures."

Nice attempt at ducking.

"Well, using the phrase "up to 4.1GHz" doesn't sound very guaranteed!"

You are simply not making sense. You are mixing thing up, purposely misinterpreting numbers and drawing conclusion that have nothing to do with reality.

It's very simple. A 8 core 4.1GHz POWER7+ processor will run flat out at 4.1GHz on all cores if you provide provide load for all cores, and provide the cooling and airflow specified in the manuals.

If you specify in the energy policy for the system that the system should optimize energy usage over performance the system will do so. Including the cores, hence it will put cores and processors for that matter into states that uses less energy if there isn't load enough to provide work for these cores processors, IO slots whatever.

If you specify in the energy policy for the system that it should prioritize performance over energy usage it will try to boost the frequency of the cores when it's needed.

If you are using a POWER7 system booted up in TurboCore, mode, the cores that are activated will run at a boosted frequency rather than the normal frequency all the time, there will be no aditional frequency boost available, according to the manual.

Actually it's much like what Intel is doing with it's Xeon processors. It's actually pretty simple.. or at least for everyone else than you.

// Jesper

1
0
Devil

Re: Well.....

"I nearly fell off my chair laughing at that! I'm guessing you designed the Windows environment because the idea of making Power even match x64 prices is laughable"

Yes sure it is:

http://www.tpc.org/tpcc/results/tpcc_result_detail.asp?id=111050501&layout=

and

http://www.tpc.org/tpcc/results/tpcc_result_detail.asp?id=110041301&layout=

6% price difference per transaction between a HP 2 socket server with a pure windows stack and a hotswap everything, highend POWER system.

"You are no doubt using PowerVM to produce lots of insecure instances with one common service partition"

No insecure instances with one common service partition.. that is HPVM.. you got things mixed around. There are no service partition in POWERVM, there are optional VIO servers. but that is exactly for ensuring security.

"Google ?"

*CACKLE*, you do know that those guys are trying to solve 1-4 different problems where they have total control 100% control of their software stack right ?

We have almost everything. THe mechanics of the business is very very different. Again Google/Amazon etc are more Supercomputing like installations.

// Jesper

1
1
Holmes

Re: Well.....

*CACKLE*

Man it's fun to watch you quite things from the manual and try to twist it into something that fits inside your misguided world view. And fail cause you don't understand it.

The Virtual IO server is what it is, it's a server that serves virtual IO to the virtual machines running on the physical server. It's been there for what 7+ years. In my standard design, we don't have 1.. or 2 or 3 we have 4. One that resides inside each System Unit, and owns the IO devices that are in that System Unit. Which is very smart when doing concurrent maintenance on the physical machine. We don't yet use the federated capability of the new VIO code but we will. So for now it's SEA failover for network, and MPIO for SAN traffic. Works like a charm. No single point of failure. And I can just boot a VIO server if I want to... no loss of packages no loss of traffic, sure virtual machines that are using this VIO server will cough and complain as they 'route' traffic to another VIO server.. if I 'just' pull the plug. So a controlled managed takedown is clearly preferred.

Now you don't have to use VIO servers if you don't like them. You can just dedicate an adapter, no problem or use the IVE adapters for network (hardware virtualized adapters). But if you are running a multitenant environment you want the flexibility and the security a VIO servers give you. It does not run beneath the virtual machines like IVM, it runs besides the virtual machines. So your whole double overhead is hilarious.. again you are taking the shortcomings of IVM and projecting them onto POWER. Shortcomings that aren't there.

Is there an overhead ? Sure there is. Memory wise it's usually 16-32GB for a machine with 2-4TB RAM and CPU wise who cares. It does normally reserve perhaps 2-3 cores in entitlement. That is the real overhead, and the real price you pay, well kind of because some of the IO CPU usage normally done by the virtual machines are actually shipped to the VIO servers, so the overhead is perhaps more like 1-2 cores. But again when compared with HPVM/IVM it's peanuts... I mean 18% compared to what 1-2% memory overhead for POWER. *CACKLE*

And with regards to cli. Now what is wrong with CLI ? IMHO it's actually nice and keeps the Cowboy GUI "just click here here and here and then..." riders away from the VIO part,

But you can actually GUI configure your whole Machine, using the System Planning tool, in which you design your whole virtualization layout of the physical machine. You then just point and click and push that design out on your physical machine via the HMC and wuupti.. you have a machine with VIO servers installed and all your virtual devices, virtual networks, virtual machines etc etc. defined. Ready to install AIX/LINUX whatever.

Sure it isn't that popular with your local cowboy consultant sysadmin, cause it takes away time where he could be clicking away for hours sipping coffee and making error 40ies.

But again.. so your whole POWER attack plan, in defence of Poulson comes down to the fact that the main VIO server management is done via a CLI ?

now the price stuff is in another post.

// Jesper

1
1
Silver badge
FAIL

Re: Re: Well.....

"....It learns! ...." You haven't seeing as you still haven't supplied any counter to the point.

0
3
Silver badge
Happy

Re: Well.....

".....So basically the amount of L3 cache that is in Poulson brings us back to the day of McKinley....." Neatly ignoring the fact that the Poulson cache is faster and has wider buses, and plugs into faster a memory system. See, you're back to your IBM smoke-and-mirrors routine - "look at the numbers, just at the numbers, don't think about what is behind them".

"....For comparision POWER7+ has 80MB L3 cache...." Yeah, you're still dodging the bit about it being slower DRAM cache. Your evasions are becoming boring.

"....And last rumours has it that haswell will using eDRAM....." Power7 already uses eDRAM, all the term means is "embedded DRAM", ie DRAM cut on the same silicon as the CPU cores. It's still slower than SRAM and actually needs a separate controller just to refresh the DRAM periodically (which can mess with performance).

".....Nice attempt at ducking....." Hmmm, so quoting a vendor's own manual about their own kit is "ducking". I guess you just want to call anything that show you up as "ducking", right?

"......You are simply not making sense...." No, it's more like you don't want it to make sense. IBM's own manual says "up to 4.1GHz", not "4.1GHz". Go argue with IBM if you don't believe them.

"......If you specify in the energy policy for the system that it should prioritize performance over energy usage it will try to boost the frequency of the cores when it's needed....." Actually, that's WHEN it can, because it can't provide enough energy to run all the cores at 4.1GHz with max memory and peripherals. Just like with the old IBM blades it's another case of IBM's flakey PSU designs clashing with power-hungry cores. It's a simple fact that hp's Inetgrity servers with Poulsons will not have that issue - when they say 2.53GHz for all eight cores it's what you get, no fudges or compromises.

"..... it's much like what Intel is doing with it's Xeon processors...." Yes, it is similar except for one very key point - Intel state up front the normal speed of the Xeon cores and then explain the boost is a temporary one, whereas IBM try and mislead customers into thinking it is available 100% of the time for all cores, regardless of the system configuration. The Intel approach is honest, whereas the IBM one is... well, IBM's.

0
3
Silver badge
FAIL

Re: Well.....

".....6% price difference per transaction between a HP 2 socket server with a pure windows stack and a hotswap everything, highend POWER system....." Oops, we've lost Jesper, he's hiding in the Benchmark Zone again! But, hold on a sec - Jesper claimed he could make his IBM solution CHEAPER than Windows 2-socket, but IBM can't even with their infamous benchmarking hi-jinks. Jesper is truly an IT guru - not! What makes it worse is that was a bench for the older G7 version of the Proliant, not the latest and faster Gen8. Looks like even the fantasy world of the Benchmark Zone isn't being friendly to Jesper today!

".....There are no service partition in POWERVM, there are optional VIO servers....." Without a Virtual I/O Server instance you cannot virtualise the adapter cards, shared disk, tape drives, etc. In short, without VIOS you're just creating VMs that can't actually connect to anything. The typical partiton layout, with dual VIOS instances for redundancy, is shown on page 370 of the IBM Redbook on PowerVM (http://www.redbooks.ibm.com/redbooks/pdfs/sg247590.pdf). Once again, you want to disagree then go moan at IBM. You could do micropartitions via the PowerVM Hypervisor, but then that's really putting all your eggs in one basket as you cannot have a redundant Hypervisor.

"....*CACKLE*...." I'm getting a bit worried by that, it is some IBM salegrunt standing over you and zapping you with a cattleprod whilst screaming "more FUD, more FUD!"

"......you do know that those guys are trying to solve 1-4 different problems where they have total control 100% control of their software stack right ?...." Yes, with x64 tech, not Power. And the reason is becuase they can do it better and cheaper with x64.

".... THe mechanics of the business is very very different...." Yes, backtrack some more why don't you. So it looks like you can make Power cheaper than x64 right up until the point where actual reality steps in.

0
2

Page:

This topic is closed for new posts.