But, will it run Crysis?
...That is all...
Solaris customers looking to upgrade their Sparc server iron will have a second set of choices now that Oracle has admitted that it will be reselling the Fujitsu "Athena" servers that use the Japanese company's Sparc64-X processor. Oracle's launch today of the Fujitsu Sparc M10 server lineup is not really much of a surprise. El …
...That is all...
How many bloody times the same "funny" post!
Surely it is cheaper to migrate onto modern platforms than to buy more of these outdated and over priced boat anchors?
No one these days would dream of deploying Solaris on a green field site unless they worked for Oracle...
1. 32Tbyte shared memory boxes are not common
2. Solaris is STILL a far more robust OS than Linux
3. SPARC is expensive versus x86, but if you have SMP needs, then revert to #1. above
Outdated and overpriced?
Hey, wake up, you seem to think it's still 2010.
Don't be ridiculous.... Oracle wouldn't deploy Solaris, they would deploy Oracle Linux.
Having worked on both x86 and SPARC, I can say one thing --
price differences between the mid-range SPARC and high-end x86 are nominal. I used to pay ~ 100K for the M5Ks and ~ 100K for the M4800s with 256GB RAM. Running solaris on both, I ran 5-8 zones and oracle dbs on both. Didn't see much of a performance difference. The M5Ks were a lot more stable in my experience, however, with advanced RAS features.
Did some micro-benchmarks on the T4s and HP DL580/BL460 G6/G7 boxes. HP running vmware + rhel, T4s of course running Solaris 10 and 11. Performance-wise they were very close.
The T5s might tip the scale in SPARC's favor with the extra clock-speed.
With the STREAMs benchmark I got ~ 32GB/s memory throughput on the T4s scaling upto 4 cores. I didn't test beyond that...but that's pretty good. I had to tweak the benchmark and put in parallel hints into the code, but one'd expect modern software to parallel process as much as possible...
Also, with virtualization, it is difficult to say which platform has better cost:capacity numbers. As far as scalability is concerned, I find it hard to see how something like an ERP system running on 128 h/w threads would fare on x86 at this point.
Moral of the story is -- There are many ways to solve infrastructure performance/scalability/reliablity problems. And there is no "one best way" to do it. The answer, well...it depends.
"I find it hard to see how something like an ERP system running on 128 h/w threads would fare on x86 at this point."
Better than anything with similar specs does on Sparc (with up to 160 hardware threads):
Larry says one week: "Here's T5 and M5, they are the BEST for all workloads!" The next week, we get the Fujitsu chip that is actually what the customers want. It's "Rock" all over again!
More like you CAN'T find anything to say to dispute the obvious - T5 can't do the heavy, single-threaded enterprise workloads Larry claims it can, and sticking a lump of cache on the side without any integration into the chip on M5 is a waste of time. Once again, SPARC64 (and Itanium, and Power) will massively outsell Niagara because it's the only chip Snoreacle's can offer that is actually capable of doing the job required. Whilst it's good for Snoreacle's that Larry has learned from Ponytail's mistakes, it's still amusing to see how the Sunshiners are still trying to deny the obvious.
I think this is all horses for courses. The T5 chip should be good for heavily threaded workloads provided you don't try to LDOM it heavily. So, there's a market. Database workloads.......that might be a different matter. As you say, cache size is not the only difference between processors that handle big single threads and many smaller threads. The whole processor design is different. So, the M5 simply adding cache will help, to a point, but no more. The SPARC64 is designed much more for heavy single threads.
The question people need to ask themselves, is do they want two different types of servers? Do they want some kit specifically for DB (and the like) workloads and others for heavily threaded app layers? Isn't that a bit inflexible? Mind you, according to Oracles strategy of putting hardware accelerators into the chip, this would get even worse. Might you end up with a processor for Exadata, a different one for Exalogic, etc.etc.? Unless they waste a lot of space on the silicon, putting on accelerators that won't all be used together, won't that be a risk?
Interesting times, but I can't really follow Oracles strategy, if it has one. They seem to be suggesting we head back in time towards more and more speciality processors, dedicated to certain functions and that's a direction hardware has been moving away from for some time. Not saying it's wrong, but they're the only people doing it, which must make you wonder. Genius or mad. A fine line.
There are initiatives underway (and have been for a while now) to use FPGAs to provide hardware acceleration. Why is it a bad idea to do something similar on the processor die itself?
If they design it right, it can have significant impact on performance (positive) of apps that leverage java (we all know that can use some help in that regard) and databases. Sun/oracle have done that with their crypto accelerators (and so has Intel btw). This is just taking that a few steps further.
For computing to evolve and grow, the lines between what was considered software domain vs hardware domain needs to blur from time to time. Also, if you notice, it is a cyclical phenomenon. The cycles are getting closer and closer as technology evolves, miniaturization matures further.
I don't think there is anything inherently wrong as such, and I acknowledge the performance gains. The problem arises in that the processor becomes a single function entity (rather than multi-function) and also you start loosing economies of scale, potentially increasing prices. Crypto accelerators are still pretty much multi-purpose in that encryption has enough standards to allow lots of different software etc. to use the same ones.
However, if you have an accelerator solely for Oracle DB, what about when you want to run an application tier, say Weblogic? Plenty of soon on that DB box (spare capacity), none on the application box. So, you either have to run sub-optimally on the DB box, wasting some potential there, or buy another application box. You're starting to divide your server base into smaller and smaller chunks, which leads to greater overheads.
Of course, a lot of this depends on exactly what accelerators they implement, how much die space they take up and what performance benefits they give. Then, people will have to run the TCO calculations and see if the multi-purpose servers, whilst having theoretically lower performance, actually have a lower TCO through lower cost and maybe higher utilisation rates etc. We'll wait and see.
I don't think that is the natural outcome of these "embedded" accelerators. It is not cost-effective to have different fabs for different functions. I think they will try and move some functionality into silicon that will be broad-spectrum (some java acceleration, some db acceleration). Given that Oracle apps are predominantly java based and have an oracle db backend, the gamut of oracle apps will fit on these processors.
I also think that they might start building Exa<data/logic/lytics> engineered systems on the SPARC line once they get the hardware accelerators in place.
"There are initiatives underway (and have been for a while now) to use FPGAs to provide hardware acceleration. Why is it a bad idea to do something similar on the processor die itself?...." The question becomes how tightly do you integrate the app to the hardware. Too tightly and the chip becomes too specialized to do any other tasks well. And then you reduce the customer base, which means you lose any economies of scale and have to sell your specialised chips at very high prices. Remember the IBM/Sony/Toshiba Cell? Brilliantly clever, capable of amazingly tight integration, and yet a PITA to develop for due to the tightness of integration. Lack of stomach for the pain of coding it limited the application base, limiting the uses and keeping the costs high, leading to it overlooked in favour of cheaper, simpler, "less clever" but infinitely more popular and flexible designs like x86. If Larry ties his chips too tightly to Java, or worse still too tightly to just Oracle database software, then the other app vendors will push their software on anything but Larry's chips, and you can't build a business on Larry's software alone.
Matt Bryant posts, "T5 can't do..."
You don't have a rational argument for the statements made.
A rational argument against your statement.
If you don't have a rational argument for your statements, perhaps you can show us the numbers...
I actually think you have a very valid point. There are very clear and relevant differences between T5/M5 and SPARC64 X servers, when it comes to how your workload will perform.
For example the SPARC64 X has a decimal floating point execution unit, the T5 doesn't. Again that will mean that various code will execute very very differently.
"......you don't have a rational argument for your statements....." Try the Oracle server sales figures (http://www.theregister.co.uk/2013/03/20/oracle/ ). Nothing shows the fact that Larry is failing to convince the customers more than the complete lack of penetration of Niagara into even existing Sun accounts, who prefer the SPARC64, whether badged or direct from Fudgeitso. Your staged benchmark figures are completely pointless compared to the all important sales figures.
Please dont talk about the CELL cpu, it was a freak of nature. IBM has killed it, and there will be no more development of CELL. For a reason. It performed awfully in real life workloads. As soon as the workload did not fit into the cache, the performance dropped 95%. Yes, only 5% of the performance remained. Terrible design.
For instance in String pattern matching, the 3.2GHz CELL was 70% faster than the Niagara T2+ cpu at 1.6GHz. This result was for small workloads. When workload exceeded the cache, performance dropped radically. You needed 13 (thirteen) CELL cpus to match one single Niagara T2+ cpu. And, whats worse, the IBM team did heavy optimizing in string pattern matching benchmark, they used assembler, loop unrolling, etc. The Sun team just implemented the Aho-Corasick algorithm in pure C, and did no optimization. And still, the Niagara T2+ was 13x faster than CELL with too big workloads that did not fit into the cache.
It is funny how Niagara T2+ could perform order of magnitude better than IBM CELL, with a very small cache. I think the T2+ has something like 2MB cache in total. This benchmark proves yet again, that the Niagara is not cache starved, but instead has a different design which makes it superior to several times higher GHz and several times larger cache.
1.6GHz 2MB cache > 5GHz 20MB cache
does not seem it is cache starved to me.
IBM CELL is not further developed, it was a dead end. For a reason. Dont talk about the CELL, please. Talk about a good cpu instead, like the POWER7. POWER7+ seems to suck, because it has double the number of cores, and is only 20% faster. Sign of a bad design.
POWER7+ double the number of cores? No the POWER7+ chip has the same number of cores, what is done on servers like the POWER 760 is that you can plug a single dual chip module into a socket.
"..... It performed awfully in real life workloads......" LOL! Didn't you get the memo? The reason Snoreacle's has to swallow its pride and badge Fudgeitso servers is BECAUSE the whole CMT range are pants with real workloads outside of the web serving niche. And the main reason is because it chokes on heavy single-thread apps and doesn't have enough cache. Your pointless denial of reality is very amusing but I think it's about time you started facing facts.
If I consider how the crypto accelerators work, it might be transparent to the application. The OS detects conditions that match and offload the work to the processors' built in crypto accelerator.
"....The OS detects conditions that match and offload the work to the processors' built in crypto accelerator." Sure, sounds fine on paper, but you need to understand that in circuit design you don't get anything for nothing. The space taken up on the die by additional, specialised circuits usually has to come at the cost of more generalised circuits that can help in more generic uses. In the case of M5, by finally adding the cache the CMT designs need (though not the rest of the cache-handling technology required), half the cores had to be chopped out to make room. So you have a choice - make the CPU bigger so you can add the additional specialised circuits without affecting the general circuits, or add the specialised circuits at the cost of non-specialised performance. Making the chip die bigger means increasing the wattage required and also reducing the yield per wafer, both of which drive up costs. Staying in the same envelope but reducing general performance makes your chip less attractive to those users not running the specialised tasks you have designed for. And - whilst it maybe hard for the Snoreacle fanbois to admit - not every server out there is running Larry's database software as its core role.
it would make more sense to design such offload engines onto plug-in PCI-e cards, then they can be added as required without crippling the general performance of the system.
"...And the main reason is because it [Niagara] chokes on heavy single-thread apps and doesn't have enough cache...."
If your claim is true, how can Niagara best much higher clocked cpus then? If Niagara is cache starved, how can four Niagara 1.6GHz cpus equal 14 (fourteen) POWER6 at 5GHz on SIEBEL v8 benchmarks? You have never answered this question. Every time you claim the Niagara is cache starved, I ask this question, and every time you are silent. Something does not add up in your posts. :)
Ok, the POWER7+ has not just the double amount of cores? Ok, thanks for that information, I will not say that again.
Ive googled a bit now, and it seems the POWER7+ has higher Hz and 2.5x larger cpu cache. And some hardware accelerators, just as T4 and T5. Is that it? Higher Hz and 2.5x larger cpu cache, gives 20% better performance? Not that impressive, if you ask me. It is just like Intel Haswell, which is 10-15% faster than Ivy Bridge, that is not impressive if you ask me. 100% better performance - THAT is impressive, if you ask me. The SPARC T6 will be much faster than T5, it will again double throughput, and have 1.5x stronger threads than T5. And in two years, we will see 16.384 thread SPARC server with 64TB RAM.
I am more interested in the POWER8. POWER7+ is too small a upgrade to be interesting for geeks. All geeks can appreciate a good cpu and cool tech, no matter who does it IBM or Oracle or HP. But HP has no fast cool servers, instead they have extremely stable OSes. Unix is unstable compared to OpenVMS, and HP-UX is the most stable Unix out there, sysadmins say. I wish OpenVMS was open sourced so we could try it out for free. That would be cool with the best clustering out there: OpenVMS.
I don't think Oracle's trying to sell their hardware of "Other" software. If they can accelerate their own suite of software (which is pretty extensive), that makes sufficient case for new customers (and old) to start buying their hardware.
I'm not sure if you know the full gamut of oracle's software portfolio, but it is pretty massive. I don't think they need to care about accelerating DB2 or Sybase. Odds are if it is a DB2 shop, they are already on IBM hardware or some other vendor's if a sybase shop.
Well actually I'd rather call the POWER7+ accelerators, coprocessors after reading a bit more about them, they are aren't actually inside the core but on chip.
With regards to POWER7 versus POWER7+, then it depends on the workload.
If we look at the SPECJBB2005 results, same machine POWER7+ at 3.5GHz POWER7 at 3.55 GHz.
Now that is a 46% increase with less clock frequency.
We can also look at the POWER 740, 3.55 GHz POWER7 from 2010 compared with a POWER 740 with 4.2 Hz POWER7+ from here recently
Now that is an increase 53% with an 18% increase in frequency.
Now according to those benchmarks then POWER7+ is actually a good upgrade. So there is no doubt that there is quite a potential in POWER7+ compared to POWER7. But if you look at something like SAP2 Tier benchmarks, then the improvements aren't that great, again most likely cause the software stack cannot take advantage of POWER7+ specifics yet. Nothing new in that.
".....If your claim is true......" Oh Kebbie, you KNOW it's true, as otherwise Larry wouldn't need to badge Fudgeitso's SPARC64 kit! Duh!
"I don't think Oracle's trying to sell their hardware of "Other" software...." They have to. Sorry to burst your bubble but the only Oracle software that really sells well is the database software, the rest is usually not even in the top three of the relevant software segment. Remember Oracle Collaboration Suite? So, how many Exchange instances did that replace? And the majority of Oracle databases are actually storing data for other companies' software (such as SAP). Larry may think he can sell everyone an Oracle DB appliance but he can't replace the rest of the stack, and if he tries making Slowaris and SPARC too closely ties to Oracle software then he will kill his chances of selling servers for anything other than his own appliances.
[[[Sure, sounds fine on paper, but you need to understand that in circuit design you don't get anything for nothing. The space taken up on the die by additional, specialised circuits usually has to come at the cost of more generalised circuits that can help in more generic uses. In the case of M5, by finally adding the cache the CMT designs need (though not the rest of the cache-handling technology required), half the cores had to be chopped out to make room. So you have a choice - make the CPU bigger so you can add the additional specialised circuits without affecting the general circuits, or add the specialised circuits at the cost of non-specialised performance. Making the chip die bigger means increasing the wattage required and also reducing the yield per wafer, both of which drive up costs. Staying in the same envelope but reducing general performance makes your chip less attractive to those users not running the specialised tasks you have designed for. And - whilst it maybe hard for the Snoreacle fanbois to admit - not every server out there is running Larry's database software as its core role.
it would make more sense to design such offload engines onto plug-in PCI-e cards, then they can be added as required without crippling the general performance of the system.]]]
PCI-e cards are going to be of course significantly slower than if we were running the same logic in silicon, on the die itself. Which is why things like tcp offload engines, crypto accelerators were moved into the processor die (they used to be co-processors and add-on cards in the past). That is the natural evolutionary process of miniaturization. If we didn't do that, could we manage to an iphone or an android in the palm or pocket?
I think protestations against this paradigm are based on fallacious premises. Every vendor should have the freedom to differentiate their product and provide incentives to their customers (to better sell their products). This need not necessarily be from a "cost-savings" perspective in terms of pure capital spent, it could (and in fact should) also be in cost-savings in the form of more work being accomplished (achieved through performance boosts, etc), etc.
"....cost-savings in the form of more work being accomplished (achieved through performance boosts, etc), etc.". Helloooooh, anyone home? What do you think happens with CMT, already too tightly tied to parallelised workloads to run the single-threaded apps customers have now, when you take away even more general performance stuffing "accelerators" onto the die? FFS, go get an adult to explain it to you.
"...Oh Kebbie, you KNOW it's true, as otherwise Larry wouldn't need to badge Fudgeitso's SPARC64 kit! Duh!..."
That was not a very convincing argument, dont you think? It is well known that the best tech does not always win. For instance, Windows has larger market share and more profitable than OpenVMS or HP-UX, so by your logic, Windows must be better, right? Wrong.
Can you answer the question? I have asked you this question again here for the umpteen time. Why is that a "cache starved cpu" can be 10x faster than POWER6 and CELL running at 3-5GHz? Can you answer?
Real world example from an adult for Matt Bryant. There are plenty of applications that can leverage even the older generation CMT processors. I moved a middleware application running hundreds of perl-based data mungers spread across 2 M5000s (@2.2GHz clock) to a half T5440 (128 threads / clock @ 1.6GHz).
In data warehousing workloads often the throughput is more important than single-thread speed. That unimpressive .5 T5440 did the throughput of the two M5000s in the same window.
The T4s and T5s are a completely different class from those humble T2(+) processors. When I migrated that workload to a T4 (only 12 cores), it ran 2x as fast and completed the work of 2 M5Ks and 0.5 T5440.
You know which applications suck the most? Apps from IBM. Cognos - such a piece of junk. Other horrendous applications are those of the Tivoli suite of products. These are so pathetic that they keep SIGSEGV'ing all over the place. IBM's solution - run on our hardware. Our suggestion to them --make your damn software work. On a Netcool upgrade couple of years back, their 1+ year old stable release needed 750 patches to be functional. I've never had to deal with apps from other vendors that were that buggy. It took them 6+ months to make the product stable!
Well according to a guy called Kebbabert the reason why Cell was slower than the T processor, was lack of cache. *cough*
Well it's hardly a surprise if you move a distributed embarrassingly parallel datawarehouse load from 2 old M5000 with PCI-X slots onto a more modern server with PCI-e x8 slots that things run better.
Is there any such thing? :)
Actually the app I was referring to was embarrassingly "singular". No multi-threading...just a humble little perl script munging data from raw files and inserting into a remote DB. The trick was to be able to run enough instances of these perl scripts against different files so that it could achieve the throughput needed.
Guess who the vendor of that wonderful application was? IBM!
"....That was not a very convincing argument...." Oh but it is! There is no way Larry would go to Fudgeitso unless he really had to. Ponytail learnt the hard way when he left it late before going to Fudgeitso for help, Larry is just better at lying about CMT's capabilities whilst covering his a$$.
"....Why is that a "cache starved cpu" can be 10x faster than POWER6 and CELL running at 3-5GHz?...." The only time CMT wins is in wildly-contrived Snoreacle benchmark sessions or when webserving. Not in a real enterprise production environments using real production data and stack. If it was otherwise then we'd all be buying CMT servers, and the fact that the VAST MAJORITY of companies do not is all the proof I need to expose your sillyness.
Oh, then it's nice to see that IBM can write applications that can exploit the Tx processor architecture.
Actually ibm wrote some pretty crappy code there. Its the massive hardware threads of cmt that allows the throughput.
Why even offer the thumbs up or down buttons if you are going to make them so annoying to use that everyone will quit using them.
I'm not really sure what Oracles game is here, as they seem to change tack every week. They announce the T5 and M5s with loads of fanfare etc. as one would expect. They then, almost immediately, announce the Fujitsu servers.........Why? What is a customer supposed to think. It's like Sun all over again. No perceivable strategy and confusion everywhere. What's the point in Oracle selling the Fujitsu servers unless they have at least some benefits over the T5 and M5 line? Is Oracle likely to admit this though?
Also, as Oracle want to put hardware accelerators into the chips for their software, surely they need to own the processor design? In which case, once again, M5 and T5. Fujitsu might put the same accelerators into their chips, but presumably would have to pay Oracle? What's the status of Oracle Linux and Solaris at Oracle? As Sparc is where they can put the hardware accelerators, you would expect them to promote this architecture heavily and over x86. This would suggest a move towards Solaris. But, what's happening? Oracle consultants still promote Oracle Linux and x86.
It's all very reminiscent of Sun all over again. Left hand doesn't know what right hand is doing. No coherent strategy or direction etc.etc. Maybe Oracle did take over Sun, but maybe Sun 'infected' them with an ill?
Oracle consultants promote Oracle Linux/x86 - they're paid for it & better margins. Solaris/SPARC is not so cheap. More heads. More expensive heads.
Oracle Linux and Solaris (plus SPARC) together? Nope! In different orgs in Oracle.
Work together? No!!! Never!!!
Remember dtrace/ZFS annouced for Oracle Linux. Did Solaris know or asked first? No! Everyone wondered what the future was...
Solaris & Oracle Linux compete. VP of Oracle Linux wants $300m/quarter not $200m/quarter and VP of Solaris (systems) with $100m/quarter. Get it? VP of Oracle Linux wants to be "Oracle platform" and kill Solaris. VP of Solaris/SPARC just wants survive. No love inside Oracle.
Mad Mike .....
You figured this out!!!!! I imagine this lack of clear positioning and messaging regarding the differentiation between the new Oracle T5/M5 SPARC servers and the equivalent Fujitsu SPARC M10 servers will make for some very interesting conversations with current SPARC customers, many of which have previous versions of the Fujitsu architected systems installed in their data centers and now need to decide whether to refresh their SPARC install base with Oracle new or Fujitsu new. This is truly Sun type confusion all over again, no surprise here. I had expected Oracle to be much more precise with their forward looking HW strategy though as they have had almost three years to figure it out. Not sure what's going on at the Oracle exec level! I suppose time will tell.
Yes, indeed. Getting salesmen together from each group can be interesting. Pretty easy to get each to insult the other and start a fight. Entertaining, but beyond that, not really useful.
Any decent company would have realised you can't go no like this, but appears Oracle has not. The x86/Sparc (Solaris v Linux) debate is bad enough, but now we even have in-fighting within Sparc!!
This is why we are moving away from Solaris. Had plenty installed, but for years and years, Sun couldn't come up with anything intelligent around roadmaps (at least ones that stuck!!). Now, Oracle are doing the same. The whole Sparc/Solaris marketplace has been going through a slow slide downwards and Oracle seem to be doing nothing to stop this. Really, really sad. Solaris is a great operating system and Sparc was as good a chip as any. But, over the years, both Sun and now Oracle seem to be throwing it away.
Mad Mike --- Sun couldn't come up with anything intelligent around roadmaps (at least ones that stuck!!) Now, Oracle are doing the same.
It looks like the roadmaps from Sun are pretty much being completed by Oracle... and the Oracle roadmap appears to be getting completed.
While the in-fighting and apparent under-funding of Solaris (vs Oracle Linux) seems to be an interesting discussion (uncertain of the references), it seems pretty clear that the SPARC / Solaris road map has been executed upon very well over the past number of years.
This being said, being the fastest out of all the competition is nothing to sneeze at, especially if it was done on "the cheap"!
When IBM catches up with the Power 8 and Intel releases their 8 socket capable chip - it will make things more interesting. (Oh yes, you can make 8 or more socket Intel platforms, but they are very expensive, with a lot of latency to deal with... and Power 8 was on the roadmap to be released in less than a year, but I suspect it is unfortunately farther off than that.)
Oracle and Fujitsu have released some nice processors recently. It is good to see the competition return to the marketplace!
".......Sun couldn't come up with anything intelligent around roadmaps...... It looks like the roadmaps from Sun are pretty much being completed by Oracle......" So which is it? If Sun didn't have any intelligent road maps and Oracle are completing on them then surely that means Oracle are just completing a not intelligent roadmap. Thanks for clearing that up.
and there are more in this particular blog.