back to article Sun cranks clocks on Sparc T2 and T2+

The executives at server and operating system maker Sun Microsystems have been uncharacteristically quiet since the $5.6bn Oracle deal was announced back in April. And they've been silent since Sun's shareholders approved the deal last Thursday. This - from one of the most aggressive, PR-driven firms on the planet - is a bit …

COMMENTS

This topic is closed for new posts.
  1. Arnold Lieberman
    Headmaster

    Eh?

    "The move from eight-core, four-thread Sparc T1 chips to the eight-core, eight thread T2 chips ***did not much change in clock speed, although the T2 did has twice as many threads*** and could be used in two-way machines, which gave systems about twice the oomph on workloads."

  2. Kebabbert

    Niagara architecture

    "Sun is charging a pretty big premium for the extra Sparc T speed bump. A T5440 server with four 1.4 GHz T2+ chips with all 256 threads activated in the four-socket box, plus 128 GB of memory and two 146 GB disks has a list price of $89,895. Jacking that machine up to four 1.6 GHz T2+ chips with the same hardware otherwise boosts the price to $115,695."

    Yes that is a fair amount of money for a small speed bump. But you have to know a bit more on the Niagara chip design to understand.

    According to studies from Intel, a normal x86 server CPU idles 50-60% under FULL load - because of cache misses. During full load, a normal x86 idles 50% of the time. This means a 3GHz x86 corresponds to a 1.5GHz chip that does full work and never idles. This is because RAM is not as fast as CPU. The difference in speed is too big and you get lots of cache misses. This ancient problem has no one solved. Except SUN with Niagara chip.

    The Niagara has several cores, and several threads in each core. As soon a thread stalls and waits for data, Niagara will switch to next thread in one clock cycle and continue work without idling. Also, in optimal circumstances one core can run 8 threads at once in different stages in the pipeline, which a normal CPU can not do. Also, a normal CPU can not switch thread that fast, it takes many many cycles. Therefore the Niagara idles maybe 5% because of cache misses whereas x86 idles 50%. This means a 3GHz x86 chip corresponds to a 1.5GHz (and it has few cores) whereas a 1.6GHz Niagara corresponds to a 1.5GHz chip under full load (with lots of cores and threads). Therefore a 1.6GHz Niagara corresponds to a normal 3GHz x86 chip.

    Also, the next Niagara, T3 will have 16 cores with 16 threads in each core. This gives 256 threads. There are rumours that SUN is planning a machine with 8 of the T3 chips, giving 2048 threads. Luckily Solaris is known to scale well vertically and can handle that.

    Here we see that one SUN T5440, that you speak of in your article, that has 4 of the 1.4GHz Niagara, gets 14.000 Siebel benchmarks. Whereas 3 of IBM Powerservers P570 with a total of 12 Power6 CPUs at 5GHz, gets a total of 7.000 SIEBEL. Thus, the SUN T5440 is twice as fast as 3 of the IBM power servers.

    And one IBM Power P570 with 4Power6 and 64GB RAM costs 413.000USD. Compare that to the SUN T5440, and you will see that in fact, the SUN is mighty cheap and delivers mighty performance.

    But granted, the Niagara is slow on single threaded work. But servers always utilize a client-server meaning that one server serves many clients. Meaning that high throughput with many threads are preferable to fast single thread - in client server setup.

  3. Jan 7
    Megaphone

    FFS

    * Don´t * fucking * compare * anything * to * a * fucking * single * chip * p570. Its a goddamn eight socket server built for 3/4 TB ram, not one of your stupid entry-level toys.

    If you want one socket buy a p510.

    and for benchmarks:

    Yes, of course a T5440 is faster than three p570s, thats why SUN dominates the HPC sector by taking the first ten places in the Top500 year by year. Oh, wait...

  4. Kebabbert

    Calm down

    "* Don´t * fucking * compare * anything * to * a * fucking * single * chip * p570. Its a goddamn eight socket server built for 3/4 TB ram, not one of your stupid entry-level toys. If you want one socket buy a p510."

    Calm down a bit. No need to get upset?

    I understand you do not want to compare p570 against toys. Who does? Let us instead compare Enterprise Solaris + SUN T5440 which drives several very large Stock Exchanges, against the p570. As Oracle has shown in their white paper, one T5440 is twice as fast as 3 of the P570, despite the IBM machines having thrice as many CPUs and thrice as high clock frequency.

    The price of one p570 will get you 4 of the SUN T5440 boxes, totaling 4 x 14.000 = 56.000 SIEBEL. According to the Oracle white paper, you need 24 of the IBM p570 with 5GHz Power6 CPUs to match that number. And one p570 costs 413.000 USD.

    I dont really understand why you would like to compare a mighty Sun T5440 against a measly p510? The T5440 outperforms p570 easily, why compare it against a weak p510? Isnt it better to compare so you have a fair fight? I dont like big guys fighting smaller guys, I dont like T5440 going up against a x86 neither a p510? You know, SUN could multiply the price of a T5440 several times, say 10 times. Would it be more suitable then, to bench it against a p570? So you mean that SUN has to multiply the price many times, before you consider T5440 against a p570?

    "Yes, of course a T5440 is faster than three p570s, thats why SUN dominates the HPC sector by taking the first ten places in the Top500 year by year. Oh, wait..."

    I dont really understand this statement. Are you arguing that the Niagara does not easily outperform the Power6, for a much lower price? And at the same time, you talk about Top500? I dont get it. The machines in Top500 are large clusters. Basically a bunch of nodes in a network. Big Iron has 64-128 CPUs normally. How many has p570? 8? There exist no Big Iron with 10.000 CPUs as in Top500. And you do know that the IBM winner in Top500 is driven by dual core PowerPC CPUs at 750MHz, but it has many of them instead? There are no Power6 in Top500. And those machines have only one purpose; to run a stripped down linux kernel to do number crunching.

    They can not handle many users logged in simultaneously doing work, just as a IBM Mainframe can, they are not general all purpose Big Iron computers. Although the IBM Mainframe has it's merits, performance is not one of them, it is even worse than Power6. It is well known that 1 IBM Mips == 4MHz x86. So a Mainframe CPU on 1000Mips equals 4GHz x86 CPU. If you have a Mainframe with 10 of the 1000 MIPs CPUs = 10.000 MIPS, they correspond to 5 Intel Quad Cores = 5 cpus x 4 cores x 2GHz = 40.000GHz. Read this Linux expert for more on Mainframes:

    http://www.mail-archive.com/linux-390@vm.marist.edu/msg18587.html

  5. Kebabbert

    To clarify

    The higher the clock frequency, the more it has to wait when it cache misses. Because the difference between memory speed and CPU speed is so wide. Ideally, you want a low clock speed to minimize cache waits. And it keeps the power consumption down also. An 3GHz x86 idles ca 50% under full load. It must be worse for a 5GHz CPU, maybe it idles 70% of the time? The higher the frequency, the worse it gets because RAM speeds doesnt catch up.

    And according to laws of physics, the power consumption for a CPU is proportional to

    Volt x Frequency x Frequency.

    This means that the single most important factor to keep power consumption down is to decrease the frequence. If you have 5GHz on a massively large CPU such as Power6, maybe it uses 400-500Watt? IBM has never released numbers on this, they keep shut. While SUN releases numbers on power consumption:

    http://blogs.sun.com/bmseer/entry/sun_s_2008_summary_of

    IBM brags about their Mainframes, but keep quiet on the fact that you can emulate a 30MIPS mainframe on your laptop with the emulator called "Herkules". And IBM never releases benchmarks on their mainframe cpus, why not? If they are really good, why keep quiet?

    I remember when IBM bragged about their Power6 CPU stating it has 200GB/sec bandwidth or so. But upon closer scrutiny, IBM had added the bandwidth of L1 cache + L2 cache + other bandwidths in the entire chip. You can not do that. If their is one bottleneck on 5GB/sec, then the chip's entire bandwidth will not be greater than 5GB/sec. It will never reach 200GB/sec. You can not add all bandwidth in a chip. That is just plain wrong. Maybe IBM didnt knew that.

  6. Anonymous Coward
    WTF?

    Retard alert ..Big Fing deal.....1.6GHz .....wow...so amazed

    I wonder if Oracle is going to up the price per core factor to 1 instead of .75

    T2+ systems are good at web tier?

    Weblogic pricing => 32 cores *.75 = 24 licenses * $25K = $600,000

    No wonder Oracle is buying Sun....looks like they will look to give these shit boxes away for free if you buy WebLogic licenses..

    You are a retard if you buy four 1.6 GHz T2+ chips for $115,695. That's a 28.7 per cent price hike for 14.3 per cent more clocks.

    MOJO

  7. Ysean
    Flame

    None of you get it!

    First, anyone that wants to trash ANY system because it isn't used in the TOP 500 should really go back to school to understand that such listing REALLY isn't important to typical business uses. MOST businesses don't need to do raw number crunching on the scale that the generally quite large clusters do in the TOP 500. Meanwhile, building traditional systems with the capability to process numbers like most of the TOP 500 would be HIGHLY cost prohibitive. Clusters are NOT single system image deployments. Traditional systems are deployed as a single system image. This means that regardless of the number of CPUs /cabinates/server blades that the whole shebang is 1 system to the hardware/software. DIFFERENT BEASTS FOR DIFFERENT TASKS, TWITS.

    Now, as for the Sun T boxes... I haven't had the privilege to work with on yet. But, from what I've read and heard they make excellent boxes for for the majority of the typical server use scenarios. That does not mean they are best for all. There are things that I would still use a Sun e6500 for over one of these Sun T boxes. Data warehousing would be a candidate. Again, DIFFERENT BEASTS FOR DIFFERENT TASKS, TWITS.

    So, can we stop this stupid little argument now?

  8. David Halko
    Happy

    Coward: Customers buy performance, not clocks

    Anonymous Coward posts, "You are a retard if you buy four 1.6 GHz T2+ chips for $115,695. That's a 28.7 per cent price hike for 14.3 per cent more clocks.."

    Perhaps - but customers may purchase more performance.

    http://www.sun.com/aboutsun/pr/2009-07/sunflash.20090721.2.xml

    "Running the latest versions of the Solaris 10 OS and Java Server software backed by Oracle Database 11g, the SPARC Enterprise T5440 server delivered... a 21 percent improvement over the previous 1.4GHz-based SPARC Enterprise score"

    21% more performance is being seen for some applications. The 20% increase in memory bandwidth may have done more for the performance of the overall system than the CPU bump of 14%.

    It is interesting how much of a boost Sun and TI got out of the T2 and T2+ CPU's with such a small boost in clock rate.

  9. Anonymous Coward
    Anonymous Coward

    Niagra vs Nehalem

    Surely thats the interesting question?

  10. Anonymous Coward
    Anonymous Coward

    Niagara Design is a system on a chip !

    For the T2+ one also got 2x10Gbit Ethernet onboard (XAUI) and each core got a crypto unit for

    the 10 most popular ciphers ! This is very important for SSL traffic, etc.

    Divide that powerful system in logical domains and inside a logical domain subdivide your

    workload further with Solaris containers. This system gives you a very efficient gear for

    different workloads. Do not make the mistake to compare a system on a chip design with

    a power6 CPU !

  11. Anonymous Coward
    Paris Hilton

    Re: Niagra v Nehalem

    I'd open it up further - Power6 (and upcoming 7) v Niagara v Nehalem.

    Is this x86's chance to get some of the mid-range action?

    Paris,because she enjoys watching the big boys fight it out.

  12. Kebabbert

    Anonymous Coward

    "You are a retard if you buy four 1.6 GHz T2+ chips for $115,695."

    Why are you a retard if you buy one SUN T5440 with 4 T2 chips? Should you instead buy an IBM p570 for the paltry sum of $413.000 you mean? But the T5440 outperforms three p570, and they cost in total $1.240.000.

    I fail to see why you shouldnt buy one T5440 if you want most bang for the buck? But maybe you know something I dont? Please enlighten me!

  13. Anonymous Coward
    FAIL

    Re: Calm down

    "I dont really understand this statement. Are you arguing that the Niagara does not easily outperform the Power6, for a much lower price? And at the same time, you talk about Top500? I dont get it. The machines in Top500 are large clusters. Basically a bunch of nodes in a network. Big Iron has 64-128 CPUs normally. How many has p570? 8? There exist no Big Iron with 10.000 CPUs as in Top500. And you do know that the IBM winner in Top500 is driven by dual core PowerPC CPUs at 750MHz, but it has many of them instead? There are no Power6 in Top500."

    Come on. How about a little due diligence? There are power6 machines on the top 500:

    http://www.top500.org/stats/list/33/procgen

  14. joe 14
    Paris Hilton

    How many?

    How many of the ones slamming these boxes have any T2's in their racks? Come on guys tell us? I for one have 20 5220s and 5120s. We love them, A single CPU 5220 replaced a quad SF480 and the DBA noticed almost double the performance for OLTP! JAVA web apps? All I can say is WOW!

    They are the best bang for the buck bar none for web services! Though I'm sure some of you will disagree even though you have never even tried one.

    Paris, cos she loves a good bang for a buck.........

  15. Captain Thyratron
    Headmaster

    Re: Kebabbert

    Before you damage an otherwise reasonable argument about the merits of SPARC by saying something uninformed, let me remind you that present-day IBM or Unisys mainframes are more like 30,000 MIPS than 30 MIPS. Have a look here for an example:

    http://www.serverwatch.com/hreviews/article.php/3737101

    The mainframes people simulate on laptops aren't modern million-dollar zSeries systems--they're old virtual System/370 machines from decades ago. A modern mainframe, between sheer CPU power and monstrous I/O (people seem to forget that I/O is where mainframes really kick ass), is often capable of virtualizing an entire data center. IBM has plenty of bragging rights. (There's also the whole deal about reliability, safely sustainable load at nearly full capacity, and uptime typically on the order of a decade, but I think you were discussing performance.)

    If you are to compare individual SPARC systems to that, then the most suitable comparison would be between a modern mainframe and a Fujitsu SPARC Enterprise M9000, which *might* be a fair fight.

  16. Anonymous Coward
    Anonymous Coward

    Niagara or Power?

    Kebabbert, since single process can utilize only single core, it's the core performance matters the most. Who cares how many cores are within CPU? Oracle Database Enterprise is licensed per core. Run a silgle database query and take a look how many cores are utilized.

    And don't believe what the Intel guys say about cache misses - it's true, but to single-threaded x86 CPUs. Otherwise who would need so high frequencies? For what?

    Anyway, IBM was first to introduce dual core processor (PowerPC and Power4).

  17. lansalot

    horses for courses

    We use them for front-end application servers (oracle9ias) and despite the much lower clock-speed, they can handle far more concurrent connections than their non-coolthreads brethren here. Would we use them for HPC ? No. Would we use them when the load comes from many connections and processes rather than a handful of CPU-thrashers ? Yes, because that's what they're designed for.

    Don't focus on the clock speed, or whether it at #1 in some HPC chart. It's not there for a reason - reason being because it's not built for that. But in plenty of everyday server-based scenarios, it will be ideal. Do your homework, and assess whether it's the kit for you and your situation.

  18. Maurice Verheesen
    Thumb Up

    Listening carefully...

    @Kebabbert

    Thanks for the explanation, really interesting and very clear!

  19. Kebabbert

    Anonymous Coward, Captain Thyratron

    ANONYMOUS COWARD,

    "Come on. How about a little due diligence? There are power6 machines on the top 500:"

    Cool. Last time I checked there were no Power6 on top 500. I will not say that again. Thanks for clarifying.

    I quickly lost interest in top500 because the supercomputers (i.e. large clusters) are made for only one purpose, and nothing else. One of the problem is all the excess heat. To keep down the frequency is a possible solution. The biggest problem with supercomputers is the software (not the hardware), to distribute all the work to all nodes in the cluster in an efficient manner. Each node can be 700Mhz but if you distribute the work well, you win, despite using 700MHz. On place five on top500 we find the famous IBM BlueGene which uses PowerPC cpus at 700MHz. Does place five mean that IBM has superior CPUs? No it doesnt. It is a non rigorous argument to state that IBM has superior CPUs because they have rankings on top500. I would hardly call 700Mhz PowerPC superior?

    http://www.top500.org/system/8968

    It is similar to TPC-C benchmarks, where IBM hold the record, last when I checked. That configuration costed $15 million. It used lots of hard disks which were short-stroked. And it used 2TB RAM! Just really silly. The machine couldnt do any serious work, only TPC. And this TPC is not representative for normal DataBase work. Who would short-stroke disks in a normal DB server? Who uses 2TB RAM for normal database servers? No one. When IBM states that they hold the TPC record, it doesnt matter for the normal DB server. This is one of the reasons SUN lost interest in TPC, because it has no real world value. The benchmarks are artificial and of no interest for real work.

    All these benchmark machines are pathological and only built for one purpose, they can do nothing else. A normal server is a totally another kind of animal. Im not interested in pathological cases.

    CAPTAIN THYRATRON

    "Before you damage an otherwise reasonable argument about the merits of SPARC by saying something uninformed, let me remind you that present-day IBM or Unisys mainframes are more like 30,000 MIPS than 30 MIPS."

    Cool. The new IBM Mainframe Z10 uses a Quad Core 30.000Mips CPU. If we accept that 1Mips == 4MHz x86 (this comparison is from year 2003 when Pentium4 ruled, and should be adjusted because of the very much stronger Nehalem) then Z10 CPU corresponds to a 30.000Mips x 4MHz = 120.000 MHz = 120GHz x86 single core CPU.

    A fictive Intel 3GHz Quad Core from year 2003, corresponds to 4cores x 3GHz = 12GHz. You would need 10 of Intel Quad CPUs from 2003, to match one Z10 CPU of today. This calculation assumes that the Intel Nehalem architecture is as weak as Pentium4, which is wrong. If a single core nehalem CPU is twice(?) as fast as one Pentium4 CPU from 2003, then you would need 5 Nehalems to match one Z10 CPU. And I bet that 5 of the Nehalems are cheaper than one Z10 CPU?

    Does the Z10 CPU alone cost $50.000 USD, or is it far too low? The Mainframe begins at $1 million. Do you get one CPU then? And if you want a Mainframe with 2CPUs, then it cost $2 million? For a decent config you pay $10-20 million? And some people say that SUN servers for $50.000 are overpriced? Ive heard that one company migrated from Mainframe to one SUN T5440, for the right workload it can easily be done.

    "A modern mainframe, between sheer CPU power and monstrous I/O (people seem to forget that I/O is where mainframes really kick ass), is often capable of virtualizing an entire data center. IBM has plenty of bragging rights. (There's also the whole deal about reliability, safely sustainable load at nearly full capacity, and uptime typically on the order of a decade, but I think you were discussing performance.)"

    Yes, this thread is about performance. (For uptime, Ive read about OpenVMS machines with uptimes of 17 years, which may be in par with Mainframes). Regarding performance, Ive read about IBM bragging that one Mainframe can consolidate 1.500 x86 servers. That is mighty impressive.

    Upon closer scrutiny, this consolidation requires that all the x86 servers idle at a few percent and the Mainframe is fully loaded at 100%. This statement is just doubtful. Again. IBM states "an x86 server often idles, so this is an ok statement". No it isnt. Then I can state that "my laptop can consolidate 10 server, buy it for $1000". But no one would buy my laptop when they find out that the servers must be idling, as soon as the servers start work my laptop crumbles. This is not really correct statement about my laptop?

    Several of the IBM campaigns reminds me of Microsoft's "get-the-fact" campaign. MS stated that Windows has lower TCO than Linux. But upon closer scrutiny, MS assumed that Linux was run on a IBM Mainframe and Windows was run on a PC - of course Linux is more expensive then! But that statement is just plain wrong. IBM work in the same way. Have you heard about Mainframe business? IBM has been accused of monopoly and starving it's competitors, but no one notice.

    ANONYMOUS COWARD,

    "Kebabbert, since single process can utilize only single core, it's the core performance matters the most. Who cares how many cores are within CPU?"

    This discussion is about which CPU is fastest, yes? It is not a discussion about which core is fastest, or which ALU is fastest, or which integer division is fastest? If CPU A) has faster integer division than CPU B) - can you deduce that that CPU A) is fastest? No you can not, because the entire CPU B) may be faster. If CPU C) has one core but it is faster, can you deduce that CPU D) is slower? No you can not, maybe CPU D) has 10.000 cores.

    "And don't believe what the Intel guys say about cache misses - it's true, but to single-threaded x86 CPUs. Otherwise who would need so high frequencies? For what?"

    The Intel guys are lying? If you had been knowledgeable about programming, then you would have known that when the cache misses, a CPU has to go to RAM and that takes 100s of ns (nano seconds).

    Say that you have one 1GHz CPU which must run a loop, with 100 computations. At the end of the loop you need to fetch some data. Each computation takes 1 ns, so the CPU spends 100ns for the computations. Then you have to fetch data, and if you access RAM the CPU has to wait 100 ns before it can continue. In this case the CPU idles 50%: 100ns work, 100ns wait. (Of course, the Niagara CPU doesnt wait, it just immedeately switches thread and continues work instead of waiting).

    If you have a mighty fast CPU, say 10GHz, then the CPU will still have to wait 100ns because RAM is still slow. The computations take 0.1ns each, so the CPU spends 10ns for all computations. Then it fetches data which takes 100ns. In this case the computer works 10ns and waits 100ns. Hence, it idles 90% of the time under full load. Ergo, the faster the frequency, the more it idles. This is common knowledge, maybe you should study a bit before you post again. I dont think you knew this, because if you knew, then you would be spreading FUD about SUN Niagara and I dont believe you are that kind of person.

    The only way to get around this problem is to have faster RAM. This is why IBM, Intel, AMD, etc all have gigantic CPU cache, and uses complex pre-fetch logic, etc. But the only way to completely eliminate cache misses if you have slow RAM, is if the CPU uses ESP and psychic powers to always fetch correct data without causing cache misses. If you have not faster RAM or ESP, then you can use the SUN Niagara solution: switch thread at once and continue work instead of waiting. The Niagara has very small cache. The Niagara also has low frequency. This keeps the nr of transistors down and still it is very fast. It uses very little power and you save lots on power bill. In fact, if you study Niagara a bit, it is truly a new ingenious solution. Remarkably clever. And it shows in the benches.

    "Anyway, IBM was first to introduce dual core processor (PowerPC and Power4)."

    Good for IBM. Still IBM has said that SUN solution to have many slow cores, is a bad thing. The correct way is to use few fast cores. IBM has mocked SUN for using many cores.

  20. Matt Bryant Silver badge
    Happy

    RE: Kebabbert

    So, they made the cores faster, but didn't fix the real problems like the too small cache. And when will people stop describing any Niagara as being capable of running 256 threads - it can't run them concurrently, at most there are only ever sixteen wheiner threads running, and they are all waiting on the too small cache. I nearly fell of the chair laughing when you tried to attack Intel on cache hit ratios - Intel's cache hit ratios for all their x64 and Itanium range are far better than any Sun CPU's.

    But, more to the point, chips like Nehalem and Power not only have real cores that can handle proper threads, they have larger caches and more bandwidth to the chip courtesy of technology such as DDR3 memory, which keeps the cores spinning more. Niagara is Sun admitting they can't keep the cores spinning, it is a surrender to poor bandwidth design. Making the cores slightly faster is not going to help much other than make them even more starved by the lack of cache. The idea of seriously comparing any T2/T2+ server to a Power6 server is simply laughable, it' like putting a courier's moped up against an articulated lorry. Sure, the moped may get across town faster when all you want to send is a parcel, but for shipping that grand piano you want the lorry.

  21. Billl
    Happy

    Re: RE: Kebabbert

    Matt, stop showing your ignorance.

    The design of CMT makes smaller caches possible. It does not require larger caches. Though it is unlikely that 256 threads will run at once, your assertion that "at most sixteen" threads can run is just plain wrong. On the T5440 there are 4 CPU's, each with 8 CPU's and 8 threads per core. At least 8 threads per CPU can run at once, and since there are 4 cpu's that means that at least 32 threads can run in parallel. That doesn't exactly explain it though, as there are multiple steps in the pipeline and each thread can run in different stages of that pipeline, making it possible for more than 8 threads to run at once put core. Now, go and look at your HP/UX load and see how many CPU's are stalled waiting on memory. That exorbitant cache is not always helping you, except in some very cache friendly situations. That stalled CPU is doing nothing else while waiting for memory. While that CPU is waiting on memory, a Niagara CPU could actually be getting work done with one of the other threads ready to run... Please see this document:

    https://www.sun.com/offers/details/cmt_wp.html

    It will clear up much of your misguided HP FUD. Yeah, yeah, I know you don't work for HP, but you readily repeat their drivel...

    The fact is that Intel is forced to increase Cache because they have not and cannot do any kind of innovation in the core of the CPU. The majority of Tukwila will be cache as the EPIC design does not allow for much else. Cache is a workaround, not a feature.

  22. Kebabbert

    Matt Bryant

    "So, they made the cores faster, but didn't fix the real problems like the too small cache."

    Huh? You didnt understand anything I wrote, did you? SUNs Niagara design ALLOWS a small cache. SUN's solution makes having an enormous cache obsolete. Other designs forces the CPUs to have enormous cache, and complex prefetch logic. You must devote many transistors to these two things. Transistors that can be better spent elsewhere. Ive told you that you do NOT want to spend all these transistors on enormous caches and prefetch logic. The less transistors, the less heat and the less complex the chip will be. This translates to easier to manufacture, easier to debug, higher quality, better yield, etc etc etc.

    And then you make remarks as "HAHAHAHAHA!!, Niagara doesnt have a big cache and no complex prefetch logic! HAHAHAHAHA!!!". I suggest you should study more. I have double M Sc degree, one in math, the other in comp sci. You should learn a bit more about computer architectures before making such ignorant remarks?

    "I nearly fell of the chair laughing when you tried to attack Intel on cache hit ratios - Intel's cache hit ratios for all their x64 and Itanium range are far better than any Sun CPU's."

    As I told you, the CPUs has been faster and faster, whereas the RAM has not evolved in the same way. The difference in clock speed is getting larger and larger. It is obvious that a configuration where the CPU and RAM has equal slow speed will not get many cache misses, as one config with a very fast CPU and slow RAM. That is pure logic, yes? The only way the fast CPU could avoid cache misses is if the CPU had Extrasensory Perception and Psychic powers to predict the next data to fetch in advance.

    This is pure logic. Obviously, you havent studied logic at the university, but you should. Then you wouldnt make such illogical and wrong remarks. And if you persist stating a fast CPU has fewer cache misses than a slow CPU, then I require a proof from you. Show me papers supporting your false statement. (The next time, dont make up things. If you want to make up things, make sure they have no obvious errors).

    "But, more to the point, chips like Nehalem and Power not only have real cores that can handle proper threads, they have larger caches and more bandwidth to the chip courtesy of technology such as DDR3 memory, which keeps the cores spinning more. Niagara is Sun admitting they can't keep the cores spinning, it is a surrender to poor bandwidth design. Making the cores slightly faster is not going to help much other than make them even more starved by the lack of cache."

    A large cache is something you want to avoid. It is not a good thing. The Niagara has small cache, and still it outperforms Power6 at 5GHz. One 1.4GHz Niagara is worth six Power6 CPUs at 5GHz, according to the white paper from Oracle on SIEBEL benchmarks. How in earth can a Power6 have fewer cache misses if it is soo slow? If it's cache would function perfectly, it would beat a slow 1.4GHz CPU, yes? But the benchmarks show that you are wrong. Again. The IBM power6 has really bad arcane design. You know, very high speed and large cache with complex prefetch logic were good in it's days. But there is a new solution in town now. One solution that is superior to the old design from the 1970s.

    "The idea of seriously comparing any T2/T2+ server to a Power6 server is simply laughable, it' like putting a courier's moped up against an articulated lorry. Sure, the moped may get across town faster when all you want to send is a parcel, but for shipping that grand piano you want the lorry."

    Yes, I totally agree with you! This is the only sane thing youve posted. One SUN T5440 gets 14.000 SIEBEL benchmark and three of the IBM P570 Power6 servers get together 7.000 SIEBEL benchmark, according to Oracle. This shows that it is simply laughable to put one T5440 against three p570. The SUN niagara is famous for massive throughput and enormous loads. The IBM power6 is not. Power6 is like a small moped, whereas the SUN is like a huge loader that is capable of enormous loads, which the benchmarks show.

    If the things you write were true about the Niagara beeing inferior to Power6, how come the Power6 bites the dust by a large margin in the benches? You should apply some logic on your statements first. It helps a lot.

  23. Matt Bryant Silver badge
    Happy

    RE: Billl

    "Matt, stop showing your ignorance...." Anyone still pushing Sunshine after the Sunset, the massive decline in SPARC sales and the current uncertainty as to what current Sun hardware will even be available in a year's time, really hasn't got the right to accuse anyone of ignorance.

    "....The design of CMT makes smaller caches possible...." Nope, the design of T2/T2+ means you have to make do with small cache split between all the cores and being flushed continuously as you switch between stalled threads. And every time there is a cache miss it's off to RAM ( relatively slow), local disk (very slow) or the SAN (extermely slow!). Which is why T2/T2+ can only shine with wheiner-threaded apps or light loads like webserving.

    ".....It does not require larger caches....." Which is a bit like saying "my car doesn't need a bigger engine as I'm happy crawling along at a slow speed". Of course it needs more cache! Any CPU design will benefit from more cache. Actually, I should rephrase that - any CPU design will benefit from a competant cache design that uses good prediction technologies to try to avoid the lag of going to RAM or out to disk. Since Sun can't match IBM or Intel or either of those two points it would be more accurate to say the T2/T2+ designs can't use more cache even though the design desperately needs it. Even after T2/T2+ forces users to recompile their old Slowaris apps to get away from the old heavy threads of UltraSPARC, they still have a tiny cache space for each thread as the cache is shared between so many cores and stalled threads. Compared to Nehalem, Power6+ or Tukwila, T2/T2+ is cache-starved and has poor memory bandwidth.

    The reason it's unfair to compare T2/T2+ to those real enterprise CPUs is beacuse T2/T2+ was never meant to be a real datacenter CPU, it was always meant to be the cheaper option for webserving whilst Rock was supposed to do the heavy lifting. Now, with Rock dead, the Sunshiners are desperately spinning T2/T2+ as some kind of Rock replacement, when the reality is it's like taking a waterpistol to a five alarm fire when everyone else has a proper firetruck.

    "....The fact is that Intel is forced to increase Cache because they have not and cannot do any kind of innovation in the core of the CPU...." Yeah, that whole EPIC thing was handed to hp and Intel on a plate - not! The facts are that IBM and Intel have moved on from RISC, IBM with a semi-RISC design whilst Intel have taken hp's EPIC and moved it forward. Sun are stuck trying to still peddle RISC, their only "innovation" being to try and strip the RISC core down an make it tiny so they can squeeze more onto a die. The truth is hp/Intel have innovated the most, IBM have followed, and Sun has innovated the least. Rock was supposed to be their innovative design - it failed.

    "....Now, go and look at your HP/UX load and see how many CPU's are stalled waiting on memory...." Actually, with good coding, not as many as you Sunshiners like to claim. I know you like to pretend that everyone else's CPUs are spending most of their time idling, but that's just the daydream you like to believe because it helps you keep faith.

    "....That exorbitant cache is not always helping you, except in some very cache friendly situations...." Like running large enterprise applications such as databases, perhaps? Just because you want to believe having large cache is not an advantage, it does not make it true. Even if there wasn't the evidense of comparing different models of the Intel CPUs with different cache sizes, common sense would let anyone with half a clue about computing realise that large caches are going to help as they cut down on the amunt of calls to memory and disk. For you to say otherwise is just such obvious male bovine manure. You lot are like the Black Knight out of "Monty Python & The Holy Grail", insisting that having your legs and arms cut off is not a problem.

    "....It will clear up much of your misguided HP FUD...." Yes, I'm so sure Sun FUD will help clear it all up - not! By the way, did you notice that big drop in Sun sales? That was because the rest of us don't believe that Sun FUD. You are wasting your time posting it here.

  24. Matt Bryant Silver badge
    FAIL

    RE: Matt Bryant

    ".....Huh? You didnt understand anything I wrote, did you?...." Let's see, looking back at what you wrote.... Yeah, I did, it's just I saw it for the complete Sunshine it was.

    "....SUNs Niagara design ALLOWS a small cache...." Lol, that's like saying a Yugo ALLOWS a smaller engine compared to a Porsche. It's not that the design allows a small cache, it's that the design forces a small cache for two reasons - one, it is supposed to be a cheap design to fight x64, so Sun can't afford to put too much cache on it; and two, it would not be possible for Sun to provide enough cache for all the cores as there is not enough space on the die, and to make it larger would make it both hotter and much more pricey. Sun CHOSE to make the design with small cache as they always thought they would have Rock to take on Power and Itanium.

    "....You must devote many transistors to these two things. Transistors that can be better spent elsewhere....." Yes, like in real cores. The problem for your little bit of fantasy is that Nehalem has proper cores and much more cache at a lower price, and Power and Itanium manage to include real cores with much larger register counts because they are full-scale enterprise designs, not cheap alternatives.

    ".....Ive told you that you do NOT want to spend all these transistors on enormous caches and prefetch logic....." Well, you don't if you're Sun and you're desperately trying to fight off Xeon. Of course, Power and Itanium weren't designed to fight Xeon but each other, and they do so with no holds barred. They have large transistor counts becasue they have larger register counts, fatter pipeslines and properly implemented technologies such as cache prediction, whereas T2/T2+ has always been a compromise to meet a pricepoint in a lower league. If that wasn't true then Sun would never had felt the need to design Rock or use Fujitsu's SPARC64 chips.

    "....The less transistors, the less heat and the less complex the chip will be....." And the less it will be able to do. You can't have it both ways. By making T2/T2+ with such simple cores and so little cache, Sun made a chip that was never going to be competitive in anything other than webserving.

    "....This translates to easier to manufacture, easier to debug, higher quality, better yield, etc etc etc....." Oh yeah, that worked out so well for you on Rock, which also had a simplified RISC core design, though not quite as crippled as T2/T2+. And the new and higher-clocked T2/T2+ parts are suspected of just being from deep sorts of the bins, which means they are zero innovation.

    "....And then you make remarks as "HAHAHAHAHA!!, Niagara doesnt have a big cache and no complex prefetch logic! HAHAHAHAHA!!!"....." Did I post "HAHAHAHAHA"? Nope, must be that imagination of yours. It seems quite fertile, if a little unoriginal.

    "...I suggest you should study more...." Sorry, too busy doing real enterprise work, which included doing comparative benching of Niagara against Xeon. I'm guessing the only benching you do would be in a park.

    "....I have double M Sc degree, one in math, the other in comp sci..." Which just goes to show that education is no protection agaist the idiocy of the Sunshine. Did you also brag about your qualifications earlier this year when you were no doubt telling everyone that Sun wasn't up for sale, that Rock was a sure-fire success, etc, etc? I'm not going to list qualifications or even where I studied, as it would be rather crass, but let's just say you're not winning in that area either.

    ".....You should learn a bit more about computer architectures before making such ignorant remarks?..." Ah, the old Sunshiner standard - "if you don't agree with me it must be because you are ignorant and stupid". Well, it looks like the majority of the market are just as stooooooopid, 'cos we aren't buying Sun. Does it frustrate you that you are just so gosh-darn smart but all the stupid little people don't listen to you?

    "....As I told you, the CPUs has been faster and faster, whereas the RAM has not evolved in the same way....." Of course! That old EDO RAM in my loft is just as fast as DDR3, how stupid of me to spend all that money buying DDR3! I'm so glad there is someone as clever as you to set me straight. Now, if only we could get you to work on World peace.....

    "....The only way the fast CPU could avoid cache misses is if the CPU had Extrasensory Perception and Psychic powers to predict the next data to fetch in advance...." Well, here in reality we use cache algorithms. You may want to take some time of from your next MSc to do a little reading about algorithms such as least frequently used, adaptive replacement, or multi-queue caching. You'll find there is no ESP required, but it may require you to take a step out of your fantasyland.

    "....And if you persist stating a fast CPU has fewer cache misses than a slow CPU, then I require a proof from you. Show me papers supporting your false statement...." Did I say that? Even if I did, how do you prove the statement is false? You didn't. Did you post any links to paper supporting your waffle? No. And you "require" proof from me? What are you, my teacher? Looks like your studying missed out a big section of basic comprehension as well as basic manners. But let's just think about your statement, using that logic you are so hot on. The point I mentioned is that Itanium, Nehalem and Power have MORE cache than T2/T2+, not that they are faster clock speeds. The logical conclusion would be that you want to drag the conversation away from this fact by a brash assertion that I am "lying" about something completely different. Either that or you're just an idiot talking through your rectum.

    "....A large cache is something you want to avoid. It is not a good thing....." OK, you can't seriously expect anyone to swallow that massive lump of male bovine manure! If your requirement is performance, then the larger the cache the better! But then I'm guessing you only want to consider such factors as power consumption, where the T2/T2+ is superior. The reality is companies will pay to get performance, even if it is power-hungry, because they really need it for those enterprise applications.

    "....The Niagara has small cache, and still it outperforms Power6 at 5GHz....." In what way does it outperform it? On what test, using what application, to fit which business requirement? My own experience is that Niagara has zero chance of outperfroming Power6 in any of our business uses, except webserving. And even then we use x64 for webserving as it is easier, more flexible and far cheaper.

    "....If the things you write were true about the Niagara beeing inferior to Power6, how come the Power6 bites the dust by a large margin in the benches? You should apply some logic on your statements first. It helps a lot." You really can't be so stupid as to believe that one carefully crafted bench makes Niagara faster than Power6! I'm sure if I called Armonk they'd supply me with a dozen equally carefully crafted bench sessions showing Power6 walking all over Niagara. The fact is neither has any bearing on my actual business requirements, and that's even after admiting we use Oracle and Siebel. I have benchmarked Niagara, using our apps in our environment, with Sun providing the tuning, and it provided miserable Oracle performance when we compared it to Power5/6, "Montecito" Itanium, "Barcelona" Opteron and "Tigerton" Xeon. You can squeal accusations and your beliefs all you like, I've seen the reality, and my logic is that what I have touched and seen far outweighs what you are daydreaming of in Sunshinerville.

    /SP&L

  25. Justin Stringfellow
    Dead Vulture

    nit pick

    T1 topped out at 1.4Ghz actually.

  26. Kebabbert

    Matt Bryant

    Oh, it is always a pain in the *** trying to explain some business people on technology. The worst thing is when they havent done the basic computer architecture courses at the University. And it doesnt help when they apply unsound logic either.

    Matt, I suggest you talk to some people that knows computer architecture. I suggest you talk with people at a University, they are probably more non biased than your average HP or IBM sales rep, which you talk to. And the Uni people know a lot more. It is quite stupid by IBM to state that "one Power core is faster, ergo the Power CPU is faster". That doesnt just add up, logically. But I understand you think this sounds fair and that you have problem using logic, as you have clearly not studied logic. Otherwise you wouldnt have said these ignorant things about CPUs.

    Actually, the things you say are so weird, it makes you wonder. For instance

    ----------------------

    ""Matt, stop showing your ignorance...." Anyone still pushing Sunshine after the Sunset, the massive decline in SPARC sales and the current uncertainty as to what current Sun hardware will even be available in a year's time, really hasn't got the right to accuse anyone of ignorance."

    If you really believe that the best technology always wins, then you are quite naive. For instance, have you heard about VSH vs Betamax? No? You havent heard about Windows vs Unix? No? *sigh* If SUN doesnt sell Niagara boxes for one fifth of the price of a Power box, despite higher performance - what does that mean? That the SUN tech is bad, or the sales division are bad? Hmmm... Let me see, it must mean that the SUN tech is bad, right? With furious marketing you can put lipstick on a pig and outsell anything else.

    -------------------------------------

    ""....The design of CMT makes smaller caches possible...." Nope, the design of T2/T2+ means you have to make do with small cache split between all the cores and being flushed continuously as you switch between stalled threads. And every time there is a cache miss it's off to RAM ( relatively slow), local disk (very slow) or the SAN (extermely slow!). Which is why T2/T2+ can only shine with wheiner-threaded apps or light loads like webserving."

    You havent read what I wrote. Or you did, but didnt understand. Well, it is not really that hard to understand (I hope). It doesnt require a M Sc actually. I will do my best to explain this again. Listen carefully.

    An CPU will ALWAYS suffer from cache misses. There is no way of avoiding them. The only way to avoid cache misses entirely, is for the CPU to use psychic powers.

    Fact: CPUs will ALWAYS suffer from cache misses. Intel Corp says 50% idle under full load because of cache misses on a normal 2GHz x86 server. This idling occurs because RAM is much slower than CPU.

    Now there are two strategies to deal with this fact.

    A) You try to minimize cache misses by using large caches and complex pre fetch logic. Then you can maybe decrease the idling from 50% down to 45%. But I doubt that, as Intel has applied both these techniques and stiil an Intel CPU idles 50%. The higher the frequency, the more idling. An 5GHz CPU idles maybe 70%? Dont know, just a guess. I havent seen studies on 5GHz frequencies.

    B) You dont try to minimize cache misses at all. You KNOW there are nothing you can do to avoid them. Why not try to work around that problem instead of minimize misses, which is totally futile? It is a lost battle, chip makers have been trying to avoid misses for decades now. And the company with most research resources, Intel, is still stuck at 50%. If Intel's cpus are idling at 50%, despite all research, there is nothing you can do against cache misses. This is a fact. The CPU needs psychic powers to foresee the future to avoid cache misses. This is impossible.

    Instead of fighting this fact, work around it. There WILL be cache misses. So we will use a new revolutionary technique which no one has thought of yet: try to mask the cache misses. As soon as there is a cache miss, switch thread at once (this is the key point - at once, normal CPUs takes hundreds of cycles to switch thread - they can as well wait for the data in RAM - it is equally slow) and continue working with another thread. So do not try to avoid cache misses, instead do some useful work while you wait. This is an unique and new solution.

    You can never fight cache misses, by laws of probability and mathematics. Dont fight them, instead do some useful work instead of idling and waiting. This is extremely clever. After decades of research into route A) you still are stuck at 50% idle. Route A) is legacy, and you need a new solution.

    Because of this new solution, you also dont care about cache misses. They will occur. You can not fight them. And because they will occur, you dont care about large cache sizes nor complex pre fetch logic. Because both of these legacy techniques are useful when you try to fight cache misses. But SUN doesnt fight them.

    THIS is the reason Niagara doesnt have large caches or complex prefetch logic. Niagara doesnt combat cache misses, Niagara works around them.

    If Niagara had large caches and complex prefetch logic, there would be little won. Then you have had to spend large amounts of transistors, for no benefit at all. Then Niagara would be a power hog like Power6 CPUs. Niagara would use 500Watt and still perform lousy - just as Power6. Now, Niagara uses ca 100 Watt (which is less than Intel's server CPUs) and still Niagara owns Power6.

    Ok, have you finally understood the point with the Niagara approach? It is like a Gatling Gun (many many fast small bullets), instead of making larger and larger rifles Magnum, Mega-Magnum, Mega-Mega-Magnum just like Power6 does. The new solution is: many, smaller, faster bullets than one big bullet.

    Niagara doesnt fight cache misses. It works around them. It is impossible to fight cache misses. Niagara doesnt need large cache because it doesnt fight cache misses. Large cache is only needed when you have another strategy; to fight cache misses.

    I suggest you read this again. Slooooowly. Just like a Power6. It surely is a pain to explain tech to someone ignorant. But it is not really hard to understand this, the thing is Niagara uses a new technique. Not the old legacy technique with large caches and pre fetch logic.

    Matt, besides complaining that Niagara has no large cache, you can also complain that Niagara has no elaborate pre fetch logic. For a legacy CPU, to not have complex pre fetch logic is really bad. But I hope you understand now, that they are not needed in Niagara's new unique solution.

    --------------------------------------

    Another thing. You state that you have benchmarked Niagara and it turned out to not suit your needs. That is fine. Niagara is not always best route for all problems. If you can use many small, fast bullets, then you use Niagara. If you want to use one large bullet then you are better off with a legacy CPU like Intel or Power6.

    But all bench markers need to know this:

    The Niagara CPU has to be loaded EXTREMELY high to shine. Several bench markers have loaded Niagara CPU with a small test work load, and then Niagara sucks. But when you load Niagara with large work loads, it never chokes but continues to work happily - this is due to it's new arcitetchure. If you only utilize a few threads, the Niagara will seem to suck. I have read several reviews that shows this. They used a test work load and concluded that Niagara was slowest in town. But when you load Niagara far beyond where the other legacy CPUs chokes, the Niagara just dont care. It can handle enourmous work loads, far much than any other legacy CPU. That is the point of using Niagara; load them up enormously. Far more than any other legacy CPU and you will see it performs extremely well. Load it up with few threads, and you have never seen it's potential.

    This is the reason it wins over the Power6 CPU on many benchmarks. For instance,

    ORACLE, SAP, spec_int, Lotus Notes, etc.

    Here are just a few world records with the old CPU T2, where Power6 bites the dust.

    http://johnjmclaughlin.blogspot.com/2007/10/utrasparc-t2-server-benchmark-results.html

    And one last thing, these benchmarks above, is "not one carefully crafted bench makes Niagara faster than Power6". They are not carefully crafted benches from SUN. These benches are valid, and specified by other companies such as Oracle, etc.

  27. Matt Bryant Silver badge
    FAIL

    RE: Matt Bryant

    Oh dear, the separation from reality is strong with this one. Even Novatose didn't quite hit this level of denial!

    "Oh, it is always a pain in the *** trying to explain some business people on technology...." I'm guessing that's because it's not your job to, seeing as you obviously have no clue about what happens in the enterprise computing field. It is my job to, and I do it often to people with real financial power on matters that affect the future of my company. All your theoretical nonsense is just fine until you apply it to the reality of the marketplace, and then the complete lack of penetration of Niagara into the enterprise datacentre is painfull evidence of just how wrong you are. Sun designed Rock to be the heavyweight of the team, and when Rock died they were left with a gaping hole in the range that Niagara just cannot fill. Why do you think even before that Sun brought out the M3000? Because Niagara just does not meet the general UNIX requirements of us customers, it is a niche solution. Go back to your ivory tower, any more time in the real World is likely to cause you mental anguish.

    "....The worst thing is when they havent done the basic computer architecture courses at the University..." But don't tell me, you are the World's greatest mind and silicon design authority. Yeah, right! But let me put your worries to rest - I studied amongst other things many of the electronic theories that govern transistor design, from basics such as what happens at breakdown right through to complex logic gate modelling, so I can quite happilly tell you to get off your high horse and stop assuming you are the cleverest and most knowledgeable person in the room. I'm betting a fair number of the readers here exceed both mine and your knowledge and experience combined. Your display of typical Sunshiner arrogance just makes you sound all the sillier.

    "....And it doesnt help when they apply unsound logic either....." And another typical Sunshinerism - "only my argument/product/solution/logic is sound, everyone else is a liar/newbie/idiot for not agreeing with me". Give it a rest you wannabe Einstein, there's a whole market out there that has shown that they can do their own thinking and that thinking is why they are not buying Sun. You can label and insult everyone that doesn't agree with you however you like, but the reality is you have zero real input to the market because you are so obviously not working in a real enterprise environment. I hope you enjoy yourself, but please remember the rest of us have real work to do, and we base our decisions on a lot more than fantasy.

    I'm scanning through the rest of your diatribe looking for something relevant, but it's just more insults and repeated waffling. I especially like the comparison to VHS and Betamax as it is often used by technical people in an attempt to paint their failed solution as technically superior, when all it does is highlight their lack of understanding of the market. I owned a Sony Betamax player, and later I replaced it with a VHS one. Later still I replaced it with a number of DVD players, but I haven't yet upgraded to Blu-Ray. Now, look at that sequence of products and try and consider it from a consumer persepctive rather than the technical one you so obviously stuck in. Yes, Betamax was a superior technology, which is why I initially purchased it, but my REQUIREMENT was to watch videos, so as soon as VHS made more sense I was happy to surrender the supposed superior technology of the Betamax format as VHS did the job just fine and a lot cheaper with wider choice. And until DVDs came along, VHS carried on meeting my requirement, as it did for the majority of consumers. Some idiots insisted on sticking with Betamax, claiming they were somehow smarter whilst struggling with the declining number of Betamax releases and players. Nowadays, DVD is the norm, with Blu-Ray still not offering enough of an advantage to get the majority of users to switch. That is a PERFECT analogy for Sun, SPARC and Slowaris - the diehard Sunshiners are still insisting Sun's products are the Betamax option because of some theoretical technical benefits the majority of users don't need, and that majority of the market has moved on to VHS and now DVDs. Niagara is like a multi-slot Betamax player - technically interesting, but as rellevant to the average buyer as a chocolate teapot. Despite all your much-mentioned education, you really need to get out into the marketplace and get some real World experience, you'll probably learn more real practical and valuable knowledge in your first six months than you did during any of your degrees. I certainly did.

    As to your advice on how to bench Niagara, there's one slight problem - the real World does not craft solutions to meet a CPU's requirements, but solutions to meet business requirements. The reason Niagara is niche is EXACTLY BECAUSE it's design does not reflect the requirements of the market. You are correct in describing Niagara as a Gatling gun as the modern Gatling gun is just the aplication of modern technology (electric drive and belt feed) to historic technlogy (the hand-cranked, magazine-fed Gatling). Niagara is a desperate attempt to prolong the historic tech of SPARC and Slowaris by applying the new technology of die-shrinking for multiple cores, the only problem being they had to cripple the cores to get them to fit the pricepoint. That's like taking a modern Gatling and making it fire .25 ACP, then pretending it will shoot to the same range as a Ma Deuce.

    To use your analogy, Sun is offering a Gatling gun (Niagara) as a single-use item, without any mounting, poor support and no roadmap of how they are going to improve the tech, with it's big brother armoured system (Rock) having just been cancelled, and a history of telling customers that their backup offering (x64 as in Galaxy) is for the mentally retarded. The rest of the competition are offering MBTs with proven track records in reliability, performance, sheer kick-ass grunt, a flexible platform that allows the same vehicle design to be re-used as the bassis of a number of roles, and believable roadmaps based on delivering product, and all backed up with x64 partner offerings that make Galaxy look narrow. Guess which offering is more attractive to your average army? I'll give you a clue seeing as you are obviously unused to thinking outside your neat little Sun-lined academic box - it ain't Sun's saturday night special, no matter how many barrels they put on it.

    /Ah, Sunshiners - the comedy just keeps on and on!

  28. Kebabbert

    Matt, Oh Matt...

    One good thing is that you dont state those ignorant things about Niagara anymore, that it needs more cache. It doesnt. It only took three long explanations for you to understand finally it. Thats pretty fast. But seriously, you should have complained on Niagara doesnt have complex pre fetch algorithms too. That is a major drawback for a legacy constructed CPU. You only talked about small cache. Had you known more, you would have also mentioned the lack of elaborate pre fetch logic. But, Niagara doesnt need those techniques as it has a radically new approach. Good you understand at last, Mattie Pattie.

    ------------------------------

    "But let me put your worries to rest - I studied amongst other things many of the electronic theories that govern transistor design, from basics such as what happens at breakdown right through to complex logic gate modelling,"

    Excuse my language, but this is just plain bull s**t. If you really had studied something technical, you would had have such difficulties to understand the new radical approach of Niagara. And, you would certainly not have applied your unsound logic of yours. It doesnt add up. It is like, "Me have Ph D in English literature, yes. Me good english!". If someone would have claimed that, you would have slightly doubted him, yes?

    -------------------------

    "I'm betting a fair number of the readers here exceed both mine and your knowledge and experience combined."

    Actually, I will not say more about what I have done, but regarding academical merits, I know that I am one of few at one of the Europe's best Universities. I AM good. But I know that there are better people than me. But they are not that common. Seriously. For experience, yes, there are surely people here that has more experience than me. But now I work at a large USA finance company, well known to everyone. One with the best reputation. Sadly, management wont buy SUN hardware, despite it being faster and cheaper. I hope Oracle will rectify the situation.

    --------------------------------

    Here are more of your Niagara bashing.

    "Niagara is like a multi-slot Betamax player - technically interesting, but as rellevant to the average buyer as a chocolate teapot"

    "All your theoretical nonsense is just fine until you apply it to the reality of the marketplace, and then the complete lack of penetration of Niagara into the enterprise datacentre is painfull evidence of just how wrong you are."

    "That's like taking a modern Gatling and making it fire .25 ACP, then pretending it will shoot to the same range as a Ma Deuce."

    Yadda yadda yadda. If these things you say were true, how come it wins in all the bench marks then? If Niagara really sucked, it suffered from a small cache, no intricate pre fetch logic, it were of no interest to a average sys admin - how come it beats the Power servers easily? I dont get it. You say so many things, and at the end, facit tells you the opposite what you state: Niagara beats everything else, and it is cheaper than everything else. Then it should be interesting to an average sys admin.

    You say things, but the real world bench marks and testimonies say the opposite. You have been disproven. No scholar would continue to repeat false statements, upon being disproved. But you do. Not really scholarly? There are numerous proofs, and still you dont consider them. That is not scholarly nor academic.

  29. zvonr
    Grenade

    10 petaflop computer will use sparc:

    10 petaflop computer will use sparc also known as the world fastest cpu:

    The system will adopt Fujitsu's SPARC64™ VIIIfx CPU (8 cores, 128 gigaflops), which is manufactured using the company's 45-nm process technology. As the world's highest-performance general-purpose CPU, the processor offers both performance and energy efficiency, achieving a computational speed of 128 gigaflops per CPU. The inclusion of an error-recovery function(7) also enhances its operability.

    http://www.prdomain.com/companies/F/Fujitsu/newsreleases/200971874149.htm

    Will see:

    if they can deliver,

    if IBM can keep up.

    In case of HP, all it can do is just watch ...

  30. Matt Bryant Silver badge
    Happy

    RE: Matt, Oh Matt...

    "One good thing is that you dont state those ignorant things about Niagara anymore, that it needs more cache...." Oh, I'm sorry, did you very stupidly mistake my not having to say it over and over again as somehow agreeing with you? OK, again just for you - Niagara doesn't have enough cache. It also has a crappy design that means even if they added in more cache, the cores still wouldn't be able to use it in an optimal manner becasue Sun designed them as crippled cores to meet a pricepoint.

    "....But, Niagara doesnt need those techniques as it has a radically new approach...." Yes it is different, thought not radical, just a complete acceptance of failure. Sun couldn't design a core that could match x64 with proper bandwidth, so they instead decided to rope sixteen wheiner cores into one die in an attempt to fool customers into thinking they could run 264 concurrent threads. They just forgot to say the threads were too small to run any of their current apps, and those stalled threads would mean memory and cache tied up and not available to active threads, which all meant that on anything more stressing than xclock Niagara couldn't keep up with an Atom, let alone Xeon. Customers that tried to run existing Oracle apps from their UltraSPARC Netras couldn't believe how poor the performance was, which was why Sun was forced to take the M3000 with SPARC64 to market, because Niagara just couldn't do the job required.

    "....Excuse my language, but this is just plain bull s**t. If you really had studied something technical, you would had have such difficulties to understand the new radical approach of Niagara...." Your problem is you just can't accept that someone can understand the technical "merits" of Niagara but at the same time see exactly how it doesn't meet the requirements of the market, and the reason is because you are just too obtuse to see the latter. Or maybe that's deliberately obtuse. Either way, since you obviously didn't attend the same educational institutions as myself, you have no idea what qualifications I have.

    "....Actually, I will not say more about what I have done, but regarding academical merits, I know that I am one of few at one of the Europe's best Universities...." And there's the admission we expected - zero practical experience, fresh out of your ivory tower. Mind you it's not surprising you have a problem comprehending my posts as English is not your first language, and it looks like techincalese is not natural to you either. Don't worry, when you've finished in whatever kindergarten you're studying at in Germany or wherever, come over to the UK and we'll find you a proper uni to study at.

    "....Sadly, management wont buy SUN hardware, despite it being faster and cheaper...." Maybe that's becasue they have the experience and market savvy you don't have. Or maybe because Sun just isn't faster and cheaper like you think it is. Did you ever stop to wonder why your bosses don't buy Sun, especially considering the financial market used to be Sun's playground?

    "....If these things you say were true, how come it wins in all the bench marks then?...." Here's a hint - Niagara doesn't win all the benchmarking sessions. Believe me, despite what Sun told you it did, it doesn't. If it did, then they would never have needed Rock or SPARC64 and I'd be recommending to my boss we buy Sun. But we haven't bought any Sun for years. However, we have bought Xeon, Power, Itanium and Opteron.

    /has Ponytail been leafletting the nurseries now???

  31. Kebabbert

    Matt, Oh Matt...

    "OK, again just for you - Niagara doesn't have enough cache."

    Mattie Pattie boy, Ive told you that Niagara also lacks complex prefetch logic. Why dont you mention that? It is also a major drawback for a legacy CPU. Just dont mention the "lack of big enough cache". You have two things to attack here: small cache and no complex pre fetch logic.

    If Niagara doesnt have ENOUGH cache, how can it win all benchmarks? If it lacks lots of functionality to do it's work, how can it be the fastest? I dont get it. Do you mean that all benches are lies?

    Look, an Intel/IBM CPU may have high cache hit ratio, say 90%, but whenever the CPU needs to fetch data from RAM, it has to wait for eons. That is why, at the end, studies from Intel shows that a CPU idles 50% of the time, waiting for data - even under full load. The few cache misses causes large disturbances and lots of waiting. Whereas a Niagara doesnt care, it just continues to do another thread while it waits. Ergo, it doesnt need a large cache to win all benchmarks. Which it does.

    ---------------------------

    Regarding English not being my first language, that is true. But on the other hand, I doubt you speak my language. Hence, I speak at least two languages, whereas you speak at least one language. 1-0 to me. :o) (In fact, I speak five languages).

    ----------------------------

    Regarding my uni. Actually, It is one of the best in the world on my highly mathematical subject in Comp Sci. I think that Cambridge has better overall reputation, but my uni does have a world class reputation also, and the researchers are among the best in the world. Better than those at Cambridge if you look at the awards theyve won and cited papers. I wont specify the subject, because anyone knowledgeable will immediately know which uni I talk about.

    -----------------------------

    "Here's a hint - Niagara doesn't win all the benchmarking sessions. Believe me, despite what Sun told you it did, it doesn't. If it did, then they would never have needed Rock or SPARC64 and I'd be recommending to my boss we buy Sun"

    I and others have posted links that shows that Niagara wins lots of benchmarks. And you claim that the benchmarks are lying? Explain that to me. Prove your claim.

    How can e.g. Oracle publish white papers on their web site showing that Niagara totally demolishes Power6 if the white papers are not true? Either you are lying, or Oracle are lying. Now, whom should we trust? You mean I should trust you: "Believe me, despite what Sun told you it did, it doesn't."

    If you can prove to me that SUN is lying about these benchmarks, I will get deeply disappointed on SUN, and I will not regard SUN high anymore. Then SUN is as bad as IBM, despite opening up their tech. Things I dont like are lies and FUD and dishonest people. If SUN is lying, as IBM does, then SUN has lost a supporter. Prove SUNs lies to me, here is your chance.

    "Sun couldn't design a core that could match x64 with proper bandwidth,"

    Well, I think that the T2 chip has quite a good band width. 60GB/sec is quite good. Dont you think? Here is a article on the T2. He talks about two CPU approaches, to use low frequency, or to use large cache. He explains why SUN has not taken the route on larger caches.

    http://www.itjungle.com/bns/bns040908-story03.html

    It all boils down to this, I believe SUN's published white papers has some credibility. You say no. Well, it should be quite easy to see who is right, SUNs white papers or you. He who is right, wins my support. And I will be first to tell the world that SUN is lying. Or that you, Mattie Pattie, is lying.

    --------------------

    OTOH, we know that IBM lies a lot. The Power6 bandwidth of 200GB/sec or so, comes from adding all the different caches and buses; L1 + L2 + etc etc. Well, you can not do that. A CPU can not be faster than the slowest bandwidth. You can not add all bandwidth. That is just plain stupid to do.

    As is this claim "one Power6 core is faster than Niagara, ergo, the entire CPU must be faster" - that is also stupid. If one part is faster, then it says nothing about the entire CPU.

    Or when IBM states that a Mainframe can consolidate 1.500 x86 servers - which requires all x86 servers to be idling at a few percent and the Mainframe fully loaded at 100%. Then I can state that my laptop can consolidate 10 x86 servers, if they all idle. Hardly correct statement. In fact, it is well known that 1MIPS == 4MHz x86. Thus, a Mainframe CPU has low performance and can be emulated on a x86.

  32. Matt Bryant Silver badge
    Troll

    RE: Kebabbert & Novatose (a new comedy duo!)

    Oh dear, Kebabbert still hasn't got round to taking a reading class, his lack of comprehension is just staggering! Imagane that - someone that can be obtuse in five languages!

    "....If Niagara doesnt have ENOUGH cache, how can it win all benchmarks? If it lacks lots of functionality to do it's work, how can it be the fastest? I dont get it. Do you mean that all benches are lies?...." <Sigh> It's like talking to the village idiot. Probably a Schwabian village idiot at that! As I have said before, Niagara does not win all the benchmark sessions, are you seriously telling me you have checked EVERY single publically available benchmark and declared Niagara the winner in each? If so then there is only one word for that, and unfortuantely it is the word you throw around with abandon - a lie.

    Please point me to the Niagara benchmark result for a 64-way system. Oh, you can't, because Niagara doesn't scale. How about T2+ versus Itanium or Xeon running MS SQL Server? RHEL and MySQL maybe? Darn, just can't do either of those on one-trick-pony Niagara, so we'll just have to mark that up as another big fail on Sun's part. Oh dear, looks like I've just poked a very big hole in your blindfold, hope I didn't take an eye out in the process. Please feel free to apologise for accusing me of being a liar in your next post.

    "....I and others have posted links that shows that Niagara wins lots of benchmarks. And you claim that the benchmarks are lying? Explain that to me. Prove your claim...." Oh, you mean old Novatose's posts? Don't worry about him, we had a good laugh at him before in this thread http://www.theregister.co.uk/2009/02/27/hp_sun_oem_comment/comments/ when he tried cherry-picking SAP benchmarks. I never said they were "lying" benchmarks, just that they were carefully selected to try and paint a positive picture. All vendors do it, if you'd been in the industry a little bit longer than five minutes you'd know that. But my basis for argument is not someone else's benchmarks, I have done my own, and there is no way I'm going to ignore my own empirical evidence in favour of your fantasy "technical" advice, especially as it is clear you have never benchmarked anything.

    You see, there is actually only one really true benchmark, and it is the market. You can have what you think is the cleverest product in the World, but if the market doesn't believe it or the market just doesn't need it then it will fail. That is what killed Sun - it failed the market bench session because it had unwanted products and unconvincing strategies, the customers simply bought elsewhere. You may notice that other companies such as IBM, hp and Dell are doing a lot better. Well, you might if you took your head out of the sand.

    /SP&L, especially at Novatose's ickle new friend.

    PS: I do have an agenda. After years of having to suffer Sun's and their partners' selling tactics, FUD and increasingly shoddy kit and ropey support, I take serious afront to anyone trying to tell me that Sun is the greatest/cleverest/best server maker ever.

  33. Anonymous Coward
    Happy

    re: RE: Kebabbert & Novatose (a new comedy duo!)

    The fact is that MB has zero credibility. He claims that the UK has the best uni's, yet his complete lack of understanding technical subjects proves that he has not attended one of these "great" uni's... MB makes blanket statements that "cache is good" and then that since "no one buys SPARC, Itanium is better". The fact is that Niagara alone accounts for more sales than all of Itanium. Using your logic, MB, that means that more people want Niagara. Of course , MB will counter that HP makes more money from Itanium... OK... Maybe... But WHAT DOES ANY OF THIS HAVE TO DO WITH THE MERITS OF THE ACTUAL TECHNOLOGY!!!

    You have proven that you lack any technical ability, MB. Please stop accusing others of the same. Everyone that has commented here has shown knowledge of the subject except you, yet you say the most... You know what they say - "They talk most who have the least to say."

  34. Kebabbert

    Matt, Oh Matt...

    Ive told you, "Niagara wins all these benchmarks" and posted some links. And that other guy also posted some links. I talk about the posted benchmarks. Ive explained to you that Niagara is good on some work loads and smokes on those. And Ive told you that Niagara sucks at other workloads. How can you infer that I mean that Niagara wins EVERY possible benchmark? Again. Your logic is unsound. The more you try to argue, the more I gets convinced you havent done any uni course work. No uni would let someone with that flawed logic pass. Seriously. You must have failed at every course, because you reason so strange. The Niagara wins all these benchmarks. Period. And Power6 looses. Thats fact.

    And where is your "OK, again just for you - Niagara doesn't have enough cache."? Have you finally understood why Niagara doesnt need a big cache? Have you finally understood that a server CPU can impossibly hold all software in it's cache, no matter how big the cache is? Have you finally understood that a server swaps different software in and out all the time, and therefore a large cache is not as useful as it is for a workstation? Have you finally understood why the Power6 with it's large cache is more suited as a workstation? You can not possibly believe that a Power6 cache can hold all data in it's cache when acting as a server.

    That is the reason the Power6 is sloooow on server workloads, Mattie Pattie boy! It's large cache is not large when you consider server work loads, where it swaps in and out different software all the time. That is the reason the Power6 behaves even worse when considering server workloads when it can not utilize it's large cache. It has to swap in out everything all the time. And it's 5GHz will idle 70% or more of the time.

    But granted, the Power6 suits better as a Desktop CPU, when it can fit all the data in it's cache. But it sucks badly as a server CPU, which these benchmarks shows.

    Do you really believe a server can fit all the different data serving hundreds of clients into a cache? Do you now understand why the server CPU Niagara has fast access to RAM and masks cache misses instead of a large cache? A server CPU has no use of a large cache, there will be cache misses all the time. Ive tried to explain this to you, and you still dont get it? Seriously?

  35. Matt Bryant Silver badge

    RE: re: RE: Kebabbert & Novatose (a new comedy duo!)

    "The fact is that MB has zero credibility....." The fact is you have posted zero technical argumnet, just indulged in the favourite SUnshiner past-time - slagging off anyone that doesn't agree with you.

    "....But WHAT DOES ANY OF THIS HAVE TO DO WITH THE MERITS OF THE ACTUAL TECHNOLOGY!!!....." Simple answer - if you make more money you can innovate and develop your products and carry on making more money. If you don't, then you end up like Sun - bought for the enterprise equivalent of chump change. Niagara doesn't even make enough profit to keep the server part of the Sun hardware bizz going, so the whole future of Niagara is open to question, especially as Larry likes profit first.

    "....You have proven that you lack any technical ability, MB....." Well, let's see - I correctly identified the faults in the Sun SPARC startegy over five years ago, migrated what apps we did have on SPARC Slowaris off onto Linux and hp-ux, and saved my company money as well as improving their operational capabuility. In the meantime, I'm sure you were one of the Sunshiners running around telling everyone as loudly as possible that only Sun had the answers. Sounds like I'm a darn sight more technically capable than you at least.

    /Usual point-laugh routine for usual Sunshiner low-brow response.

  36. Matt Bryant Silver badge
    FAIL

    RE: Matt, Oh Matt...

    ".....The Niagara wins all these benchmarks. Period. And Power6 looses. Thats fact....." And the big fact I have constantly been telling you, my obtuse little foreigner, is that you are looking at a tiny subset of the available benchmarks, and none of them from real World environments. Niagara wins only a very narrow selection of benchmarks, which is what you are concentrating on and then pretending that they make Niagara a superior design, when the evidence from the marketplace is Niagara is niche, and a low-end niche at that.

    "...Have you finally understood why Niagara doesnt need a big cache?...." What I have come to understand is that the universities on the continent must be really dropping their entry requirements if they even let you in the door. Niagara needs a complete redesign, not just more cache, but there is no way that will happen as there is no-one to pay for it. Sun doesn't have the money, and Larry is concentrating on making a profit, not throwing good money after bad. Rock has already been killed off, I'm pretty sure Niagara will follow soon. Your own business doesn't even listen to you, why should anyone else waste their time with your waffle.

    ".....Have you finally understood that a server swaps different software in and out all the time, and therefore a large cache is not as useful as it is for a workstation?....." It is becoming very obvious that you know nothing about enterprise computing. Cache is very important with large applications, it is one of the reasons Itanium also trounces Niagara, UltraSPANKed and SPARC64. It is frankly astonishing that anyone working in computing could pretend otherwise, but then I'm beginning to suspect that is because you work as the teaboy in your company office.

    "....That is the reason the Power6 is sloooow on server workloads...." Yeah, so slow it just happens to be used for real enterprise power applications like SAP, Siebel, Oracle, etc, etc, whereas Niagara is rarely used outside its little webserving niche. Argue all you like but the facts are there in the market figures, as illustrated by your own company's choices.

    ".....But granted, the Power6 suits better as a Desktop CPU....." Probably the most stupid thing you have posted. Power6 is designed as a large SMP solution - it is designed to work in servers that scale, not in desktops. Please show me a single desktop using Power6, but better still please show me a major enterprise solution, like a telco billing solution or a major stock exchange, that is using Niagara for the core database and applications. I can think of examples for Itanium and Power, but none for Niagara. In fact, neither can Sun. A quick check of the Sun website shows they obviously don't think Niagara is a choice for an enterprise core solution, instead they list the likely key applications as "Media gateway controllers, Telcom operations and maintenance, Signaling gateways, Intelligent networks, MMS/SMS, unified messaging, Shipboard command and control", which are mainly telco edge services. So Sun don't seem to agree with you either, are you now going to call them liars or illeducated?

    /can continental unis really be turning out such poor quality graduates?

  37. Kebabbert

    Mattie Pattie, boy

    Answer me this. Do you really believe that a server CPU is capable of holding all thousands of different client's data in the CPU cache? Do you really believe that increasing the Niagara cache size to 12MB (or whatever Power6 has) will help doing server work loads with thousands of different data sets, one for each client?

    A legacy designed CPU must hold all the data in it's cache to be effective, otherwise it will loose it's speed. This design is valid for desktop CPUs where few programs are run. Power6 fits in here.

    A CPU designed for server usage must serve many different clients, with different data sets. This kind of workload makes it impossible to fit all different data set into a cache. A server CPU must therefore not be sensitive to cache misses. Niagara fits in here. This is a radical and new approach.

    If you bench a desktop CPU for server usage, then the desktop will surely loose big time. Likewise, if you bench a server CPU for desktop usage, then the server CPU could loose (although Niagara smokes Power6 on spec_int). Ergo, theory says that Power6 will loose big time on all server work loads. Which, in fact, it does if you consider benchmarks.

    It doesnt matter what you say Mattie Pattie boy, facit tells Power6 sucks badly as a server CPU. That legacy Power6 CPU with large cache is more suited as a desktop CPU. This is also a fact. Power6 is constructed as a desktop CPU, that IBM that falsely advertise as a server CPU. And that is the reason it is so slooooow for server usage. Which tests show.

    ----------------------------

    "And the big fact I have constantly been telling you, my obtuse little foreigner, is that you are looking at a tiny subset of the available benchmarks, and none of them from real World environments."

    http://searchenterpriselinux.techtarget.com/news/article/0,289142,sid39_gci1313798,00.html

    Here we see that a Linux shop measured Niagara CPUs and switched. "No real world environments", eh? "Benchmarks carefully crafted by SUN", eh?

    "On a 64-bit AMD processor and Fedora, we could process approximately 200 matches per second of RSS," Whitehead said. "With Solaris 10 on the T1000, this match rate jumped to 10,000 per second."

    You can say whatever you want on Niagara, but none of it is true. Mattie Pattie boy, maybe I should call you a liar and FUDer? Can you prove anything you told us? I can prove that I am correct: Niagara's performance show that it doesnt suffer from small cache. You say opposite, that it DOES suffer. Now prove that, or you are a liar and FUDer.

    ------------------------------

    Here we see more on how IBM does it marketing (in the same vein as "one core is faster, ergo the entire cpu is faster")

    http://thestorageanarchist.typepad.com/weblog/2008/10/1028-benchmarketing-badly.html

  38. Matt Bryant Silver badge
    FAIL

    RE: Mattie Pattie, boy

    "Answer me this. Do you really believe that a server CPU is capable of holding all thousands of different client's data in the CPU cache?...." Well, that depends on how big the dataset is for each customer and what else the cache is being used for. If for example you want to consider Power6, ignoring the L1 cache, there is the shared 4MB of L2 cache and you have 32MB of L3 cache shared between the two cores, so in theory you could actually have several thousand sets of customer data if that dataset was small, say a KB, even if the L3 cache was not all being used by one core. Which just goes to show you really didn't think before you typed.

    "....Do you really believe that increasing the Niagara cache size to 12MB (or whatever Power6 has) will help doing server work loads with thousands of different data sets, one for each client?..." Am I susprised you don't know how much cache Power6 has - not really. Like most Sunshiners, you just drink the Sun koolaid and never bother to actualy check what the competition has to offer. In the idael World you like to make believe exists, Niagara only has to contend with applications written with tiny and multiple threads. What happens in reality is Niagara keeps choking because most enterprise apps have heavy single-threaded routines. And then, when Niagara is busy swapping between threads and waiting on RAM, what do you think happens in cache? Are you saying Niagara flushes the cache all the time to make room for all that data coming in for what could be 256 stalled threads? If it does, then it will slow down even more as it has to juggle cache about to meet demand. If it doesn't, if it retains cache between misses, then that means you never have enough cache to actually satisfy 256 threads with only 4MB of L2 cache as on T2. Sun knew this was a problem with T1 as they made the cache bigger for T2, if what you pretend is true then they wouldn't need to, would they? So, obviously, more cache and a better cache handling would benefit Niagara. I'm expecting T3, should it ever arrive, to have a similar growth in cache, maybe as much as 12MB.

    "...Power6 is constructed as a desktop CPU, that IBM that falsely advertise as a server CPU...." Well they seem to "falsely advertise" it very well to a lot of very cautious buyers. You may not relaise this but try-before-you-buy is very common in the enterprise, we don't like splashing out money unless we have a very good feeling about a product. If Power6 really was a desktop CPU then it would be found out pretty soon and not bought. If any CPU is a desktop one then it's Niagara, which only scales to 4-way in T2+! Even Xeon and Opteron scale to twice as many sockets in standard servers, let alone more specilised servers such as from Unisys. T2+ is not only a crippled product, it is the most constrained - one supported OS choice, poor expansion options unless you buy additonal I/O expansion modules, and only four internal disk slots.

    "....Here we see that a Linux shop measured Niagara CPUs...." Yes, against Opteron, not Power or Itanium. I'm guessing that's beacuse you couldn't find any real independent benchmarks where Niagara wasn't caned by either.

    "....It doesnt matter what you say Mattie Pattie boy, facit tells Power6 sucks badly as a server CPU...." Actually, it doesn't matter what you or Sun say about Power6 as the market has obviously made up its mind. Power and Itanium are both taking market share from Sun in the enterprise high-end, which is the most demanding arena and the one that offers the highest margins and pull-through in associated services. With Rock dead, Sun have virtually abandoned the enterprise sector. They are not stupid enough to put T2/T2+ up against Power or Itanium as they know it can't compete, which is why they are desperate to keep Fujitsu making SPARC64s. You whole argument is completely undermined by these simple, proven facts.

    /SP&L

  39. David Halko
    Happy

    Anonymous: Sockets, Cores, CPU's, Threads, MultiChip Modules, and Oracle Licensing

    Anonymous posts, "Kebabbert, since single process can utilize only single core, it's the core performance matters the most. Who cares how many cores are within CPU? Oracle Database Enterprise is licensed per core."

    This is wrong on so many levels.

    - a single process is often composed of many threads, so how does a single process use a single core?

    - a single core may be composed of many hardware threads, so how does a single core get utilized by a single process?

    - from an application perspective, the core performance does not matter the most, the application performance matters the most

    For most Oracle installations, the concern is quite different from what is suggested

    - oracle standard licensing is licensed by socket (up to 4 sockets - and those are not multi-chip module sockets!)

    http://netmgt.blogspot.com/2009/03/cost-control-oracle-database-licensing.html

    A lot more people license Oracle by Standard License than they do Enterprise License. IBM POWER is a very expensive solution, for most consumers, due to Oracle multi-chip momdule licensing clause, as well as IBM's higher hardware costs.

    Anonymous posts, "I wonder if Oracle is going to up the price per core factor to 1 instead of .75"

    Before Oracle engaged Sun, the per core factor on Power6 was bumped

    http://netmgt.blogspot.com/2009/03/oracle-database-license-change-ibm.html

    The OpenSPARC 1.6GHz processors have been released, and Oracle processor factor is still at .75 as of today!

    http://www.oracle.com/corporate/contracts/library/processor-core-factor-table.pdf

    Oracle has been unfairly punishing SPARC for years, giving IBM business, and cutting their own throats.

    http://netmgt.blogspot.com/2009/03/oracle-database-licensing-cuts-own.html

    Can't wait to see the next SPARC64 and OpenSPARC processors released - I hope to see Oracle stop unduly punishing SPARC, now that they own a stake in it.

  40. Kebabbert

    Mattie Pattie, boy

    ""Answer me this. Do you really believe that a server CPU is capable of holding all thousands of different client's data in the CPU cache?...." Well, that depends on how big the dataset is for each customer and what else the cache is being used for. If for example you want to consider Power6, ignoring the L1 cache, there is the shared 4MB of L2 cache and you have 32MB of L3 cache shared between the two cores, so in theory you could actually have several thousand sets of customer data if that dataset was small, say a KB, even if the L3 cache was not all being used by one core. Which just goes to show you really didn't think before you typed."

    Holy cow. I just sat and stared at your answer. This is truly hilarious. Actually, I dont know what to say. Your answer is so... so... ignorant I have never seen anything like it. And the punchline "you really didnt think before you typed"! ROFL.

    Dear Mattie Pattie. A cache has no preference to some data sets. It just tries to hold all frequently accessed data in it's cache. This means that it tries to hold AIX data in it's cache. GUI data. SAP data. Oracle data. etc etc. It tries to hold all kind of data that are frequently accessed, in the cache. It doesnt distinguish between different data. GUI code. SAP code. Oracle code. AIX code. Kernel code. All kind of data that is frequently accessed. I bet the AIX kernel data is more frequently accessed than one particular user's Oracle data - meaning it will swap out the user's data instead of AIX kernel data. The tiny cache must try to lots of environmental stuff, AIX related. Oracle related. Printer queue related. And what not. And also, user's data sets.

    And the users dont update the same kind of data, one user maybe updates a huge amount of data which will swap out other's users data. Another user will update small data. etc. You can not say that a user always updates 1KB. This is just... weird thinking.

    In short, you are totally wrong on this one too, Mattie Pattie. Didnt you think that a cache must hold OS data? Didnt you think about that? And your punchline. I love it.

    Seriously Mattie Pattie. Why do you continue this one-sided "discussion"? You dont know anything. You have proved it numerous times. And when you claim things, they are wrong. You have lost. Dont you see? Dont try to discuss things you know nothing about? Do you want to discuss some intricate details in research math? Things you have no clue about? You wouldnt be that stupid to try to discuss things you dont know, right? Why do you continue with this, then? You have no clue about anything.

    ------------------------------

    "What happens in reality is Niagara keeps choking because most enterprise apps have heavy single-threaded routines. And then, when Niagara is busy swapping between threads and waiting on RAM, what do you think happens in cache? Are you saying Niagara flushes the cache all the time to make room for all that data coming in for what could be 256 stalled threads? If it does, then it will slow down even more as it has to juggle cache about to meet demand. If it doesn't, if it retains cache between misses, then that means you never have enough cache to actually satisfy 256 threads with only 4MB of L2 cache as on T2. Sun knew this was a problem with T1 as they made the cache bigger for T2, if what you pretend is true then they wouldn't need to, would they? So, obviously, more cache and a better cache handling would benefit Niagara. I'm expecting T3, should it ever arrive, to have a similar growth in cache, maybe as much as 12MB."

    Holy cow. You still havent understood Niagara. What I am trying to say is that Niagara doesnt rely that much on a cache, as a desktop CPU does. In fact, I suspect that Niagara would perform almost equal with NO cache at all. Look, Niagara has no need of a large cache. A large cache wouldnt be bad, but it is of no help either. Because a large cache can never fit in all thousands data sets, Solaris kernel data, Oracle kernel data, and what not.

    Read this carefully; Niagara would do almost equally good with NO cache. The performance would almost be the same - without any cache at all. But this is only speculation, and I havent seen data on this. (Notice; I specify, when I am unsure on things. Which you do not, you pretend to know things).

    The reason is Niagara doesnt choke. Remove the cache and Niagara would perform almost equally as good as a Niagara with cache. Why doesnt it choke? Because Niagara doesnt wait for data from RAM, as Desktop CPUs does. Because Niagara has plenty of data ready for processing - in other threads. The other threads function as cache.

    As soon as there are a cache miss, Niagara switches thread and continues work while the data arrives from RAM into the stalled thread. When that data has arrived, Niagara can switch back to the stalled thread and resume execution.

    Mattie Pattie, in how many ways do you want me to explain this to you? Thick bone headed. How many explanations do you require before comprehend something? 10-20? One is not enough, obviously.

    ---------------------------

    "If any CPU is a desktop one then it's Niagara, which only scales to 4-way in T2+! Even Xeon and Opteron scale to twice as many sockets in standard servers, let alone more specilised servers such as from Unisys."

    Mattie. You are getting tiresome. It doesnt matter how many CPUs it scales to, as long as it can handle work loads greater than any one else. If you had one desktop CPU with one thread with performance as good as Power6, but it could scale to 64 CPUs, and compare that to one server cpu that beats 128 CPUs, then what? Do you still claim that the desktop CPU is better fit, because you can use more CPUs?

    Look. Sun T5440 has 4 of these T2+, and that single machine is twice as fast as three P570 with 12 Power6 CPUs at 4.7GHz in SIEBEL. Obviously it takes several more Power6s to match one Niagara - but that is a BAD thing. Not a good thing. Each additional CPU uses power and a source of failure. The less components, the better. If you can use 1-2 cpus and they outmatch several Desktop CPUs - how can that be a BAD thing as you state? Why is it better to use many slow desktop CPUs that each use 500Watt, than one server cpu that uses 100Watt?

    Seriously. How are you thinking, Mattie? I am soon getting a bit worried. Life can not be easy?

    ---------------------------------------

    "SUN are not stupid enough to put T2/T2+ up against Power or Itanium as they know it can't compete... You whole argument is completely undermined by these simple, proven facts."

    Maybe you have missed those benchmarks where SUN pits Niagara machines against IBM Power servers? Do you deny those benchmarks exists? For instance, one T5440 is twice as fast as three P570 on SIEBEL? Or when one 1.4GHz Niagara gets spec_int than 4.7GHz Power6? You know, there are lots of benchmarks.

    Earlier you said something about those benchmarks where Niagara wins, them benches were a tiny subset of real life work and that they were carefully crafted by SUN. Hence, those benches were useless. Do you call spec_int a carefully crafted bench by SUN?

    Mattie. You have lost big time. Maybe we should start calling you a liar and FUDer? All things you have claimed have turned out to be false.

  41. Kebabbert

    Liar Matt Bryant, where art thou?

    I want to see how you are going to wriggle yourself out of this.

    Liar Mattie, what is the reason of using a cache? Do you know that? Let me tell you; the reason of a cache is to be able to quickly access data. This requires the data to already be in the cache. Either

    1) since earlier access.

    2) prefetch logic has prefetched new data that is believed to be used soon.

    Now the cache is small, compared to all data that gets accessed all the time (thousands of clients user data, AIX kernel data, a server is not likely to run any GUI - so no GUI code will be cached, Oracle kernel data, etc etc). If the cache is too small, it gets emptied and refilled all the time. CPU spends time with filling cache with data, which will get swapped out soon when a new client is to be served. The data gets swapped out all the time, the data changes all the time. It is not a small data set which hardly changes.

    How can a CPU use and utilize a cache under this circumstances, with ever changing data? It can not. You still havent understood, why this is the reason a desktop CPU performs very bad on server work loads?

    Liar Mattie Laddie, how about you do some basic computer architecture courses, instead of me spending my precious time lecturing you? I have no problem with lecturing others, but when the student doesnt understand, despite several explanations you start to wonder. Dont you agree? You explain once, twice, thrice, etc etc etc - and still he doesnt understand basic concepts. What would you think about such a student?

    No, I have a better suggestion, instead of you doing several basic courses, why not doing the same course several times instead? I doubt you will understand only one explanation?

    Mattie Laddie, what do you say?

  42. Matt Bryant Silver badge
    Happy

    By the way....

    One of the guys here has just told us that a kebabbert is a slang term for a "typical Chihuahua owner", that is a rather less than manly fashion victim. Can't say I'm surprised!

  43. Kebabbert

    Mattie Pattie Laddie

    Well, that is not true. Kebabbert comes from Kebab - Bert. Bert is taken from Dil-bert, pay homage. Kebab is the same thing as Kabob or however you spell it, in the UK.

    Regarding my manlyness, at least I dont go around spreading lies and FUD as you do. Therefore I consider myself more manly than you. Fair fight, and no lies is manly, yes?

    Anyway, I think that maybe you should stop claim that Niagara is slow because it suffers from a small cache. This is simply not true, as benchmarks and many testimonies show (if you google a bit). Ive tried to explain why Niagara doesnt need a large cache, it only took 10 posts or so for you to understand, reiterating the same thing over and over again. Can you understand that people may find you annoying? Especially when you claim things about which you have no clue?

    If you were right, if Niagara actually were slower than Power6, then I would say nothing. But you claim that Niagara is slower, despite all benches showing the opposite. That is just weird of you. I really dont understand how you reason. The proofs shows something, and you claim the opposite. That is not really sound logic? It is like, if it is raining hard outside, and you claim "no, it doesnt rain" - but if you look out you DO see the rain. Or, if a parrot is dead, and you claim "no, it isnt dead, the parrot is only sleeping". That is just strange reasoning.

  44. Matt Bryant Silver badge
    FAIL

    Dear Chihuahua abuser....

    Well, I did post a rebuttal to your previous piece of clueless waffle, but I fear Ms Bee decided it was simply too cruel to post. I'll try and remember the main points and put them down below, hopefully without upsetting Ms Bee!

    Firstly, on the cache point. What happens with Niagara is that it starts a new thread whenever the current thread is stalled waiting on data. When the second thread stalls, it kicks off a third. I'm sure even Kebabbert will agree with that part at least. In the Sunshine fantasyland, this allows the CPU cores to keep spinning and deliver a high throughput. In reality, since not many applications fit this model, what happens is the first thread stalls and the core can't kick off a second as there is no second thread to start, or if there is a second thread then there is no third. This is why Niagara sucks so badly when it comes to the current crop of enterprise applications. So, the core is stuck waiting for that first thread to come back, which is when you are really hoping for a cache hit, the only problem is that is unlikely given both the small size of the cache available to the thread and the poor cache techniques used. Remember, even with T2+ we only have 4MB shared out between sixteen "cores" and possibly up to 264 threads, and that's before we consider what else the cache has to hold. Odds on you will get a cache miss and have to go off to RAM or disk. Even in the Sunshine fantasyland scenario this is an issue as the delay means your response time has just gone through the roof.

    Now, Sun knew this when they designed T2+ and they talk a load of hooey about how they can keep all the cores spinning like this is somehow what the customer needs. Like Kebabbert recommends, they will try and have you restructure your whole testing to try and keep those wheiner cores spinning, even if this in no way reflects what is actually happening in your environment. What the customer actually wants is his usually-single-threaded application to respond as fast as possible. Sun did try and up the amount of cache on T2+ compared to the earlier designs, but in order to try and keep it anywhere near price-competitive and to avoid pushing the power requirements up, T2+ is capped at 4MB of L2 cache, and with a poor means to use even that amount. In the webserving niche Niagara is not too badly handicapped here as most people will expect a delay and attribute it to Internet delay or just the page's graphics loading, but in a business scenario where one system is talking to another the delay in handling single-threaded or heavy-threaded apps is just not acceptable when the competition will smoke through the task a lot quicker.

    Adding more cache would help Niagara, but a proper cache design and a larger cache would help it a lot more. Stating that Niagara would perfrom just as well with no cache is frankly the type of statement that could only be uttered by someone with their head firmly in the sand. The same type of person that just cannot see that a vendor's benchmark is liable to have little bearing on how a server will perform in the real World. Try again, newbie!

    /SP&L

  45. Kebabbert

    Dear Mattie Pattie Laddie

    "I'm scanning through the rest of your diatribe looking for something relevant, but it's just more insults and repeated waffling."

    "Well, I did post a rebuttal to your previous piece of clueless waffle, but I fear Ms Bee decided it was simply too cruel to post. I'll try and remember the main points and put them down below, hopefully without upsetting Ms Bee!"

    I wonder who is insulting who, here? I am not the one who gets his posts blocked because of foul language.

    ----------------------------------

    "In reality, since not many applications fit this model, what happens is the first thread stalls and the core can't kick off a second as there is no second thread to start, or if there is a second thread then there is no third. This is why Niagara sucks so badly when it comes to the current crop of enterprise applications."

    Could you please explain this again? What do you mean with "not many applications fit this model"? Maybe you havent heard about client-server, but one server serves many clients. It is not about your application has to be parallellized. Each client will be served by one thread. Server - client software is naturally parallellized. You dont have to rearchitect your server-client software.

    If it were the case that Niagara code had to be rearchitected, then Niagara would suck big time - both in theory and in practice, when you did some benchmarkes. Because then the cache misses would stall all threads and the cache. But what does facit say? Who wins benchmarks? Niagara or the slooooow Power6?

    As David Halko writes:

    "The concept that environments with common applications would not benefit from highly threaded hardware is really a myth propagated by DoS trained folks.

    With DNS cache lookups, async I/O, file system syncs, multi-threaded NIC cards, VPN encryption, HTTPS encyption, compression file systems, web browsers, background processes, software update downloads, virus checking, signature checking downloads, de-duplication, backups, RSS feeds, internet radio, MP3's playing, etc. - the common user benefits tremendously as hardware become more highly threaded with a generally more responsive platform.

    Even my Windows XP desktop has 74 processes running, never mind the thousands of threads!"

    Maybe I misunderstood you. Maybe you didnt talk about applications must be rearchitected. Therefore, could you explain again why Niagara is slow yet wins all these benchmarks posted?

    --------------------------------

    "Adding more cache would help Niagara, but a proper cache design and a larger cache would help it a lot more."

    Maybe you dont know that a server CPU can not keep all the different data in it's cache? So what makes you believe a large cache would help a server CPU? Could you explain this point?

    ------------------------------

    "Stating that Niagara would perfrom just as well with no cache is frankly the type of statement that could only be uttered by someone with their head firmly in the sand. "

    I didnt say that. why are you lying again? Read my post again. What I tried to say, was that a server CPU such as Niagara doesnt need large cache, because a cache can never fit in a server workload. And therefore, maybe Niagara could perform almost equally without a cache. But I pointed out that was only far fetched speculationm, and I had not seen data on this.

    -------------------------------------

    "The same type of person that just cannot see that a vendor's benchmark is liable to have little bearing on how a server will perform in the real World. "

    Ive posted information about a company got 50 times more throughput with a Niagara T1 than a AMD. Didnt you read my post here, or are you deliberately lying?

    -------------------------------------

    "Try again, newbie!"

    Who is the newbie? Someone with Masters in Comp Sci and a Masters in Math, or some business people knowing nothing about CPUs? Someone that believes that a server CPU is able to hold all different data in a cache? And these insults all the time. Why? Can't we talk like educated grown up people, without insulting each other? Is it difficult to stop?

This topic is closed for new posts.

Other stories you might like