They are taking competition quite seriously
> Itanium chips were originally at a 0.75 scaling factor, by the way,
> but were reduced at some point,
Well, they were, and reduced, too, because everybody in the business believed that Itanium is going to be the next big one rather than the Itanic.
Unfortunately, while Itanium is a nice all-round CPU, it isn't really good for database work, unless the database is a rather small, rarely accessed dataset (in which case it simply sucks as much as any other CPU).
> and despite the large number of cores in modern x64 chips from
> Intel and AMD (four or six), Oracle has not been tempted to raise
> the scaling factor here. It will be interesting to see what Oracle does
> when AMD crams 12 cores in a socket and Intel starts cramming in
> eight cores.
Nothing will happen. I know AMD is going to make the Magny-Cours a multi-chip module (MCM), is that also true about the 8-core Nehalem? I have read many conflicting reports on that.
Note that Oracle still has an MCM clause regarding IBM Power CPUs, where they are licensed at 2x the cost per socket (they are treated as two CPUs that they actually are rather than one package). I would expect Oracle to use that clause against AMD and Intel in their upcoming chips.
This might make T2+-based machines really nice Oracle boxes, given that they already are well-suited for that kind of workloads.
There were some interesting comments in the last round of SPARC-bashing in the linked article. I would just like to correct some statements by Matt in that discussion that:
1. Memory bandwidth does not make up for memory latency -- idle cycles are lost regardless of whether memory serves gigabytes or terabytes per second. Database queries are rarely larger than a few kilobytes, but the latency prevents that data from reaching the CPU quickly. If you have a few cores and all have to wait on a random query, they will stall. A Niagara will stall too, but instead of 8 or 16 threads stalling, you get 64 threads. Small cache has nothing to do with it because with the speed of a single thread (assuming all threads stall and are switched), the memory latency can be treated as one cycle.
Oh, and the cache of the T2 was enlarged compared to T1 only because you need to retain more data for more threads. That's quite elementary. If Matt's argument for more cache held any water, Sun's microarchitects would have to increase the cache more than two times keeping with the 2x increase in the number of handled threads, and they have increased cache by a measly 33%, from 3 to 4 MB.
By the way, as for the bandwidth, a T2/T2+ chip has four DDR2 controllers on-die. That gives more bandwidth than two or three DDR2 controllers and only 33% less than three DDR3 controllers on-die, so the Niagara chips are definitely not starved for memory bandwidth.
2. DDR3 memory might not be faster than DDR2 memory in some workloads. DDR3 memory might have a cycle latency (CL) of 7 or 9 cycles, whereas typical DDR2 memory has CL of 4 or 5. A DDR2-800-CL4 is always faster for small random queries than DDR3-1600-CL9, even though it has far less bandwidth.
3. If your thread stalls, it doesn't matter if you have a 64 MB cache or 64 KB cache. The CPU does not work on large sets anyway -- 2 or 3 64 bit data at the most per one cycle, with 64 bit instructions adds up to 256 bits or 32 bytes. Some SIMD commands will take more data, and some data may be larger and working on it may be spread across multiple cycles, but a small cache is never a hindrance if the CPU waits on memory. If a random database access comes, a CPU will not have the data cached (by definition of random data). If the CPU waits, say, 50 nanoseconds for the data, it can either idle (as most CPUs do) or switch to a different thread (as Niagara, Nehalem and some NetBurst chips do). Nehalem and NetBurst cannot switch more than once, but Niagara can then switch a 14 times more and when the data arrives, it can switch to the requesting thread at an instance or cache it and wait for the thread. After that random data is processed, it doesn't need to be kept in cache, anyway.
4. As for the Rock. While it's sad that Sun will not be releasing that CPU, they did not revise their roadmap as much as it has been suggested. The Rock was to stay on the market only for two or three years (which is ludicruously short for an enterprise CPU) and all improvements introduced by Rock were to be incorporated in the new VT core of all future Sun CPUs rather than keeping Rock as a separate family.
To the best of my understanding, Sun has agreed with Fujitsu to not duplicate effort, leaving the general-purpose Sparcs to Fujitsu as their SPARC64 line.