@A.C
Yes, that is correct. With many threads AND the ability to switch thread in one clock cycle - you can efficiently hide latencies. Normal cpus switch threads in 100s of clock cycles, which means you can not mask latencies.
For instance, studies by Intel shows that a normal server x86 cpu, idles 50-60% of the time - under max load. Under full load - a typical x86 cpu waits for data 50-60% of the time. This means a x86 cpu running at 3GHz, is actually doing work corresponding to a 1.5GHz cpu.
That is the reason normal cpus have big caches, complex prefetch logic, etc - to try to minimize latencies. CPUs have reach high GHz, but RAM is still slow. Thus, if you have a 5GHz cpu and the RAM is 1GHz - then the CPU needs to wait for RAM all the time. But if both the cpu and RAM runs at 1GHz, then cpu need not to wait. Thus, high clocked cpus are not really meaningul. 5GHz POWER6 cpus using 1GHz RAM is really pointless. Even IBM seems to understand this now, as IBM has decreased clock speed and increased the nr of cores.
So, how successful is the Niagara approach? Well, the Niagara idles 5-10% under full load - waiting for data. That is much better than 50-60%. Thus, the Niagara at 1.6GHz competes with, and outperforms in some cases, much higher clocked cpus. In fact, Niagara holds several world records today, beating mich higher clocked x86 and POWER7 cpus.
http://blogs.oracle.com/BestPerf/entry/20110812_x6270m2_specjenterprise2010
The funny thing is that Niagara has a tiny cache, because it hides latencies very well. Thus, Niagara is fastest in the world in some cases, without big caches. What does that prove? It proves that Niagara is not cache starved. If it were cache starved, it would never beat 5 GHz cpus. You need 14 (fourteen) POWER6 at 5GHz to match four (4) Niagara T2+ cpus at 1.6GHz in official SIEBEL v8 benchmarks. How is that possible if the Niagara is cache starved?
Conclusion, the ability to hide latencies can be very valuable.