805 posts • joined Wednesday 22nd July 2009 09:09 GMT
Re: And this is news? @Kebabbert
"...As for the conspirationist theories..."
It is a fact that these large companies simultaneously, all together, bet on immature Linux with a bad license, instead of mature FreeBSD with a more suitable license. It is a fact that large companies together have bet on one company or technology instead of something superior. There are no theories involved here.
It is also a fact that only a few companies control the global economy - there is much research and PhD dissertations on this. These companies are typically Wall Street investment banks, such as Goldman Sachs, JP Morgan, etc, and they all cooperate tightly - no theories involved here, there are lot of credible research on this, just read the research papers in my link. If these companies decide to bet on something, for instance shorting more silver than is annually produced, then silver prices will plummet (this has happened, and if you work in finance you know this). Or if they decide to go long on a company, stock prices will rise. These are facts.
Here comes the only theory in my post, again: "It is my theory, that if these companies bet heavily on Linux, then Linux will take off. So, it might happen that Oracle and IBM and MS and HP and everyone else will bet heavily on Linux and start to buy and sell Linux". This is the only theory in my post. And I agree that it might be considered as a "conspiracy theory". The rest of the contents in my post are facts. Just read the research papers if you think it is a bit far fetched. I will recap here, what conclusion one PhD dissertation arrived at in my link:
-The researcher used mathematical models to analyze lot of financial databases containing lot of information. In particular, he analyzed which company owned a stake in another company, and which company owned another company which owned another company, etc. And he kept track of all this, across millions of companies. And it turned out that always, only a few companies owned a company, which owned another company, etc. It turned out that 50ish companies where the spider in the net, they controlled every other company. And these 47 companies where typically wall street investment banks: Goldman Sachs, Barclays, JP Morgan, etc. Just read the new and ground breaking research in my link. It is very interesting when researchers use mathematical models in other areas, such as in economics. No human can do this, but math and computers can. And we can observe things that has not been possible earlier.
Re: And this is news?
"...Why did IBM, Oracle, Google, Facebook and others went with Linux..."
Yeah, why? Back then, they all suddenly jumped shipped and bet heavily on Linux, that was very immature back then. Linux supported 2-4 cores and not much more. FreeBSD was already mature and stable back then. And still every large company jumped ship and went Linux, with a much more constrained license than Linux. FreeBSD license allows anyone to even close the source, while using all code. GPL Linux forces you to open the code, you can not close it. Obviously, a competitive company (all of them are) will prefer propietary stuff will prefer FreeBSD license - and still everyone, suddenly chose Linux which was immature and had a bad license for monopolistic greedy companies. They all chose Linux at the same time. Why?
I worked at a big fortune 500 company recently and they suddenly said "orders from manegement: we dont buy HP anymore" and several other companies reported the same thing: they stopped doing business with HP. At the same time.
And I also saw that several big companies recently choose to bet heavily on ARM. Microsoft released Windows for ARM cpus, that was unheard of! MS had support for non intel way back, but today? Why would MS bet on ARM suddenly? Release Windows for ARM? And AMD did the same, AMD is now designing ARM cpus. And Nvidia too. And HP will sell ARM servers. And several other companies. Suddenly they all simultaneously choose to bet heavily on ARM. Or Linux. Or Google. Or Microsoft Windows instead of OS/2. etc. Why do they act as one single will is governing them?
You want to know the answer? Here it is, read the answer here: here is a smoking new hot research, a PhD thesis and other senior researchers on this subject:
"....> Open VMS lives on in the architecture of Windows NT
Except for a whole bunch of differences that mattered like putting graphics drivers in kernel space which has always been one of NT's Achilles heel (but also necessary for the gamers and partially alleviated with modern WDDM). I am not sure of the numbers but I think people would be amazed how many blue screens are due to poorly written 3rd party drivers (and hardware failures themselves of course) as opposed to poorly written Microsoft code (some there too though especially in past)...."
You are not really updated. Windows have moved out the graphics from the kernel, so latest of incarnations of Windows are lot more stable than when Windows had the graphics in the kernel. For instance, Windows7 can update it's graphics driver without rebooting nowadays.
Funnily enough, Linux has moved it's graphics into the kernel. This has made Linux even more unstable, but Linux has increased it's performance at the cost of stability. How good is an OS if it is fast but unstable?
are far older than 2005. It has been reworked several times, and the last iteration wast released 2005. It stems back from 1999.
Anyway, this Linux tech seems it is heavily inspired by Solaris containers. The cool thing about Solaris containers back then, was it remapped all kernel calls to the Solaris kernel - this was new. So when you installed Linux Red Hat, it would remap all Linux kernel calls to the Solaris kernel, so only one Solaris kernel was active no matter how many Containers you booted. Each container only cloned a few kernel data structures in RAM (40MB RAM), and also cloned the filesystem (100MB) via ZFS. So Containers are really resource saving, that is the point of Solaris containers. One guy booted 1,000 containers on a 1GB PC back then, it worked but it was really slow. Now, this Linux tech seems awfully similar, just like systemd is a clone of Solaris SMF, Linux btrfs is a clone of Solaris ZFS, Linux systemtap is a clone of Solaris Dtrace, Linux open vSwitch is a clone of Solaris Crossbow, etc etc etc.
It would be really cool if Linux developers did something new themselves, instead of cloning what others have done. For instance, the Linux "RCU" which is presumably cool according to Linux developers, turns out to be patented tech from IBM. So, RCU is not an invention by Linux devs either. BTW, did you know that the "Linux" kernel itself, is a clone of Unix? Everything in Linux is a clone, nothing is new?
Re: Big memory
Ok, you mean that If Bixby supports 96-sockets, then you can not use ordinary M6-32 servers, you must use modified M6-32 servers? Maybe you need to insert another card into each M6-32 server? Is that it?
Are you implying that for Bixby to connect 96 sockets, you can not use three of the normal M6-32 servers, but you need to use another type of server that has not been announced yet? I doubt that, because these large servers are expensive to make. It would be more economical to allow three of the M6-32 servers to connect via some extra hardware, using Bixby. But you dont agree with my guess? You mean there is another type of server coming? Do you have information on this, or is it your guess?
Re: Big memory
Ok, I did not know that. How do you know that? Do you have more information? I mean, Bixby builds a huge 96-socket server from building blocks, and the building blocks are the M6-32 server. So, I thought you could use several M6-32 servers to build a M6-96? But this is wrong? What link have you read to learn more?
Three of the new 32 socket Oracle M6 server will be able to connect with Bixby interconnect, into a huge 96-socket M6 server with 96TB of RAM, and 9.216 threads. If you run your database from 96TB of RAM, and also compress the data - it will be very fast. I doubt SAP Hana can compete with such a huge server. How much RAM can Hana utilize? Can Hana go higher than 96TB of RAM? I doubt that. Anyone knows?
Clone of Solaris Containers
"....This approach beats VMs in terms of resource utilization, as the OS copy is shared across all apps running on it, whereas virtual machines come with the abstraction of separating each OS onto each VM, which adds baggage...."
Solaris Containers have done this for ages. Linux is cloning Solaris tech again, as cloning ZFS, DTrace, SMF, Crossbow, etc.
One difference to Solaris Containers, is that Solaris allows virtualization of different kernel versions. You can even install Linux in one Solaris container. So there is only one Solaris kernel running, and all other virtualized kernels just remap API calls to the single Solaris kernel. One Sun guy started 1.000 Containers on a 1GB PC Solaris, it was very slow but it worked. Each Solaris container uses something like 40MB RAM and 100MB disk space, by cloning som kernel data structures. They are very efficient.
And now Linux is getting them too.
Re: Other ways to get a back door
"...Strange then that [all code] does get checked..."
Sure it gets checked. But the point is that it is not checked thoroughly. It is only skimmed and lot of subtleties are not catched. There are question marks in the code that gets accepted, because the code turn over is so high, no one can thoroghly check all code. Lot of code that no one really understands gets accepted. Maybe they contain subtle back doors?
"....Lok Technologies , a San Jose, Calif.-based maker of networking gear, started out using Linux in its equipment but switched to OpenBSD four years ago after company founder Simon Lok, who holds a doctorate in computer science, took a close look at the Linux source code.
“You know what I found? Right in the kernel, in the heart of the operating system, I found a developer’s comment that said, ‘Does this belong here?’ “Lok says. “What kind of confidence does that inspire? Right then I knew it was time to switch....”
Other ways to get a back door
NSA dont have to ask Linus Torvalds himself anymore. They can just submit a patch, because there are so many patches into Linux all the time, it is hard to check all new code. Apparently, this attempt was blocked. But how many more are not blocked? In Windows, NSA can not submit a patch, so NSA must ask Microsoft to deliberately insert a patch. But for Linux, there is a very high code turnover, so it is not hard to submit some new code:
"If you were the NSA, how would you backdoor someone's software? You'd put in the changes subtly. Very subtly."
"Whoever did this knew what they were doing," says Larry McVoy, founder of San Francisco-based BitMover, which hosts the Linux kernel development site that was compromised. "They had to find some flags that could be passed to the system without causing an error, and yet are not normally passed together... There isn't any way that somebody could casually come in, not know about Unix, not know the Linux kernel code, and make this change. Not a chance."
Re: Switching from big iron to x86 virtualisation
"...Another point to consider with respect to your assertion that you will get good scaling from unmodified binaries on an M9000 is that it relies on running a huge number of threads to achieve high system throughput at the expense of a big hit in single-thread throughput. Legacy binaries are usually tuned to run on high-single-thread throughput systems, so I would expect to see better scaling and throughput from a system with a high-throughput cores (eg: Xeon) vs low-throughput cores (SPARC Tx)...."
Jesus. You are just totally off. Off. You havent understood much. It is the opposite: The M9000 is old, and use the old SPARC64 cpu which has 2 strong threads per core, and low throughput. The new "SPARC Tx" has very high throughput cores, with many threads. Xeon has low throughput and SPARC Tx (T5, etc) are high throughput. Xeon has in comparion: few strong cores and few threads, and SPARC T5 has many threads to boost throughput. This is just the opposite of what you believe.
Relatively this holds, if you compare them:
Xeon: low throughput because it has few strong threads
SPARC64: low throughput because it has few strong threads
SPARC Tx (T5): high throughput because it has many weaker threads.
You are stating the opposite. Just check the some of the world record benchmarks here, and see that SPARC T5 is SEVERAL times faster at high throughput server work loads (inlcuding SPECIint2006):
"....Probably because people like Oracle have their customers by the short and curlies. $legacy_vendor gets more margin if they can convince you that buying one of their new boxes will cost less than a new $legacy_vendor license + $competitor's box. The $legacy_vendor can set the license for the competition's box to cancel out the price difference AND they can maintain a nice fat margin because there is no competition (that's the whole point of legacy lock-in)...."
Wrong again. The Oracle database runs on Linux too. In fact, it is mainly developed on Linux I have heard of lately. So it would be very easy to migrate from Oracle database running on a very very expensive 32 socket server, to a cheap 32/64/128 socket Linux SGI cluster running Oracle database. But no one is doing that, why? And Oracle is not famous for cutting prices when they have you locked in, they are expensive. Many companies wants to migrate to other databases because of the high license costs. If you did not know this, I doubt you have worked at Wall Street as you claim.
Why dont no one migrate from a very expensive 32 socket Unix server to a cheap SGI cluster - running the same software? No vendor lock in exists, because you migrate from Oracle, to Oracle. Your explanations why noone does this are logically unsound. I tell you the answer: as the kernel developer explained, these cheap Linux clusters can not run large Oracle database configurations which requires SMP servers, and that is why no one migrates from expensive Unix SMP servers to cheap Linux clusters. Because clusters can not handle huge database configurations. The worst case RAM latency is more than 10.000ns in clusters, and performance of a database would grind to a halt.
"...I doubt [SGI cluster] is cheap...."
Wrong. Yes, the SGI cluster is cheap. Check the prices. It can in no way compare to 32 socket IBM P595 server used for the TPC-C record, it costed $35 million list price. I am convinced the SGI cluster costs like a few x86 cpus and a fast switch and not much more, or maybe twice that cost. You can buy several SGI clusters for the price of one 32 socket IBM server.
"...[SGI UV1000] it's not a cluster, it runs a *single* instance of the OS against shared memory. A single process can use every single byte of memory in that system. The same is not true of a cluster...."
You are wrong again. For instance, the ScaleMP Linux server with 1000s of cores shows the same charasterica: it runs 8192 cores and loads of RAM, and it runs a single Linux kernel image. But, it is actually a cluster. Just because it runs single image does not mean it is not a cluster. It can only run HPC workloads, just like the SGI cluster. Both clusters consists of several smaller nodes, connected to look like one giant server running single image kernel:
"... vSMP takes multiple physical servers and – using InfiniBand as a backplane interconnect – makes them look like a giant virtual SMP server with a shared memory space....The vSMP hypervisor that glues systems together is not for every workload, but on workloads where there is a lot of message passing between server nodes – financial modeling, supercomputing, data analytics, and similar parallel workloads. Shai Fultheim, the company's founder and chief executive officer, says ScaleMP has over 300 customers now. "We focused on HPC as the low-hanging fruit".... "
Check the SGI and ScaleMP workloads, they are all HPC workloads. For a reason. No customer runs a large database configuration such as Oracle.
"...With SMP all memory is remote, so you are operating at worst case (but uniform) latency all the time, by contrast with NUMA you get the best possible latency for local accesses and the same worst possible latency for remote accesses as you would have with an SMP box...."
True. The Oracle servers are SMP alike, and has very low worst case latency, 500ns or so. In effect you treat it like a true SMP server. The SGI and ScaleMP clusters have latency of 10.000ns or much higher - these must be treated as clusters, and can only run cluster software. That is why all SGI and ScaleMP customers are running HPC workloads.
"...All the M9000 does is hide the latency from your code by putting your code to sleep and running another thread whenever it has to hit main memory. Xeons achieve a similar trick with HyperThreading..."
Wrong again. You are mixing the M9000 cpus with SPARC Niagara cpus. The Niagara cpus hide latency by switching to another thread when the cache pipeline stalls, and it has many cores and many threads to be able to achieve a very high throughput. The M9000 has the old SPARC64 cpu, with only 1-2 threads. The old SPARC64 is very similar to older x86 or odler IBM POWER cpus: few cores, 1-2 strong threads. All cpus where constructed like this long ago. Then came the SPARC Niagara and changed everything with many cores and many threads, and now every cpu is similar to Niagara with many cores and many threads: POWER7, Xeon. The SPARC64 is not good at high throughput, it has few strong threads, not many threads.
So, you dont know too much about M9000 or SPARC64 cpus. You are mixing them. That might explain why are so off with your knowledge.
"....I'm done. I can't see much point in writing stuff for someone who shows no evidence of being able to read and learn...."
I have proved that you are wrong in many(every?) bit of your reasoning. It is you who needs to read and catch up.
Oracle new ZFS servers
The new Oracle ZS3 servers sets new world records again, beating NetApp, EMC, IBM, etc
Re: Switching from big iron to x86 virtualisation
Jesus. Again, that SGI UV1000 is a cluster.
"One can view NUMA as a tightly coupled form of cluster computing."
And, again, there are fundamental differences between a cluster and a SMP server. As kernel developer explains:
"[HPC clusters] spend a huge majority of their time in "user" and only a minuscule tiny amount of time in "sys". I'd expect to find very, very few calls to inter-thread synchronization (like mutex locking) in such applications....Consider a massive non-clustered database. (Note that these days many databases are designed for clustered operation.) In this situation, there will be some kind of central coordinator for locking and table access, and such, plus a vast number of I/O operations to storage, and a vast number of hits against common memory. These kinds of systems spend a lot more time doing work in the operating system kernel. This situation is going to exercise the kernel a lot more fully, and give a much truer picture of "kernel scalability"
If you really believe such a SGI UV1000 256-socket cluster is going to run a Oracle database for a tenth of the price of a 32 socket SMP server at a higher performance, why dont all large Investment Banks at Wall Street do that? What are you thinking with? Are you thinking at all? Why is there a market for very very expensive 32 socket servers, if a cheap 256-socket SGI cluster can replace 32 socket servers? Seriously? You dont think at all?
A question, you never attended some logic courses or so? Never went to university?
Re: Switching from big iron to x86 virtualisation @ Kebebbert
"...Given the >3x performance advantage of the Xeon cores over the M9000 cores in SPEC rate figures I think that very few people would choose the M9000 for the kind of compute intensive workloads because it would need 3x as many sockets to achieve the same result with perfect linear scaling (ie: not gonna happen)...."
Wrong conclusion. If the Xeon core is 3x faster, then it does not follow that M9000 needs 3x more sockets. Because, there is a huge difference between core and cpu. Maybe the M9000 cpus has very many cores, maybe 1000s of them. Then you dont need 3x more _sockets_. Or, if the M9000 cpu has only one core, and the Xeon has 1000 cores, then M9000 need many more than 3x sockets to catch up on one Xeon.
But it is true that Xeon is faster at number crunching. The reason is becuase SPARC is designed for Enterprise. Which means SPARC has better RAS (if a cpu instruction was errorneous the cpu can rollback and replay the instruction just like in IBM Mainframes, Xeon can not do this), and the SPARC is better at SMP workloads because it scales better. You can run large databases on M9000, but not on Xeon servers.
"....I can't help but notice that these benchmarks, that you claim show Linux does not scale well, are running on HP hardware, meanwhile all the examples of great scalability seem to come from the SPARC/Solaris hardware..."
HP has good scalability on the same 64 socket hardware, when running their own Unix called HP-UX. They offer 64 cpu configurations when running HP-UX on the Big Tux server. But when running Linux on the same Big Tux server, the largest supported Linux cpu configuration is 16 cpus. No 64 cpu Linux configurations are offered. Why is that? Is it because Linux has problems utilizing 16 cpus? With larger configurations there will be too much support problems? So, where are those superior Linux 32 socket benchmarks?
"....Let me know if you can find some benchmarks that test scalability for Solaris & Linux running on identical SPARC 16/32/64 socket hardware...."
I know that Linux runs on small 1-2 socket SPARC workstations, but it would be far fetched to believe you can move the same Linux to 32 socket SPARC servers without problems. That would need lot of tailoring and recompiling, and redesigning of the Linux kernel. So dont expect to see Linux running well on larger SPARC servers. SPARC servers have traditionally been targeted to scalability, not number crunching. Many SPARC benchmarks are about scalability, it is the strength.
However, there are Solaris and Linux benchmarks running on nearly identical x86 hardware. When using few sockets, 1-2 sockets, Linux wins sometimes and Solaris sometimes. But when you go up to 8 sockets (which is very small servers) Solaris wins easily. The more sockets, the better Solaris scales. On 1-2 socket servers Linux wins most of the time (I suspect). On small handheld devices, Linux wins I suspect. But on larger server, 8 sockets and upwards, Solaris wins - because that is Solaris domain. Solaris has long been made for scalability on large servers, and if Solaris did not win on large servers, that would be an indication something is wrong.
Re: Switching from big iron to x86 virtualisation
"...The Oracle Solaris engineers seem to think the Sparc Enterprise servers are NUMA architectures.... Should I trust the Solaris engineers or Kebabbert? Tough call..."
Please read my posts again so I dont have to repeat myself all the time. But the heck, let me recap once more.
The M9000 with 32 sockets is not a true SMP server, neither is the new Oracle M6 server with 96 sockets. But they act like true SMP servers because of good design. It is very few hops to reach memory cells far away, the worst case latency on M9000 is 500ns and the best case is 100ns. The spread is tight. So in effect you treat the M9000 as a true SMP server. You dont need to reprogram your software, to make sure data is close to the current cpu, etc. No need for this.
In contrast, the SGI Altix cluster is a cluster. It has worst case latency of 10.000ns or 70.000ns, I cant remember the number. This means that you can only run some workloads on a HPC cluster. When you program for a cluster, you must make sure that the data is allocated close the current cpu, otherwise the performance will be very very bad. You must reprogram your software if you intend to run on a HPC cluster. For isntance, SMP workloads such as databases does not run on HPC clusters. No one is running large Oracle databases on a SGI cluster. For a reason.
However, the Oracle M6 server with 96 sockets and 10.000 threads and 96TB RAM - is specifically designed to run databases and other SMP Enterprise workloads. So, you dont have to redesign your software running on M6 server, just copy your binary to the M6 server without problems. In effect the M6 server is a SMP server.
Did you understand why the Oracle M9000 (which is a non SMP server) behaves like a SMP server in real life? You dont have to redesign all your software. But for a HPC cluster, you must.
Re: Switching from big iron to x86 virtualisation
The SGI Altix UV1000 cluster sure beats any SMP servers on HPC number crunching workloads. The SGI Altix cluster was made for it. I would be very surprised if a Enterprise SMP server such as the Oracle M9000 beat a pure number crunching cluster in number crunching.
On the other hand, if you benchmarked the SGI cluster on SMP workloads, you would see very bad performance. The M9000 is made for SMP, and SGI made for HPC. I dont know how many times I have to say this? They are running different workloads, one is running number crunching, the other is doing SMP. The SMP can run trivially parallelisable problems such as number crunching, but a HPC can not run SMP workloads with latency of 10.000ns to 70.000ns - that would be impossibly slow. That is the reason no one on Wall Street use HPC servers for databases and other Enterprise workloads.
Re: Switching from big iron to x86 virtualisation
I was trying to explain to you, how the Linux fanboys are wrong when they talk about Linux supreme scalability. I am not meaning that you are like them. Sorry if I was unclear.
But I hope you see my point. There have never been any Linux 32 socket server for sale so the Linux kernel developers could not have improved Linux for 32 socket scaling. You need to test your code on live hardware, and such hardware does not exist. Sure, some people have tried to compile Linux onto Unix and Mainframes, but just because it works badly and scales badly, does not make them good options. There is no way in hell that Linux can scale well on 32 sockets, because there are no such benchmarks.
Well, actually, HP did benchmark once on 64 sockets (Big Tux server), and Linux sucked badly with ~40% cpu utilization. Of these 64 cpus, around 26 cpus were used and the rest 38 cpus were idling. More than half of the cpus were just sitting there, rolling their thumbs - under full load. That is bad actually. Very bad. You want to use all cpus and use all resources in a server. It can only take a deluded Linux fanboy like "Roo" to think that is a good result. 40 cpus idling on a 64 cpu server under full load - is that good scaling???
Re: Switching from big iron to x86 virtualisation
"...CPU utilisation is a pretty meaningless statistic. What would you rather have 10 transactions per second with 100% CPU utilisation or 1000tps with 87% utilisation ? You can achieve 100% utilisation with a busy loop ffs..."
Please read my post again. I am saying that Solaris, using slower hardware than Linux, scored higher on SAP benchmarks. Why is that? At the same time, Solaris had higher cpu utilization and Linux lower. Coincidence? Maybe. The point is, in 8-socket SAP benchmarks, Solaris scored higher than Linux. On equivalent hardware. Both using Opteron cpus, same opteron model, but one clocked at 2.8GHz (Linux) and one clocked at 2.6GHz (Solaris).
Cpu utilization is very meaningful - if you run the same benchmarks. If SAP utilizes resources more on one OS, than another OS - is that "meaningless"? Most people dont agree with you.
"...Repeating a falsehood when the truth has already been pointed out and the facts are easily available to everyone reading the thread is a silly thing to do...."
Ok, show all us the truth then. It is only a matter of linking to a Linux server SMP alike with 32 sockets. No, IBM Mainframes with 24 sockets running Linux does not count. Neither does a IBM AIX Unix server running Linux. So, please show us a Linux server instead of calling me a liar. Can you, or can you not? If not, then dont call me a liar, because that makes you the liar - you lie about me.
The only 32 socket server capable of running Linux today, is a Unix server. The IBM Unix P795 is only a couple of years old (3 years or so), before the IBM P795, where were any Linux server with 32 sockets? No one existed. And still, prior to the P795 the Linux fanboys claimed that Linux scaled extremely well. On what hardware? That is just laughable. I mean, where is the proof? Where are benchmarks? I can never understand how Linux fanboys can go and raving about Linux excellent scalability - but it has never been tested! Well, it has actually been tested once, on the HP Unix server with 64 cpus, and Linux scaled horribly according to HP benchmarks.
"The truth has been pointed out" - what truth? Seriously? In what reality do you live? WHERE ARE THE 32 SOCKET LINUX SERVERS??? WHERE???
A major problem with hardware raid: it is unsafe, your data might get corrupted, it is vulnerable to write-hole-error which makes you loose all your data. ZFS fixes all these problems.
Re: Hardware raid obsolete
Of course I am serious. There are no reason to use a hardware raid card over software raid such as ZFS. Hardware raid has only disadvantages: it is slow (a server has tons of more resources), costs lot (ZFS is free), vendor lockin (you can not migrate your disks to another card, you must buy an identical card from the same vendor), SPOF (if the card breaks you are toast, in ZFS you use several cards so you dont have a SPOF), etc etc and a dozen more disadvantages. Actually, I can not come up with a single reason of using hardware raid over software raid as ZFS.
Re: Linus is totally wrong
Duke Arco of Bummelshausen
"...That will be sufficient for all your needs, believe me on this. Most people will be even OK with an RC4 stream...."
The same RC4 that NSA might have broken?
Re: I think Torvalds is losing it
"...I've seen Torvalds present (on GIT and the failings of the CVS / Subversion model). He began the presentation by saying "You can disagree with me if you want, but if you do then you're stupid and you're ugly". And you know what? It got a good laugh from the crowd...."
You know, they would laugh at anything he would say. They are his worshippers, and he is their God. He is flawless in their eyes. Even if Torvalds insulted and humiliated them, they would gladly accept being peed upon. They are brain washed, a sect.
No sane person would accept Torvalds behavior, as we can see in this thread.
Re: Switching from big iron to x86 virtualisation
"...I apologize to the readers for not having given the proper icon first way round. After all, the orig comment I replied to was about 32 CPUs, not 32 CPU sockets ... in any case, I agree that "more" of some sort will give you bragging rights amongst a certain audience..."
So, what is the difference between 32 cpu and 32 socket server? In each socket, there is a... cpu? Right? So, shouldnt 32 cpu server be the same as 32 socket server? I agree that the core count might be different, but a a cpu with 16 cores, or a cpu with 4 cores - they are both one single cpu. And they both sit in a single socket. Or do you have another explanation?
Regarding Linux and 32 socket servers. There are lot of Linux fanboys that claim Linux scales so well, it is the besto, they say. Well, I ask, how do you know that? Can you show me Linux benchmarks on a 32-socket server that proves Linux scales well? And no, they cant. Because there has never been a 32 socket Linux server for sale. So, how do they know that Linux scales well on large servers with 32 sockets? Pure imagination, I guess. And when I ask for benchmarks on a 32 socket server, they call me names, "dumb", "idiot", etc. At that point I know I have won. Because if they do really have proof they would have shown us the links. But they have nothing, so they revert to using harsh language instead. Just like Linus Torvalds himself.
So fact is: Linux scales very bad on larger SMP alike servers. It has troubles scaling on 8 socket servers, just study some benchmarks. For instance on SAP benchmarks, Linux used higher clocked cpus and faster RAM sticks, and still Solaris got a higher score, because Solaris had better cpu utilization at 99%, whereas Linux had 87% cpu utilization. Linux scales crap on SMP servers, because no kernel developer can make Linux scale well on large SMP servers, because there exist no such server to test Linux on. However, on clusters Linux scales well, and no one denies that.
Linux scaling on SMP servers: problems scale above 8 cpus, because there are no larger SMP server than 8 socket. So no Linux kernel developer can tailor the Linux kernel for 16 sockets or 24 or 32 sockets. Or 96 sockets, as the Oracle M6 server has. Solaris, AIX and HP-UX developers have had access to large 32 socket servers for decades for testing, so these Unix runs fine on such servers.
Linux scaling on HPC clusters: very good.
Re: Switching from big iron to x86 virtualisation @ Kebebbert
"....[You can copy your Solaris binary to a SMP server and expect it to perform well without rewriting it] That is only a given if your binary is solving a trivially parallelized problem, and in those cases a cluster will do just as well...."
Not at all. I dont know how much you know about computer science or programming, but these Oracle SMP alike servers are not only running NC-complete problems. If you believe that, I suggest you study the customers (enterprise companies), and what they use the Oracle servers for: typical SMP workloads such as big databases.
And then I suggest you compare the customers for HPC clusters such as the SGI Altix server and their workloads. You will find the customers are researchers, weather forecasting, oil companies etc, and they all do number crunching. No one runs big databases or other SMP work loads.
I dont know how many times I need to say this? Just compare the workloads and see what the SGI Altix server is used for, and see what Oracle and IBM and HP unix servers are used for. You will there is a big difference. Do you understand now, at last? These servers are used for different tasks, one for SMP things, and the other for HPC number crunching.
"....Ah, so a 32 socket server that is running Linux doesn't count because it can run Windows, HP-UX or AIX too ? You must be a fully paid up member of the Flat Earth Society...."
I am trying to say that these big Unix servers made by IBM, HP and Oracle, are Unix servers. Benchmarks from HP shows that Linux does not run well on 64-socket SMP servers. My point is that there are no Linux vendor designing Linux 32 socket SMP servers. No one. If you need 32 sockets, you need to go and buy an large expensive Unix server and compile and install Linux for it - and pray Linux does not fall apart on the server. I would hardly classify this configuration as a "Linux server". These Unix servers were built for Unix, and Unix scales well on these servers. Linux is hardly supported on them, on the HP-UX server, Linux is only supported up to 16 cpus - do you really call it a "Linux server"???
Just because my car can run on race tracks, I dont belive it is a Formula1 car - do I? Exactly what is it that you have difficulties in understanding? You dont agree that all these large Oracle/IBM/HP 32-64 socket servers were built for running Unix and that HP has provable big troubles running Linux above 16 cpus? What is so difficult to understand?
Do you still believe Linux scales well on 32 socket servers? If you do, on which server have you seen those benchmarks? Not on HP's servers, because they have documented bad scaling. Maybe you have seen Linux benchmarks on the IBM P795 server? Can you show us them benchmarks? If you can not show us benchmarks on IBM P795, which server have you seen good Linux benchmarks on? Not HP. Not IBM. Not Oracle. Hmmm... there are no other 32 socket vendors. I wonder how this "Roo" guy can claim Linux scales well on 32 sockets - because it doesnt. Maybe he is talking right out of his nose and making up things, and have a vivid imagination?
ZFS more common than NetApp and EMC Isilon combined:
"...We [Nexenta] alone have half as much storage, we figure, under management as NetApp claims. Add Oracle and you’re already bigger than any one-storage file system. Add all Solaris and illumos deployments on top of that and you are 3-5x larger than NetApp’s OnTap. In fact, the number of ZFS users is larger than those using NetApp’s OnTap file system and EMC’s Isilon file system combined...."
Re: Who can tell?
Mixing random generators are never a good idea, it weakens everything if not done correctly. If you study the subject you would know it. But if you are a mere Linux developer, he would of course believe he knows everything.
Re: Linus is totally wrong
Bla.bla. I know the difference. I did some work on group theory and pseudo random generators. It turned out that the work was already known, but I did not know that when I started. You want to read my thesis on the subject??
Re: Simple h/w device?
That is a good idea actually. It should have a market, indeed. For instance, a small radioactive source, or microphone, or something similar. Another idea would be to record noise from a current microphone and extract randomness from it.
One friend at uni, had to create random numbers for a software, so he took a photo with the usb camera, and hashed the photo to extract random numbers. His software used a usb camera, so it had access to a usb camera.
Re: Linus is totally wrong
yes I know all that. I studied cryptography for one of the leading experts in the world. He is world famous and if you have studied cryptography, you have surely heard of him.
Re: Linus is totally wrong
"...I know of the story you're referring to, and you're mis-stating it. First, the "mixed sources" random number generator used linear congruential generators -- no PC noise, ..."
No, you dont. I studied cryptography back then, and I remembered that some company, was it Netscape?, used the space left on the hard disk as one of the inputs to create random numbers. They used "PC noise", that is for sure. It seems you have not read the same story as I did.
What "there is not such thing as random"? Have you read professor Chaitin's work on algorithmic information theory where they define the very concept of randomness? Read it and then come back.
(I apologize, but it is funny how similar you sound like a Linux kernel developer, who thinks he knows everything, when in fact he has not studied the subject or know nothing on the subject. Hybris all Linux kernel developers display. But I am not accusing you of this, I am just saying it sounds a bit funny)
Re: Switching from big iron to x86 virtualisation @ Kebebbert
Of course you should consider caches and align vector properly, when programming SMP servers. Of course there will be differences in best worst case latency in SMP, because of when cache pipeline stalls, etc. Who on earth would believe the opposite???
My point is that when you program for a NUMA cluster with worst case latency of 10.000ns or more, you need to carefully redesign your software. You can not copy your binaries to a NUMA cluster and expect it to perform well, it won't.
But for a SMP alike server such as Oracle/Sun servers, you can copy your binary to it and expect it to perform well. That is why Oracle is building the M6 server with 96 sockets and ~10.000 threads and 96TB RAM. Oracle intends to run Databases on it, as Larry Ellison said: he does not believe in SAP Hana RAM databases, because Oracle databases will run as good as (if not better because of more RAM) than the Hana RAM database. You will never see anyone run a database on a NUMA cluster, that would drag down the performance to a halt.
"...You have already been given plenty of examples [of 32 socket Linux servers], you chose to assert they are not Linux servers which is pretty dumb seeing as they are servers that run Linux...."
I have been given two different Linux NUMA servers. Ever. And I myself have given example of a third NUMA cluster, that is the ScaleMP server. As we all know, NUMA servers are clusters. And they dont run SMP workloads, such as huge database configurations. They are only doing number crunching.
I have also been given ONE single example of a 32 socket server, that is the IBM AIX Unix P795 server. I myself has given an example of a 64 socket server, the HP-UX Itanium Superdome server. But as we all know, these Unix servers, are.. Unix servers. Linux scales awfully bad on the HP-UX server, it is hardly supported (only up to 16 cpus). The IBM P795, I expect Linux to scale as bad on it too. I would be surprised if any customer in the world runs Linux on such an expensive server. For the price of one single POWER7 cpu, you could buy a cheap 4-socket x86 server. Nobody runs Linux on an expensive IBM Unix AIX P795 server, I am convinced. The predecessor, the old 32 socket IBM P595 server for the old TPC-C record costed $35 million list price. Who would run Linux on a $35 million server? Why not buy a bunch of cheap x86 servers?
So again: I invite anyone to show links to a Linux 32 socket server, which is not an existing Unix server. We are talking about enterprise, and enterprise runs big databases, not number crunching. Is there any Linux 32 socket server out there? No? And it has never been. No matter how harsh language you use, it wont change the fact. Put up, or shut up. Show us the links, the proof.
Re: Torvalds needs a paranoia transplant
Netscape mixed different random sources, and introdcued a pattern so it was breakable. Donald Knuth says never to mix stuff, instead rely on a proven mathematically strong design. Just becase you can not break your own crypto does not mean it is safe. Read my post further down.
As I explain further below, Netscape(?) mixed different random sources (current millisecond, space left on hard disk, etc) with a random number generator - and researchers broke it.
As Donald Knuth explains in his Art of Computer Programming: mixing random sources is never a good idea. His own home brewn random generator which mixed lot of stuff, had a remarkably short period before repeating itself. Read his book on random number generators. It is obvious that Linux kernel developers have not, nor have they studied the subject of cryptography. Donald Knuth says it is better to rely on a proven mathematically strong system, than making your own. Read my post further down.
Linus is totally wrong
There was a famous example of... Netscape(?) did a mix of different random sources. They used a random number generator, added the current millisecond, how much space was left on the harddrive, etc to create a "truly" random number. But researchers succeeded in breaking it, because they knew what was the building blocks, they could infer things, such as "typical hard drive is this big", etc. So, the researchers succeeded in discarding lot of the search universe, so they could decipher everything. It was lot of work, but it was doable. To mix different sources does not make better randomness. The Linux kernel developers would have known this if they studied cryptography (which I have).
Donald Knuth has a very interesting story on this in his epitome "Art of computer programming". He was supposed to create a random number generator many years ago, so he mixed lot of different random sources, the best he could. And Donald Knuth is a smart mathematician, as we all know. After his attempt, he analyzed it and discovered a very short range. It quickly repeated itself. That learned Donald Knuth that he should never try to make a random generator (or cryptosystem), just because you can not break your crypto or random number generator, it does not mean it is safe. Donald Knuth concludes in his book: it is much better to use a single well researched random generator / cryptosystem than make one yourself. Much better. If you start to mix different sources, you might introduce bias which is breakable. It suffices if the adversary can discard some number in the huge search space to be able to break it.
So, NSA and the like, would be more concerned if Linus used a proven high quality random generator. As Snowden said: NSA can break cryptos by cheating. NSA has not broken the mathematics. The math is safe, so use a mathematically proven strong random generator instead of making your own. That is very bad. If you study basic cryptography.
The Linux kernel developers seem to have very high thoughts of themselves, without knowing the subject? Probably they would also claim that their own home brewn crypto system is safe, just because it is complex and themselves can not break it. That would also be catastrophic. They should actually study the subject, instead of having hybris. But, with such a leader....
Re: Switching from big iron to x86 virtualisation
"...If your definition of x86 servers can stretch (a lot) the Cray XC-30 might be of interest. It ships with the Cray Linux Environment, a cabinet can hold 384 sockets (3072 Xeon E5 cores), infinband I/O. You can add a lot of cabinets too. At the lower end you have the SeaMicro boxes ranging from 64 to 256 sockets (and I have seen them referred to as 'servers')...."
The Cray XC-30 is a cluster. It is a HPC cluster used for number crunching. Have you seen the workloads the Cray tackles? All embarassingly parallel workloads, running on lot of nodes, on a cluster. Cray does not make SMP server (a single big fat server, running for instance databases). Cray makes computing clusters, not Enterprise servers running Enterprise work loads. You will never see such HPC clusters running big fat Oracle databases, for instance.
"....256 socket NUMA single system image coherent global shared memory. It is not a distributed memory HPC cluster.... The IBM P795 has 32 sockets and SuSE Enterprise Linux is one of the supported systems...."
First, ccNUMA servers are a cluster. Not SMP servers, nor close to SMP servers. They can not handle SMP workloads, they are a cluster:
Second, the IBM P795 is not a Linux server. It is an AIX server that someone has compiled Linux for. I doubt anyone runs Linux on it, because the P795 is so expensive. It is better to run Linux on cheap x86 servers. Besides, Linux would never be able to scale to 32 sockets. HP had their "Big Tux" Linux server, which was the 64 socket Itanium Integrity (or was it Superdome?) server that they compiled Linux to, and Linux had something like ~40% cpu utilization using 64 sockets. Linux scaled so bad on 64 sockets, that when HP sold Big Tux, HP only allowed Linux to run in a partitioned server. The biggest supported Linux partition on Big Tux was 16 cpus. If Linux scaled well, HP would have supported 64 socket partitions too. But they didnt. 16 cpu Linux partitions did not work that well, either. If you look at modern benchmarks of Linux on a 8-socket x86 server, the cpu utilization is quite bad. For instance, SAP benchmarks show 87% cpu utilization on a 8-socket server. 16 socket would give... 60% cpu utilization, I guess. And 64 sockets does give ~40% cpu utilization in confirmed benchmarks. I am convinced the IBM P795 Linux offering is very limited in terms of Linux scalability, like, it has 40% cpu utilization, or P795 is only allowing max partitions of 8-16 cpus.
So, no, there are no Linux SMP alike servers. Some people have compiled Linux to big Unix servers, but that does not make them Linux servers. For instance, you can run Linux on a IBM Mainframe with 24 sockets, but that does not make the IBM Mainframe, a Linux server.
If you look at the RAM latency of a true SMP server, it has uniform latency from every cpu. No matter which cpu you are using, it accesses RAM as fast as every other cpu.
SMP alike servers, for instance the Oracle M9000 SPARC server with 64 sockets, has a worst case latency of 500ns, which is quite bad. But best case is something like 100ns or so. So the spread is tight, no big difference between worst case and best case. The Oracle M9000 is not a true SMP server, because there is some difference in latency. But that does not matter, see below.
If you look at the latency of a NUMA cluster like the SGI Altix 256 socket Linux server, the worst case latency is something like 10.000ns, or was it 70.000ns? I can't remember, but I know it was above 10.000ns, which is catastrophic. This changes everything. If you develop for the SGI server, you must allocate data close to the current node, and you design your software differently than for a SMP server - otherwise the performance will be extremely bad. In effect, you design your software exactly as if it was a cluster: you allocate data close to the current node, etc. And if you look at the workloads the customers are buying SGI for, it is for HPC workloads, and other clustered workloads. If you do SETI number crunching, each node does not have to talk to other nodes, that is a typical parallel workload that NUMA clusters handles fine.
If you study the new Oracle M6 server with 96 sockets, and ~10.000 threads, it is a mix of 8-socket SMP servers, connected with NUMA connections. But, the worst case latency is again, only 2 hops or so, just like in the Oracle M9000 server. If you need to access data, it will not take you more than 2 hops to reach it, which is fast. In effect, when developing for the M6 server, you design your program as it was a true SMP server. You dont need to allocate data to close nodes, etc, just develop your software as normal. So, Oracle M9000 and Oracle M6 runs SMP workloads just fine, and you dont have to redesign your software. Just take your current Solaris binary, and copy it to the M6 server and it will run fine. Try to do that on the Linux SGI cluster, and it will show extremely bad performance unless you redesign your program. Oracle M6 will run SMP workloads such as the Oracle Database in very large configurations. Databases is all Oracle cares about. So, in effect, the Oracle M6 and M9000 behaves as if they were true SMP servers, and you dont need to redesign your software to run on them. This is not the case using the Linux SGI clusters, they can never show good performance running large database configurations, with worst case latency of 70.000ns.
So, no, there are no Linux 32 socket servers for sale. Sure, IBM has one AIX P795 server they offer Linux on, but that does not make it a Linux server. And IBM offers Mainframe with 24 sockets to run Linux, but that does not make the Mainframe a Linux server either. If someone can show a link to Linux 32 socket server, I would be very surprised, because no one has ever manufactured such a Linux server. And NUMA servers dont count, they are just a cluster. Anyone can make a cluster, basically, just slap on lot of PCs on a fast switch and you are done.
So, I invite anyone to show a Linux server with 32 sockets. There have never been until today. Why? Linux scales very bad, shows benchmarks from HP on their HP-UX 64-socket server. Linux can not go beyond 8-socket servers today. And Linux does not even handle 8-socket servers well, just look at such benchmarks, and read about "cpu utilization".
Re: Switching from big iron to x86 virtualisation
"....Refresh your tech knowledge. There's quite a few options in the x86 space these days that offer 16 or 32 CPU cores. Even if you count chips / sockets, you have a range of choice in the 8-socket space (remind me there, how many CPU sockets did a T5-8 have, again ... ?).
Yes, there's not that many x86 servers out there that can have 32 CPU sockets. If that's what counts for you, go IBM / Fujitsu / Oracle...."
I am glad that you agree with me: There are no 16 or 32 socket Linux SMP servers for sale, and has never been. The T5-8 has 8 sockets. The Oracle M6 has 96 sockets, the Fujitsu M10-4S has 64 sockets. The IBM P795 has 32 sockets. HP has a Itanium Superdome/Integrity server with 64 sockets. There have never been a 32 socket Linux SMP server for sale. If someone objects, I invite him to post links to a 32 socket Linux SMP server. Good luck, there has never existed such as server. Sure, the SGI Altix 2048 core server is a HPC cluster, but there are no 32 socket Linux SMP servers for sale. The ScaleMP 2048 core Linux server is running a single image kernel on a cluster. The latency is so bad it is only fit for HPC workloads such as number crunching, just as the SGI Altix cluster.
The >32 socket server market is very very lucrative and high margin. The x86 business is low margin, and Larry Ellison has declared that he does not care if the x86 business at Oracle dies, because the margin is so low. IBM and Oracle does high margin business, that is where the big bucks are. For instance, the old 32 socket IBM P595 used for the old TPC-C world record costed $35 million list price. $35 million. I kid you not. That is some serious money. The largest IBM Mainframe has 24 sockets, and it is also very lucrative and one of IBM's big cash cows.
If you Linux supporters say that I am wrong, please show us a link to a 32 socket Linux server. No one has ever showed me such links. Never ever. They claim I am wrong, but... no links. Lot of talking, but no proof. I am wrong, then it is easy to make me shut up: show us a link to single 32 socket Linux server. :)
Re: Switching from big iron to x86 virtualisation
Linux is fine for low end server work, or for large clusters doing parallel HPC computations such as the SGI Altix server. x86 servers are also getting more and more powerful, so Linux can take over low end work loads. But there are no high end SMP Linux servers for sale and has never been, so for high end you must go to Unix or Mainframes. You have no choice. Enterprise work loads require large SMP servers with as many as 16 or 32 cpus, and no one has ever sold such large Linux servers. But for cluster workloads, HPC Linux servers are fine - but no Enterprise use HPC servers, only SMP servers.
My links are messed up, check them to find 10x more expensive NetApp, for slower performance.
NetApp does some great storage servers, at a high price. NetApp servers are running FreeBSD.
There are ZFS based storage servers running OpenSolaris, that are much cheaper than NetApp. Sure, ZFS are not clustered, but if you dont need clustered storage, then ZFS will provide ample resources, at a very low price. Check these ZFS benhcmarks, vs NetApp servers. ZFS is 32% faster, and NetApp is 10x more expensive:
And also, ZFS beats EMC / Isilon, NetApp, etc:
There are other ZFS vendors as well: Tegile, GreenByte, Nexenta, etc
Hardware raid obsolete
A hardware raid card is essentially a tiny PC on a card with it's own cpu, RAM, BIOS, raid software, etc - that is the reason they are expensive. You are buying a tiny PC.
Long ago a server's cpu was weak and you needed to offload I/O to raid cards. Today a server multi core cpu has plenty of power and you dont need to offload I/O anymore. For instance, to run ZFS on a single core, costs something like 2-3% of cpu power. That is nothing, so typically a raid card has 800 MHz 32 bits PowerPC cpu, and yes they are very weak.
Also a server might have 16-32 GB of RAM, whereas a raid card might have 256-512 MB RAM. A server is vastly superior in all aspects, so why dont you run your raid software on the server instead? Use software raid such as ZFS, and you get superior protection and performance. No 800MHz raid cpu can outclass a multicore 2.7GHz server cpu. So, who buys raid cards today? They are obsolete. Use software raid instead. Sell your raid card, and switch to open sourced software raid instead, and save money and gain performance.
Re: Catching up on SPARC T5
"...If single-thread performance is the most important thing for a piece of work, a core or set of cores will step down the threading automagically and run it with fewer processor threads...."
This sounds exactly like SPARC critical threads introduced in the old SPARC T4, where an important thread will take over a single core. So the core will only run that single thread. That is the reason the T4 has strong threads, and also extreme throughput if needed. You choose during run time.
And also, POWER8 cpus are finally getting transactional memory, first introduced to world in the SPARC T5 cpu. But Sun's old ROCK cpu also had transactional memory years ago - but it did not make it to delivery. So the ROCK research shows up in those SPARC Txx cpus instead. And Intel Haswell also has transactional memory now. So, SPARC T5 has transactional memory, later Intel Haswell also got transactional memory, and now IBM is trying to catch up with transactional memory too. Better late, than never, though.
As I wrote:
"....Funny how POWER is getting more similar to SPARC for every iteration...."
It would be cheaper for IBM if they just licensed SPARC cpus, so IBM did not have to catch up all the time.
Re: Catching up on SPARC T5
There are lot of inconsistiencies and plain wrong "facts" in your post. You should do some reading to catch up. Or are you up to date, and deliberately write false things?
"...Fast cores -> not sparc...." We have shown official benchmarks, and the SPARC T5 has 25% faster cores than POWER7. And, the SPARC T5 has twice the number of cores. So, yes, the SPARC T5 has faster cores, and is also the faster cpu. Up to 2.4x faster than POWER7 on TPC-C benchmarks.
"...Wow, after Oracle letting most of the Sun tech die a silent death..." What are you talking about? Oracle is capitalizing heavily on both the SPARC CMT cpus (T5) and the SPARC M6 cpus. And Oracle bets heavily on Solaris, too. And Java. etc.
"....OK, you are never out in the field so I will tell you how this works. If you are at Solaris 10 and want to go to Solaris 11 you cannnot upgrade. So one has to find or buy a new server, install your new OS and application there and then you have to replicate what is years of configuration from the old production server. In the end you migrate the data and do a switch. This takes weeks, is risky and has extreme cost. You can call Solaris "the most advanced operating system in the universe"(true, Sun has stated this!) all you want but it is still amateur night...."
You are totally off here. It is obvious you are never out on the fields. But let me tell you how it works. If you want to migrate a Solaris 10 server to a Solaris 11 server, you utilize containers (you know, the tech that IBM AIX copied and named it WPAR, just as IBM AIX copied DTrace and named it ProbeVue) and zip the entire Solaris 10 server, and dump it into an container on a Solaris 11 server. And then you are done. You can also do that with Solaris 9 and Solaris 8 servers, and dump them onto an Solaris 11 server, via containers.
You have some reading to do. Or, you have just not understood what Solaris Containers are good for.
Re: Bandwidth != LAtency
"...I wonder if the compiler / kernel will be able to attempt to 'intelligently' allocate or shift threads to cores where the other threads that need to 'talk' to the first one are a small number of hops away...."
I suspect it is not really necessary. On a NUMA cluster you must design your program like that, because worst case latency might be 10.000 ns or more. But on this M6 server, the worst case is only 2-3 hops away, which makes for a good latency. So, you just program this server as a true SMP server, just like normal programming, just copy your binaries to this server and off you go. You dont need to recompile and redesign to make sure your data is in close nodes. No need for this server. Just treat it like a true SMP server, because of the good design.
But on the other hand, if you want to port your Linux applications to a NUMA cluster such as SGI Altix servers, you must redesign the programs, and rewrite them. Otherwise performance will grind to halt, if you do not make sure that the data is located in close nodes. In worst case, it is almost like accessing a hard disk, because the nodes are so far away before you access the memory you need. So these Linux NUMA clusters are directed to HPC parallel workloads, just check the use cases and the benchmarks. They are all HPC cluster stuff. No SMP stuff.
Re: SPARC is not even competitive to Power
"....Clearly the Oracle marketing team had an extra shot of espresso today. Statements about SPARC outperforming Power7/7+, copying SPARC features, SPARC is significantly less expensive than Power, oh yal and the old reliable "Oracle software licensing cost is 2X more on Power, etc, etc is laughable. These statements are misleading at best...."
Oh, this sounds like good old IBM FUD. Lots of unsubstantiated negative claims about a competitor, with no proofs - this is actually the very definition of FUD. Spreading negative rumours which have no bearing in reality. "Yes, I heard this horrible rumour, but I can not prove it is true" - because it is not true.
So, if you do claim that SPARC does not outperform POWER7, or SPARC is not cheaper than POWER - can you show us some hard facts that prove your stand point? Show us some performance benchmarks. And show us some pricing comparisons. Come on, I dare you! :)
If you can not show any hard proofs (which you can not because we are telling the truth, just check the benhcmarks we posted) then it is the IBM camp that is a bit over ambitious again. :)
I remember one of the diligent IBM supporters here, who has not been active as of late. I said that the x86 is faster than POWER6 in LINPACK benchmarks, in response he said "no, it isnt. The POWER6 is faster in LINPACK because the POWER6 has faster cores" or something weird. I cant recall the logic he used because it was wrong, I cant think like that. I pointed out that you need two POWER6 cpus to match one Intel Xeon in LINPACK, but he said "well, the POWER6 has faster cores, so it is faster in LINPACK". Sure, the Xeon had six cores, and scored twice as high and if you count core wise the POWER6 was faster, true. But that does not make the POWER6 the faster cpu, does it? I dont understand the logic IBMers use.
I also said "in this benchmark the SPARC T2 has higher throughput than the POWER6 system" to which he replied "it doesnt matter, the POWER6 has lower latency which is the important thing!". Later I showed another benchmark where the SPARC T2 had lower latency, to which he replied "it does not matter, because the POWER6 has higher throughput, which is the important thing!". I really dont understand the logic IBMers use.
And now here comes this "PowerMan@thinksis" claiming lot of weird stuff, without providing any benchmarks nor pricing examples. Out out the blue he says we are all wrong, without pointing out the errors. "Trust me on this, but you are wrong, I can not tell you where you are wrong, but I know you are wrong. I am a doctor, trust me". I really dont understand the logic IBMers use. Or, lack of logic that IBMers use. Or just marketing aggressiveness the IBMers display.
BTW, have you heard that AIX is going to be killed off? It will supposedly happen sometime in the future when cheap Intel x86 cpus are catching up on expensive POWER cpus. Because IBM only does high margin business, and if the x86 are cheaper than POWER with same performance, then IBM will shut down POWER and AIX too. AIX runs only on POWER, and without POWER servers AIX can do nothing. So the POWER future looks grim. Yes, it is true, I really heard it. I am talking out of the blue? I am spreading false rumours that I just made up, or can I prove this? Well, read it here yourself and see I am not spreading false made up rumours, like the IBM camp does. No, instead I back up my claims with hard undeniable facts, straight from IBM.
"...Asked whether IBM's eventual goal is to replace AIX with Linux, Mills responded, "It's fairly obvious we're fine with that idea...It's the logical successor." A replacement "won't happen overnight," Mills said, but years of experience designing operating systems at IBM and other companies means developers know just where Linux needs to go. "The road map is clear. It's an eight-lane highway."
Re: Catching up on SPARC T5
"...Software is licensed per core and even the old Power7 has more than twice the performance per core of the new T5..."
Yes, I do not rule out the possibility that POWER7 might be faster per core, on _some_ benchmarks, yes. However, on database workloads the SPARC T5 is faster both cpu wise and core wise. In fact, the SPARC T5 is the worlds fastest cpu today. For instance, in SPECint2006 and SPECfloat raw cpu power the SPARC T5 is faster, just check the benchmarks. The SPARC T5 is also 2.4x faster than POWER7 per cpu in real life TPC-C benchmarks, not only faster in theory.
The T5 has 16 cores, and the POWER7 has 8 cores. So if the T5 had the same oomph per core, T5 cpu would be 2x faster. But the T5 is 2.4x faster, which means the T5 cores are way faster than the POWER7. So, there might be some benchmarks where POWER7 is faster core wise (I dont know which, though). And there are benchmarks where T5 is faster core wise, including databases.
The SPARC T5 beats the POWER7, core for core in database workloads. So you dont need to license as many cores as you need on POWER7. Plus you have twice the number of cores in T5, so if you need even more performance you can get it, leaving POWER7 far behind. So the SPARC T5 hardware is way cheaper than POWER7, and the SPARC T5 is also cheaper in license costs.
What is best? A quad cpu with 4 strong cores, or a cpu with 64 cores, each core slightly slower than the quad core cpu? The 64 cpu will be faster, because it has more cores.
Re: not a mention on instruction set
"...After a bit of research there seem to be a big step forward: IBM added hardware support for transactional memory! ..."
Tthe last in line. SPARC T5 was the first commodity cpu with transactional memory. The old Sun ROCK cpu also had transactional memory, but it didnt make it to delivery. But all that old ROCK research has found its way into current SPARC cpus. Intel Haswell has transactional memory too. But it is nice that IBM tries to catch up on other cpus.
Re: sales drones on here aside
"...Wow somebody doesn't understand the market worth a damn. The total market for proprietary UNIX boxes is less today than SUNs SPARC sales in the late dot com era by a significant amount...."
I agree on Unix is diminishing. But i distinguish between Unix servers (where Linux is getting in, however only for low end servers because there exist no high end Linux servers for sale), and database servers. Oracle is starting to create extremely fast database servers, that happen to run Solaris and SPARC. And all Oracle Enterprise customers running huge Oracle databases, will be very interested in the fast Oracle database servers. They might not be interested in SPARC, but they are interested in the database servers (which happens to use SPARC and Solaris). So, if a tiny fraction of all database customers wants to get better database servers, they must switch to the database servers from Oracle (using SPARC and Solaris).
Re: 256 socket Xeon
"....Yes you're talking about it, but no I'm afraid you don't know the difference between SMP and NUMA. Lets drill a bit deeper into your example, the M9000....Actually, we can start with the diagram on page 22 of the M5000, and the following sentence that says: "SPARC Enterprise M8000 and M9000 servers feature multiple system boards that connect to a common crossbar."
If you have a design where sockets on a system board only have access to limited local memory, and must traverse an interconnect, like a crossbar, to access memory on another system board, then that is a NUMA, or NUMA derived design. It's most certainly not SMP. An SMP design is where all CPUs have equal access to to all memory. The problem with that is it doesn't scale well, hence the reason why NUMA was invented...."
Yes, I do know all this. I was the one talking about NUMA and SMP, wasnt I? It seems you claim no 32 SMP servers do exist. If that is true, then maybe you accept that no Linux 32 cpu SMP servers exist. So again I am correct: there are no 32 cpu linux SMP servers.
The M9000 is not a true SMP, I know. But Sun worked hard to make it act like SMP. This manifests in that memory latency is quite bad on the M9000, but the latency is not that catastrophically bad. The latency is quite tight, with a small spread between best case and worst case. A true SMP server would have no difference, there would be no best case nor worst case latency. So, in effect the M9000 server is SMP.
If we look at a true NUMA system, such as the 8192 core Linux ScaleMP server with 64TB RAM. This server is a cluster running a single image of Linux. And like all clusters it has a very wide spread between best case and worst case latency:
"...I tried running a nicely parallel shared memory workload (75% efficiency on 24 cores in a 4 socket opteron box) on a 64 core ScaleMP box with 8 2-socket boards linked by infiniband. Result: horrible. It might look like a shared memory, but access to off-board bits has huge latency..."
So it does not really matter if a server is a mix of NUMA and SMP, if the latency is good (because the server is well designed). If a NUMA server had extremely good latency, it would for all intents and purposes act as a SMP server, and could be used for SMP workloads.
-The Sun M9000 has 500ns as worst case latency. And best case... maybe(?) 200ns or so. The M9000 did 2-3 hops in worst case, which is not that bad, you dont have to consider it as a problem when programming. In effect, it behaves as a SMP server.
-A typical Linux NUMA cluster has worst case... something like 10.000ns or even worse. The worst case numbers were really hilarious, and made you jump in your chair (was it even 70.000ns? I dont remember but it was really bad, the worst case numbers were representative for a typical cluster). In effect you can not program a NUMA cluster like it is SMP, you need to program differently. If you assume the data will be quickly accessed, and the data is far off in a Linux cluster, your program will grind to a halt. You need to allocate data to close nodes, just like cluster programming. And if you look at the use cases and all benchmarks on all Linux NUMA servers, they are all cluster HPC workloads. No one is used for SMP work.
This Oracle M6 server is an island of SMP servers, connected with NUMA connection. I am convinced Oracle is building on the decades of experience from the Sun server people, so the M6 server has very small difference between best and worst case latency. It will act like a SMP server, because databases are typical SMP workloads, and Oracle cares strongly about database servers. The Oracle M6 server will be heavily optimized to make sure you dont have to make more than 2-3 hops to access any memory cell in the entire 96TB RAM server - it acts like a SMP server fine for databases and other SMP workloads.
I suggest you study the RAM latency numbers for M9000 and for all Linux NUMA clusters. The differences are huge. 500ns worst case, vs 10.000s ns or was it 20.000ns?? One can be programmed like a SMP server, the other needs to be programmed as a cluster.
So, you are wrong again.
- Xmas Round-up Ten top tech toys to interface with a techie’s Christmas stocking
- Xmas Round-up Ghosts of Christmas Past: Ten tech treats from yesteryear
- Review Hey Linux newbie: If you've never had a taste, try perfect Petra ... mmm, smells like Mint 16
- Analysis Microsoft's licence riddles give Linux and pals a free ride to virtual domination
- NSFW Oz couple get jiggy in pharmacy in 'banned' condom ad