Intel is pumping up its virility through proxies like Michael Dell reminding us of an 80-core chip future. It's impressive, but Intel is a company obsessed to distraction with Moore's Law. It's like watching a crack addict do anything to get the next hit, a doubling of processor performance every 18 months, whatever it takes, in …
"Unless I'm missing something" - You are.
One word: Servers.
A quick average of some of my smaller production servers here gives ranges of between 1K and 10K threads.
Not running any HPC apps here, just some databases, ESBs, webservers... nothing really unusual. Have a look at the tuning options for Apache someday.
Per thread execution time is still a problem that doesn't get addressed by just adding more cores, but number of concurrent threads runnable is very important for even fairly basic serverside tasks (proper tuning may be required).
Looks like you have seen that Oracle isn't desperately good at parellel processing - which is semi true, RDBMS' are hard things to scale to highly multiple cores - but they are getting better and better all the time. Besides, they are mostly IO bound anyway.
I realise this doesn't help you play games any faster - but this is an industry that has only just started to wake up to the fact that they are going to have to learn how to write good multithreaded code. Some companies are learning, some are still resistant to the idea. By the time this chip is a reality, games are likely to be much more multithreaded than they are now.
My machine at the moment is running 681 threads. Given this number I am quite sure the OS (or actually its core component called scheduler) could find good use for 80 cores, even given that 99% of these threads do nothing, most of the time. Of course, this won't improve performance, for the very simple reason that the whole thing will choke memory bus. Still, if I wanted to write massively parallel application, I would use intel threading blocks and look at parallelisation opportunities. It's not really that difficult, given right tools - and lets not forget that these tools are actually available for C++, which happens to be dominant system programming language.
Anyway, the whole thing is just a dream, unless we find a way to get memory bottleneck out of the way. Not likely in the next 5 years, I'd say, given how little progress has been made in memory latencies reduction. Oh, and 10 VMs per core (5 per thread)? Not in my dreams.
To sum it up : OS scheduler will find good use to any number of cores no matter whether single app needs them or not, but before vendors (Intel or any other) put many-cores CPUs on the market, they must solve memory bus problem. Builtin memory controller is just small, first step.
Any kind of Monte Carlo simulation workload can be parrallelised. This is an evolutionary programming approach. Many design optimisations and game strategies can benefit from this kind of algorithm. A term used in this connection worth looking up is "simulated annealing". E.G. to layout the tracks on a PCB with thousands of interconnects and to place the PCB components in the optimal locations you want to be able to simulate the performance of the product for very many possible layouts and then focus on the better result regions where further narrower optimisations can be achieved. Think about the value of adding 2p to the value of something you make by the million and shaving 2p off the cost.
In my opinion you missed the point.
A hypervisor can do no more to help a given application (by today's definition) than an OS can. Both provide services to distribute the app across multiple hardware threads.
A hypervisor helps when there are multiple apps (as does the OS to a lesser extent), and multiple OSes involved.
"Intel can have any number of Threaded Building Block development efforts to add parallel programming to C++ applications as it likes, but they're not going to boost the speed of the weekly sales order processing run."
This is exactly the sort of activity that will drive improvements in application performance, including the one doing your weekly sales order processing run.
Hypervisors are not magic. The industry has to suck it up and learn new skills to build traditional and new apps on many cores, or it will stagnate. This hardware will soon be everywhere, not just in niche areas like gaming and data centres.
Of course databases like Oracle (or even MySQL) can benefit from a large number of cores, as they are highly multi-threaded applications. Java web applications can also benefit, being highly threaded in the application server, as I am sure .Net application containers are as well.
As to multi-cores versus multi-processors, it's much more efficient to have many cores on a single die than to have to interconnect discrete processors. As well as increased performance, as long as the cores aren't sharing things like the cache in an ineffective manner, it also means lower power consumption.
Of course on boxes with a huge number of cores you're also going to want to run virtualised operating systems under a hypervisor. Having so much scope to juggle resources between OS images on a single machine is great, and lowers the need to migrate images from one machine to another, as is common when running something VMWare on a stack of cheap dual processor, dual core boxes.
Oracle, which you mentioned, was running on 64way sparc machines (e10k) almost a decade ago, and yes it scaled based on its workload.
I'm not sure what role a hypervisor would play here, just about any modern OS would use all of those cores; dispatching the single (and multi) threaded apps as required.
Well I'll call you on the Oracle one - we have customers who paid a fortune for 8 core Sun servers to run their Oracle databases, and would happily have more if they were cheaper.
That uses parallelism in a few ways - firstly, individual queries / tasks can be split across cores, and secondly, you've got options to introduce programmatic parallelism by using parallel query, or pipelining. Which of course does mean code change, so no benefit on legacy systems, particularly any that run their batches as single threaded processes.
With desktop apps, what strikes me is that a lot of applications (particularly Cocoa ones on OS X or similar MVC frameworks) already use a model where applications have to deal with asynchronous events - i.e. you try to avoid blocking waits in code, but instead use a message and callback mechanism.
Asynchronicity is the main issue to overcome in making code parallel, and I see no reason why you can't then let the runtime manage the distribution of objects to cores (I'm not underestimating the complexity of that - you'd need to dynamically load balance the cost of processing against the cost of messaging, but I would also put a bigger bet on the framework developers doing a better job of that than developers).
And again, of course, there is the issue that a lot of code out there isn't modern, and does work in a very synchronous way.
Lovely threads dude
>Unless I'm missing something the vast bulk of existing applications, the stuff we want to run more quickly, are single or single-digit threaded applications.
Sounds like you are missing something.....
It's more about RAM and disk, but more threads means all the oracle processes don't get scheduled off proc (and allows multiple dispatcher rather than shared server).
Not to mention the old Oracle licencing fun and games, more threads less cores means a cheaper licence.
Again, for persistant http/https a hardware thread per software thread keeps the process on proc and makes it very quick.
But you're right, an application has to be written to take advantage of it, there's not a whole lot out there on the home market, it's chicken and egg, I remember playing doom on my old dual processor BP6/Celeron machine, and it used both procs, now the multi/proc/core/thread is here for the home market the software will be written for it (build it and they will come).
It is not the software, it is the IT industry
The IT industry is infected by MicroClap. 99.9% of the software administrators out there would not consider even for a second the possibility of running two unrelated applications on one server. The average OS ineptitude in allocating scheduling resources in a defined and predictable manner to more than one app does not help here either. It is the cluelessness of Corporate IT and Feeblness of the OS schedulers that drive the hypervisor and virtualisation uptake, not the software industry as such.
You don't need a hypervisor
No need for a hypervisor in a data centre - we've been running on multi processor/multicore machines for years and are used to shoving large numbers of jobs through at teh same time. No need for VMs either - even if they've also been around for donkeys years. Just need a decent OS (NOT WINDOWS) a decent scheduler and a workload manager to stop resource hungry apps from stealing too much and blocking the rest of the machine up.
The big benefit of virtualisation in a data centre is to insulate apps from machine failures/unavailability, and facilitate maintenance and upgrades. Particularly when you're running a lot of small, relatively unreliable (compared to mainframes or top end unix boxes) machines.
OS X Grand Central
Is this the problem Grand Central sets out to solve?
Lots of single-threaded applications, all at once? Almost like an operating system...
I think you may have described an operating system. Allowing lots of processes to share processing resources efficiently? Wow, not! That is hardly revolutionary, or new, or even thoughtful. Opinion? Not even close.
Solaris and Linux already allow a variety of ways to slice a given machine, (from virtual memory/processes, through complete virtualisation). BSD has a few options (like jails, and more I'm sure) , and Windows even has more than one (in the works).
An opinion piece on this might be a case for whether virtualisation interfaces are best defined by hardware or software vendors (although I believe some independant standard is best of all, and some work has been done there already).
Virtualization all the way...
I agree entirely. For most people even the current quad core offerings are massively over specced. There are many projects around that don't even take a single core of CPU during their day-to-day operations. Add to this the problems with per-core licensing and multi-core servers become a needless and costly waste, unless you consider virtualization.
VMs don't solve the multi-threading problem...
Oh dear - you don't resolve a multi-threading application problem by running lots of inidividual copies in their own VMs. You still have the same parallelisation problem - just that you now have to load balance over multiple operating systems. Of course there is still a use for techniques for consolidating lots of individual servers onto one box using VMs, but that should not be mistaken for getting more throughput through a single application. There are ways of exploiting multiple OS images for some typles of applications - network load balancing web servers for instance, but generating lots of VMs just to exploit more cores is a wasteful exercise - all those VMs need operating systems, memory and CPU overheads, IP addresses, configurations. It's called VM sprawl. Load balancing over multiple physical servers for resilience reasons is fairly sensible. Load balancing over multiple VMs on a single system when there are better ways is just wasteful.
It's much better natively to use environments which can use multiple threads within a single (or more limited) number of OS images. Any decent commercial database can do this, as will J2EE and any number of other run time environments. For most commercial, multi-user systems you often get sufficient parallelisation through supporting multiple users and hence transactions (a single transaction using multiple threads is rare and not usually necessary).
On the desktop the last thing I want is a dozen VMs. About the only advantage I can think of is limiting the scope of any virus infection (although I now have to manage multip[le VMs). Even if I do have multiple VMs there is only so much multi-tasking I can manage in my head. I just can't directly use that many very explicitly separate work threads. Admittedly most desktop applications do not exploit or need multiple threads. For the most part these don't have highly demanding CPU requirements. However, there are some that do, especially in the area of multi-media handling. Video, Audio and Photographic processing are examples of application areas where modern software does exploit multiple threads. These are, of course, specialist areas but for that very reason there are algorithms and software libraries that do exploit threading.
I'm rather less enamoured of hardware threads though - these are virtual CPUs and, whilst they can increase throughput, you can pay a very large penalty in damaging single thread speed once contention does set in. At the very least it confuses basic system performance numbers through non-linearity of reported CPU usage when compared with core utilisation.
Thread aren't that hard
My first point is that the software industry finds writing multithreaded applications hard largely because engineers don't go seeking out the tools for the job. In the real time operating system world that I've lived in for the past decade or so threads are almost unavoidable for anything but the simplest systems. Thus the development tools are rather better for the job than those found in the *nix and Windows worlds.
A good example is Windriver's (I don't work for them) development suites for vxworks, specifically the version hosted on Solaris. What this gives you is a mechanism for debugging individual threads without stopping the rest of the app, more than one debug window (so you can debug threads side by side), and something called WindView that is frankly awesome. WindView gives you the ability to find out what thread has run when, why it ran, what it was doing when it ran, what thread has run next and why, where the OS has stolen some execution time, where the interrupts are, etc. all wrapped up in a lovely GUI that is best described as a logic analyser for software and without introducing a runtime cost.
Mercury's (I don't work for them either) development tools are very good too. TATL is the equivalent to WindView and is truly excellent for their specialist multi CPU system.
For piccies take a look at these links:
My point is that whilst tools like WindView and TATL are designed for fairly specialised hardware and software systems, all these multi core CPUs that we're finding on our desktops are resembling those specialist systems more and more, so its about time the mainstream development tool creators got with it and looked at what the rest of the world has been doing for 10+ years. Sun's dtrace may be getting there, slowly.
My second point is that no one is taught multithreaded software design anymore. I was, and it’s not hard. When I ask round the lads in the office what a pipe is I may get some glimmer of a response from the unix guys muttering dark things about command lines, but that's generally it. Semaphores elicit an even sparser response. No one seems to think of using pipes within applications these days, yet apps built using threads communicating through pipes (as opposed to global variables) and select(), with the odd semaphore thrown in where strictly necessary, are easy to design and not hard to implement given the right tools. Anyone remember Communicating Sequential Processes? Also, (and this is a very big also these days), pipes can easily become TCP/IP sockets which means that converting your single machine app in to a multi machine distributed app is straightforward. Threaded building blocks and clever compiler switches are useful in that they take some of the donkeywork out of parallelising standard stuff, but they're no substitute for some coding skill.
My third point is that I don't think it necessary to have a hypervisor to make use of an 80 core CPU. An OS should be able to divvy up apps and the threads they create across cores without having to virtualise anything. They already do, so why can't they in the future? Granted, virtual machines and hypervisors can do some clever things, but mostly they're just allowing app developers to be lazy in not having to build clustering, failover, security, etc in to the app in the first place leading to what seems to my parsimonious eyes a whole lot of unnecessarily used clock cycles.
Paris, in the fervent hope that she can do more than one thing at a time.
9 ps3 cores?
You may want to check your data there. the PS3 specifically uses 7 out of 8 cores (one being redundant to achieve better yeild), of which only 6 are accessible to programmers.
Pardon me if I'm being dense...
...but why do you need a hypervisor to run lots of single-threaded apps in their own VM? Decent operating systems have been doing that for years with less overhead than running an OS under a hypervisor.
The last mainframe OS I was an internalist on supported up to 64 instruction processors, and that was some years ago.
re: The cars on the data centre motorway
The threads ("cars") will still have to wait for other parts of the systems, such as hard disk for data storage - so go beyond 4 cores and most of them will be sitting idle for most of the time
"but Intel is a company obsessed to distraction with Moore's Law. It's like watching a crack addict do anything to get the next hit, a doubling of processor performance every 18 months, whatever it takes, in Intel's case."
From my uni days I remember Moore's Law being the doubling of transistors in a processor rather than the doubling of performance. Performance largely follows, but is not the driving factor. Or am I wrong?
Build it and they will come.
Your argument is akin to going back to 1985 and asking why should we have colour monitors? Every application has been written to run in monochrome, so what the point of having a colour ?
Software that makes use of multi-processor machines will become more common as soon as multi-processor desktop machines become more common.
640K ought to be enough for anybody
Moores Law for the pedant
"Moore's Law" was an observation that the number of transistors roughly doubled every 2 years. This is not performance per se.
Do not pass go, do not collect £200, the bank has lost it playing the stock market.
So what's more likely;
A company with a multi-million dollar research budget hasn't spotted the "obvious flaw" that someone with the "brain of a six year old" could spot?
or that they haven't announced the solution to the press?
or that the chip simply isn't designed for that purpose you're judging it on?
That, and other presumptions of individual cores not increasing in performance, and software not being optimised in the meantime; makes me wonder, given most people realise multi cores aren't an end-all solution, what is your point caller?
Do apps need to be multithreaded?
Rather than running a single app across all cores/threads, it could be used to run multiple apps. A single-server citrix farm springs to mind as a classic example. Each user gets their own core (or a higher percentage of a core) in order to reduce latency. Since the apps are independent, there's less of an issue with concurrency. Close to this might be hypervisors which don't split the processing of a single application within a VM across multiple cores, but do load-balance VM instances across cores. The reduces the number of boxes, their redundant power supplies and therefore power consumed.
Equally, you could partition your (eg) billing database. Today we might process A-M on one host and N-Z on a second host so each host only deals with a subset of the total processing. With more cores, you might run multiple mysql instances each of which look after a subset of the total dataset. Of course you might run into various I/O bandwidth problems but that's a separate issue.
You could also run more processor-inefficient applications. A web-farm may want 80 java server processes running simultaneously, chomping through the xml to HTML conversions for web pages. You may want to simulate foreign architectures, eg sparc on x86, to consolidate critical, but perhaps less cpu intensive functions.
None of this gives you anything you can't do today utilising more rackspace, but if you can shrink your datacentre down to a rack, you've probably saved yourself quite a bit of money.
It is not for the desktop
I agree that you can't use 80 cores on a typical desktop, however a server that is dishing up web pages, DNS results, ... might be able to use them. The big problem is going to be memory bandwidth.
The Cloud and Agents
The network is the computer - A famous company once said that, in the days where one computer had one core and the network provided the resources. However, when everything is in the "cloud" - the computer is the network, and the ability to run 80vm's on a machine would be very handy. How very green.
Then there is the agent paradigm, intelligent software that is social, proactive and goal orientated. Mini-AI applications that seek out the functionality they need to complete a bigger goal. An agent community is a collection of functions as unique threads. That's 80 full time agents on one core. So the challenge then falls in programming threads as opposed to libraries.
The future is mega-multi-cored. The transition will always be slow, but once the ball starts rolling... intel's compilers are a good start. The trickle down effect will start, and turn into a torrent.
But the question is where do we go from here?
Our ordinary app
You comment that ordinary apps (DBs) don’t need lots of cores. I can speak from my experience and say you’re wrong.
Our ordinary app has 15 background tasks (each in there own thread).
Every client has it own thread etc etc
Hence 100 user system requires about 300 threads meaning threads have to share a CPU.
Muticores is the way to go, even for ordinary apps!
I also am an indie game writer, multicores are in this case are harder to program for, but then again writing games is about the hardest thing you to do!
How many single process computers out there?
So really how many computers do you know of that run a single process these days. Even my router runs Linux and has several daemon processes. My HTPC is running at least 40 processes.
Of course more cores will help here, you don't have to have multi-threaded apps to see a benefit.
There's more to cores
Actually, I could well imagine multithreaded versions of database and spreadsheet apps. Just because we are not producing them now does not mean it is not possible. Many, if not most very large databases run on distributed memory systems, using MPI or something similar to take care of interprocess communication. These would simply fly on multi-core system (or any other shared memory system) because communication is orders of magnitude faster.
Even without multithreading, multi-core chips can be used with fairly simple programming paradigms such as embedded in OpenMP, in which you can often just add compiler pragmas to gain huge speed increases. Apart from classical HPC apps, think of multimedia processing, which can often be parallellized quite trivially (and almost linearly) using OpenMP. The main worry is the memory bandwidth, which Intel has now finally addressed with its nehalem architecture. However, memory bandwidth might not scale easily to 80 cores.
Having said that, for many apps, the current processors are fine, and virtualization is certainly a way to use multiple cores with existing apps.
...is to implement some decent workload management technology. There are products out there that will manage the allocation of system resources to different apps in real time without the multiple abstraction layers and multi-OS overheads inherent in machine virtualization products. So you'd run your base OS (whatever that happens to be), and install, for example, Oracle server, other chosen apps, and the workload management technology. Multiple DBs will happily live in multiple Oracle instances, each running as a different executable and allocated a different portion of the available resources. Replicate across multiple hosts, store the DBs on a SAN, virtualize the app itself into a sandbox if you are concerned about contamination of the OS, and you've got the same levels of availability and consolidated system performance for a fraction of the price of a machine virtualization solution.
"The software industry already has problems writing and compiling multi-threaded software so that the threads can be spread across the cores and execute in parallel. The more thread bandwidth there is in a chip the harder the job gets."
Well, there's a reasonably well-known maxim here about how Amdahl's Law is kind of irrelevant. You don't make the same workload go faster. You make a bigger workload go the same speed. This applies perfectly to games and simulations. So I dispute that more threads really make things harder. This is received wisdom but in practice it's kind of wrong for the kinds of applications that actually suit multicore systems.
OTOH, for applications that don't suit multicore systems, the job doesn't get harder, it gets impossible.
"Sony's PS3 has a Cell processor running 9 cores, one a controller core, the other 8 replicated graphics cores which do all the render work stuff, and very well too."
Actually, the Cell processor used in PS3 is only specced at 8 cores and one of those is used for "security" leaving 7 for the application. The 6 available SPUs are not typically used for rendering. PS3 has a traditional GPU for that (from the NV4x family). SPUs can be used for preparing data for the GPU, or at a pinch for doing image processing, but are equally likely to be running physics workloads or even game scripting code.
"radically better data centre application bandwidth"
Using a multicore system to run massive numbers of VMs is a clear non-starter. Everything bottlenecks at the serial devices. You only have a certain amount of disk bandwidth, memory bandwidth, disk IOPs and memory IOPs. Sharing this across 80 VMs is not going to give you 80 full-speed VMs, or anything like. Unless your VMs are all running compute-heavy tasks which require little I/O you're screwed. And if your VMs *are* all running compute-heavy tasks which require little I/O, you're probably in the arena of games and simulations. GOTO 10.
What it does allow - and this goes against the grain somewhat - is for languages to be very very inefficient. If a language now takes a thousand cycles to add two numbers (I'm looking at you, LISP) then it's fine because you're going to be waiting about 10,000 cycles for your last memory transaction to complete. You may as well interpret ASCII source code directly. Parallel graph rewriting languages become viable. It's a good day for computer science.
This same factor allows emulation of other hardware. At the bit level, if you like. These systems will be awesome for Verilog. It's a good day for computer designers.
So Intel's products might just presage a revolution in how software is written and in how hardware is designed.
But I can certainly see why your typical data centre guy is thinking "why bother?"
You are correct, for now.
Firstly, the current software industry needs the proper tools. There are a bunch out there already and more are coming into play and improving every year, but for now most programmers are still thinking top-down and OOP.
Speaking as a coder trying to break into parallel and multithreaded design, it is fucking hard to switch. You find yourself going into old habits and trying to cut corners because thats what you know. Its like trying to learn how to ride two bikes at the same time where your only experience is racing F1. The change needs to start from the beginning. Don't teach BASIC to the kids anymore, teach parallel C.
Once the tools are in place and a new generation of coders comes along who were nurtured with those tools and styles from the get-go, you will see the change. Don't hold your breath though, give it at least another 15-20 years for it to be ubiquitous.
I don't think Intel and the like should stop increasing cores. Absolutely not. Because once the software industry finally catches up, the structure will be in place and all of a sudden everything will be running millions of times faster.
A web server will create a process (or thread) for every concurrent connection from a client - so will a database, file server, mail server, directory server.... etc
So servers will benefit from lots of cores for all those concurrent connections.
Hmm, maybe my brain core is underclocked but I thought that there are a lot of database-type operations which respond very well to parallelisation (with appropriate lock and cache management). Didn't there used to be Oracle Parallel Server? Presume it has now morphed to Oracle 9i10g11xyzpqr :-)
I agree with most of what was said. While a good number of regular apps (including things like Oracle, SQL etc) benefit from multiple cores, the benefits start leveling off after a given number (say, four, but I could be wrong). So all, told, the only benefits are going to be virtualisation at the moment. This will reduce the number of hosts considerably, but I see some interesting things coming out of it:
- Firstly, hypervisor makers will have to support this many cores - At the moment, VMware, for example, scales to (if I remember right) 32.
- Does an average company really want 80+ VMs per host? It smacks of putting even more eggs in even fewer baskets - something that scares some places with just today's technology. The new servers will have to be gods gift to resilience as well as king of cores.
- I can see software licensing cost models being questioned again - will a product be licensed per virtual CPU?
- Great, you have so many cores, lots of processing power in one server. You then move out the bottleneck to IO, as all of a sudden your 80 core host has to provide networking services for so much more than it's previous incarnations. So it'll be 10GbE and/or 8Gb FC across the board. Nice and cheap then.
Straying beyond virtualisation means some serious redevlopment of apps and operating systems, including a new idea of how the use so many cores effectively.
Paris - 'core' blimey.
New paradigm: trad4
My logic is we need a new programming language and paradigm that does scale on multiple cores. Like this: trad4.sourceforge.net.
Disclaimer: I wrote it, it's in beta but the 2.0 release is scheduled for the end of the year.
Firstly, the reason you want more cores is because no-one uses just one application at once any more, if they ever did. The spreadsheet example you used may use just one core (more likely more) but at the same time, the OS uses a couple, your iTunes playing in the background likes cores, as do your invisible-but-still-running widgets, indexing system, IM client, twitter interface, mail app, browser ...
And another thing, what's wrong with a constant hunt for performance? You mention Moore's Law as if it's a bad thing.
More than one OS?
I's perfect for Solaris zones.
Maybe not Oracle...
but the Informix DBMS is quite capable of running threads in parallel across multiple cores. Sequent may be gone, but their Silver Bullet architecture runs ever on.
A lot of crap spoken about threads
It doesn't matter how many threads you have on your system. Most of them are sleeping. It's number of active threads that matters.
While the Cell processor in the PS3 has one core disabled, and another used for security, full-price Cell processors used in servers have all eight cores available. Also, while a database server handling multiple queries at once is concerned with throughput, and so can effectively use multicore chips, plenty of other applications on desktop computers do face the problem of not being effectively parallelizable.
But until they can make some new kind of chip with transistors that run faster than silicon CMOS, and put enough transistors on it so that the speedup provides a benefit, there is only one other way that the benefits of continuing improvements in chips can be made available.
The price of a single core chip could come down, leading to personal computers packaged and sold like pocket calculators.
A 50 GHz 8-bit microprocessor sporting a parallel 8-bit adder won't outperform a 3 GHz 64-bit microprocessor with advanced arithmetic hardware, and it seems that many of the exotic materials aren't at the point of even making an 8-bit microprocessor out of them.
Concurrency has a lot to do with responsiveness
A lot don't seem to grasp that.
Sure, you can make linear programs run faster if you split stuff up and redesign, but on the whole that is not where you want multiple cores, you want faster cores for that.
But, there is not too many places where number crunching is required, instead what you want is to to use the device when number crunching is happening. So, a core for each application is helpful, just so that the application responds immediately even with one of the other cores is under load.
"What applications are capable of executing across 80 cores in parallel, with, say, two threads per core, meaning 160 parallel threads?"
How about apache2 with the thread worker MPM?
I'm glad someone mentioned servers
A server not a million miles from here (looks at IRC window) has a load average stretching into the hundreds and growing. Admittedly, this is a test to see what breaks, but it's based on what happens in real life otherwise we wouldn't be trying it.
The server software my previous employer sells also typically runs at a fairly high load average.
What's load average? It's the number of runnable tasks -- not all of them can get a bite of the limited numbers of cores at the same time.
Gettings lots of cores will help, but servers will still be kicking their heels unless you can get I/O to match the processing power. But all this talk of "we'll have to change programming to make use of all those cores" it's rubbish and typical of someone who sits in front of a Windows box. It's as bad as those silly adverts that claim having a dual core processor will allow you to do two things at the same time, as if operating systems haven't been scheduling multiple tasks for years anyway.
Does no-one who read this article know what a core is? Did no-one get the fact that the author was saying that massively-multi-core systems _are_ much better for a server environment than for the desktop?
Go on: someone explain to me how, on a dual-core machine, all of your background threads are going to max out your second core when your first one is trying to keep up with you typing into Word?
The killer app
I, for one, am waiting the day i'll be able to TALK to my computer. This will be the real killer application. Just imagine a few commands:
Simple : - Call <customer name> tomorrow, about 9AM, and pass me the call.
Powerful: - Find the last report on <some subject> with <some conditions>
Useful: - Answer th that article in The Register with .....
Very useful : - Call my mother in law, and tell her i'll be late .... perhaps not coming.
I'm sure you can imagine a lot more ...
This would be heaven for the computer illiterate, visually impaired, etc.
Forget ssh, e-mail, keyboards, mouses, RSI, and so on. Just talk, even over the phone.
Use it at home, just to do talking what you're having to use a button now (or more than a button).
Ahh... just dreaming on it.
Who remembers SMP?
That's SMP as in symmetric multiprocessing.
IF (and it's a huge if) your workload is SMP-compatible, and your CPU (alone) is the bottleneck, the time taken to process a given set of work is halved when you double the number of processors.
SMP was industry standard technology and terminology back in the 80s and 90s, but seems to have been forgotten (or is being ignored) because the people now coming out of IT college and/or going into IT management are seemingly Intel-sponsored Microsoft-brainwashed fashion victims, as is illustrated by the VMware-style hype in the base article here.
Multicores is the same as SMP, just cheaper smaller and faster than back then. Back then, 80 CPUs in an SMP box would generally with good reason have been laughed at by those with a clue, and so it should be today, outside of certain niches.
"the software industry finds writing multithreaded applications hard largely because engineers don't go seeking out the tools for the job."
'Engineers'? You've already let the cat out of the bag. Software 2.0 is about prettified GUIs and a "business model" that makes money for the various players regardless of the quality of the end product or service. IT's not about engineering Stuff That Works.
"no one is taught multithreaded software design anymore."
Absolutely. Communicating Sequential Processes is indeed what they need. That's from when, 1978? Not hip, not trendy, nothing to sell (except a mind-numbingly simple sensible design concept), move on.
If civil engineers built buildings the way most software people design systems, they'd be reinventing a different kind of girder/welder/rivet every couple of weeks and spewing rubbish about it in every available industry journal. And a lot more bridges would collapse.
The rest makes a great deal of sense too. Where do you work, have you any jobs going?
"Of course databases like Oracle (or even MySQL) can benefit from a large number of cores, as they are highly multi-threaded applications."
Bollocks factor: 75%. The database might be multiprocess or multithreaded, but sooner or later in any worthwhile real world application (eg anything with transactions) there will be one or more serialisation/synchronisation points where everything has to stop to get in sync before progressing further. And that's in addition to the memory and disk bandwidth issues others have already mentioned.
"Don't teach BASIC to the kids anymore, teach parallel C."
Indeed. Except Parallel C (and indeed Parallel Fortran) has been around since the days of BigSMP (see above) and DSPs and so on (and indeed the Transputer). Say around the late 80s, for (eg) Parallel C from 3L in Edinburgh. Where has Parallel C been the last twenty years? Waiting for Studio support??
A domestic or business-class desktop PC can use a couple more cores than it's got hard drives. If you've got a core per hard drive, you can parallelise the virus scan *and* maybe do some work/have some fun while it's scanning. Anything beyond a few cores isn't going to help, regardless of how many threads are sitting idly waiting for something to do.
Summary: massively parallel processing hasn't been, isn't, and won't be, relevant to anything except a tiny proportion of the IT problem space. In the problem space where it might potentially be relevant, other constraints (memory bandwidth, disk bandwidth, scheduling issues) often render massive parallelism relatively pointless.
If you want to know more, ask someone who remembers the late 80s/90s, do not ask someone who thinks Visual Studio is a software development tool.
Well there's nothing like being patronising. Of course idle processes don't use cores, although there is always some background activity gomg on and it is increasing. However, there are desktop applications which do use multiple cores - I know, I have some of them. Image and video editing are the obvious ones. Multi-media replay, "immersive" games, ray tracing, engineering simulations - these are all examples where extra cores can be used.
OK - most of the time these cores will be idle most of the time, but with modern power management they don't have to be wasteful, and the cores can be there when they are needed. Some of us do rather more demanding things with PCs than type things into Word. I use a four core machine at home, and there are times - often lasting an hour or more, when it is running almost flat out. I'd much rather use software that can use multiple threads efficiently rather than consuming mega larger amounts of power at wastefully high clock rates (power consumption increases disproportionately with clock speed - clever power management and suitable software is vastly more energy efficient).
Me want Power6
And a few of them, too
@Steven Jones: Dead right!
I regularly max out the 4 cores of my machine at home with just one app (I love the vacuum-cleaner sound when this happens). The same app can hog all 16 cores of a 4 socket quad-core per-socket Xeon-based machine, but it does have memory bandwidth issues which limit performance to about 9-10 times speed up (waiting for Nehalem or Opteron kit here), I am also rewriting some other apps to make full use of multiple cores. I can see many photo and video edting cases where 80 or so cores would easiliy be used so I can process the same images in a more complex way than before, but with the same speed (a previous comment on Amdahl's law and its irrelevance was spot on).
Of course I do not max out the 4 cores all the time, but I want a system that can process serious amounts of image data quickly when I need it. You do not buy machinery of any kind for average use, but for peak requirements.
- Just TWO climate committee MPs contradict IPCC: The two with SCIENCE degrees
- Apple winks at parents: C'mon, get your kid a tweaked Macbook Pro
- SOULLESS machine-intelligence ROBOT cars to hit Blighty in 2015
- China in MONOPOLY PROBE into Microsoft: Do not pass GO, do not collect 200 yuan
- BuzzGasm! Thirteen Astonishing True Facts You Never Knew About SCREWS