While social networking site Facebook doesn't seem inclined to pull a Google and build all of its own servers, the company's top techies are frustrated enough with the current crop of x64 boxes that they may be giving the idea some thought. Speaking at the Structure 09 conference in San Francisco yesterday, Jonathan Heiliger, …
the almost for sure fact that they are incompetent morons means Intel and AMD suck? c'mon.
They don't get performance gains because they don't have a good design.. if they had it, they would know why.. even I with the somewhat limited resources I had at my disposal, was able to determine performance bottlenecks.. and have a good design.
As it is most of times, now they will expend huge amounts of money to have small improvements: good performance must start with a good design, and then continue with good programming, testing and fine tuning. Don't expect a house made without plans to be of good quality.. and maybe it would be a good idea to tear it down and start from the begining.
I DO see big improvements with new Intel and AMD tech.
Horrible programmers complaining about the Hardware and completely ignoring that their code doesn't scale. Where I work, we run all of our servers as Virtual Machine Hosts annd we have been able to run anywhere from %50-%80 more machines per host with the same.
About the power issue, since their application sucks, they aren't actually seeing the increased performance / Watt increase of the new servers. I for one, am very happy with the new kit that is coming in.
40,000 virtual ZLinux systems running under ZVM on one Zseries box. Power issue solved.
A technical article that apparently includes some technical analysis. Maybe there is still hope for some of the journos on el reg.
I'd like to see if a T1000, a low end cool threads server from sun would perform with facebook backend software.
If MySQL can only run 4 threads to speed itself up, surely to take advantage of more cores per box, they would run multiple copies of MySQL per server?
Does the writer believe that they _aren't_ doing that, for some reason?
But I want it!
All I heard from those quotes was "Waaaah! I want, I want, I want! Mommy, they won't give me what I want!" We all know that virtually all manufacturers, whether they make CPUs, batteries, or automobiles, will advertise their optimum performance. Having said that, you can't blame a processor, or even a server, for your poor performance without due diligence. You better be damn sure you've ruled out every other possibility. His quotes remind me of the type of idiots who complain that upgrading the memory in their system didn't make it faster when their processor is a Celeron-333.
As for the energy requirements of servers, there's not a hell of a lot server builders can do other than using low-power parts and more efficient power supplies. Even for those builders who design their own motherboards, they don't have a lot of room to work with. Most of a server's power usage comes from the processor. After that, you have the chipsets (north bridge and south bridge), various chips/controllers (network, video, storage, USB), memory modules (let's not forget that each DIMM uses around 2-3W), and power for the USB ports (must provide 2.5W for each port). Then you have storage media and backplanes. Let's not forget the fans which, in all servers I've seen, add considerably to the power usage due to the high RPM rate and the number of fans used. And, of course, there's the inefficiency of AC/DC conversion. In reality, server makers have little control over power usage because they don't make most of the parts. It's the various component makers who need to cut their power requirements.
"'I am not sure why the server vendors have failed us,' Heiliger added when asked why Facebook wasn't getting the machines it really wanted to buy."
I could be going out on a limb here, but I would suggest it's because his expectations were too high.
Aitor: Good for you - no-one is saying that this isn't happening, just that it isn't happening for this (and lots of other) load scenarios. Also, there's a difference in scale between what most people are doing and what facebook are putting through their books.
Steven W. Scott: Don't, just don't. One of our grid engineers heard that you could do this, and thought that he could run 3000 grid (CPU bound) instances on the Z series. He seriously thought this would work, and give massive performance benefits!
Ira Downing: Almost definitely "crap". The cool threads stuff is really slow on single thread execution, so they'd need to run containers (or the like) and scale horizontally within the box. All in all, not really a good solution
Finally - @Facebook. Go talk to IBM - they ARE doing custom hardware with as many/few I/O systems as the customer likes. From what I've heard they are non standard racks (15" or something) to give space for cooling. If you want a system that's just CPU, network card, memory and console (of some sort) they will probably make you one.
All in all - fail all round ;) (And yes, I've probably failed to notice something too)
ree Buy SUN
I was thinking the same thing. The problem, however, as TPM has guessed is that the back-end software probably does not scale well across the available cores/threads. Most opensource software does not... If you threw all of the slow/plentiful cores/threads on the T1000 at the problem, and TPM was right, then you would not do any better.
A Pint for TPM, as this is one article that he seems to have thought through. Perhaps he's right, maybe he's not, but blaming the HW vendors for performance issues before you've completely threshed out the problem is bad style.
The fastest boxes
can only run poorly written applications so fast.
i wish i had a dollar for every time an inexperienced developer told me that the server was too slow.
@Graham Wood - I'd be very surprised if IBM wasn't already in contact with them. Their sales guys are especially good at sniffing out lengthy consulting gigs that lead to hardware sales.
Same old story......
It's the software (not keeping up with hardware development), stupid!
"the server vendors have failed us" ?
"the server vendors have failed us", he says.
But what he actually says, as other comments have spotted and maybe not noted explicitly is:
the <<<x86>>> server vendors have failed us
You want something different than the commodity vendors currently provide, then you don't limit your choices to the commodity vendors.
The underlying principle is not news either way: x86 performance increases by means of clock speed increases have pretty much come to an end. Chip and system vendors now have to "lie" about performance gains, by relying on parallelisation-based scaling, whereas in reality although a few important apps will parallelise the vast majority of apps will see little or no benefit.
"which is written in PHP" and they are blaming hardware for performance issues?
now don't get me wrong i'm a big fan of php and use it almost exclusively for web work, but if large scale performance is the goal then native code is going to run a hell of a lot faster than php scripts, so i'd start by optimising the code on the server by using a language that at least has the potential to perform well
Yeah, servers suck
Actually, servers at the moment do suck, and not just at hyperscale. On the nanoscale side, try and buy a quiet 50W server for DHCP/DNS/etc for a campus edge or for a branch office.
The top of the range isn't any better. Sure the CPU numbers are impressive, but there's nowhere near the RAM and I/O to drive the CPU to capacity. I've had a devil of a time finding any box which will in practice drive a 10Gbps ethernet link to sustained capacity from disks. What is particularly annoying is the number of bugs we have uncovered. Unlike the days of yore, 'manufacturers' are really assemblers, they don't stress test their systems at all. That's left to their customers.
Facebook do run optimized memcached instances
Can't find the link at the moment, but facebook contributed a large amount of code designed to optimize memcached over more processors, and to much better utilize networks. I suspect this may have been the basis for the commercial hardened-memcached offering metioned in this article. Why assume that they are just using a vanilla install of the application?
I'd be extremely suprised if mysql hasn't been heavily optimized by them as well, and certainly running multiple instances per server may help them with this. In most web-application scenarios, though if you are doing enough data-crunching to make mysql processor-bound you've probably set things up wrong.
As to PHP, which isn't considered, the whole arcitecture of the langauge minimises the sharing of resources across processes, so in principle it is perfectly capable of scaling to lots of cores.
In summary, they have lots of engineers working on this stuff, and are in sum much smarter than you, so why assume they have installed and configured the open source software as badly as you would have done? Facebook may not have a fantastic business model, but it is not run by technical idiots.
"a quiet 50W server for DHCP/DNS/etc"
Look, this isn't exactly up my street, but perhaps you could help me understand why something ARM-based (therefore Linux-based) won't do what you want, for less money up fron than an x86, less power to run than an x86, less space than an x86, and without needing fans (and hopefully without needing hard drives).
You could start with any of the army of £30 SoHo routers, that can be reflashed with your choice of readily available easily reconfigurable GPL software. If that doesn't exactly suit there are plenty of slightly more upmarket ARM-based boxes around, you might fancy something based on the IXP4xx "system on chip" family with dual LANS and plenty of Flash and RAM, for example.
If you insist on x86 (why so?), have you thought about reusing an old quiet laptop whose batteries or display or keyboard or other non-critical part have failed and which would otherwise be destined for the scrapheap?
"Unlike the days of yore, 'manufacturers' are really assemblers, they don't stress test their systems at all. That's left to their customers."
Indeed. Companies like DEC (which did do stress testing before announcing, which was reflected in the price) went under, while Microsoft (which regularly leaves all kinds of bugs to be discovered by paying customers) thrive.
Funny old world, unless you want to buy something which actually *works* as advertised.
Let's bitch about opensource
"Both MySQL and memcached are not particularly good at scaling on many cores and threads,"
This would be because the problems that MySQL and memcached solve aren't suited to parallelism. The current generation of multi-processor machines will only show big gains in performance for tasks that can be split into parallel sections that rely on as little IO as possible,.. i.e Do a little bit of IO, LOOOOOOOOOOOOTS of work on the processor, little bit of IO.
Each MySQL or memcache thread is going to need so much synchronisation to stop the threads shitting on each other that it becomes almost pointless.
Yes, let's have machines capable of running 30 zillion threads, 20 zillion of them waiting.
>(I have no idea how PHP is or is not making use of all those extra
> threads in two-socket Nehalem EP and Istanbul servers).
mmmmm I know you're a journo, but really,.. if you're going to write articles about this stuff you should really have a clue about how it works right? PHP isn't an application server, it's an interpreter you <insert generic insult alluding to stupidity here>.
It's up to whatever webserver Facebook are using to maintain/instantiate enough instances of said interpreter to deal with requests... I can't think of any operations in PHP that you would want to be non-serial considering the tasks it does, i.e. generating *streams* of text, so there's no reason to have multiple threads running for one PHP script. Unless you're not writing a web application with PHP...
The reason you would want many MySQL threads running is so that your N number of running PHP scripts don't start to queue up... instantly turning your x-socket (/me wonders why how many connectors something has means anything..) y-cores machine into a FIFO.
If any of the clever code monkeys on here, with their two-pennorth advice, really believe a company the scale of Facebook aren't already optimising their code to the nth degree then they must be living in cloud-cuckoo land (cloud - computing... see what I did there?? :-D)!
@Facebook do run optimized memcached instances
"In summary, they have lots of engineers working on this stuff, and are in sum much smarter than you, so why assume they have installed and configured the open source software as badly as you would have done? Facebook may not have a fantastic business model, but it is not run by technical idiots."
They're also not likely to be as smart as the people at Intel/AMD. If they're so smart why don't they make their own boxes like Google? Chances are because they're not as smart as you think and the problems are largely of their own creation. They're running software that doesn't multi-thread well such as MySQL (and if they've knocked up their own little concoction what's to say they did it well?) and a scripting language - smarter people would use compiled code if speed is of the essence, no?
As for the new architectures I've gotten my grubby mitts on a 16 thread Nehalem server for use with Matlab and it goes like stink.
may be they have a point
i would like to have writen (own!) that crap software!
MySQL version what?
"... also why Sun Microsystems has been touting that its future MySQL 8.4 (or rather, Oracle's future MySQL 5.4) will..."
So err, is it version 8.4 (which seems a long way off) or 5.4... hmm, decisions decisions!! Sorry just had to nit-pick.
Why "open source"?
This is really a cheap shot against Free Software: if they were using proprietary programs rather than Free Software, you would never have written "... the limits of the proprietary software ...".
Give and Take
What I do see here is a common problem; that Facebook seems quite happy to use open source applications such as PHP and MySQL without making contributions in terms of development to it. A major company like Facebook should have the resources to target the problems in the Open Source products they're using and contribute modifications to solve their own problems.
You get what you pay for...
Intel, Sun and AMD all have created a current generation of hardware that outstrip the capabilities in the Open Source code being used by Facebook.
TANSTAAFL applies. (There ain't no such thing as a free lunch).
Its well known that MySQL doesn't scale and as it has already been pointed out that synchronization of multiple copies gets very expensive. Oracle's RAC server is an example.
(Oh I know a certain Oracle competitive wonk is going to be pissed that I said that!)
Its a wonder why Facebook hasn't looked in to GreenPlum, however that's not really open sourced even though its roots are based on PostgresSQL.
The fact is that there is room for growth in the OS along with a refactoring of applications to take advantage of the parallelism now being offered.
Thumbs up for the article, not the whine coming from Facebook, they are all little bitches, no?
@Mark 65 and others
Well, of course they are not as smart at box/processor design as those at AMD or Intel, otherwise they would be a company that made the same sort of products. What they are is smarter than Intel and AMD in what they do. Duh. They are also not Google. They are Facebook. They have requirements of their hardware, and they appear to be disappointed in what AMD and Intel are providing. AMD and Intel are getting more performance but multicoring and threading, but modifying software to reflect that is a non-trivial task. In fact, writing good MT software is notoriously difficult - difficult to write properly and difficult to debug. There is also no guarantee it will help with performance, as that depends on the type of process to be worked on.
All the above explains why the Opensource code is not yet as multithreaded as it could be.
A couple of ideas
Not my area of expertise, but ideas. If scaling horizontally doesn't run them into some other limit any time soon, here are a couple of ideas for them:
1. Quit using fast boxes and concentrate on maximising crunch per watt. This might point them to buying a vast number of (blade?) Atom servers. Not at all fast, but low power consumption. I can't remember who sells such but I'm sure I read about at least one such system.
2. Keep the fast hot boxes and give VMware a call (and/or one of the competitors). Buy enough extra RAM to split sixteen cores into (say) eight virtual machines with two cores each, or sixteen VMs with one core each. This basically takes some of the CPU's multithreading ability that they can't use up to the hypervisor level, which can use it effectively to multithread multiple VMs.
In passing: VMware enterprise stuff includes the ability to hot-move VMs between physical boxes, and to shut down / reboot boxes when load on the virtual machines drops such that not all the physical boxes are needed. Big power savings here?
If scaling horizontally CAN'T provide a long-term solution they're just going to admit that their current system architecture can't scale up enough, bite the bullet, and re-engineer everything. if so, here's wishing then luck!
Virtualise or make "slimmer" servers?
At the moment, the choices with such apps seem to be either stick it in a VM and pack more VMs onto a faster server, or make specialised servers like the Google biscuit-tray type. Given the cost that is involved in producing new designs, until there is a substantial market I don't think we'll see any Atom servers from the main vendors. Which really winds me up as it would really simplify some of our infrsatructure if I could put in some low-power Atom blades. Why x86? Because it means I can re-use existing x86 binaries, be they Linux or Windoze. And it always seems much more expensive in time and effort to change the app than to change the hardware.
Architecture, not Open source
Or redesign your software architecture. Facebook isnt that hard a concept all up. But if you treat it like your typical 3 page website multiplied by 10 million times, expect to have an architecture that looks like a massive bloated shared hosting site.
Tim, stop bagging open source. They could have chosen inadequate proprietary software and achieved the same poor result. I did hit the "get more from this author" button, and this article didnt improve. Its not open source that's the problem, its the choice of technology that meets a scale that facebook never thought they'd ever get to. All those guys praising google biscuit trays neglect to say that their pizza boxes are built to support scalable software architectures, using map-reduce and other tricks of the trade. Yahoo use Hadoop to get some of the same scalability and others are working with concurrent languages like erlang or stackless python and abandoning relational databases for more scalable structures storage structures.
There is plenty of good open source that assists users to get scale, PHP and MySQL solve lots of problems and have been effective at scale. Get the architecture right first for the problem at hand. Otherwise the cost is what Facebook faces now, lots and lots of underperforming boxes.
- Vid Hubble 'scope snaps 200,000-ton chunky crumble conundrum
- Bugger the jetpack, where's my 21st-century Psion?
- Windows 8.1 Update 1 spewed online a MONTH early – by Microsoft
- Google offers up its own Googlers in cloud channel chumship trawl
- Something for the Weekend, Sir? Why can’t I walk past Maplin without buying stuff I don’t need?