I recently revisited my all time favourite consulting experience; the design and construction of a mid-sized render farm. I had never done anything quite like it before; it took a full summer’s worth of research and two months to fully install and test the final design. The render farm consisted of 400 render nodes, a rack’s …
Sounds like fun
So what magical video cards did you use - Nvidia or AMD (ATI)?
fp calcs + ATI = epic fail
Since when has ATI been good for anything but graphics. In fact I burned up a 1950 pro doing first generation Folding@home calculations. Nvidia owns the floating point calc market for good reason.
At the moment, the client's sysadmin is most familiar with Octane Render as their GPU rendering platform. It can only talk to CUDA cards, and so nVidia was the only choice. So far, despite Octane being in beta…I’m mightily impressed. It is dirt simple to use, and fast as could be desired. The renders done on GPU have all the fidelity of a CPU render; there are no shortcuts taken by this software. (Traditionally, GPU renders would show graininess in shadows and there were frequently issues with glass rendering.)
At the current pace of development, Octane Render should have a fully supported version 1.0 product out the door before we get the datacenter upgrades completed and the new farm installed. Version 1.0 will come with all the scripting gibbons and bobs we need to make the whole thing properly talk to a command and control server and then we’re off to the races!
Fortunately for me however, I’m not the one who has to deal with the render software. I am setting up the deployable operating system, designing the network and speccing the hardware that will be used. I get to design the datacenter’s cooling and power systems and oversee the retrofit. I’ll be ensuring that Octane Render is installed properly on the client systems and that the individual nodes grab their configs from a central location…but tying those nodes into the CnC server is the in-house sysadmin’s job.
Overall, it’s a great way to play with new toys in a fully funded environment. More to the point, it’s doing so in a fashion that takes full advantage of my unique skills: instead of simply following a manual someone else has written, I am doing the research and writing the book myself. Doing that which hasn’t quite done before…but for once, with a proper budget backing it up.
The fact that I am getting paid for it is simply icing on the cake. These types of jobs are so fun...honestly, I'd do them for free. Hurray for the fun gigs!
I think you must have deleted 1/2 the article, because I was expecting a page 2 but there isn't one.....
What were you hoping to read about?
Agree with Robinson
This was a really interesting article, but it kind of cut out just when it was getting going. We want specifications, photographs, layout diagrams, and most importantly, costs!
I'm an end user of such systems, and would really like to know more as our own render farm is looking a bit long in the tooth
I can do some of that. The client wants to preserve his anonymity throughout this process. (He doesn't want to give his competitors an edge in any way.) So as such, photographs are out. Design diagrams and floor plans are certainly doable, but only if you are willing to put up with my terrible Visio skills.
As to costs and specifications…some of that should be manageable. I have to ask the guy designing the liquid cooling rig what his thoughts on the whole deal are…but I’ll write up articles on what I can get away with. ;)
Would love to hear more......
Any chance of another blog discussing the design, hardware choices, form factors and other assorted technical treats?
Ask, and ye shall recieve. I've put it on my todo list for new articles. :)
Interesting. As well as the obvious cooling and electrical supply issues, what about the space for UPS systems? Might not be an issue for a render farm, a little scheduled downtime may be tolearable, but not for most server racks where the priorities seem to be uptime, uptime and uptime.
Battery racks are not small, so you either increase the UPS space (x3), or reduce the backup time (1/3)! Seems obvious to me (being more hardware than software), but I guess it is very easy to be seduced by processing power alone. Well said Trevor, hope you solve it.
You are correct in that the render nodes don't need UPS support. The command and control servers (as well as the storage systems) do have UPSes. The UPSes are APC, and are installed on the same racks as the systems using them. So far, they have handelled 2 hour outages with room to spare. Though that's probably because I went to Princess Auto and bought a bunch of very large Deep Cycle batteries and added them to the UPSes when I installed them. Works a treat!
500 -> 50
So was there a bottle neck on the machines 10 machines worth of files have to be read and written from 1 machine - was this an issue ?
Actually, we never had much of an issue here. We simply spammed spindles. Big fat RAID 10s running on Multiple Adaptec 2820sa controllers. The limits were the controllers themselves, not the drives! It took some work, but we eventually realised that if you staggered the startup of the system, they wouldn't all be trying to read/write from the storage array at once. Once we mastered staggered startups, booting was no longer an issue.
The control software was also good about this: it could be configured to only hand out jobs to a preset number of nodes at a time. So the first 15 nodes would get jobs and then 30 seconds later another 15 nodes would get jobs, etc. This staggered the requests from nodes reading jobs and writing results enough that it was within the capabilities of the hardware as provided.
CPU/GUP needs larger then file system
A render farm will have (relatively) litle disk activity compared to the CPU/GPU need, so I doupt this would be that much of an issue. I would think network congestion is what will benifit the most from the reduction in the number of machines, but I'm far from as well placed as the author to say so.
I hope they are taking the opportunity
To move to contained air flow (contained hot or cold aisle, it doesn't matter which) to properly sort out their airflow. This would not only allow them to run their air supply temp higher and therefore less power spend on cooling but would also likely release some cooling capacity.
Also, I hope those aren't 1U servers toasting away with 2 video cards in each and peeing away lots of Watts on cooling fans just so they can have some empty racks...
Well, we live in Edmonton, Alberta, Canada. 10 months of the year, the outside temperature is below 20 degrees C. I have never installed a datacenter in this city without an outside air system. It would be unbelievably stupid not to take advantage of the massive source of cold air just on the other side of the wall.
The issue is that for two months of the year, the outside air temperature is often over 30 degrees C. This means that in addition to your outside air system, you need chillers capable of handling the entire datacenter, even if they are only active two months of the year. (Also: the outside air system has to be upgraded to a much higher volume/minute capacity than it currently has.)
The upgrades won’t be particularly hard…but they simply cannot be done right at the moment. Edmonton has had something like a metre of snow in past five days. We are still trying to clear our streets and walks, let alone having warm bodies to climb up on frozen rooftops to upgrade chillers!
In truth, most of the datacenters I have anything to do with here only need chillers two or at the outside three months out of the year. The rest of it is simply forcing outside air into the building, through the front of the servers and then exhausting the lot of it back outside the building. Not particularly complex, but it does take a sheet metal guy, someone to drill holes in the concrete wall and some dudes up on the roof upgrading the chillers.
Hafta say I agree with that. In potentially warmer environments, I'm afraid to go 1U.
(Some clients/companies worry about price more than common sense/LOGIC).
What is the humidity like in the two hot months?
Reason I ask is that if you use a bit of water on your economisers then you can run to the wet bulb temperature instead of the dry bulb.
In more detail indirect water would involve using a free cooling dry cooler with a water spray for when the dry bulb temperature rises. Indirect air would use an air to air heat exchanger whose intake cooling air passes through a regular air washer module of an AHU. These two systems allow you to maintain quite a tight internal humidity control unlike "fresh air" type systems.
Of course, if the client is prepared to accept a wider operating environment, remember almost all current server hardware is compliant with the ASHRAE specs and will run to 27C intake air all year perfectly happily and up to 35C intake air for the couple of months you are concerned with.
The other aspect to consider, given the simplicity of the fan in the wall (fresh air / direct air side economiser) approach is that if the external humidity is low you can use an air washer to bring the humidity of the intake air up and thereby drop the dry bulb temperature.
You should be able to run all year with no chiller, you can in most of Europe.
Humidity in Central Alberta is roughly 0% - 5% year round. As such, you are quite correct in that it is (in theory) possible to run the whole setup without chillers. Indeed, last year my chillers were only on for 3 weeks of the year. That said, I am unsure that I would ever build a datacenter without adequate chiller capacity. While we do get down to -40 in the winter, we can easily have days of +40 in the summer.
On average, the summer months are 25-30, but the spikes that go up to 40 are enough to drop any datacenter I personally know how to design. (Well, theoretically I could engineer a heat-pump system that would not be a chiller, but I am fairly certain the chillers are actually more power efficient.)
As to hardware meeting ASHRAE specs, I am not 100% sure of that. We whitebox our servers, just as we whitebox our datacenters (and everything else.) It is the reason people call me to do this stuff. Anyone can order a pre-canned (and usually very expensive) server (or even entire datacenter) from a tier 1. Not so many people take the time to look at the available off-the-shelf components from the whitebox world and ask the magical question “what if?” Hewlett Packard can deliver you a datacenter in a sea can that does everything the one I am building will do; tested to meet a dozen different standards and proof against almost anything except a nuclear strike.
I am called in when someone wants to build a datacenter into an awkward space and do it for something like half the cost of a datacenter-in-a-sea-can. (Alternately, if someone wants to make a computer system do something it was not designed to do, I can usually arrange to make it do perform that function anyways.) My partner in crime on most projects – and fellow sysadmin at my day job – is the polar opposite. He is so by-the-book he makes my teeth hurt. He tests everything, checks, re-checks and then does it all over again. Every time I approach a problem from an oblique angle, there he is measuring the angle, documenting and ensuring we have enough backups to survive World War III.
In this case, we are likely going to be using some modified Supermicro servers. (I have a guy working on the liquid cooling systems now.) The issue is the video cards. I just don’t know that I can dissipate the heat off the video cards using forced air at or above 25 degrees C. They crank out stupid wattage, and trying to design this tiny little shoebox datacenter to handle 500 units without chillers is hugely outside my comfort zone.
I will naturally try to design out the need for chillers as much as is humanly possible…but I think I would be a fool not to install enough chiller capacity to completely back up the outside air system as a just-in-case measure. Call it the backup cooling system. After all, what do you do if the primary and secondary fans on the outside air system fail simultaneously?
This should be a a well-known risk, and should be manageable with the right routine maintenance, but if you're not used to dealing with water evaporation as an air-cooling method, I can see it catching somebody by surprise.
This is one of the main reasons my company stick with tape backups - we don't move the data tapes out of sites, so that advantage of tape isn't taken. However we do backup over 3PB every night, so the power and cooling, even for automatically spinning down disk systems would be prohibitive.
Also, we're getting into blade servers in a big way, any power we save anywhere gets eaten up by them and their exotic cooling requirements.
Oh, to be said sysadmin!
Wouldn't it be nice.....ahhh....happy thoughts.
Doing all that seems really cool. I'm a low level IT guy in a K-8 school district where I'm kinda like a first responder guy where I handle the small, easy tasks like forgotten passwords, new users, fixing printer jams, etc. I'd love to have a job where I get to work big projects like that. You looking for an apprentice or anything?
An alternative to "small" UPSes in the server room adding to the heat load is to go industrial.
Caterpilar make a range of systems based on flywheels/diesel generators. We have 2 300kW units supplying our labs and server room (although the server room only eats about 50kW...)
Single biggest advantage - the flywheel setup is a motor/generator, so all incoming power is conditioned. The quality of power feeds is often rotten in terms of spikes/brownouts/etc. Electronic equipment failures have dropped 50% since we installed the setup 8 years ago.
Agreed. The article ends rather abruptly.
He's at the tail end of the design phase right now, so there's no build to talk about. That's when the fun stuff happens - like a metre of snow getting sucked in through the failed forced-air traps after the upgrade!
Did you consider ...
Asking the client if it would be acceptable to degrade the performance at peak times (temperature wise) ?
If your main concern is cooling the active GPUs with high intake tempts, could you throttle them down a bit and make a bit less heat ? Presumably the temperature drops somewhat at night, and then the systems could ramp up again. Done intelligently, this ought to save a lot of effort (and cost) on the cooling.
I suspect however it's not possible. Summer temps usually only drop by 5 degrees. While that might be good enough for many days…there are entire weeks which could exist outside the temperature range of “running full bore.” You’d think that shouldn’t quite be a problem, excepting that apparently a half day’s rendering can make all the difference when on deadline. That said, it’s worth exploring Amazon’s EC2 or Rackspace’s cloud as potential emergency backups for thermal excursion events.
One of my most interesting gig was also building a small render farm, five years ago... with an ultra tight budget (25KE for 20 nodes? what the hell!)
Had to use basic consumer stuff (dual core CPUs? Too expensive!), build some homemade ventilation and write the controller software all by myself. And guess what? This thing still works...
Also, I grined when you talked about the "staggered startup" as I experienced the same issues when testing the controller software :)