The job of a datacentre network is to connect the equipment inside to the outside world, and to connect the internal systems to each other. It needs to be secure, high performance and operate with an eye on energy consumption, with a guiding principle of minimising device numbers and costs, so you end up with a system that can …
My bad ... I'll refrain in the future.
talk to any of the nosql people and they drop the fibrechannel for replicated data across worker nodes, scatter/gather query dispatch and thin front end machines running mysql and memcached. It's roughly the design that facebook, google, yahoo! all use, so it can't be that bad.
Layers...outside ring...... This is crying out for a diagram.
Ok, so it's not news or anything, but the article isn't a bad little overview. I'll bear it in mind next time I'm asked about this stuff. However, the need for a diagram is so obvious I can't believe it isn't included...
Physical or logical?
These days you can stuff it all in single (pair of) chassis with an infinately (almost) scaleable number of small seperate access switches co-located with the servers. All the Firewalls, Load balancers, SSL offload module etc all sit in one switch so the physical diagram is easy. But the logical diagram can be massive when you take into account all the layers with firewalls between each.
Modules such as cisco ACE do all the load balancing and SSL, other modules do all the firewalling and wait for it...... they can run virtual routing and forward instances as well as vitrual LANs. So a single chassis can run not only several layers but several (well hundreds of) different customers all at once.
Everything is virtual these days even networks. Some of the virtual networks I work on now are mindnumbingly massive, yet they all sit in half a rack. Ten years ago all the seperate kit would fill half a hall.
Also depends on how that data is stored.
Will you have multiple servers hosting the same data for load leveling purposes?
Or will you have each server cache the most used data and have a centralised data core to which all nodes glean their data to keep their cache current?
Some high traffic websites/game servers use a combination of both to provide maximum data throughput/availability.
But not one I'd like. Too many layers, too many switches. And where's the structured cabling?
El Reg, either this should be done in much more depth or not at all.
I thought it was a great introductory article. It described the guy's basic structure, and left the field open for follow-up articles fleshing out individual elements. Now, I can’t speak for the author – I’ve never talked to him, so I don’t actually know under what constraints he is working – but I know they hold me to between 500 and 750 words.
The long multi-page articles are apparently not nearly as well read as the simple 500-750 word single-page ones. I think you’ll find that even the really experienced authors such as Lewis, Lester and Andrew write more single-page articles than they do multi-pagers. The multi-pagers they do write are hugely in depth and generally very concise. They have had years of writing experience to learn how to hold a reader’s attention long enough to click the button for the next page.
Consider cutting the author a little slack. I’ve read all of his articles so far and I’ve liked every one. He is doing a good job trying to take a very complex topic - “Datacenters In General” - and reduce it to something that individuals who aren’t familiar with it can grasp. He has only written a few articles for El Reg; perhaps he’s even new to being a writer in general. He’s just hitting his stride with his audience, and frankly he’s doing better than I did when I started!
Try ASKING the author for further elaboration on topic areas you prefer. El Reg’s commenttards are notoriously critical; being offensive, rude or demanding will probably just get you ignored by the author. Rightly so, in my opinion. Asking politely however will probably earn you a smile and a mental “hey, thanks for not being a douche.” If he has the leeway to do so within his contract, I’d bet that the “asking politely” bit would then manifest itself in the form of an article diving further in depth on whatever area you wanted more information on.
A great example of how to do it right is given by a couple of the commenters here: http://forums.theregister.co.uk/forum/1/2011/01/10/datacentre_cooling_and_power_constraints/
They asked very politely for further elucidation on specific areas and I am currently have three follow-up articles in draft open on my screen to accommodate them.
Anyways, for my comment to the author:
Manek Dubash: good article, sir! I however have some questions. Perhaps if you have time you could expand upon them for me, please and thank you:
1) You talk about fibre channel as the storage layer, but exclude other technologies such as iSCSI or ATAoE. Any particular reason?
2) Also: you talk about your core network as being “large, high-performance switches consisting of blades plugged into chassis, with each blade providing dozens of ports.” In my setups, I have preferred to go with large numbers of commodity switches that physically break up my subnets and/or physically provide redundant paths. I admit to not having had a datacenter under my care larger than 500 nodes, but I wonder at the reasoning specifically behind “bladed” switches. Is there something about “bladed” switches you feel is inherently superior to standalone stackable switches? (Other than space conservation?) Having not had room to play in a > 500 node datacenter, I am very curious about all the rationale.
Looking forward to the next article!
Whilst I'll admit to schoolboy amusement at Playmobil reconstructions of lesbian Spanish donkeys (fat asses provided by Ralph Lauren) flying paper space-planes armed with Russian recoiless rifles when hunting escaped Paris pantie wearing gyrfalcons threatening 747s over Scotland, all the while dodging Optimus Prime nuking "teh terrorists" at Heathrow, fueled by bacon sarnies, Wenslydale Cheese & a decent pint provided by Sarah Bee (etc.) ... and that *IS* part of the reason I read ElReg ... When I see an article labeled "Datacentre", I kinda expect more ... uh ... technical info. Which is another reason I read ElReg.
And part of the reason I don't read you anymore, Trevor. And why I won't read the author of this particular series anymore. Do carry on, though ... Everyone has to start somewhere.
But please note that I apologized for my initial (nixed) comment, and my follow-up (wimp), when I realized who the (new around here) author was. I made a mistake, I should never have commented to him or her. As I said, my bad. Life goes on ... Learn anything?
So...who is the author? Is it someone I should know? There was no "el reg bulletin: new datacenter articles are written by X." I am going to assume the on the article is the name of the author unless told differently. If the name is one I am supposed to recognise as "really big in the industry," then I am afraid they play well outside my pay grade. (Actually, that's evident by the Neat Stuff being discussed."
Also, unless you are Matthew Malthouse, I wasn't talking to you at all in this thread. Are you cyber-stalking my posts in other threads now? I have no idea why you posted "wimp" to this author. I found it bizarre, but figured "meh." Why you felt that my post to Matthew Malthouse was in way directed at you, I have absolutely no idea.
Are you feeling okay, dude?
Also, @Manek Dubash: I am very sorry to admit publicly to not recognising you name. Google came up with a few possibilities in the IT industry…but I must admit to not having heard it before. Please take that not as a slight against your experience, but rather an example of my not playing in quite the same fields as you. I also apologise that thread has somehow grown a “jake vs. Trevor” arm. Not remotely my intention. Keep the fantastic articles coming!
Seriously, ma'am, if you ask I'll never post here again.
Trevor: Get over yourself. This isn't about you. It rarely, if ever, is. And it's never about me. Sad for both of us, innit?
None of this is about me, nor do I understand why it should be about you. I don't even understand why we're having this conversation in the first place. This is about a guy who wrote a great article, one that I personally am eager to read follow-ups to. It's about someone who I think did a credible job at bringing a difficult topic "down" to the level of regular folks like me. The OP to this thread was kind of harsh on the author; I felt maybe if the OP was looking to get more info from this author...
...he'd catch more flies with honey than with vinegar. Where and how and why you got involved, I’ve honestly no idea.
Further apologies to the author for the tangential nature this comments thread has taken.
Well I guess some "equipment porn" might have made the article better. Just show photos of the devices you are describing.
Other than that it's not a bad article.
Worth mentioning to newcomers joining datacenter ops. I notice that many newcomers even though certified are not aware of many aspects.
It is not clear to me how the defined layers interact. I admit I come from a heavily Cisco based background so I am used to their enterprise campus model (and before that the 3 layer model) which may be clouding my understanding but I still have trouble with this.
I tried diagramming it out, but no matter how I follow the "instructions" I cannot generate anything that is meaningful, sensible and agrees with the statement that all data passes through the core:
External devices talking to DMZ devices don't appear to traverse the core (in fact the author states this is the case) and neither will server to server traffic in cases where server racks contain aggregation switches.
I know this may sound pedantic, and may be a result of a failure of understanding on my part, but this is an important topic that does need a high level of detail and accuracy.
Only one DMZ ?
No mention of Backups at all?
No traffic other than web traffic through the front door?
I don't think 'simplified' or 'overview' quite cut it.