very obvious and often overlooked :-(
IT people start on their journeys through infrastructure provision lacking one fundamental thing: experience. You emerge from school, college or university knowing something about technology (unless of course you did one of those nancy IT degrees that doesn't teach you anything about proper IT, in which case your usefulness to …
More importantly. Racks come in more than one sharpness.
The biggest and most important rack differentiator in a heterogenous install is the amount of meat and blood sacrificed when adding/removing equipment. If you are dealing with a cloud install it does not matter as you just pile them and remove the dead bodies when they kick the bucket.
You do not regularly unplug, service, add hardware, remove hardware, etc. In a mixed install you have to do it, so you learn to value racks without very sharp edges very quickly.
"Racks come in more than one sharpness"
They also come in several different external dimensions - most of the ones on the market are slightly wider than 600mm with their side panels on, which is a problem if you need security or are concious about airflow in a high density environment.
The "expensive" racks aren't always that much more when you have to add in "optional extras" to the cheap ones.
and DON'T SKIMP ON THE POWER BARS!
Back in '96 I learnt that an HP '19" rack' mounting shelf would not fit in a non-HP 19" rack - Length was fine, but width between the posts was both too narrow for the shelf and too wide for the bolts. Cue a visit to the DIY store for a hacksaw and some overlength bolts,
Oh, and re: "Serial connectivity is witchcraft" - knowing the pinouts, crossovers and ties for the port means you don't have to lug every cable variant around in your toolkit. A set of the commonest pre-made connector ends plus a breakout box and/or connector kit will amaze and astound t'kids of today. A DC engineer once called me a sad tosser for hoarding RS-232 bits, but had to grovel 10 years later when he needed a particular cable - I knocked one up in a couple of minutes :-)
from the command line.
The more pretty GUI's and layers of abstraction from what is really going on with the underlying OS means that one can employ those who have been taught how to pass an exam to administer ones infrastructure for a nominal fee. Experience is not cheap and one has to cut costs.
This does present a much broader attack surface to those that wish to own your ass though. And when your ass is owned, those cheap administrators one has hired prove to be no use at all.
Some management modules come with everything enabled, some require licenses for full functionality and others are basically crippleware to make you buy licenses when you realise they do nothing.
Also, some of them are a complete pain to install licenses on, so if you need advanced functionality, either pick a brand that includes all the functionality you need without licensing, or get the licenses pre-installed by the vendor.
you lost me at Buy the management module, there is nothing in a management module that can't be done from ssh and the shell. if you want redundancy have two machines in one location then use them to "hit off" against each other, it's not difficult, with bios settings you won't be having problems.
The type of light's out management he is referring two is more akin to a remote "power on" button then some sort of bash shell available *after* the machine is powered on.
Without lights out management, you need to visit the rack to switch the server back on.
TL;DR ssh doesn't work unless the machine is switched on.
Really ! And how do you propose powering up a shutdown server via ssh/command line ??
You can use paired servers to act as serial consoles for each other (assuming your hardware still has serial ports), but that won't help you if the target server isn't running an O/S in the first place.
Fully functioned DRAC/ILOs ? Just say YES, especially if in a remote DC !!
That still won't let you look at the machine between bootup and loading SSH. Also, if your system does not contain ILO/serial control, it probably doesn't contain a watchdog either - so look forward to the occasion where the server is rebooted/powered on with wake on LAN and it hangs at the RAID controller.
Many years ago I was sitting in my home office (recliner downstairs) and I'd left a machine powered off upstairs. I thought about extricating myself from the recliner, but then - when I heard the Whos in Whoville sing - I decided to see if I could script up something relatively quickly and get a magic packet (the packet that initiates a wake-on-LAN event at the node that is powered down) to wake it up. I took me about an hour, but I did it. I was a God for that brief moment.
We used to always have a dumb terminal on a trolley stuffed in a corner for emergencies. Few PCs, and even fewer laptops, come with serial ports these days, so at least make sure you have something. If not an old VT100 then at least a USB serial device for your laptop. And a few 9/25 gender changers. My racks also have a tray with folding screen & keyboard, and flying leads at the back. It only takes up 1U and is very handy to plug into any misbehaving system.
" if you want redundancy have two machines in one location then use them to "hit off" against each other, "
This room has big boys in, you may want to think about being so clever.
we have 1000+ servers (excluding VM's) across multiple countries. You may not want to "hit off" (whatever the fuck that means) for that many servers , for the sake of an LO module.
Explain to a customer why you have to travel to godforsaken land land to press a power switch because the server has become completely unresponsive and it will be up again tomorrow once we've pressed the power button.
Guess you've never heard of 5 9's SLAs?
I tried this with 4x IBM blade chassis in 2007, the colo provider said it would be fine.. till it started deawing something like 8kw of power, they then very swiftly backtracked and I had to take another rack.. thankfully, it was the next door cab and i just fed the power through..
I love blades, but you pay a ridiculous premium for them... after this we went back to pizza boxes, horrible amount of cabling, but when we investigated the cost difference, it was an extra 40-60%.
"I love blades, but you pay a ridiculous premium for them... "
Blades are worthwhile _IF_ and only if you fill the shelf. At anything less than 80% full they almost always cost more than the pizzabox equivalent.
The issue that some blade KVM and switch kit *ahem*Supermicro*ahem* is so fucking atrocious I wouldn't want to inflict it on anyone is another matter. KVMs themselves are another pain point. They're almost all bloody awful.
When we brought our Blade's IBM it was 2006/07, and IBM had a huge presence locally (North Harbor in Portsmouth), and given a support call an engineer would usually be onsite with spare parts in 69-90 minutes (Yes, he was often dressed in a tweed suit).
But since then, IBM all but pulled out of the SME market in 2009/10, and North Harbor is now practically an IBM-free zone (from what I can tell).
When he says label everything, really folks, *literally* label everything...
- Every rack to ensure people know which one
- every power drop, with phase and circuit detaile
- every UPS with phase and circuit
- every PDU with UPS/phase/circuit
- every power lead with server details, num of PSUs (1/2) - target phase can be invaluable if using two UPS in adjacent racks to split load/risk
- front *and* back of every server/switch/appliance/modem/etc
- every network cable both ends
- every wall wart with equipment details (so MANY ADSL boxen)
- every telco box with number and owner/purpose
- every cable that transits a frame (regardless of end labels)
A good label machine can be a literal lifesaver
We went to a lot of effort for this.. power and data available on both sides of the building. Building Service "helped" by routing the feeds from both boxes through one entry port "to save some money"... yeah.. the landscaping company decided to replace a tree. Down for 2 days. IT took the flak until someone produced an email with BS demanding control of the entry point for "cost effectiveness".
On the other hand, when we built out, we saved money by not routing all power and data via the ceiling but under a raised floor. The cost savings from IT for cabling paid for the floor for BS..... and there was still money in the budget left over.
Corporates can be a real nightmare of turf wars....
The writer obviously moves in better circles that I do. From past experience, the following still needs to be pointed out far too often:
1. A cloak room is not a server room, even if you put servers inside.
2. You cannot power 100A worth of equipment from a 16A wall socket. Not even if there are 2 of them.
3. You cannot cool the above by opening a window and puting two fans in front of the computers. A domestic AirCon unit won't do much good either. Also, don't put drippy things above sparky things.
4. Ground lines are not for decoration. They need to be used and tested regularly for safety.
5. DIY plugs cannot be wired any which way. Not even in countries that allow Line and Neutral to be swaped.
6. The circuit breakers at the end of your circuits are part of your installation.
7. You cannot protect a rack of equipment with a UPS from PC World. If you really need this, you're going to have to buy something which is very big, expensive and very very heavy. And the battaries are only good for 3 to 5 years.
8. Buildings have structural limits. You cannot put several tonnes of densely packed metal just anywere. Know your point and rolling loads and then check the building.
9. Electrical fires are nasty. Chemical fires are worse. You need stuff to protect the installation.
10. If you want 24h service, you'll need 24h staffing. A guy that "does computers for us" won't do.
11. A 1 Gbps uplink cannot feed 48 non-blocking 1 Gbps ports.
12. Metered and managed PDUs will save you bacon one day. Buy them.
13. Label all the cables BEFORE installing them. (No, you cannot just "follow the cable" afterwards)
14. My favourite: Don't blow your entire equipment grant on computers. All the stuff avove costs money.
And that's just off the top of mehead.
Well said... and kudos for the breakers comment.
Yes, your server hall staff DO need a key to the breaker cabinets, and also need to be able to get access to distribution risers and plant rooms in a hurry.
Also access to all building layout/wiring diagrams...
Plan ahead for all failure opportunities, as surprises are not what you need under pressure
" there is nothing in a management module that can't be done from ssh and the shell."
You can power on the machine remotely. You can go into the BIOS or EFI if there's anything that needs adjusting. If the system doesn't boot, you can access grub and single user mode. Even if a system's working, if it takes a bit longer than you'd expect to boot to an ssh-able state, it may be informative to watch it boot (although there should be reasonable logs in /var/log too.)
I must admit to having never used a system with an integrated management module, but I can see how they'd be quite useful.
There is a middle ground between the horrors of a data centre with type 1 token ring cables everywhere, and the most perfectly crimped accurate to the inch cabling.
We wandered into a comms room with a server feeding a few dozen modems - except for the fact they didn't work. Upon closer examination it was found the cables between the modems and phone socket were routed within an inch of their life - it was a thing of beauty. Unfortunately the cables are quite fragile and don't react well to being bent. So, go through and test each modem individually, then try and find new cables (they all look the same, but aren't).
Likewise, what goes in will some day come out. There are some installations that are a joy to work on - it's racked, and undoing the bolts lets the unit slide out as smooth as silk. On others, it's underneath three other heavy systems, and superglued to the bottom..
Biting the hand that feeds IT © 1998–2019