right. people who rave about flash being the death of hdd always seem to forget that litho and hd platters are both 2d, and therefore follow the same moore's-like law: shrinks give exponential effect.
30 posts • joined 14 Apr 2007
Re: Sooo out of date!
it's funny that people often go on about humidity control for datacenters. but the fact is that they are easy to keep at modest numbers (say 15-35%), which also happens to let you avoid both humidification and dehumidification. in most countries, you'd have to put some effort into driving the humidity down so low that static was an issue.
Re: Three pages
and say silly things like "every DC has a PUE of >= 2" (penultimate paragraph).
Re: Just wondering
no, 12-15KW/rack is no problem with air.
Re: So uptime sometimes doesn't matter. Nor does data integrity. Sometimes.
Integrity is easy - paxos, raft etc: it's not like you have to give up sensible, cheap, commodity features like ECC. It's only worth paying for "Enterprise" features if you can't do it the modern way for some reason: corporate culture, not smart enough, superstition, etc. The only surprising thing here is how long it's taken the Enterprise culture to start withering away.
When will we get the important performance numbers, such as rates and latency? A variant of IB with 100Gb is only incrementally interesting, but if it's lower latency, or cheaper, or can do cache coherency, that would be news. Similarly, putting 60 cores on a chip is not exactly news unless it's substantially different (remote cacheline put instruction? threads in the ISA proper?)
I cannot understand who gives a damn about this stuff unless it achieves a reasonable price. The basic hardware costs $1-200/TB, so how about this stuff? Or is it just another phallic substitute to enhappify the costs-a-lot-so-must-be-good crowd?
Re: We're doomed I tell you....
But that's actually not true: cloud systems require sysadmins, too. Basically, your sysadmin needs will always be proportional to your IT needs, regardless of whether you outsource the physical datacenter (whic his all IaaS is...) If you think going Cloud means cutting staff, you're wrong. You might get rid of some box-monkeys when you outsource boxes, but they probably make minimum wage anyway (and each looked after hundreds of servers, so you had very few of them.)
Re: IBM will slide further down
Why is the mustard so hard to cut? Do you mean "that customer" is just pathologically risk-averse?
I'm really curious what you think is difficult about HPC. Sure, there are a lot of details that contribute to a good cluster, but they're nothing magic. Manage reliability while containing cost. Choose enough but not too much cpu/memory/net/disk. Keep packages up-to-date but don't upset users with too much churn. These are all very straightforward ops things, nothing exotic.
tape-ism is a worldview. for instance, many people will say that it's not a real backup or archive if it's not offline (usually their justification is that mistake or malice can more easily kill an online "backup".) if you rarely recover from archive, that colors your expectations as well: you are rarely exercising the tape, so may have an unrealistic estimate of the actual, silent failure rate. obviously if you more frequently recover from archive, you'll be pained by tape's latency (probably offsite, but even libraries are slow relative to disk seeks.)
in reality, people who take tape seriously write two copies. once you plug that in - the price, the data rate, the space, and factor in environment-controlled storage, offsite of course, and the fact that tape drives are expensive and don't last very long, and normally need a separate spooling facility. wow, costs do pile up.
it can probably still work well for very large, very sparsely-accessed storage. most people don't bite, though, and online, spinning storage for backup and archive really is the norm. simply being able to verify all your data is a powerful argument.
Re: Longevity of SSD as a medium
hmm, flash is rated for much less than a million writes per bit (3k for common MLC, for instance). of course, ssd virtualizes that and covers the early failures using spare blocks. but it's completely mistaken to think that you can write an ssd a million times (fully, with uncompressible/non-dupe data).
Re: Longevity of SSD as a medium
flash retention rates depend not only on erase-based wear of cells, but also on crosstalk-like degradation from operations on nearby cells (even reads). in principle, if you wrote data once to flash (archival, like most tape uses), it would last on the order of 10 years. documentation of this seems fairly sparse, though, probably because that's not the main market. (flash all uses quite powerful ECC, which is fundamentally different from checksums...)
many people would not share your confidence of the retention rate for tape. it could be that we've all been warped by horrible performance of old generations of tape, but then again, that was always the explanation. (verify-after-write was a game-changing tape technology, for instance.)
Re: What is AMD up to?
don't read gamer reviews of intel vs amd power consumption and then draw conclusions about either HPC or webscale applications. these are throughput boxes, where the workload is embarassingly parallel and (for webscale at least) not flops-heavy. such servers are simply never idle, for instance (or their being used wrong).
Re: fuzzy math?
that's correct: "enterprise" disks only ever use quite narrow bands of the outer part of the disk, since that gives the lowest latency. these disks are sold on iops, not bandwidth. (which is why, more than ever, they sell to a shrinking niche market. think SSD...)
uh, cloud is expensive
you know Amazon's profit margin is HUGE, right?
Re: Accuracy of results
whohasthefastestcomputer.com is just a flash plugin - very little relationship to the true speed of the computer it runs on, and totally unrelated to HPL.
Re: Let me overclock it plz :D
HPC doesn't generally overclock for two main reasons. first, overclocking is, by definition, running the system outside of spec. unless the specs were stupid, that means less reliable or robust - higher FIT, etc. second, overclocking dramatically increases power dissipation, and operating at scale means optimizing for performance/power, which means a strong preference for lower clocks.
speculate on AWS margins?
I was looking at AWS prices recently, and even comparing to retail prices for servers, space, power, networking, I don't see how AWS could run at less than 20x markup. that's pretty amazing, even compared to, oh, say Apple. could it be that AWS gives incredibly steep discounts to large customers? or could they have some kind of exorbitant hidden costs?
AWS costs between $250 and $700 per year per ECU; purchasing your own servers, running them for 3 years, and throwing them away will cost you somewhere around $50/ECU-year. if you get hardware at wholesale and build/operate your own datacenters, the cost is probably close to half that.
Hazra needs to work on his rhetoric. simply claiming pcie3 is "necessary" makes him laughable - a simple appeal to authority. _why_ is it necessary? show us the numbers demonstrating realistic cases where it helps.
the best examples I can think of are high-end IB and some kinds of IO-intensive GP-GPU codes. failing to provide an actual example, he looks like a marketing weasel.
if an attacker so 0wns your network that they control DNS and can MITM all traffic, you're basically screwed. but this doesn't mean you need to cache everything - just the root certs. and those should be updated via your OS's standard update mechanism (after all, you have to trust them just as much as you have to trust your kernel, tcp stack, etc)
this is really the way it should always have been - separating ssl from domain mechanisms was just a historic oddity.
the big change here is that the current nasty, parasitic SSL-cert industry goes away. lots of them won't be happy. no customers will regret this though.
possibly the stupidest cloud vapor yet
why haven't people realized that VM doesn't improve security or reduce admin load? fussing with hardware is so infrequent.
but obviously diebold is worried about NFC and people having secure and easy ways to buy and/or get cash advances. ATM's are today's buggywhip...
the paper seems to be using a deliberately old version of mcd. the tilera version was also pretty extensively hacked (lockless sharding). what do we know that we didn't years ago from FAWN?
you pay for whatever rank on top500 you want. and it has almost nothing to do with the performance of real codes. but you're right: the interconnect does sound interesting, since it's the only novel part. it's a shame there's so little info available about it.
and this is news how?
it's a bit sad that's the best he could manage, and that he thinks it's worth talking about. compare to the current article about google's plan to manage 1e7 servers - probably very few of them in luggage.
2-3 KW limit is a lie
why do you let asshole vendors get away with claims like that one about 2-3 KW/rack? it's absurdly untrue, but if you pointed that out, it would also implode most of IBM's spin.
fact: it's not hard to build rooms at a bit over 10 KW/rack - normal raised floors and standard Liebert chillers. with rack-back radiators or more careful air-engineering, much higher is achievable.
it's the call that's unsafe
it's not the clumsiness of holding up a cellphone that makes calling from the car unsafe. the problem is that the call itself steals enough of your attention that you are no longer a safe driver. please to not give people the mistaken impression that hands-free makes it safe to call while driving!
amdahl's law, for real
do you really think these guys don't intimately understand parallelism? but look up Gustafson's law instead - that's the relevant one here, since the point of this cluster is to scale up the problem, not to solve a small problem really fast...
why cardboard at all?
ultimately, any significant cluster winds up racked. so why not ship racks fully installed? the cluster I care for daily certainly arrived like that: ~30 racks, with a minimum of cardboard, no excess power cables, etc. there _were_ 30 skids, but the shipping company took them away. one rack arrived bashed in, but I'd guess that damage rate is comparable to the cardboard-intensive approach. naturally, preconfigured racks work well with putting leaf switches in each rack, for instance.
have these people looked at disk prices recently? raw storage costs about $250/TB, so everyone with half a brain is wondering what's worth the 10x markup. sure, you have to put the disks _in_ something, and yes, there still is some modest value in 15k rpm and 24x7 vs 9-5 duty-cycles. but disk is cheap, and particularly at the block level, may not make sense to try to centralize. especially if you consider performance. there is significant value in providing multiprotocol, shared file-level access, especially with features like snapshots, replication, multisite caching, etc. but those are largely a small matter of programming, and therefore hard to charge arms or legs for...