460 posts • joined 15 Oct 2008
Normally I would agree, but if you run big name CMS you are automatically exposed to all the exploits in it as and when they are discovered, and you will be probed for those along with every other site running that CMS.
If you have a site that is based on a home brewed CMS only used by you, it will most likely not bear the signatures of another commonly used CMS and the scanning bots will simply move on after a cursory glance. The only people who will bother to find obscure holes in your custom CMS are the people who are specifically after you, and if you have someone that determined to get you specifically, they will eventually succeed, but possibly still not as easily as by waiting with a finger on the trigger for another big name CMS exploit to be discovered.
Re: We are altering the deal.
The quote that comes to mind is:
Darth Vader: Perhaps you think you're being treated unfairly?
The real reason...
... that Google Glass hasn't gained huge popularity is the price tag. Yes, it's a cool gadget, but not cool enough to jump through beta program hoops and then spend $1000 on.
I thought Buran was fully unmanned. It was designed so it could take a crew, but the only flight was carried out on full auto-pilot (including landing), with the cockpit controls not even installed.
Very deep pipelines were the very reason why hyperthreading (and more generically speaking SMT) were invented. Context switching requires a full pipeline flush. That means that any instruction that hasn't completed gets reset and stacked away. If the pipeline is deep, that means many instructions could have been in it, so resetting is very expensive.
Adding an extra hardware thread means that you halve the number of context switches.
The other reason is memory latency. With internal speeds of the P4 the wait for RAM became very expensive in relative terms. With twice as many processes scheduled to run, there is twice as high a chance that the data for at least one will be in the on-die cache.
Re: Pentium 4 didn't suck. @Gordan
"You make a very good point, but you ignore that compiling for a particular processor, using all of the features of that processor breaks the "compile once run anywhere" ubiquity of the Intel x86 and compatible processors."
This would be an excellent point if it were the case - but it isn't. When I was doing the above testing I found that code in question built with only P3 optimisations using Intel's compiler performs near identically on the P4 as the code optimized for the P4.
P4 was more sensitive to really bad binary code, which happens to be what most compilers produce even today, but if the developers had done their homework during the years of the previous generation of CPUs (P3) it wouldn't have been a problem. Unfortunately, the disappointing reality is that the vast majority of software sucks, compilers included.
Pentium 4 didn't suck.
It is merely the case of developers (including most compiler developers) being too incompetent to leverage it's capabilities efficiently.
See here for relevant performance comparison data with well written C code (no assembly) of P3 vs. P4 using different compilers:
Note that with crap compilers the P4 did indeed perform relatively poorly. OTOH, with a decent compiler (which annihilated a crap yet ubiqutous compiler on any CPU), P4 shows a very significant per clock throughput increase over the P3.
The point being that software is written for hardware, not vice versa. Don't blame the hardware manufacturer if you are too incompetent to use the equipment it to it's full capability.
Forgot to feed...
... the Chaos Monkey?
It's a marketing hype scam. It's not 2x better than the Moto-G; Moto-G is a pretty high end device at a budget price, and it is not being sold at cost. Unless this thing is made of solid gold there is no justification for it's price tag.
Re: Sell to ARM?
AMD will soon stand for ARM Micro Devices.
And this is what happens...
... when all those ex-mining high end GPUs that are no longer profitable flood out onto the second hand market.
And the posters above are right - ATI suck particularly badly when it comes to software, even down to the drivers. Things like drivers causing BSODs when used on motherboards with NF200 bridges (but only with R9 GPUs, earlier ones are fine), long standing bugs in desktop stretching across multiple monitors, endless feature removal (no more custom, non-EDID modes) and the fact that after years of virtualization their drivers still don't reinitialize GPUs properly when the guest VM being given the GPU restarts are nothing short of disgraceful.
Re: It's actually Windows OS X
WindOS X, surely?
I'm pretty sure virtual desktops date back to _at least_ 30 years back. OpenLook Virtual Window Manager (OLVWM) has it and I remember running that on my Sun3 machine in 1994 (which was already well deprecated back then - Motorola 68020 bases, pre SPARC). And I remember using similar virtual desktops long before then.
You mean that isn't "normal" for EE?
I've generally crawling 3G data on mine in most places since I joined the network years ago (2010 IIRC). For example, there is plenty of 3G signal between Waterloo and Clapham Junction but I hardly ever manage to get a single data packet in/out in that area. It improves slightly further away from central London.
I just assumed this was normal, regular crapness of networks being shit. I was with O2 before and the data on their network was just as bad, so I didn't notice things getting any worse.
Is this definitely a new, different issue, over and above the normal level of uselessness?
Re: Want to see the film version?
It cannot possibly be any worse than "Moon 44".
"And it'd be a Corolla (preferably of the AE86 variety) rather than a corollary."
Are you saying the analogy went more than a little sideways? :-)
Re: There are also 128GB and 256GB models available...
4x more expensive per GB than a proper SSD and 100x slower. Maybe it's time for cameras to start switching to mSATA form factor media...
Marketing Hype vs. Desirability
So what you are saying is that the only people who want iProducts enough to queue for them are actually paid by PR and marketing companies to do so?
I've always suspected as much but now it seems official. Nobody could genuinely be such an iDiot.
Re: @Eugene Crosser
Such countries are few and getting fewer. If all the stolen phones were only usable there, the supply in those countries would balloon to the point where even the high end phones would become worthless, and thus not worth the risk of stealing in other countries.
Additionally, some makes of phone (e.g. Motorola, most likely many others) self-erase when the network tells them their IMEI ID is blocked to protect the data (on top of being encrypted), so at least any sensitive data like one's google password is protected for sufficiently long from casual thiefs to change the passwords even if the phone isn't noticed stolen for a while.
Hold on a second... IMEI Blocking?
Doesn't IMEI blocking effectively already do this? The IMEI block lists are nowdays supposedly more or less globally synchronised. The net effect should be that the stolen phone, once it's IMEI number has been blocked, is going to be useless for more that being used as a tiny WiFi-only fondle-slab.
Rhetoric vs. Delivery
HP Moonshot servers were announced many months if not years ago - yet it is impossible to actually buy one as a regular buyer like you can buy their x86 servers. There has certainly been plenty of rhetoric, but so far in terms of availability and delivery this has been pure vaporware. That is really disappointing.
EL6 has been ported to ARM (RedSleeve) and EL7 port is being actively worked on (RedSleeve and CentOS). Debian and Ubuntu are also very committed to supporting ARM machines. But despite the Linux community having rallied, decent hardware (i.e. with more than 512MB of RAM) is scarce and almost entirely limited to Chromebooks (the old dual core model and the newer 8 core model), with the notable exceptions being the Arndale OCTA (similar spec to the new 8 core chromebook) and the Cornfed Conserver (mini-ITX).
Boston Viridis looks quite awesome but the cost is astronimical (many times what a similar spec machine farm made of Chromebooks would cost).
It would be nice if HP moved from words to deeds when it comes to delivering ARM servers.
Oh, the embarrasment
Error establishing a database connection
I hope the figures are wrong - 48/80 R/W IOPS is pittifully bad, worse than spinning rust. I am guessing there's a "K" missing in there somewhere.
Interlan roaming is a GOOD idea
"Mandatory roaming would reward operators who invested the least in their own rural networks, and increase intra-company haggling."
This is patently not true. Operators will charge each other. Yes, there will be haggling, but ultimately, a small operator that wants increased coverage will end up paying a larger part of their subscription fees to other operators that carry their calls. If an operator can accurately work out what their infrastructure cost is per call minute (which they most certainly do), then they can charge a few % more to the operators that are roaming calls to their network.
It means there is still incentive to have your own network with good coverage, and virtual operators, although they have lower costs, will also end up with even lower profits. This is how "cloud" provision works. You get between 1/3 and 1/2 of bare metal performance due to overheads, but you can spin virtual infrastructure up and down easily. The total amount you pay over medium term is considerably more than having your own hardware would have cost you, but there's no capex, only opex. It's a tradeoff, and if anything roaming would increase competition by introducing more virtual operators (which already exist, e.g. GifGaf or Virgin Mobile).
How is this different from what MySQL's InnoDB buffer pool has always done?
How long before...
... auto-troll feature is implemented in online forums and games? And how will we be able to tell it's not humans trolling?
Re: It all depends on whether you'd prefer to pay 80% tax or be hung from a lamppost
I guess it hasn't occurred to you that emigrating is a very real and viable option for most of those that would be required to pay the 80% tax rate.
Re: deja vu
"80% tax for the rich is so France 2012, I think Hollande's aides must have read the book."
And look how well that is working out.
Re: I mainly agree but...
"The tax and national insurance that you might have to pay during your life can be considered to be a very large liability."
By that measure the inequality is even smaller because those on high incomes have enormous liabilities in terms of lifetime tax and national insurance contributions.
Re: 4K screen prices are already dropping
Having tried it in the past, 40" is waaaay too big for desktop use. I find 30" is the comfortable limit. Nowdays I have a pair of 22" 3840x2400 monitors on my desk (separate VMs), as by far the best compromise available.
RBLs generally work in two ways as far as removal goes:
1) Removal requests (many require a payment to process removals)
2) Time based auto-removal
Just about all have 2), and most have 1).
You can try to chase 1) where available, but ultimately some will retain the IP until 2) takes place. So all you can really do is wait 2 weeks or so for the blacklisting to expire.
Re: Gullible Twat Dribbles into Beard
ChromeOS may be of diminished usefulness, but as has been mentioned in several places on this thread, there is nothing preventing you from putting a fuller flavour of Linux on one.
I run RedSleeve on my Exynos Chromebook, and it is at least as good as any other laptop that size, with better battery life on top.
"I put Linux on an old laptop"
Old laptop will have a battery life several times shorter than the ARM Chromebook. For some of us the purpose of a small, lightweight laptop is disconnected use while away from our desks.
I put full fat Linux on my Exynos Chromebook, and while not quite up to the spec of my ThinkPad T60 (2048x1536 screen was hard to give up), the weight saving and 6x the battery life made for a clear win for the Chromebook in the end.
"Linux can be installed to a memory stick"
Good enough for some uses (e.g. small server/firewall), but for general laptop/desktop use USB sticks (with the exception of the ones that are full fat SSDs with Sandforce controllers in a memory stick form factor, with an astronomical price tag) simply aren't up to the task - they make the user experience quite painful due to terrible random-write performance.
I look forward to seeing your numbers, then, when you reproduce the test I originally mentioned. We can then debate observations with some more numbers to compare.
"Again, you assert "worst case" that is, to be blunt...dated. I run huge databases virtualised all the time. Ones that pin the system with no ill effects and no noticeable difference to metal."
Whatever happened to your previous statement that it is only by pushing the system to the redline that we learn tings? Which of the two assertions isn't true? :)
"I also strongly disagree with your assertion that you cannot give up an erg of performance in the name of convenience; that may be your personal choice, it certainly isn't mine."
I said that "you don't necessarily have the luxury of being able to sacrifice any performance for the sake of convenience." I didn't say it is always the case, I said it isn't necessarily the case. To give you some real-life examples, if you are already paying £100K+/month on rented bare-metal servers from someone like Rackspace (I have several clients I do database consultancy for that pay at least that much for their server renting from similar providers), losing 40% of performance would also crank up your costs by a similar amount. That's not an insignificant hit to the bottom line.
"Specifically that "features" within most hypervisors to optimize RAM usage create a dramatic overhead on the system and they need to be weeded out."
If you speak of ballooning and deduplication, I always test with then disabled. I mostly use Xen, and with that what happens is that the domU memory gets allocated when the domU is created, and if ballooning is not enabled there will be no memory shuffling taking place. The domU memory is allocated and never freed or in any way maintaned by the dom0 - it's all up to the domU kernel to handle.
"There is also the issue that many virtualised systems = many OSes caching to RAM."
Again, I am not sure what difference that would make, since the caching is done within the guest, and the disks are typically not shared.
I'm not saying that memory I/O isn't the problem - I'm saying that I have not yet hard an explanation for it that makes sense. IMO the biggest difference comes from context the increase in the cost of context switching. This has been documented by several people who looked into it. I'm sure you can google it, but here are a few links for a start:
http://blog.tsunanet.net/2010/11/how-long-does-it-take-to-make-context.html (finds context switching is 2-3x more expensive on ESX)
The databases most of my customers present me with almost always start of badly disk I/O bound, and running so poorly that the entire system is falling apart. By the time I'm done with them with making sure they are appropriately indexed for the queries run against them, some query rewriting to work around the more eggregious unoptimizable cases, usually using materialized views, and a handful of other tweaks, they are typically running in the region of 10-20x faster, and are purely CPU limited (possibly memory I/O limited, but this is quite difficult to differentiate).
As for my data being out of date - I'm happy to admit that I have not re-tested the virtualization performance with the MySQL load since November 2012 (ESXi 5.0 or 5.1, I am not 100% sure which).
But the most important point I would like to make is this: "Don't listen to my numbers - produce your own based on your workload." By all means, use my methodology if you deem it appropriate, and/or point out the flaw in my methodology. But don't start with the assumption that the marketing brochure speaks the unquestionable truth. Start with a null hypothesis and go from there. Consensus != truth, and from what you said I think we both very much agree on this.
"I would have to conduct my own testing. My lab results consistently show an ability to saturate RAM bandwidth on DDR2 systems. Your results smell like an issue with RAM bandwidth, especially considering that's where you're pulling your I/O. I will look to retry by placing the I/o on a Micron p420M PCI-E SSD instead."
This implies you are asserting that running virtualized causes a substantial overhead on memory I/O, otherwise saturating memory I/O shouldn't matter. I'm open to the idea that the biggest overhead manifests on loads sensitive to memory bandwidth, although measuring memory bottleneck independently of the CPU bottlenecking isn't trivial.
"I also disagree with your assessment regarding near/far cores on NUMA setups. Just because the hypervisor can obfuscate this for guest OSes doesn't mean you should let it do so for all use cases. If and when you have one of those corner case workloads where it is going to hammer the CPUs ina highly parallel fashion with lots of shared memory between then you need to start thinking about how you are assigning cores to your VMs."
In some cases you may not have much of a choice, if you need more cores for a VM than a single physical socket has on it. For other cases, maybe you could get a little more mileage out of things by manually specifying the CPU socket/core/thread geometry - if your hypervisor supports that. I'd be interested to see your measurements on how much difference this makes on top of pinning the cores.
"Hypervisors can dedicate cores. They can also assign affinity in non-dedicated circumstances. So when I test something that I know is going to be hitting the metal enough to suffer from the latency of going across to fetch memory from another NUMA node I start restricting where that workload can play. Just like I would in production."
Sure you can, but testing with a simple base-line use-case where you have one host and one big VM seems like a good place to start assessing the least bad case scenario on the overhead. As you add more VMs and more arbitration of what runs where, the overhead is only going to go up rather than down.
I'm not even saying that the overhead matters in most cases - my workstation at home is dual 6-core Xeon with two GTX780Ti GPUs (and an additional low spec one), split up using Xen into three workstations, of which two are gaming capable. With the two gaming spec virtual machines having a dedicated GPU and pinned 3 cores / 6 threads, both on the same physical socket (but no overlap on the CPUs). The performance is good enough for any game I have thrown at it, even though I am running at 3840x2400 (T221). So clearly even for gaming type loads this kind of a setup is perfectly adequate, even though it is certainly not overhead-free. It is "good enough".
But in a heavily loaded production database server you don't necessarily have the luxury of being able to sacrifice any performance for the sake of convenience.
"Frankly, I'd also start asking pointed questions about why such workloads are running on a CPU at all, and can't I just feed the thing a GPU and be done with it?"
That's all well and good if you are running custom code you can write yourself. Meanwhile, the real world is depressingly bogged down in legacy and off-the-shelf applications, very few of which come with GPU offload, and most of which wouldn't benefit due to the size of data they deal with (if you PCIe bandwidth is typically lower than RAM bandwidth, so once your data doesn't fit into VRAM you are often better off staying on the CPU).
"That makes me very curious where the tipping point between my workloads and your simulation is."
Databases are a fairly typical worst-case scenario when it comes to virtualization. If you have a large production database server, you should be able to cobble together a good test case. Usually 100GB or so of database and 20-30GB of captured general log works quite well, if your queries are reasonably optimized. Extract SELECT queries from your general log (percona toolkit somes with tools to do this, but I find they are very broken in most versions, so I just wrote my own general log extractor and session generator that just throws SELECTs into separate files on a round-robin basis). You will need to generate at least twice as many files as you have threads in your test configuration (e.g. 24 files for a single 6-core/12-thread Xeon). You then replay those all in parallel, and wait for them to complete. Run the test twice, and record the time of the second run (so the buffer pools are primed by the first run). Then repeat the same with a VM with the same amount of RAM and same number of CPU cores/threads (restrict the RAM amount on bare metal with mem= kernel parameter, assuming you are testing on Linux). This should give you a reasonably good basis for comparison. Depending on the state of tune of your database, how well indexed your queries are, and how much it all ends up grinding onto disks, I usually see a difference of somwhere in the 35%-44% ball park. Less optimized, poorly indexed DBs show a lower performance hit because they end up being more disk I/O bottlenecked.
As I said, the I/O saturation was a non-issue because the write caching was enabled, the data set is smaller than the RAM used for testing, and the data set was primed into the page cache by pre-reading all the files (documented in the article). The iowait time was persistently at 0% all the time.
I am glad you agree that the machine needs to be pushed to the redline for testing to be meaningful. Many admins aren't sufficiently enlightened to recognize that.
On the subject of leaving resources dedicated to the host/hypervisor, that is all well and good, but if you are going to leave a core dedicated to the hypervisor, then that needs to be included in the overhead calculations, i.e. if you are running on a 6-core CPU, and leaving one core dedicated to the hypervisor, you need to add 17% to your overhead calculation.
In terms of migrations and near vs. far memory in NUMA, if you have, say, a 2x6 core dual socket system, an you dedicate one core to the hypervisor and the other 11 cores to the test machine, you are still facing the same problem of hiding the underlying topology so the guest OS kernel is disadvantaged by not being able to make any decisions on what is near and what is far - it all appears flat to it when in reality it isn't. While pinning cores will still help, the situation will nevertheless put the virtualized guest at a disadvantage.
Heavily parallel loads suffer particularly badly when virtualized because of the extra context switching involved, and the context switching penalty is still a big performance problem on all hypervisors.
I guess you didn't look hard enough. If you Ctrl-F and search for "esx" it shoudl find you the relevant part of the page. Including the first line in the article. ESXi scored second least bad, after PV Xen.
Hyper-V I don't use, so cannot comment on it.
Xen isn't that configuration error-prone - there isn't that much to configure on it. The only things that make any appreciable difference are pinning cores and making sure you use PV I/O drivers. In the case at hand, the I/O was negligible since everything was done in RAM with primed caches, so PV I/O made no measurable difference. Either way, you wanted a reproducible test case - there is a reasonably well documented one.
A few notes on the test case in question:
1) Testing is done by fully saturating the machine. That means running a CPU/memory intensive load with at least twice as many threads as there are threads on the hardware. For example, if the machine has 4 cores, that means setting up a single VM with 4 cores, and running the test with at least 8 CPU hungry threads.
2) Not leaving any cores "spare". If you have, say, a 6-core system, and you leave a core dedicated to the hypervisor (i.e. you give the only VM on the system 5 cores), you are implicitly reducing your capacity by 17%. Therefore, in that configuration, the overhead is 17% before you ever run any tests.
3) Pinning cores helps, especially in cases like the Core2 which has 2x2 cores, which means every time the process migrates, the CPU caches are no longer primed. This is less of an issue on a proper multi-core CPU, but the problem comes again with extra vengeance on NUMA systems, e.g. multi-socket systems with QPI where if your process migrates to a different core, not only do you not have primed CPU caches, but all your memory is 2-3x further away in terms of latency, and things _really_ start to slow down.
However, many VM admins object to core pinning because it can interfere with VM migration. It's a tradeoff between a bigger performance hit and easier management.
You may find overheads are slightly lower on more recent hardware (e.g. if you are using a single socket non-QPI system), since the above tests were done on a C2Q which suffers from the extra core migration penalty if the target core isn't on the same die as the source core).
On something like a highly parallel MySQL test the results tend to be worse than the compile test implies, but you'll have to do your own testing with your own data (replaying the general query log is a good thing to test with) as I don't have any publicly shareable data sets, and I haven't tested how well the synthetic DB benchmarks reflect the real-world hypervisor overheads.
As a matter of fact - I do.
I'll PM you later today with more details on other good test cases, but as a first pass you might want to take a look here:
"I personally see two conflicting assertions here: that VSAN is so much better than fully virtualised server SANs because it runs in the hypervisor, and that the VSAN-less hypervisor is so awesome it can easily handle any workload – like, er, fully virtualised server SANs."
The hypervisor isn't so awesome that virtualized workloads are overhead-free. At full hardware saturation point the performance hit from running virtualized can be up to 40% on some workload/hardware combinations.
"The Seattle-based server was demoed powering a full LAMP stack, running Red Hat Linux, Apache web server, MySQL, PHP, WordPress, and – this being an event for press and analysts – the obligatory self-congratulatory video."
I suspect you mean Fedora rather than RHEL. The latter is not officially available. The only available EL6 port available is Red Sleeve, and EL7 ports are being worked on by Red Sleeve and CentOS. RH made no announcement about RHEL7 on ARM yet, as far as I am aware.
Mean time to failure
It's not the mean time to failure that's the problem - it's rebuild time after a failure.
Re: I may be wrong but...
For a good overview of the biggest problem with lead-free solder, google "tin whiskers". It is a major problem, especially for hardware that has to have a long live expectancy.
Re: I may be wrong but...
"Well, on price they're about as good as nVidia for gaming, which is the usual use for a graphics card."
I am inclined to agree, if your use-case is a simple single-monitor non-virtualized setup which is what the vast majority of gamers use.
Having been up to my eyeballs in multi-monitor virtualized setups for the past year and a half, my finding is that the Nvidia drivers handle more complex configurations much better. They generally "just work", whereas AMD cards are problematic on most levels.
Sounds suspiciously like...
... 8xxM series is more or less a relabeled 6xxM series.
880M = 1536 shaders, i.e. 680MX
I do't see anything there to imply actual new GPUs coming out, at best it looks like a minor bit of fiddling with power management.
Cleversafe? You must be joking. They lost all credibility when they disappeared their earlier open source implementation from their website. And even if they hadn't the dispersed data storage only works when you have an incredibly fast interconnect between the nodes which demolishes most use cases they were touting for it.
For real-world use GlusterFS is a far more sensible solution.
"I guess IBM is talking about a similar thing to ZFS where it it only rebuilds used blocks on the disc."
While that works when the FS is mostly empty or contains only large files in large blocks, rebuilding a mostly full vdev full of small files can take much longer than rebuilding a traditional RAID because you go from linear read/write to largely random read/write (150MB/s linear speed vs 120 IOPS which could be 480KB/s on 4KB blocks). Then again, if your RAID is under load during the rebuild you are going to end up in the random IOPS limit anyway.
You do know Nvidia released the source for their Tegra GPU drivers, right? And they had done so months ago.
Re: Add another layer of indirection...
"Not possible if you are already running under a hypervisor..."
I guess you haven't noticed that several hypervisors have been shipping for years with support for nested virtualization.
Re: This way the virt did fly!
"Our testing showed only about a 9% maximum performance drop versus native tin"
On what RDBMS/OS/hypervisor? Throughput of carefully tuned MySQL on Linux under ESXi at full saturation (100% CPU usage with at least twice as many concurrent active threads as there are CPU cores, hot pre-primed caches, read-only) drops by about 40% when you run virtualized on the same hardware with full hardware virtualization support. KVM is a little worse, Xen is a little less bad, but the difference is in the low single-figure % points.
Did your comparison involve pushing the server to the limit with a highly concurrent load or were you just measuring latency under low load?
- Product round-up Too 4K-ing expensive? Five full HD laptops for work and play
- Review We have a winner! Fresh Linux Mint 17.1 – hands down the best
- Vid Antarctic ice THICKER than first feared – penguin-bot boffins
- 'Regin': The 'New Stuxnet' spook-grade SOFTWARE WEAPON described
- You stupid BRICK! PCs running Avast AV can't handle Windows fixes