Heart Internet outage... three days and counting • The Register Forums

Wednesday 31st January 2018 16:32 GMT Professor Clifton Shallot

KVMhost69

KVMhost69 - some kind of head alignment problem, I guess.

29 0 Reply

Wednesday 31st January 2018 17:47 GMT Korev

Re: KVMhost69

For you Prof. Shallot -->

1 0 Reply
Thursday 1st February 2018 07:11 GMT big_D

Re: KVMhost69

It says a lot about their disaster recovery practices!

I thought the whole point of virtual machines is that they aren't restricted to any physical hardware. Why can't they just spin up a new machine and move the VMs over from the read-only array or recover from backups? Or just move the VMs over to other hosts with free capacity?

My previous employer had a power outage and lost our first production server and the backup server (both mainboard failures) when the power came back 2 hours later (complete industrial estate lost power). We managed to recover all VMs from the first server onto the second server within a couple of hours and jury-rig a backup process until 2 new servers and SANs could be installed a couple of days later.

The VMs were a little sluggish as the second production server was carrying its load and the load from the first server for a couple of days, but everything was back up and working.

The IT manager was on his own and managed to get the infrastructure back online in a few hours, surely an ISP, whose job it is is to provide IaaS, must have some sort of disaster recovery procedure in place to deal with such problems? I mean, that is their bread and butter after all, they are not the underfunded IT "department" of a manufacturing concern...

Edit: I posted this against the next thread... :-S

4 0 Reply

Wednesday 31st January 2018 16:36 GMT Alan J. Wylie

failed drives in the Raid-10 array

Plural! If it's not very bad luck, how long had the array been running degraded?

16 0 Reply

Wednesday 31st January 2018 18:50 GMT Adam 52

A long time ago I learnt the hard was that if you by a set of identical drives and subject them to am identical workload then they all fail at the same time.

15 0 Reply
1. Wednesday 31st January 2018 20:21 GMT Anonymous Coward
  
  and they fail even quicker when a company uses cheaper consumer grade disks instead of disks designed for 24/7 use in a data centre environment...
  
  3 0 Reply
  1. Thursday 1st February 2018 17:52 GMT Alan Brown
    
    "they fail even quicker when a company uses cheaper consumer grade disks "
    
    A lot of stats show that if anything the more expensive drives fail faster. It was certainly the case for all our scsi-UW drives.
    
    There's at least one filesystem out there which works on the basis of "Disks are crap. Deal with it" - where loss of a drive or two isn't a big deal, vs systems with expensive raid systems and expensive disks that don't get adequate supervision and where loss of a drive is a performance-sapping event.
    
    In any case, any outfit which doesn't have monitoring setup to send out a distress call when a RAID drive dies isn't fit for hosting other peoples' VMs.
    
    0 0 Reply
Thursday 1st February 2018 13:08 GMT CrazyOldCatMan

how long had the array been running degraded?

Was also my first thought. So, either they have had an uncommon run of bad luck (which can be mitigated against by doing things like not having all the drives in the array from the same production batch) or their monitoring and supply arrangements are nothing short of shocking.

I've added them to the reasonably-long list of "companies with which to not do business"..

2 0 Reply

Wednesday 31st January 2018 17:03 GMT Anonymous Coward

Status page shows that two kvmhost had issues. How many more are going to fail due to Hearts inability to maintain it's own servers properly? Surely they data centre team have received notifications of a degraded array? Assuming they actually have a data centre team and they haven't all been poached by a rival hosting company...

Obviously they learnt nothing from the last two major incidents they had in 2016 and 2017, or are they aiming to have an incident every year as some kind of twisted anniversary gift?

5 0 Reply

Wednesday 31st January 2018 17:31 GMT Anonymous Coward

glad I moved

Previous employer used to use a heart server and I convinced them before moving that it was high time to get an upgrade. Managed to get everything moved before all these incidents really kicked off.

0 0 Reply

Wednesday 31st January 2018 17:42 GMT Adam 52

Re: glad I moved

Why would anyone pay Heart £15/month when you can get something better from AWS for $10/month?

2 3 Reply
Wednesday 31st January 2018 20:26 GMT Ken Moorhouse

Re: Managed to get everything moved

A Heart Bypass Operation.

9 0 Reply

Wednesday 31st January 2018 20:28 GMT Ken Moorhouse

KVM Host Failure

Couldn't they have just bought another monitor from PC World?

7 0 Reply

Wednesday 31st January 2018 22:22 GMT Black Rat

KVM host failure

Because you can't blame mum for unplugging the server so she could hoover your room

3 0 Reply

Thursday 1st February 2018 08:58 GMT Anonymous Coward

#Metoo

I've lost a client over this, which is really saddening.

Never been quite so frustrated and upset with their service in over 5 years

0 0 Reply

Thursday 1st February 2018 09:59 GMT Dominion

Re: #Metoo

Every time there's been a buyout, Heart -> Host Europe -> GoDaddy the level of service has degraded. Another week and I'll be off them for good.

1 0 Reply
1. Thursday 1st February 2018 11:17 GMT Anonymous Coward
  
  Re: #Metoo
  
  I can see myself leaving very soon too :/
  
  1 0 Reply
2. Thursday 1st February 2018 13:11 GMT CrazyOldCatMan
  
  Re: #Metoo
  
  Every time there's been a buyout, Heart -> Host Europe -> GoDaddy
  
  Same happens across the industry (and is the reason why I haven't been a customer of Demon Internet for a number of years..).
  
  Also, GoDaddy is one of those aquisition canaries - when a company gets bought by them you know it's well on the way to dying and it's time to bail. Much like Capita..
  
  0 0 Reply
3. Thursday 1st February 2018 17:57 GMT Alan Brown
  
  Re: #Metoo
  
  "Every time there's been a buyout, Heart -> Host Europe -> GoDaddy the level of service has degraded"
  
  When it comes to virtual hosting, redundancy is best attained by hosting with multiple providers in different locations.
  
  Likewise with RAID cloud storage. Each element in a different cloud provider.
  
  0 0 Reply

Thursday 1st February 2018 10:29 GMT Jay 2

Unimpressed

It doesn't give off the impression that their setup/kit/etc is ideal. Things happen, but you shouldn't really ever be in the position of having a disk fail in a RAID 10 to cause such an outage. If it was multiple disks, then either someone wasn't monitoring or they used crappy disks (no excuse for either if hosting is your business!). And that's before you even mention the fact that their estate doesn't seem to support moving virtual machines to other hardware as other commenters have mentioned.

The only bit I have empathy with is when a disk goes, the RAID controller does its best, but a filesystem makes itself read-only until you can run an fsck/check. But still, at worst that's a reboot and a few hours. And if your business is hosting, you should be able to withstand a piece of physical hardware breaking.

0 0 Reply

Thursday 1st February 2018 12:02 GMT Anonymous Coward

Re: Unimpressed

"when a disk goes, the RAID controller does its best, but a filesystem makes itself read-only until you can run an fsck/check"

That'll be news to most commentards on here! I've never had a situation where a single failed disk in a RAID configuration causes that to happen.

2 0 Reply

Thursday 1st February 2018 23:20 GMT Anonymous Coward

If you could sense the conflict of emotions I'm feeling right now ...

Where I used to work, they had a massive push to move everything to Heart and get rid of all in-house "anything". No they (manglement) didn't do any of the things I mentioned in internal discussions - like looking into their capabilities to see if what they promised was something they could provide. Their portal isn't the best I've used, especially for DNS which is (IMO) a right PITA to work with.

Anyway, now I've been made redundant reading this gives me a right feeling of Schadenfreude. They (my ex employer) got rid of everyone with a clue and have been busy making screwups after screwups.

On the other hand, I feel bad for any of the customers who have been affected by this. As a professional I really dislike seeing customers screwed up - and then usually lied to to avoid taking the blame.

But there is another hosting outfit that promises "we can move your entire Heart hosting setup to us - automatically". Apparently it was setup by the people who setup Heart before it got sold on.

2 0 Reply

Sunday 4th February 2018 11:51 GMT Anonymous Coward

The Cloud...

Other people's computers you have no control over.

0 0 Reply

Topics

Special Features

Vendor Voice

Resources

COMMENTS

KVMhost69

Re: KVMhost69

Re: KVMhost69

glad I moved

Re: glad I moved

Re: Managed to get everything moved

KVM Host Failure

KVM host failure

#Metoo

Re: #Metoo

Re: #Metoo

Re: #Metoo

Re: #Metoo

Unimpressed

Re: Unimpressed

If you could sense the conflict of emotions I'm feeling right now ...

The Cloud...

POST COMMENT House rules

Enter your comment

Add an icon

Other stories you might like

Activist investor to GoDaddy: Cut costs, improve sales, or sell

Users of 123 Reg caught out by catch-all redirect cut-off

Namecheap admits 'unauthorized emails' pwning its customers

tsoHost pulls plug on Gridhost service with just 45 days' notice

Singapore software maker says own hardware in colo costs $400M less than cloud

Save $7 million on cloud by spending $600k on servers, says 37Signals' David Heinemeier Hansson

DigitalOcean waves goodbye to 11 percent of staff

LockBit 3.0 malware forced NHS tech supplier to shut down hosted sites

Major IT outage forces UK emergency call handlers to use 'pen and paper'

About Us

Our Websites

Your Privacy