A 43-second loss of connectivity on the US East Coast helped trigger GitHub's 24-hour TITSUP (Total Inability To Support User Pulls) earlier this month. The bit bucket today published a detailed analysis of the outage, and explained that the brief loss of connectivity between its US East Coast network hub and the primary US …

COMMENTS

Post your comment

House rules Send corrections

Add to 'My topics'

Wednesday 31st October 2018 01:17 GMT Martin-73

One of the biggest fails here...

Seems to be the simple "don't rely on an item, to report news about said item.

That being said, they're learning from it.

23 0 Reply
1. Friday 2nd November 2018 01:05 GMT Anonymous Coward
  
  Re: One of the biggest fails here...
  
  "left it with an inconsistency between two MySQL databases."
  
  Hopefully Microsoft will replace that crap with SQL Server AAG and they will never have that problem again.
  
  1 1 Reply
Wednesday 31st October 2018 01:44 GMT Anonymous Coward

Why did GitHub take a day to resync

“To improve performance at scale .. We use Orchestrator to manage our MySQL cluster topologies and handle automated failover.”

Demonstrating the potential instability introduced by excessive complexicity in a system. There's a name for this that escapes me at this time.

15 2 Reply
1. Wednesday 31st October 2018 07:12 GMT Ken Moorhouse
  
  Re: We use Orchestrator
  
  This may be painful to think about, but you do know what rhymes with "orchestrator"?
  
  Your data at the mercy of the orchestrator. Mwah ha ha ha.
  
  1 0 Reply
  1. This post has been deleted by its author
  2. Wednesday 31st October 2018 12:48 GMT StuntMisanthrope
    
    Re: We use Orchestrator
    
    I think it’s about 15 miles apart, though I’ve heard it’s gone up a bit with your own string. d:-)
    
    0 0 Reply
  3. Wednesday 31st October 2018 13:02 GMT StuntMisanthrope
    
    Re: We use Orchestrator
    
    On a plus note, I like the idea of a rolling board, helps to sleep at night especially with regards to day at the same time. #rest
    
    0 0 Reply
    1. Wednesday 31st October 2018 14:44 GMT Danny 14
      
      Re: We use Orchestrator
      
      Ive had fewer brown trouser moments BECAUSE of clustering my resources. We arent large scale but we do use MS clustering, it has kept things going with the core switches wents down due to faulty UPS, clustering on 2nd site failed over nicely and replicated storage back when we fixed the UPS. SQL server has failed over when spod accidentally triggered an update that rebooted one of the custers - this was a "nice" failover but worked just as well.
      
      big boys tend to have bigger problems though.
      
      5 0 Reply
2. Wednesday 31st October 2018 09:44 GMT Anonymous Coward
  
  Re: Why did GitHub take a day to resync
  
  "There's a name for this that escapes me at this time"
  
  Jurassic Park!
  
  5 0 Reply
3. Wednesday 31st October 2018 10:09 GMT Anonymous Coward
  
  Re: Why did GitHub take a day to resync
  
  I'm pretty sure that over the last 10 years I've had more outages, disruption and system down recoveries due to failover & cluster systems than I ever had on a single server running with just RAID and a nightly tape backup.
  
  16 1 Reply
  1. Wednesday 31st October 2018 10:51 GMT Anonymous Coward
    
    Re: Why did GitHub take a day to resync
    
    And I'm pretty sure that over the past 10 years that the engineering complexity that we've invested in at $JOB has saved vastly more downtime than it created when the complexity itself has issues.
    
    3 0 Reply
Wednesday 31st October 2018 02:20 GMT bombastic bob

re: Why did GitHub take a day to resync

"Demonstrating the potential instability introduced by excessive complexicity in a system"

The term I'm thinking of starts with 'cluster' (how aproh-pooh)

Admittedly, I've never had to set one of these up. I would think that a transaction-based system could simply apply the backlogged transactions, right???

[you know, like taking a failed RAID drive off-line and swapping in a new one]

1 10 Reply
1. Wednesday 31st October 2018 02:29 GMT Long John Brass
  
  Re: re: Why did GitHub take a day to resync
  
  DB replication can be a bastard. Think monotonically increasing ID style keys. If both nodes have different data records for the same primary key entry for ID 12345 that are then linked into other data records, then you have to untangle the mess. In dual node systems you can do odds & evens, but in multi node clusters it gets a lot uglier
  
  Not a DBA, but I pretend sometimes when the real DBAs are away.
  
  36 0 Reply
  1. Wednesday 31st October 2018 07:59 GMT Phil O'Sophical
    
    Re: re: Why did GitHub take a day to resync
    
    And this is why you shouldn't have automatic failover in disaster recovery situations. For local HA, with redundant equipment, when a disk, switch or server goes down automatic is fine. For long-distance DR it's well-nigh impossible for an automated system to have a full view of what happened (recoverable network outage versus primary site disappearing in a ball of nuclear fire, for example). With a person in the loop they could have looked at the situation, perhaps called an admin on the other coast, and said "oh, it's just a transient network outage, best solution is wait until it comes back.". Automate the changeover by all means, once the decision is taken, but don't make it automatic.
    
    25 0 Reply
    1. Wednesday 31st October 2018 09:29 GMT Crypto Monad
      
      Re: re: Why did GitHub take a day to resync
      
      What you also need is a mechanism which *guarantees* that there is no split-brain scenario: a provably-correct consensus protocol like Paxos or Raft. You want writes to be committed everywhere or not at all.
      
      Some databases like CockroachDB integrate this at a very low level; whether it is fast enough for Github's use case is another question.
      
      14 0 Reply
    2. Wednesday 31st October 2018 10:13 GMT Anonymous Coward
      
      Having a human to intervene
      
      Sometimes time is of the essence with a failover. If you do it correctly and fast you might see zero disruption and when it's the storage that is at issue then even delaying for a few seconds can cause you to lose the LUN or cause the application to go awry.
      
      Maybe just a bit more intelligence in the failover, done well it could make more intelligent failover decisions with a human just needing to disable the system if there is known engineering works going on that might accidentally trigger it.
      
      2 0 Reply
      1. Wednesday 31st October 2018 11:31 GMT Spazturtle
        
        Re: Having a human to intervene
        
        The big question is, why was the latency between these 2 datacenters 60ms? Were these 2 datacenters really 11k miles apart? Surely nobody would try automatic failover for an SQL database without the 2 sites being directly connected by a fiber cable right?
        
        0 0 Reply
        
        Wednesday 31st October 2018 14:23 GMT Anonymous Coward
        
        Re: Having a human to intervene
        
        Were these 2 datacenters really 11k miles apart?
        
        Latency usually covers round-trip time, and speed-of-light in fibre is only 2x10^8 m/s, so 60mS is more like 6000km of pure fibre separation. That is roughly the distance between US East & West coasts by fibre, but your point is still valid. It's too slow for synchronous replication, so you cannot guarantee that both sides are 100% in sync. Any failover solution has to take that into account.
        
        7 0 Reply
        
        Wednesday 31st October 2018 14:47 GMT Danny 14
        
        Re: Having a human to intervene
        
        your RID master can be in a cluster of course, just have a different set of seeds that will not overlap. We are talking MANUAL failover when your RID master is clustered.
        
        1 0 Reply
    3. Thursday 1st November 2018 14:12 GMT the spectacularly refined chap
      
      Re: re: Why did GitHub take a day to resync
      
      For local HA, with redundant equipment, when a disk, switch or server goes down automatic is fine.
      
      This strikes me as being a LAN issue though at one or the other site. A fault that caused connectivity to rather than on the WAN link to be lost. The reason I suggest this is that in default trim a 43 second outage is suspiciously close to the time traditional STP will take to kick in and rearrange the active links.
      
      These days you would expect people to be using at least RSTP which is almost info infinitely faster to respond but only for some of the more common faults. In particular it doesn't detect the case where a link fails on one direction in a consistent manner, in that case you are still dependent upon the original protocol to sort it out with the delays that come with it.
      
      1 0 Reply
  2. Wednesday 31st October 2018 10:01 GMT Chris Hills
    
    Re: re: Why did GitHub take a day to resync
    
    One concept Microsoft (afaik) came up with is that of a RID master. It gives out blocks of numbers to other servers upon request. When the server passes the watermark it will preemptively request a new block. In the case of a loss of connectivity, it can still create new objects until the block is exhausted. I thought this could well be applied to database replication.
    
    9 0 Reply
    1. Wednesday 31st October 2018 13:35 GMT DaLo
      
      Re: RID master
      
      DO you then cluster the RID master over geographic locations to ensure redundancy? Do you then need a RID master master to oversee your RID cluster?
      
      How will the ID blocks work when they are re-merged, the transactions will be all over the place? If you don't need the ID for anything useful outside of a unique index then you could just do a compound index with the server name or start your index block at different starting points that will never overlap.
      
      In reality it isn't so much about inserting data once or reading data, it is about changing data or a set of transnational commands that needs to be done in a set sequence where some of that sequence may exist on one server and some on another and where the times could be ms out. Or where some data is changed that only exists on one system, or has only made it into one index.
      
      3 0 Reply
Wednesday 31st October 2018 04:01 GMT jamesb2147

Better for democracy

Should have used a distributed NoSQL DB, then they'd never have any problems!

7 12 Reply
1. Wednesday 31st October 2018 06:49 GMT Anonymous Coward
  
  Re: Better for democracy
  
  Right, because NoSQL DB cannot suffer from split brain.
  
  18 2 Reply
  1. Wednesday 31st October 2018 12:34 GMT Andy Landy
    
    Re: Better for democracy
    
    mongo db is web scale!
    
    6 0 Reply
2. Wednesday 31st October 2018 07:58 GMT A Non e-mouse
  
  Re: Better for democracy
  
  I think you forgot the sarcasm tags...
  
  16 0 Reply
3. Wednesday 31st October 2018 09:15 GMT Charlie Clark
  
  Re: Better for democracy
  
  Yeah, with the right webscale blockchain system this could never have happened! :-)
  
  19 0 Reply
  1. Wednesday 31st October 2018 14:30 GMT Anonymous Coward
    
    Re: Better for democracy
    
    "Yeah, with the right webscale blockchain system this could never have happened! :-)"
    
    Technically you're correct - once the webscale blockchain consultants legged it with all the money, you'd be forced to use a single host and bits of string to connect everything together, completely eliminating the possibility of split brain.
    
    9 0 Reply
4. Wednesday 31st October 2018 10:36 GMT el kabong
  
  Better yet, they should have put it on a blockchain
  
  Then they'd never have any of those problems, they'd never get that far.
  
  8 0 Reply
5. Friday 2nd November 2018 01:05 GMT Anonymous Coward
  
  Re: Better for democracy
  
  "Should have used a distributed NoSQL DB, then they'd never have any problems!"
  
  Why not just write to /DEV/NUL - it would have similar data integrity.
  
  2 1 Reply
Wednesday 31st October 2018 07:03 GMT Ken Moorhouse

Database replication is hard

The average business that is hell-bent on sticking data in the cloud needs to bear this in mind. You're buying into whole layers of sophistication that are totally unnecessary. One data repository and backups taken during pauses of data writing is all that many of us really want.

Sure, outages consequently mean inaccessibility to your data, but this is the price paid for being hell-bent on sticking data in the cloud. However, if you count up the risk points of failure of a cloud solution and compare that with the risk points of failure in an on-premises solution and draw your own conclusions.

I'll agree that github, et al are slightly different animals to say, accounts data, but anything basic that can be done in the cloud can be done on-prem. If bells-and-whistles are important then you're incrementing that risk points of failure counter.

12 0 Reply
1. Wednesday 31st October 2018 07:56 GMT A Non e-mouse
  
  Re: Database replication is hard
  
  ~~Database~~ replication is hard.
  
  FTFY.
  
  17 0 Reply
Wednesday 31st October 2018 07:56 GMT A Non e-mouse

No s*** Sherlock

The brief outage ... caused problems in the organisation's complex MySQL replication architecture

If you have to define a system as "Complex", you can guarantee that when it goes wrong, it will go wrong in a manner that will take a long time to clean up afterwards.

I know you're operating at scale, but Keep It Simple, Stupid. Simple is the only way you stand of keeping big things like this running.

19 0 Reply
1. Wednesday 31st October 2018 14:07 GMT el kabong
  
  Complication posing as complexity.
  
  Most of the apparent complexity you see in IT is fake, you can easily complicate trivial matters and make them look complex to the lazy eye, just ask a lawyer how it's done.
  
  Complication is not the same as complexity, complexity, to be real, must be an essential attribute of the problem while complication results from either laziness, ignorance or ill intentions.
  
  5 0 Reply
Wednesday 31st October 2018 08:41 GMT Lee D

I'll explain that problem to you in two words.

Split-brain.

You had two places that both thought they had the "definitive" copy of the database, but didn't, because they didn't have what the other side had, because both were pretending to be in charge and taking any orders that came to them and applying them, even if they could never tell the other side about those orders.

Note that this is perfectly possibly with ANY replication setup that works in a failover mode whereby one place - upon detection that it can't talk to the other place - becomes a full-service node. It starts taking orders from the waiters and giving them to their own chefs, without realising that other places are also taking orders and giving them to their chefs, and then you try to merge the kitchens back together and you just get chaos.

It's so prevalent that you can do it in Hyper-V failover replicas, DFS, MySQL or anything else that tries to "take over" when a site goes down without proper shared "journalling" of some kind, or a forcible master server handing off work.

If you chop your network in two, and expect both halves to get full service, you need a way to resolve split-brain afterwards. That can either be something like DFS or Offline Files does (hey, we have these conflicts, sorry, nothing we can do and you need to manually check what you wanted), or you have to literally put in intermediary services that can handle and clean up the situation.

The job is almost impossible automatically... someone commits something to site A... it times-out because of the fault but gets to site A storage. They retry, they get balanced over to site B, you now have an *almost* identical commit to site B, but they both differ. Or you have one developer commit his branch to site A, another to site B, they conflict and now you've messed both side's entire tree. Leave it for 40 minutes with a crowd of developers and before you know it you have entire trees with two completely conflicting trees that can't be merged because the patches change the same parts of the code and who do you reject now? Plus one of those developers is going to have to rebase their tree but may have done thousands of hours of work based on the deprecated tree and they won't be happy.

And I've tried to explain this to people too... yes, just slap in a failover / replica, magic happens and it all works when you join them back.

No. It doesn't. The only way to do that is to have a load-balanced queuing/transaction system whereby the underlying databases are separate, but there's only ever one "real" master, and that gets committed to by a single ordered list of processes that will always feed that data back in the same order to the same system. Literally, one side "takes orders" but does nothing with them. Until the join is fixed and then they hand them off to the shared kitchen. You don't lose any orders, but they don't get acted upon immediately (i.e. you accept the commit, but on the failed site, it's never reflected in the tree). Even there, you have problems (maybe the commit you accepted wouldn't be valid against what is NOW the master tree that's taken other commits in the meantime).

Such things - and their solutions - introduce all kinds of problems with your "distributed / fail-safe" setup.

And all because you didn't think it through and just assumed it would all carry on working perfectly like magic. If you have a blip, and you failover, the failover will work perfectly. But before you can ever resume service, you have work to do that if you haven't considered it in your design turns into a mess with hours of downtime and potentially accepted-but-then-disappearing commits.

21 1 Reply
1. Wednesday 31st October 2018 08:53 GMT Bronek Kozicki
  
  Agreed, distributed databases is a hard problem. The fact that they used MySQL does not make it any bettter. The solution you mention requires a totally asynchronous client, which may not work for the database users. The other solution is to use Convergent Replicated Data Types, and yet another is to simply fail one side due to the lack of quorum.
  
  4 0 Reply
  1. Wednesday 31st October 2018 09:59 GMT Alan Brown
    
    "The fact that they used MySQL does not make it any bettter."
    
    Indeed. "Doesn't scale well" is what springs to mind, from experience.
    
    MySQL is very good at what it's designed for (read heavy operations) but once you get past a few tens of millions of entries there are better DBs that have smaller memory/CPU footprints, which don't have these kinds of problems with replication.
    
    People become highly resistant to change and attempt to keep banging the square peg into the round hole long after it's apparent it doesn't fit, rather than learn how to drive another DB. PostgreSQL and friends may seem big and scary upfront, but it's actually a helluva lot easier to run than trying to tune a large-scale MySQL installation - instead of lots of knobs, PgSQL tends to just get on with it.
    
    (Personal experience: Same hardware, 500million entry DB, write heavy - PgSQL ended up about 5 times faster than MySQL and using 20% of the resources. Replication was a doddle too, so migrated to it. Next manager along didn't understand any of it, so ordered it ripped it out in favour of MySQL and then spent a year trying to make it work reliably.)
    
    16 0 Reply
    1. Wednesday 31st October 2018 12:47 GMT Pirate Dave
      
      "Next manager along didn't understand any of it, so ordered it ripped it out in favour of..."
      
      The mark of a truly gifted Manager is that they can come into a new job and immediately decide to rip out systems that have been reliably ticking over for 10 years. All because they read a "Best Practices" article on a forum somewhere. That's why they get the big money.
      
      Me, bitter? no...
      
      16 0 Reply
      1. Wednesday 31st October 2018 14:59 GMT Danny 14
        
        failover works perfectly if you dont have automatic failover. Its the automatic bit that makes split brain possible. Our hyper-v site failover is manual. We issue a command to the cluster telling it which site is active. When the main site goes down we issue command to failover to backup site, when main site comes back up then we issue the return and the backup falls back and replicates back. If we lose comms between backup and main then it sucks to be at the backup site (as we dont issue the failover).
        
        Simple.
        
        4 1 Reply
        
        Friday 2nd November 2018 01:05 GMT TheVogon
        
        "failover works perfectly if you dont have automatic failover. Its the automatic bit that makes split brain possible."
        
        That's what cluster witness servers are for. We have only one server in our office location and that's what it is used for. Or you can use Azure.
        
        1 0 Reply
Wednesday 31st October 2018 09:05 GMT Anonymous Coward

I thought this ("split brain"?) was

a solved problem in the dark days long before DevoPS.

I seem to recall one mostly forgotten set of vendors calling it something like "disaster tolerance"; other similar options may have existed elsewhere.

Then the commodity side of the industry decided it was too hard or too expensive or too unfashionable and started sprouting BS like "eventual consistency". I mean, EC probably works ok for presentation-layer stuff where there is no correct answer and the money comes in anyway, but for stuff where things are either right or wrong and there's real stuff at stake, it seems like it might be time to look again at stuff from a couple of decades ago, before the knowledge is lost forever.

9 0 Reply
1. Wednesday 31st October 2018 12:19 GMT Anonymous Coward
  
  disaster-proof .ne. marketing-proof
  
  https://www.youtube.com/watch?v=qMCHpUtJnEI
  
  0 0 Reply
Wednesday 31st October 2018 09:55 GMT Anonymous Coward

Weird timing

Microsoft hasn't as much as *thought* about integrating this setup and the whole show already starts to exhibit Microsoft Windows - alike reliability problems.

Must be their aura..

:)

13 5 Reply
1. Thursday 1st November 2018 09:42 GMT The Original Steve
  
  Re: Weird timing
  
  I'd wager that Active Directory is one of the top multi-node database systems in use across the globe, and whilst I've seen issues in my two decades of experience it's been so incredibly rare compared to this type of SQL split brain I wouldn't count Microsoft out as being able to produce a DB which is top tier in terms of it's resilience during a disaster.
  
  3 0 Reply
  1. Friday 2nd November 2018 01:05 GMT TheVogon
    
    Re: Weird timing
    
    "I wouldn't count Microsoft out as being able to produce a DB which is top tier in terms of it's resilience during a disaster."
    
    What you are looking for is called SQL Server Active Availability Groups.
    
    0 0 Reply
Wednesday 31st October 2018 12:37 GMT SVV

Sounds to me like they tried to be too clever

Why not just use distributed transactions and two phase commit? That's the way to guarantee consistency. Have they just prioritised write speed instead, and added a we'll synchronise later for consistency kludge on top?

The region specific nature of this fuckup will have impressed the new owners though. See, it happens to everyone!

6 0 Reply
This post has been deleted by its author
Wednesday 31st October 2018 12:55 GMT Dave Pickles

Wasn't Split Brain solved 30 years ago?

VMS clustering had the concept of 'quorum'; if there should be N nodes accessible then the cluster would stop if the number fell below int(N/2)+1. This made it impossible for the cluster to partition into two operating sub-clusters.

12 0 Reply
1. Wednesday 31st October 2018 13:56 GMT A Non e-mouse
  
  Re: Wasn't Split Brain solved 30 years ago?
  
  But hipsters don't need any of your ancient magic. Cloud, DevOps, NoSQL DBs, etc are where it's at.
  
  8 0 Reply
2. Wednesday 31st October 2018 14:38 GMT Phil O'Sophical
  
  Re: Wasn't Split Brain solved 30 years ago?
  
  Yes, VMS cluster, Solaris cluster, Veritas cluster, they all do that for local HA, and it works. Disk vendors like EMC and HDS can also do it for their distributed storage.
  
  It gets much harder once you bring geographical distances into the picture. Fully synchronized updates so that both sites always have identical copies of data gets too slow once you go much past a few mS latency, so you have to live with the fact that each site has a different view.
  
  There are solutions, usually requiring more than three sites with a mix of local/remote and sync/async replication. That gets expensive.
  
  Oracle DataGuard (yeah, I know it's the evil empire, but the DB is still good) has a useful feature called "far sync" where the data exists only at the two main sites, but there are additional copies of the redo logs maintained synchronously at nearby sites. If a main site is lost, the survivor will automatically fetch & replay the logs to make sure it has caught up before becoming active. Still needs multiple sites, but the intermediate ones only need small systems & storage for logs, not the big servers & TB disks it would require to keep another copy of the whole DB.
  
  4 0 Reply
  1. Wednesday 31st October 2018 17:37 GMT Martin Gregorie
    
    Re: Wasn't Split Brain solved 30 years ago?
    
    Yes, VMS cluster, Solaris cluster, Veritas cluster, they all do that for local HA, and it works. Disk vendors like EMC and HDS can also do it for their distributed storage.
    
    Don't forget Tandem NonStop systems, which distributed the whole kit and caboodle (CPUs, disks, disk controllers) over geographic distances as a networked set of up to 16 nodes, each containing between 2 and 16 CPUs with all hardware duplicated and hot-pluggable. The whole distributed system was fault tolerant by design with automatic fail-over for all components. Result: 99.99% up time.
    
    ... and that was early '80s technology, so there's really no excuse for current systems supporting worldwide 24x7 services to offer less guaranteed availability than was available in 1984.
    
    3 0 Reply
Wednesday 31st October 2018 19:05 GMT Claptrap314

A bit behind the times?

"There's an effort underway to support “'N+1 redundancy at the facility level'"

1) If you're not running N+1, then you've not learned the first lesson of SRE.

2) For the repos (not all the cruft GitHub has added to them), unless you have a hash collision, resolving "some writes here, some writes there" is trivial by the design of git.

3) For the local cruft, adding shadow git repos to handle their changes should be at worst simple. Again trivial to recover from split brain.

4) For the non-local cruft (ie: webhooks), yes, there would be work to be done to deal with the fact that apparently there are people allowed near keyboards that think that IP is a reliable system for data transport.

What am I missing here?

1 0 Reply

POST COMMENT House rules

Not a member of The Register? Create a new account here.

Topics

Special Features

Vendor Voice

Resources

COMMENTS

One of the biggest fails here...

Re: One of the biggest fails here...

Why did GitHub take a day to resync

Re: We use Orchestrator

Re: We use Orchestrator

Re: We use Orchestrator

Re: We use Orchestrator

Re: Why did GitHub take a day to resync

Re: Why did GitHub take a day to resync

Re: Why did GitHub take a day to resync

re: Why did GitHub take a day to resync

Re: re: Why did GitHub take a day to resync

Re: re: Why did GitHub take a day to resync

Re: re: Why did GitHub take a day to resync

Having a human to intervene

Re: Having a human to intervene

Re: Having a human to intervene

Re: Having a human to intervene

Re: re: Why did GitHub take a day to resync

Re: re: Why did GitHub take a day to resync

Re: RID master

Better for democracy

Re: Better for democracy

Re: Better for democracy

Re: Better for democracy

Re: Better for democracy

Re: Better for democracy

Better yet, they should have put it on a blockchain

Re: Better for democracy

Database replication is hard

Re: Database replication is hard

No s*** Sherlock

Complication posing as complexity.

I thought this ("split brain"?) was

disaster-proof .ne. marketing-proof

Weird timing

Re: Weird timing

Re: Weird timing

Sounds to me like they tried to be too clever

Wasn't Split Brain solved 30 years ago?

Re: Wasn't Split Brain solved 30 years ago?

Re: Wasn't Split Brain solved 30 years ago?

Re: Wasn't Split Brain solved 30 years ago?

A bit behind the times?

POST COMMENT House rules

Enter your comment

Add an icon

Other stories you might like

911 goes MIA across multiple US states, cause unclear

Sacramento airport goes no-fly after AT&T internet cable snipped

US-EAST-1 region is not the cloudy crock it's made out to be, claims AWS EC2 boss

Cyberattack hits Omni Hotels systems, taking out bookings, payments, door locks

Datacenter outages are on the decline, but when they hit, they hit hard

Over 170K users caught up in poisoned Python package ruse

Tech trade union confirms cyberattack behind IT, email outage

GitHub fixes pull request delay that derailed developers

McDonald's ordering system suffers McFlurry of tech troubles

LinkedIn's turn to fall over: Outage hits thinkfluencer hub

World-plus-dog booted out of Facebook, Instagram, Threads

What is GitHub Copilot Enterprise? You and your org just might find out firsthand

About Us

Our Websites

Your Privacy