back to article The best and worst of GitHub: Repos wiped without notice, quickly restored – but why?

Game designer Jason Rohrer has had a bad week, discovering that his 23 code repositories representing 15 years of development and community contributions were wiped from GitHub. "I can't believe how easy it apparently is to have someone's life work taken down from Github," he said on his forum, fortunately hosted elsewhere. " …

  1. Andraž 'ruskie' Levstik

    Backup - ever heard of it?

    Why is this a surprise to people? Why would you ever store something important only on someone elses system?

    Always have backups of anything that's important to you.

    1. Paul Crawford Silver badge

      Re: Backup - ever heard of it?

      The problem here that was pointed out is not the GIT repository as that ought to be local as well (so at least a 2nd copy, if not more should your team or in-use computers number more than 1), but that a lot of discussions and bug-tracking are only held on the github server and (I presume) lack any way to mirror that aspect locally. Something to be fixed?

      1. LDS Silver badge

        Re: Backup - ever heard of it?

        It means you have something on someone else's system without any way to back it up.

        It's obviously something that need to be fixed - and if GitHub doesn't fix it, you should fix it yourself - before losing the data.

        1. ckm5

          Re: Backup - ever heard of it?

          There are plenty of ways to backup GitHub without syncing the repo to your local machine - sync it to Google or another git hosting service or use something like BackHub which is a service that backsup GitHub repos....

          1. Joe W Silver badge

            Re: Backup - ever heard of it?

            Except for the issue with the bug tracking and discussions, I guess?

            1. Arbee

              Re: Backup - ever heard of it?

              You guess wrong - a 10 second check on their website (https://backhub.co/) reveals:

              > Backups include metadata, such as issues

              I don't understand why lazy comments like this get modded +20/0, while the helpful parent comment gets +5/-3.

    2. Hans Neeson-Bumpsadese Silver badge

      Re: Backup - ever heard of it?

      Why would you ever store something important only on someone elses system?

      Especially if that is provided as a free service. In the absence of an exchange of cold, hard cash and a contract with an SLA attached, I will always assume the the service provider could pull the rug from under my feet with no notice and no comeback.

      You (don't) pay your money, you take your choice.

      1. JohnFen Silver badge

        Re: Backup - ever heard of it?

        "In the absence of an exchange of cold, hard cash and a contract with an SLA attached"

        I think it's best to assume that even if you do have a contract with an SLA.

    3. DavCrav Silver badge

      Re: Backup - ever heard of it?

      "It is not quite as bad as it sounds, since the way it works means that developers have a copy of the repository on their own machine. But it is still pretty bad."

      1. J.G.Harston Silver badge

        Re: Backup - ever heard of it?

        NO NO NO NO NO. The copy isn't on your own system, the ***MASTER**** is on your system. The ****COPY**** is on GitHub.

        1. ckm5

          Re: Backup - ever heard of it?

          Not really - because git is designed to be distributed, there isn't really a 'master' and 'copy'. Also, that doesn't really apply if you have more than one person working on a repo....

          1. MacroRodent Silver badge

            Re: Backup - ever heard of it?

            > because git is designed to be distributed, there isn't really a 'master' and 'copy'.

            Usually teams of developers do not sync directly between each other. In practice some server or servers is effectively the master, and developers pull and push code to it. And Github of course provides such a git server + added tools like the bug tracking and a web page that you can use as the staring point for getting into a project. But it is really cool that if Github disappears or "goes evil", the developers still have between them both the latest code version and the full history, and can easily reconstruct the "master" on some other server.

            I'm not particularly a fan of Git, but this resiliency feature is impressive, and compensates for its defects (like the user-friendliness of a rattlesnake).

    4. vtcodger Silver badge

      Re: Backup - ever heard of it?

      Well, yes. But as I understand it, the original reason for git was for collaboration. If one is actually using git in order that multiple folks can work on the code with minimal confusion, aren't backups likely to be somewhat less than complete and current at times? (Caveat -- I know nothing about git other than what I read on the Internet. RCS seems to be adequate for my minimal needs).

    5. martinusher Silver badge

      Re: Backup - ever heard of it?

      Code archving systems are very useful but as anyone who's been forced to use SourceSafe will know they're not entirely trustworthy.

      1. david 12 Silver badge

        Re: Backup - ever heard of it?

        >anyone who's been forced to use SourceSafe will know <

        Anyone who used VCS will know that SourceSafe was an improvement.

    6. tfb Silver badge
      Pirate

      Re: Backup - ever heard of it?

      It is not possible to back up things like GH's issue tracking. And, almost certainly it will never be possible: if it were possible you could use those backups to seed your own or a competitor's issue-tracking system, thus removing most of GH's competitive advantage.

      1. Doctor Syntax Silver badge

        Re: Backup - ever heard of it?

        "thus removing most of GH's competitive advantage."

        It's only a competitive advantage if it's reliable.

      2. tfb Silver badge

        Re: Backup - ever heard of it?

        I take back some of the above: it looks as if it is possible to back up GH repos including metadata like issues &c. However tools to do this seem to be scarce. BackHub does cloud-to-cloud backups which is (a bit) better than nothing, but it costs money.

        If anyone knows of a decent tool to do this to a local archive, I'd be interested.

  2. DJ Smiley

    There's a problem with giving 'value' to aged accounts....

    Plenty of 'aged' accounts - i.e. active accounts for years, for various meanings of 'active' get hacked / have credentials dumped, often long after the user who was using them has forgotten about the account too.

    To be fair to github - they restored the account, seemingly without any kind of loss to the user other than some time. He's learnt a lesson (don't rely on free services for something so important) and so I'm not really sure why this is even in the news?

    1. Christoph Silver badge

      Re: There's a problem with giving 'value' to aged accounts....

      It vanished with no notice and no explanation, and was gone for at least some hours. That could be critical in some circumstances, and could easily be expensive.

      1. MachDiamond Silver badge

        Re: There's a problem with giving 'value' to aged accounts....

        "It vanished with no notice and no explanation, and was gone for at least some hours. That could be critical in some circumstances, and could easily be expensive."

        Yes, but..... He was relying on a service that isn't relying on him to stay in business or even turn a profit so they have zero worry about dropping his data down the chute. Whoops, so sorry, our bad.

        The aerospace company I worked at was abusing an SVN system for all of the data we generated which meant nearly everybody in the engineering office had a copy of the repository on their computer. If that server went down, it would have been a pain, but we had data backed up all over the place and could have made do until the server was back up again even if it took a week. The bottom line is that the company could have thrown money at the problem to make it go away. A system such as GitHub doesn't allow for using money to make a problem go away. If the outage was widespread, you wouldn't even be able to find out WTH was going on and if there were an ETA on a fix.

        There should always be a complete backup and a way to use that backup. If you use an online software package and the provider flips on it's back and twitches its legs, even if you have your data backed up, you don't have a way to use it. That makes me avoid the "software as a service" crowd if I can't get an "evaluation" version of the software just in case.

        1. tfb Silver badge
          Boffin

          Re: There's a problem with giving 'value' to aged accounts....

          Yes, but..... He was relying on a service that isn't relying on him to stay in business or even turn a profit so they have zero worry about dropping his data down the chute. Whoops, so sorry, our bad.

          Um. How long do you think GitHub would last if they started losing the data of their users' projects regularly? In fact it turns out that they have quite a strong reason not to do that: staying in business.

          I agree with you about the risks associated with software-as-a-service. You can in fact run in-house GitHub instances on your own tin, and if I was working for a big enough organisation (and wanted their added-value stuff over git) I'd be doing that.

      2. Wellyboot Silver badge

        Re: There's a problem with giving 'value' to aged accounts....

        Also having to use the twit-o-sphere to get a response is less than ideal.

      3. big_D Silver badge

        Re: There's a problem with giving 'value' to aged accounts....

        If it is critical, don't outsource it and don't outsource it to a free service or the lowest bidder.

      4. Anonymous Coward
        Anonymous Coward

        Re: There's a problem with giving 'value' to aged accounts....

        That could be critical in some circumstances, and could easily be expensive.

        "critical" and "could be expensive" are synonyms for "important" and "worth spending some money on to avoid problems happening"

      5. Alan Brown Silver badge

        Re: There's a problem with giving 'value' to aged accounts....

        "and could easily be expensive."

        Hint: free service. "downtime Could be Expensive" == you need to pay for it.

        At least he wasn't like the stock market daytraders who actually attempted to sue operators of an IRC network when it went down.

        As others have said the critical part is the bugtracking stuff. If you have any sense you mirror that periodically (The "how" part is an exercise for the reader)

    2. DavCrav Silver badge

      Re: There's a problem with giving 'value' to aged accounts....

      "He's learnt a lesson (don't rely on free services for something so important) and so I'm not really sure why this is even in the news?"

      I guess so people who aren't him can learn the same lesson? I know that it often takes damage to oneself personally to learn the following two lessons, but hopefully one or two people are able to learn from others' mistakes:

      1) Always have a backup;

      2) Never rely on companies that offer to host your data online, especially for free.

    3. Aitor 1 Silver badge

      Re: There's a problem with giving 'value' to aged accounts....

      That is unacceptable.

      They have confessed that for them it is ok to have a non published script that by some obviously flawed algorithms happily deletes accounts! no warnings or human checks!

  3. Cuddles Silver badge

    Sing along with me

    It's not a cloud, it's just someone else's computer.

    At least if it's a paid service you might have some grounds to complain when things go wrong, although that still doesn't absolve you from not having a backup plan. If you're not paying, you should assume that everything could disappear at any time. That's not a possibility you should want associated with terms like "life's work" or "business critical".

    1. macjules Silver badge

      Re: Sing along with me

      Quite. If this is all the games he has ever developed then, quite frankly, he should have the original masters backed up elsewhere. Even BitBucket or GitLab would be a start.

      Having someone complain about lack of backup in this day and age really just shows their (lack of) aptitude IMO.

      1. Doctor Syntax Silver badge

        Re: Sing along with me

        The issues seem to have been not the absence of a backup for the main data but the absence of a backup mechanism for the bug-tracker and the fact that GitHub was being used as part of the workflow to manage multiple servers. As others have said, if it's that important, particular for the workflow issue, a paid service with an SLA sounds reasonable.

      2. Anonymous Coward
        Anonymous Coward

        Re: Sing along with me

        "Having someone complain about lack of backup in this day and age"

        I'm just dealing with the aftermath of an incident where someone decided that using scratch disk space on a server - an area that's explicitly NOT backed up (and has warning messages to that effect) - was a good place to put critical archival data instead of on the central fileservers.

        Server got rebuilt, scratch space got reformatted, 2 weeks later user screams the roof off the building when he realises all his precious algorithms are goneski - then demands we send the drives out to a recovery company. Um Hello? They're been scribbled over for the last 2 weeks, what do you think that's going to achieve?

        Anonymous, because I have to work with the idiots responsible for that kind of clusterfuckage. What kind of fumduck thinks that a directory called "scratchpad" is a good place to keep critical data?

  4. big_D Silver badge

    Your data...

    Your servers. I might use a cloud service as a backup, but relying on it as a primary copy... No thanks.

    1. doublelayer Silver badge

      Re: Your data...

      I don't see that it matters that much. As long as you have a backup, you can get some benefits from using the cloud as the primary. For example, I run my website on a cloud service because it's not that important and I don't need that server in my house. Also, the electricity people around here aren't great about getting to my house when their line fails until after several hours. With that said, a server located off site and where the provider handles the power and network means it's less likely to go down. If they should delete my account, I have all the files I need right here to restore it.

      1. Is It Me Bronze badge

        Re: Your data...

        Also I think this is less directly about back and more about DR.

        If you have a backup and no where to restore it to what use is it?

  5. karlkarl Bronze badge

    Why would discussions and bug tracking be lost if your repo got removed from GitHub? They are safely on your bugzilla / trac server (which is backed up nightly).

    Or do you mean people actually use the GitHub web interface in professional environments? That is absolutely shocking!

    1. Charlie Clark Silver badge

      I'm not a fan of GitHub but the stuff builit around it goes above and beyond what Bugzilla and Trac provide. I think the bigger issue is relying on a free service for something that sounds a lot like professional work.

  6. 404 Silver badge

    Github=/=Microsoft

    Who gave us Win10 and the disappearing network drives, disappearing documents, disappearing data, etc.

    Who woulda thunk?

    1. bombastic bob Silver badge
      Meh

      Re: Github=/=Microsoft

      irony acknowledged. heh.

      So, the *FIRST* major change to github is a bot that FAILS to "get it right" with respect to spam filtering, punishing the honest/innocent via brain-damaged AI algorithms, while GROSSLY MISSING the 'bulk' of the problem at the exact same time.

      Sounds like hotmail or anything ELSE that MS "took over". I'll still use it, I suppose. yay.

      (this is probably comment #43 - oh well, so much for having 42 of them. I ruined it.)

    2. Anonymous Coward
      Anonymous Coward

      Re: Github=/=Microsoft

      No one should ever trust Microsoft with any of their precious data. (Amazon is much the same).

      They can and will use it for their own requirements. Why else did they buy GitHub? Out of the goodness of their heart naturally...

      The day after that deal was announced I moved all my code off of GitHub. It will never be going back.

      I am determined to remain uncontaminated by MS (I do accept that they contribute to Linux but they have to make their submissions GPL compliant) for as long as possible.

      Downvote me all you like but while I have a choice NOT to use Azure or Github or Orifice then I will do that. The same goes for AWS. You are at their mercy as this article clearly shows. IF you step out of line, you are gone gone and gone. All that lovely work gone.

    3. Carl-Jung

      Re: Github=/=Microsoft

      Don't forget the most dodgy of them all. The disappearing privacy switches. You turn off the fact you don't like being tracked, Microsoft decide they will just switch them all back on again down the road, once they think you have forgotten.

      How else do they fund a "free" Windows 10...

      Mine has reset at least 3 times since Windows 10 launched. Along with stupid Candy Crush re-installing itself by magic... What's weird is the EU don't seem to care about this, since they got on the Microsoft pay-roll (Munich Office/Windows deal anyone???)

      1. Doctor Syntax Silver badge

        Re: Github=/=Microsoft

        How else do they fund a "free" Windows 10.

        Let's not forget that a free Linux (without the quotes) is funded without even needing those privacy switches.

  7. DropBear Silver badge
    Mushroom

    "AI" or any other kind of Machine Police should never be allowed to de-activate accounts. It's welcome to flag them up, down and sideways internally for human review all it wants, and the corp behind it is welcome to improve it until its human mods can cope with the volume of flagging it generates, or hire more of them. But "AI" should never have access to the Big Red Button. If a hacker or IP thief causes an outage or disruption of any length whatsoever they're immediately charged with causing eleventy trillion billion million dollars of "damages" - how come corps are allowed to get away with doing exactly the same without having to have anyone actually accountable for it?!? No, "because you've agreed to it" is not a valid answer. And neither is "assume guilt automatically and shoot deactivate by default, ask questions later reactivate only if and when the Twitter shitstorm hits, then apologize for the mistake of having inconvenienced someone of high enough profile".

    1. Phil O'Sophical Silver badge

      Human oversight costs money. If you want it, be prepared to pay for it. "Free" services will rarely be able to offer more than a primitive rule-based oversight, dressed up as "AI".

    2. Cheshire Cat

      AI should never be able to *permanently* delete accounts. However, when you number your accounts in the hundreds of thousands or millions, you have to have some automatic disabling. Then the rare false positive can be manually corrected.

      Of course, if you have a lot of false positives, you have a different issue and should tune your algorithm better before you give it teeth

      1. shawnfromnh

        False positive fine but if that had been an account more than a week how the hell can it be spam since it would have been reported long ago and not just them. Hell if the account was years old why would they scan it at all unless they had a letter from a TRUSTED source not just an email from a random idiot no one has ever heard of. I think this is their way of clearing space by disabling stuff and seeing if someone complains about it. Stupid sure but it's and MS company now so this seems plausible.

    3. Mike 137 Bronze badge

      Not just GitHub

      We tried to set up a Twitter account to promote our open source software, but within about 20 minutes of struggling to get our logo to display in the avatar (they'd automatically cropped it) by retrying the upload several times, the account was locked, and the "help" line never responded to our complaint other than with boiler plate emails.

      We had to abandon the account, never used and still locked because some silly machine considered we were abusing the service by trying to get it to display our logo, and no human being was even contactable to fix the problem.

  8. buserror

    Command/Control channel flag?

    Given he says he has several nodes pulling from his account automatically, its likely some sort of automaton at github decided this was a command and control channel for some sort of bot, and pulled it down, possibly 'pending review' -- and they did review and restored it. I can understand that easily enough...

    He might be better of having a VM somewhere to so that sort of schemes, he'll have official sources on github, and mirror on the VM, and the automated nodes pulling from there. Extra backup layer thrown in.

    1. ckm5

      Re: Command/Control channel flag?

      Tons of companies follow this exact pattern for devops - if GitHub starts flagging accounts because people are pulling from them, 1/2 the internet would be taken offline....

      1. buserror

        Re: Command/Control channel flag?

        Huhn seriously? more than complete amateurs would put a third party company in the loop FOR NO REASON WHATSOEVER for their devops needs? It's not like they need the storage or anything, they could EASILY run a VPS and not have that single point of failure.

        It's not 'account pulling from them' here, it's 'machine pulling from a repo, kinda like... a virus would'. I can understand automating stuff, but using what is basically a cloud server for 'half the internet'?

        Giggles.

      2. Doctor Syntax Silver badge

        Re: Command/Control channel flag?

        "if GitHub starts flagging accounts because people are pulling from them, 1/2 the internet would be taken offline"

        OK, GitHub, get on with it.

    2. Adrian 4 Silver badge

      Re: Command/Control channel flag?

      Doesn't bode too well for github's Package Registry system if they don't like bots pulling from it, does it ?

  9. Luke Worm
    Thumb Down

    It's again the 'mericans showing who's the boss.

  10. JohnFen Silver badge

    Once again

    This is yet another of the growing examples of why, as far as possible, nobody should be relying so much on cloud services that they are harmed when the service stops working.

  11. J.G.Harston Silver badge

    NEVER EVER EVER EVER EVER EVER EVER EVER EVER EVER EVER EVER EVER EVER EVER EVER trust your code/data to other people.

    By all means, use other people's servers as a *DISTRIBUTION* system, but ****NEVER******* as your data repository.

    1. ckm5

      By definition, you are pretty much always trusting your codebase to other people. Other people write the filesystem, OS & VCS. It would be more appropriate to describe it as cloud, offsite, etc. Even then, most hosted services are likely more reliable that most people's desktops....

      1. Adrian 4 Silver badge

        Maybe hosted services are more reliable than a random desktop. But not more reliable than a couple of proper backup systems.

        Github is a distribution system, not an archive. Obviously so, since it's Someone Else's Computer.

        Losing your github archive should be about as exciting as losing your web page : a minor annoyance requiring a few minute's work to correct.

        If it's not, you're doing it wrong.

  12. adnim Silver badge

    "I can't believe how easy it apparently is to have someone's life work taken down from Github"

    That's the power of a bash shell... rm -R is beautiful bitch.

    It's easy enough to git clone back to the repo from your backup though right?

    Please don't tell me you trust a third party to be your sole back up solution.

  13. StuntMisanthrope

    Life's invisible work and TV.

    It's a bit more than code, the last album or indeed shared security and epigram. #tangibleasset

  14. dmacleo

    does gitlab operate in same manner?

    don't really use either service but wondered if gitlab was less reactive in supposed spam/malware repos.

  15. Mage Silver badge
    Coat

    not locally downloaded

    Backup, backup and backup. Especially anything remotely to do with the cloud,

    Don't assume you understand backups either. Having one USB HDD connected most of the time that you copy stuff to is not a backup solution.

    !

    Loads of folk here know how to do it. But I bet loads more are vulnerable to a fire, flood, ransom ware, the cat deleting it, you deleting it.

  16. Mage Silver badge
    Big Brother

    Musing

    Really Source Forge, Github, Google Docs, Dropbox, Office 360 etc should only be used for temporary collaboration and mirrored distribution servers etc used for delivering to the Public. Your own in house system (or securely hosted accessed by VPN only) for in house distribution. All sources, final runtimes, documentation, discussions, decisions should be backed up according to best practices. Online storage or Flash is not a backup solution.

    1. Doctor Syntax Silver badge

      Re: Musing

      "Github... etc should only be used for temporary collaboration"

      That temporary collaboration on Linux has been running for quite some time.

      1. robmobz

        Re: Musing

        Git != Github

  17. tfb Silver badge
    Alien

    'You always have a local copy'

    Once and for all: no, you don't, unless you consciously arrange to do so. If you maintain a local git repo by an initial clone & then pulls what you have locally is the commits which are in the ancestry of your remote tracking branches: you don't have commits which aren't.

    Of course you also don't have all the information which is not in the master repo at all such as all the issues &c. That information might matter if you care at all about what bugs your code has, what your future plans are &c &c.

    1. shawnfromnh

      Re: 'You always have a local copy'

      Unless you created the code on the cloud you always have a local copy. You upload a copy of that to the cloud and keep the other one.

  18. David Given
    Unhappy

    Exactly the same thing on SourceForge

    Exactly the same thing happened to me on SourceForge last year. Account nuked, no email, no communication, nothing. My repositories all still existed but my username had been changed to '<REDACTED>'. Luckily I'd already migrated everything off to GitHub at this point but I still used some of the mailing lists (which bizarrely continued to work).

    I emailed them, got nothing, then complained on twitter and someone finally replied to the email claiming it had been 'overlooked'. Apaparently an antispam bot nuked it, just like with GitHub. They did restore my account and, with a bit of pushing, changed the join-up date to 2000 so I retained my seniority, but were unable to update the repositories so I was listed as the author of the commits.

    I never received any kind of apology --- not even a pro forma 'sorry to hear that'.

    Needless to say, I don't feel inclined to use SourceForge for anything much, and I now have a backup script which periodically backs up the raw repositories from GitHub to bluray.

  19. shawnfromnh

    Gitlab is all I want to say.

  20. Carl-Jung

    Github is full of 404s

    ever since Microsoft took over, anyone with any sense has deleted their account and moved the repositories elsewhere.

  21. Richard Lloyd

    GitLab is an alternative...

    GitLab is another way to go if you're worried about storing your projects (and bug tracking/discussions) on someone else's servers. You can install a free self-hosted version that has most (but not all) of the features of the paid hosted version and keep all your data local.

    Nothing stopping you opening up that "local" GitLab to the wider public, though you obviously have to take some decent security measures (keep up to date with the monthly releases, enable 2FA, use a secure cert, manually vet new user creation).

  22. Anonymous Coward
    Anonymous Coward

    Isn't Github bought or owned by Microsoft nowadays?

    Hmmm... what a coincidence that this happens just as they have problems with "offline synced files in office"...

    Oh wait... am I a conspiranoic now? >:->

  23. Anonymous Coward
    Anonymous Coward

    FFS!! Use "People" in takedowns!

    This is the problem when you replace carbon with silicon!

    Sigh ;-(

    Ps. Who is the silicon accountable to?

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2020