back to article Sysadmin's three-line 'annoyance-buster' busts painstakingly crafted, crucial policy

Monday, bloody Monday. But fear not – Who, Me? has a suitably stressful story to remind you things can always be worse. This time El Reg's weekly column of tech catastrophes comes from "Todd" who worked as an operations sysadmin for a medium-sized regional ISP. His work was consistently plagued by an irritating alert that one …

  1. Anonymous Coward
    Anonymous Coward

    Oh yeah, _that_ fscker.

    The engine that is so overengineered that even proper introductions start with "a security context is made up of these three things, of which we will never use the second and third". (The improper introduction is of course "setenforce permissive".)

    I've never ever seen a tool that was so firmly "unless you are big enough to have two people doing these things full-time, don't bother".

    1. Waseem Alkurdi

      Re: Oh yeah, _that_ fscker.

      It (SELinux) was originally engineered by the American government, so it seems to mirror a governmental bureaucracy.

      1. theblackhand
        Black Helicopters

        Re: Oh yeah, _that_ fscker.

        "It (SELinux) was originally engineered by the American government"

        You mean the NSA...so they can watch everyone's mobile phones... Andoid even lists it... Apple doesn't because they don't want you to know the truth...

        I think that covers all of the SELinux mobile phone conspiracy.

        1. Anonymous Coward
          Anonymous Coward

          Re: Oh yeah, _that_ fscker.

          iOS is based on Mach/BSD, not Linux, so it is hardly surprising that Apple doesn't list (or use) SELinux!

          1. Anonymous Coward
            Anonymous Coward

            Re: Oh yeah, _that_ fscker.

            “iOS is based on Mach/BSD, not Linux, so it is hardly surprising that Apple doesn't list (or use) SELinux!”

            You read about SELinux conspiracy theories and were worried that the Apple part wasn’t correct? FFS...

            1. Anonymous Coward
              Anonymous Coward

              Re: Oh yeah, _that_ fscker.

              People who will believe that SELinux is an NSA conspiracy are going to believe that no matter what I say, no matter what you say, no matter what anyone says, so addressing that part of it seemed to be pointless.

              The conspiracy theories will need to come up with a new conspiracy about iOS. Maybe they can find a link between Carnegie Mellon and the NSA and decide that secret NSA code in Mach is how they have p0wned iOS...

              1. Anonymous Coward
                Anonymous Coward

                Re: Oh yeah, _that_ fscker.

                I always thought the "A" in NSA stood for "Apple".

    2. This post has been deleted by its author

      1. Anonymous Coward
        Anonymous Coward

        Re: Oh yeah, _that_ fscker.

        Just replace it with AppArmor and call it a day!

  2. Anonymous Coward
    Facepalm

    Great system...

    ... if it allowed to overwrite policies without any check or warning.... unless they existed and were utterly ignored....

    1. Waseem Alkurdi

      Re: Great system...

      It's been always this way with config files.

      Let's say you have a config file under ~/.config/myprog/config and another under /etc/myprog/config and a default under /usr/share/myprog/default/config.

      Which one am I gonna load?

      1. Remy Redert

        Re: Great system...

        Depends on how prone to configuration errors you want to be. The error-free way is to load the default first always, then check for a system specific configuration and load that over the top, then check for a user specific configuration and load that.

        That way, anything that wasn't specifically modified by the system or user configurations will use the default and the user only has to configure those things he cares about.

        Alternatively, you load the user configuration if it exists, the system configuration is the user configuration doesn't exist and a system one does and the default only if neither of the previous exists.

        1. Anonymous Coward
          Anonymous Coward

          Re: Great system...

          @Remy Redert

          NOW you tell me!!!!

      2. JulieM Silver badge

        Re: Great system...

        It's probably going to read /etc/myprog/config first, then ~/.config/myprog/config , and ignore the living daylights out of /usr/share/myprog/default/config .

        The canonical order is supposed to be to read configuration from /etc/ first, and then from somewhere under the owner's /home/ folder; that way, users' own options override system-wide ones. (But there are always a few exceptions .....) /usr/share/ is not for loading configuration files from; any "configuration file" you find under there should be just an example.

        But if in doubt, read the Source Code.

      3. John Robson Silver badge

        Re: Great system...

        Order depends whether your loading is 'last time the variable is set' sticks, or 'first time the variable is set' sticks.

        But that's an implementation detail - you load such that command line options override user config overrides system config.

        1. Spazturtle Silver badge

          Re: Great system...

          Ah you are assuming that it loads and merged all the config file, it could just only read from one.

      4. Waseem Alkurdi

        Re: Great system...

        That's precisely the point.

        See how we have different answers?

        1. John Robson Silver badge

          Re: Great system...

          Except all the answers say the same priority is taken...

          Which do I load is not a relevant question. Which takes priority per option is - and it’s always system < user < cmdline

        2. Munchausen's proxy

          Re: Great system...

          "That's precisely the point.

          See how we have different answers?"

          Of course, in real life, the system will simply use the best config until the worst possible time, at which point it will switch seamlessly to using the worst one.

          1. phuzz Silver badge

            Re: Great system...

            I was going to say "the one that fucks everything up the most", but you put it more succinctly.

      5. SealTeam6

        Re: Great system...

        The one closest to root, of course. In this case 'etc'.

    2. Saruman the White Silver badge

      Re: Great system...

      The real mistake was not to the system to make sure that an existing security policy does not have the same name. It does open the question however - were the security policies actually documented?

      Saying that this is the sort of mistake that all IT admins make sooner or later; once the dust has settled, the scream silenced (possibly using the sort of techniques taught at the BOFH school of IT admin) and the system been restored to what it should have been, you tend to be a whole lot more cautious/paranoid when pushing out any subsequent changes.

      1. Anonymous Coward
        Anonymous Coward

        Re: Great system...

        "It does open the question however - were the security policies actually documented?"

        Of course they were documented. See that dusty cabinet over there with loads of carefully printed documentation that is almost untouched aside from the thick layer of dust? It's in there....probably

        1. DailyLlama

          Behind a locked door...

          with a sign saying "Beware of the leopard"?

          1. imanidiot Silver badge

            Re: Behind a locked door...

            Nah, just inside the BOFH manual, with the heavy steel covers. Wired to the heavy duty inverter downstairs. Remember to use your gloves.

            1. doublelayer Silver badge

              Re: Behind a locked door...

              You don't even have to introduce a negative. Make the documentation very verbose, include something in it approximately 68% of the way through, and watch people lose their mind when they try to find it. Otherwise known as the irritating API reference technique, where you know there should be a function that does what you want, and it's in this category, but this category includes thirty APIs, each of which implement 60 functions. And the search box only searches API names, but not functions. They're watching me to see how long I search, aren't they?

  3. DailyLlama

    But...

    Did it fix the original problem that Todd had?

  4. Waseem Alkurdi

    So Todd wrote a security policy.

    "It was a three-line policy that basically said 'this file can be accessed by this process in this way'," he said.

    SELinux policy?

  5. Sgt_Oddball
    Facepalm

    So it went...

    From Foobar to FUBAR?

    Best I managed was adding a security policy to a remote server.. Forgetting to add the office's IP address to the whitelist. Server suddenly stopped talking to our office Cue just about every department in the office howling whilst I made an embarrassing call to the hosts to talk them through what they needed to change to get us back onto the server.

    (other customers could see it just fine but because of remote access ports attempts triggering firewall blocks, we got the ban hammer straight away)

    1. Anonymous Coward Silver badge
      Thumb Up

      Re: So it went...

      Don't worry AC, we've all been there. Firewall rules on colo servers, VPN to customer sites, etc etc.

      Sometimes you can cobble together something to get you back in, sometimes you have to phone up and eat humble pie. Sometimes you can just put it off until you're next on-site and fix it discretely.

    2. Anonymous Coward
      Anonymous Coward

      Re: So it went...

      We screwed up once when converting a telco into an ISP. We had some real good people to do this, but one in particular was a tad too impressed with himself, so the Gods of Technology decided to take him down a peg.

      He was busy configuring the big bad box that to route it all, and all of a sudden he came and quietly asked if I still had dialup access to my old account - he put the filters the wrong way round and locked himself out. That was the moment he finally fitted into the team - he learned he was just as fallible as everyone else.

      There are two things you do during your work that create experience: screwing up and digging yourself out of that hole (with or without help), and teaching others so you get people asking things you never thought about. You don't become really good at something until you have done both IMHO.

      1. Sgt_Oddball

        Re: So it went...

        I keep a smartphone with remote desktop since about 2005 (yes symbian did actually have a rdp client who would have thought it) for just such instances..

  6. Rich 11

    When was the last time you gave your colleagues and customers an unrequested break?

    Are we talking fingers or legs?

    1. Anonymous Coward
      Anonymous Coward

      "Are we talking fingers or legs?"

      You leave skulls untouched? How very humane.

      1. Anonymous Coward
        Anonymous Coward

        You leave skulls untouched? How very humane.

        When it comes to gambling debts and things like that the Mafia rule is to break a leg because if you break fingers or skulls they won't be able to write the cheque.

        1. Anonymous Coward
          Anonymous Coward

          Re: You leave skulls untouched? How very humane.

          To quote Ramirez from Thief:

          Boys -

          While I appreciate your enthusiasm, I must point

          out that breaking legs is, really, rather inefficient,

          if we wish our clients to be able to pursue their

          rather active lives in order to immediately repay

          their debts. I encourage your initiative in this

          area - perhaps you might take a lesson from the

          gardener and his hedge clippers, who trims off one

          branch while leaving the rest to grow... and most

          of our clients do start with fingers to spare.

          1. Anonymous Coward
            Anonymous Coward

            Re: You leave skulls untouched? How very humane.

            There seems to be some confusion.

            I work in IT and try and keep my assistance or Mr Darwin to that which is strictly necessary to keep important systems running. And avoid unnecessary use of my time. Oh and occasionally when I'm bored on a Friday afternoon...

            I'd never dream of expecting users to pay to evolve...

            1. John Brown (no body) Silver badge

              Re: You leave skulls untouched? How very humane.

              "I'd never dream of expecting users to pay to evolve..."

              You'll forever be a PFY, never a BOFH with an attitude like that!

      2. bpfh

        Don’t break their skulls...

        Odin frowns upon his followers who cannot drink to his health from the skulls of their ennemies.

        Lock em’ in the server room in a T-shirt for a couple of hours before releasing the FM200 instead.

  7. stiine Silver badge
    Facepalm

    A long time ago on an OS far away

    Our division had been spun off and therefore had a new name. A new longer name. Much longer than three initials.... So I found where that setting was in the configuration and I changed it from the old 3 letter name to the new 18 character name. Soon thereafter, calls started coming in from the shop that they couldn't log into the product tracking system (which is as you woulld imagine, rather important). I went downstairs to the closest one (there were a lot) and logged into my account from their terminal and it worked just fine (including displaying the long company name), but when they logged into their production account, it would disconnect them. I quickly came up with a workaround that would get them connected, but required them to log in twice...they didn't much like that, but it did get them tracking product again. Then I opened a case with support. What it came down to was the 18 character name I had inserted was 10 characters longer than the field into which it was written....which, when downloaded to the FEPs, caused part of the transaction processing code to be overwrritten, and crash on login. After shortening it to 6 characters, the transaction processing logins started working again.

    The moral of the story is if there's unclear or no documentation, don't make any changes to a production system until you ask what happens if I change ______, especially if its never been changed before...

    1. iGNgnorr
      FAIL

      Re: A long time ago on an OS far away

      "The moral of the story is if there's unclear or no documentation, don't make any changes to a production system until you ask what happens if I change ______, especially if its never been changed before..."

      Er, no. The moral of this story is that user input should *ALWAYS*, no exceptions, be checked, starting with how long it is.

      1. A.P. Veening Silver badge

        Re: A long time ago on an OS far away

        "Er, no. The moral of this story is that user input should *ALWAYS*, no exceptions, be checked, starting with how long it is."

        Wrong, you should start with checking whether it is there at all, length is secondary to that (unless you equate it not being there with a zero length, which I don't, zero length can be acceptable).

        1. Anonymous Coward
          Anonymous Coward

          Re: A long time ago on an OS far away

          State information is separate to user data. Checking whether it's there or not comes before you check the user input.

          (sorry, I'm just in that sort of mood)

      2. David Nash Silver badge

        Re: A long time ago on an OS far away

        Both morals are right; the software should validate user input, and the admin should not have made a change to a prod system without testing it first.

  8. Lee D Silver badge

    a) Testing.

    b) Change Management.

    c) Proper naming conventions (which should include the date and/or author). Name a policy with 20190204 on the end when you make it today and it's impossible to get confused with one people wrote years ago called the same but with THAT date. Plus, you instantly know how old that policy is, i.e. how long it's been useless and/or working, and can modify your interference behaviour accordingly.

    It reads like a catalogue of errors from the start.

    Only lucky that the "overwriting" that was being prevented originally didn't overwrite something much more critical when deployed to all the relevant servers and leave you with, say, a blank DNS database for example.

    1. doublelayer Silver badge

      Re: Putting dates in names

      No, don't put dates in names. That just makes the names harder to understand. If the files are called "system_restart" and "system_test_dr_capability", they are less likely to be run by accident than if they're called "20170204_ada_lovelace_at_company_dot_com_fixes20160419_system_test_dr_capability".

      The latter approach not only makes long names that are hard to remember, but it can also result in multiple versions of the file that may or may not do the same thing.

      Put instructions at the top of any file that can handle them, and of course test before you push things into production. Also, if you can, don't use a system that cheerfully replaces one config with another config without asking; if they had gotten a single confirmation box or terminal warning*, this would have been detected before it caused a problem.

      *As it turns out, neither mv nor cp complains about copying a file over another one. I thought they did. Time to become more nervous.

      1. Long John Brass
        FAIL

        Re: Putting dates in names

        No, don't put dates in names. That just makes the names harder to understand. If the files are called "system_restart"

        No no no

        you copy system_restart to system_restart_YYMMDD_<Your initials here>

        Then edit system_restart

        1. Paul

          Re: Putting dates in names

          >> copy system_restart to system_restart_YYMMDD_<Your initials here>

          no, use YYYYMMDD. did you learn nothing at the end of 1999?

      2. Down not across

        Re: Putting dates in names

        *As it turns out, neither mv nor cp complains about copying a file over another one. I thought they did. Time to become more nervous.

        Yes they do. Both of them. Have you tried with -i ?

        Disclaimer: Presence and function of -i option may depend on your implementation.

        1. doublelayer Silver badge

          Re: Putting dates in names

          Perhaps I should clarify my statement. Neither mv nor cp complain about copying a file over another one *by default*, which is how everyone runs them. Since these tools don't do a lot of, to me, more obvious things without my having to tell them with switches, I would think that -i should be on by default. So much did I think this that I assumed that it was.

          1. Nutria

            Re: Putting dates in names

            alias mv='mv -iv'

            alias cp='cp-iv'

  9. Nematode

    I'm glad I retired. I don't think I understood any of the above, other than (i) DNS, (ii) broken and (iii) coffee break.

  10. Smoking Man

    SELinux? Sure, can be.

    I would have expected some fun with systemD, the "Grand Microsoftification of Linux"..

    Same sort of PITA as SELinux.

  11. adnim

    suggest improvement to DB

    me:

    The SQL queries that produce KPI reports for the CEO, CFO and other top management use the content of the kpiName column to find data. This means we cannot change names of these KPI's. If there is a typo or we wish to rename them for a more accurate description to those entering data we can't. Perhaps a kpiID column with a lookup table for the kpiName would be the way to go. Then we can change kpi names at any time. A new column called kpiID is all that is needed. I will change the Insert query to send the kpiID as well as the origninal kpiName.

    Business Analyst and dbAdmin:

    That make sense good idea

    A day later

    IT Manager:

    WTF is going on? why do these reports not work?

    CEO is going to tear me a new one.

    I write the code that that presents a web interface for users to enter KPI's

    It is obviously my fault.

    Me:

    I will find out why, will update you soon

    I look at the column names. They have been changed... kpiName to kpiID and a new empty kpiName column added.

    An email to the Analyst requesting they revert the changes to column names and bit of SQL jiggery pokery later and it all works again

    me:

    It is sorted now, some column names were changed.

    IT manager:

    Thanks for the quick fix, make sure is doesn't happen again.

    I didn't have access rights to alter the table but I kept my mouth shut.

    IT manager still thinks it my fault. Perhaps it was, I over estimated the ability of the analyst. Unfortunately I had no rights to change the queries generating the reports.

    Perhaps I should have suggested that the queries that produced the reports needed updating too.

    Was fskn obvious to me. Not so to the Analyst.

    1. Anonymous Coward
      Anonymous Coward

      Re: suggest improvement to DB

      adnim,

      Your problem was the old familar one ........ You thought something was obvious so did not mention it.

      Never make assumptions about other peoples knowledge or understanding.

      Document *everything* that relates to your problem/process .... even the blindingly obvious.

      Better to 'teach your grandmother to suck eggs' than realise 'Oops !!!' something was missed because you thought someone else was doing it !!! :)

      1. adnim

        Re: suggest improvement to DB

        Yes AC, I thought is was obvious because it fskn is!

        I am self taught, I have no qualifications except a C&G in 'C' applications programming from the early 90's. I am not even sure what I am doing most of the time. And when my code works I am sometimes surprised.

        I expect someone with a university degree who is a database admin to have a clue, and teaching them to 'suck eggs' so to speak seems patronising if not insulting.

        You are right though... I will just add "sorry if this is obvious but make sure that ...." to any future request for change. :-)

        1. W.S.Gosset
          Facepalm

          Re: suggest improvement to DB

          That's just a degree of not just incompetence but brain-beggaring stupidity.

          Process:

          1. run a single line of DDL

          2. do not populate the new columns -- fail

          3. do not resynch the downstream code -- fail

          I mean, to not even populate the Names...

          .

          I'd have sacked the guy on the spot. How the HELL did he blag his way into the job in the first place??

      2. Stevie

        Re: suggest improvement to DB

        "Your problem was the old familar one ........ You thought something was obvious so did not mention it."

        Tee hee. I've got one of those.

        I once worked for a small bleeding edge firm with a contract to a large cruise line to write a ticketing system. It was not going well. I had just been hired and was rushed to the angry customer with a bunch of their younger, wiser programmer analysts to see what could be done to make them happy.

        I was an Old Mainframe Guy. Customer rep was an Old Mainframe Guy. We locked eyes.

        "Forgive me if I'm covering old ground, but I'm new here and need to fully understand the problem we are trying to solve" I said in my best humble voice. There were rolled eyes from the younger members of my own team but no reaction from the customer rep.

        "Could you tell me, does the SS Saucy Sal always have the same number of staterooms?" I asked. My team groaned and rolled their eyes some more. The customer rep's mouth twitched and, after a suitable pause, he said "No".

        You could have heard a pin drop. We were still eye-locked.

        "I see. Could you explain how and under what conditions you change the number of staterooms on the SS Saucy Sal? Again I'm sorry if this has already been covered." The customer rep proceeded in a level tone to explain how and why this transformation would come to pass. We were still eye-locked (though blinking was allowed) but I could detect a palpable thaw in the customer rep's attitude re: contracted software "specialists" from Somewhere Not Here.

        My own mob were trying not to do the Bonehead Gape Face.

        "That's very interesting. Thank you. Now, when you say you ticket up until the last minute, could you tell me exactly what you mean and how you go about delivering that service?" A very revelatory explanation involving servers loaded on beer trolleys, long extension cords and fistfuls of cash and paperwork was delivered, demolishing the model my own team had laboriously constructed from assumptions and guess.

        By the end of it the customer rep was smiling and we were on a much better footing with him and about 3/4 of the staff. (I had to impress someone else in their DBA department early one morning by demonstrating mad skillz on the fly before the other 1/4 came grudgingly around).

        It took a couple more of these before my own team forgave me for being new and old at the same time, and mired in the mainframe world instead of flying high on a balloon made of PCs, Visual Basic and the Light of Jesus in Their Eyes.

        1. W.S.Gosset

          Re: suggest improvement to DB

          I wish it was possible to give more than one thumbs-up.

          Have had essentially the same experience multiple times. Drives you mad. Common Bloody Sense.

    2. Nutria

      Re: suggest improvement to DB

      "Simple" changes breaking Prod is why the hell that is ISO/IEC 20000 were foisted upon poor IT sods.

  12. Anonymous Coward
    Anonymous Coward

    What is a “headend”?

    1. Mage Silver badge

      Re: “headend”

      It CAN mean the system that connects "stuff" to the coax cable that goes to all the houses. Maybe one for a small town long ago. Once it might have been merely TV receivers with aerials and C-band Dish+LNB+analogue FM receivers and then loads of modulators and amps.

      Now it might be a fibre fed CMTS in a street cabinet for Internet with Digital Video, Switched video, Some analogue video and even some Band II VHF radio. Unlike the Cat3 telephone pairs, the coax can have nearly 1000MHz bandwidth rather than less than 30MHz DSL (designed for 3.5KHz!) and run at higher QAM. Distance can even be a few kilometres with no penalty. Twisted pair phone line speed / bandwidth falls of a cliff beyond 200m. VDLS2 maybe only 12Mbps at 1km.

      1. Anonymous Coward
        Anonymous Coward

        Re: “headend”

        OK, so what is a “DNS headend” then? Seems an odd choice of terminology.

        1. Long John Brass
          Terminator

          Re: “headend”

          Usually you don't have your DNS master serving traffic. You have multiple DNS slaves that serve traffic to the great unwashed. The master is the place you make edits that are then pushed to the slaves.

  13. W.S.Gosset
    Alert

    Could be worse

    You could make an undocumented change in Prod which kills people:

    https://www.brisbanetimes.com.au/world/north-america/behind-the-lion-air-crash-20190204-p50vhf.html

  14. Anonymous Coward
    Anonymous Coward

    switch reconfiguration hell

    Just this evening I was reconfiguring a switch port and knocked everybody offline for about 15 minutes when I made a small mistake changing vlans. Fortunately not many people affected. The biggest nuisance was finding a way to connect to the switch to fix the mistake, given the network was broken!

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like