back to article Dev's telnet tinkering lands him on out-of-hour conference call with CEO, CTO, MD

Welcome all, to the merry world of Who, Me?, our weekly trip down memory lane for techies who want to get something off their chest. This week, we hear from "Gavin", who was working at an ISP that had a few thousand point-to-multipoint radio links and had been asked to write a script to backup their configs. "They already had …

  1. jake Silver badge
    Pint

    About a billion years ago in internet time (call it 1986) ...

    ... I filed a bug report on a batch of bad EEPROMs that were throwing spurious errors. In the bug report, on a lark (and to see if anyone actually read the bugr), I suggested that it was probably Alpha particles off the heavy metals concentrated from sea water evaporation in the salt pile in Redwood City, which was just off our shipping & receiving dock.

    PhD Engineers scurried about for about a week, until I confessed to the joke. I nearly got fired. It's amazing how little highly trained people know about stuff outside their field. Me, I generalize ... seems to keep me saner than most.

    Note that back then there WERE some EEPROMS that were contaminated by Alpha particles, but that was caused by a manufacturing error before they were sealed up. If you know anything about such things, you'd know why my hoax was obviously bullshit.

    Why bring this up here? In the thirty years since then, I've heard the story of the salt pile in Redwood City ruining electronics "due to Alpha Particles" half a dozen times, at half a dozen companies, in three states, Canada, the UK and Australia. Usually in relation to spurious errors in electronic gear. I suspect the hoax will out-live me by many decades. If you run across it in your meanderings and it causes you any trouble, I apologize ... have a homebrew on me :-)

    1. Rupert Fiennes Bronze badge

      Re: About a billion years ago in internet time (call it 1986) ...

      The PhD's didn't know alphas are stopped by a sheet of paper, let alone x number of walls and plastic chip cases?

      It's not that unusual: I once had a spirited argument with a Chemistry and Civil engineering PhD students who apparently didn't know what the law of universal gravitation was. It's only after they looked up what G was they believed me...

      1. Terje

        Re: About a billion years ago in internet time (call it 1986) ...

        Have an up vote!

        In the poor chemists defense, in chemistry G is usually something entirely different (Gibbs free energy). To much exposure to one field of science (particularly one involving lots of organic solvents preferably halogenated ones) tend to erode any residual knowledge in other fields.

      2. Anonymous Coward
        Anonymous Coward

        Re: About a billion years ago in internet time (call it 1986) ...

        The PhD's didn't know alphas are stopped by a sheet of paper, let alone x number of walls and plastic chip cases?

        Obligatory "Yeah, but wot about Johnny Alpha?"

        1. Chris King Silver badge

          Re: About a billion years ago in internet time (call it 1986) ...

          Obligatory "Yeah, but wot about Johnny Alpha?"

          Letting him loose in your manufacturing facility with a Westinghouse Variable-Cartridge Blaster isn't going to end well, is it ?

          NUMBER FOUR CARTRIDGE ! *BOOM*

          1. W.S.Gosset Bronze badge
            Pirate

            Re: About a billion years ago in internet time (call it 1986) ...

            Go go gadget John Wagner and Carlos Ezquerra!

    2. Anonymous Coward
      Anonymous Coward

      Re: About a billion years ago in internet time (call it 1986) ...

      IIRC - I do remember reading in a "how to repair PC" type book about RAM errors caused by cosmic particles. The solution was to bury the PC under several metres of concrete

      1. Allonymous Coward

        Re: About a billion years ago in internet time (call it 1986) ...

        Obligatory xkcd: https://xkcd.com/378/

        1. amanfromMars 1 Silver badge

          Re: About a billion years ago in internet time (call it 1986) ...@ Obligatory xkcd:

          Nowadays for Advanced IntelAIgent Reality, one is expected to follow an Unfolding and Almighty EMPowering Future Narrative.

          Or has that always been they way that it has been and is for Superb and Sublime Future Builders?

          Share a Good and Grand Tale to Realise a Great View with Improved Vision.

          And there is Stellar Comfort in recognising the Fact that if newly discovered, are Elements of Mankind Experimenting in Virgin Ground which they Command and Control, and if one has simply stumbled into long ago uncovered heavenly practices, is Status Quo ACTion and Internet Working Reaction Mandatory for a Semblance of Continuity with Default Engagement for Leading Prime Operating Systems.

          1. caffeine addict Silver badge

            Re: About a billion years ago in internet time (call it 1986) ...@ Obligatory xkcd:

            I thought we'd lost amanfromMars. Have I just been unobservant?

            1. W.S.Gosset Bronze badge
              Devil

              Re: About a billion years ago in internet time (call it 1986) ...@ Obligatory xkcd:

              YoU dIn'T lOsE hIm, YoU jUsT gOt LaZy WiTh YoUr ShIfT kEy.

      2. Pirate Dave
        Pirate

        Re: About a billion years ago in internet time (call it 1986) ...

        "I do remember reading in a "how to repair PC" type book about RAM errors caused by cosmic particles."

        I remember seeing that several times back in the 90's as well. If memory serves (heh), they all said the useful lifetime of most RAM was 10,000 years, by which time cosmic rays would have caused too much damage to the crystalline structure of the memory for it to still be reliable. I guess any deep space probes would be subject to this, mostly because they'll be the only things to survive for 10,000 years.

      3. tfb Silver badge
        Boffin

        Re: About a billion years ago in internet time (call it 1986) ...

        Errors caused by energetic particles were a really serious worry at some points. People didn't understand that you can't just make chip packages out of stuff, you have to clean the radioactive contaminants out of it first. A lot of early dynamic RAM and some early microprocessors suffered badly from this (this is one reason why a lot of 70s/80s machines had ECC memory even if they were not particularly high-end). That problem is solved, but the cosmic ray problem still happens.

      4. Rupert Fiennes Bronze badge

        Re: About a billion years ago in internet time (call it 1986) ...

        Well, strictly cosmic rays are nuclei and electrons, which penetrate quite a bit better than alphas. But although I seem to remember some Google report saying they identified them as a cause of issues over a fleet of millions of computers, I suspect us normals have little to worry about :-)

        EMP E1's on the other hand...

    3. Fungus Bob Silver badge
      Devil

      Re: About a billion years ago in internet time (call it 1986) ...

      I once told my supervisor that down in Brazil, the gauchos hunted the gazebos to the verge of extinction, And he believed it.

      Also around 1986....

      1. SamJ
        Pint

        Re: About a billion years ago in internet time (call it 1986) ...

        Don't forget that PT Barnum, to keep the crowd moving and not overcrowd the side-show tent, famously posted a sign on the exit of that side-show tent that read: "This way to see the egress." It worked, because many folks wanted to see the another strange creature - the "egress."

        1. W.S.Gosset Bronze badge

          Re: About a billion years ago in internet time (call it 1986) ...

          Egress look all very well, but they keep shitting all over the verge. The gazebos hate them.

      2. W.S.Gosset Bronze badge

        Re: About a billion years ago in internet time (call it 1986) ...

        > down in Brazil, the gauchos hunted the gazebos to the verge of extinction

        Sad memories of South American yesteryear. O how I miss the haunting call of the wild gazebos trumpeting in the gathering gloom of the Amazon dusk, as they thundered through the waving fields of cow-stripping piranhas on their way to the shops to grab a packet of biscuits before closing time.

        Now, they just sit on the verge and look mournfully at me as I drive past, silently mouthing pleas for last-minute tinction. Alas! too late.

    4. vincent himpe

      Re: About a billion years ago in internet time (call it 1986) ...

      EXAR 24Cxx or 28C08 's ?

  2. stiine Bronze badge
    WTF?

    You can't get here

    Some time between 1988 and 1992, we replaced our Honeywell DPS-8/70 with a Honeywell-Bull Dual DPS-90.

    So, the system has been installed (that was fun, too), and its running. Honeywell-Bull flies a group of sales and engineering folks to our location to try to sell us ARES, their relational database product that's just been release. The computer room's on the first floor, and the dog-and-pony show is going on in the 3rd-floor conference room. And....the system crashes. Of course, this interrupts the activities on the 3rd floor, so when the sytem administrator, who was my boss, comes in to the computer room, he's followed by one of the OS developers. My boss picks up the phone and calls support. When he explains what has happened, and what the error message says, he's met with disbelief. Now, on the DPS-90, the operator console is actually semi-intelligent, so he hits return enough times to advance the console log far enough out of the console printer to read the last line, and he then reads "......YOU CAN'T GET HERE" to support. After they again dismiss his description, he calls the very conveniently on-site OS developer and hands him the phone and says 'you tell them what it says.' The OS dev tells the support person who he is, that the error does, in fact, read 'you can't get here', and to please transfer him to so-and-so, who we find out later was the author of one of the main subsystems. He then explained the situation: hardware - Dual '-90, 6 FEPs, etc; the software: CP-6 (don't remember which version, probably E00), and the error message; and can he please find it and call us back with a recommendation,and he hangs up. We got a call about 20 minutes later with a suggestion to check some physical connection in a specific cabinet (I don't recall whether it was a connection to the tape controller or disk controller), and lo-and-behold, there was a connector that was only secured on one end and had begun to pull apart at the other end.

    After reconnecting, and securing, that connection, we were able to boot the sytem normally.

    1. MiguelC Silver badge

      Re: You can't get here

      Long ago, my colleagues and I were following up a mainframe bug report and saw a message on the terminal stating something very much like "if you reached this point you're fucked". The original coder was no longer working with us and, unfortunately, we had to agree with his statement...

      1. amanfromMars 1 Silver badge

        Re: "if you reached this point you're fucked"

        Or in at the Start of a New Beginning is a Firm Favourite Resource Registered Here. ?:-) Being Virtually Mentored and Monitored and Freely Available from/for Otherly Sources into Immaculate Services.

        Care to Show Future AI Lead with Simple Almighty Instructions/COSMIC AIdDirections for Similar Source Supply?

        Where has that got you to now ‽

        Familiar Territory or Alien Space?

        In Unfamiliar Alien Space Territories, Live Operational Virtual Environments are Engaged and EMPowered to Await and Attend to Every Need with Immaculate Seed and Perfect Feed.

        As you must surely imagine to be true is to realise the future is Most Probably a Divine Masterpiece, however, a few words of caution. Future MIsUse and Abuse is a Deadly Serious Sin with Terminal Consequences in Just Reward.

        Don't Forget to Always Remember That Simple True Fact for Practically Almighty Fictions Exercising Remote Virtual System Command with Advanced IntelAIgent Controls/Quite Almighty Levers.

        And here's a Wonderful Ponder for the Truly Romantic ...

        "One should not aim at being possible to understand, but impossible to misunderstand". - Marcus Fabius Quintilian

        “The most dangerous man to any government is the man who is able to think things out for himself, without regard to the prevailing superstitions and taboos. Almost inevitably he comes to the conclusion that the government he lives under is dishonest, insane and intolerable, and so, if he is romantic, he tries to change it. And even if he is not romantic personally he is very apt to spread discontent among those who are.” …… H.L. Mencken

    2. Stevie Silver badge

      Re: You can't get here

      Bzzzzzzzz! Deviation. This story is clearly an On-Call tale and not a Who Me fessup.

  3. Vulture@C64

    "adaptability to shifty infrastructure and business knowledge was what kept us going as a business."

    Never a truer word said - this is critical if you work in infrastructure at any level. Never forget the CEO's pet customer or the dodgy fibre switch that runs one of the database back ends which everybody keeps forgetting to get budget to change or the natting which had to be done on an old server rather than the router . . . all these little gotchas are part of the job and often make it fun :)

    1. phuzz Silver badge

      "or the dodgy fibre switch that runs one of the database back ends"

      Or in our case the switch which has ended up becoming a core switch just because important things were plugged into it willynilly. Which now has degraded to the point where the only way to connect to it is by manually putting it's MAC into your ARP cache.

      Of course the things plugged into it don't have redundant connections and can't possibly be unplugged for any reason, or so I'm told at least.

      (I might 'accidentally' depower it at some point just to force the issue)

  4. thames

    @El Reg said: "Because the firmware on a lot of the devices wasn't updated for years, their telnet client had a number of quirks, such as it wouldn't ask for a password or would only ask for a password," he said.

    This is exactly the sort of situation that "expect" was created to deal with. You send this string, receive that response, send this reply, wait for time outs, etc. "Expect" comes back with whatever exit code you define for each situation which you can analyse from the script you called expect from. The "expect" scripts can be stand alone, or you can embed them directly in bash scripts. The main script logic is done in bash (or something similar) while "expect" handles the interactive parts.

    In this case I would have defined an "expect" script which looked for the anticipated response, and if the remote system responded differently I would have logged an error to a file and carried on to the next station. I could then have analysed the error log later to see what happened, and then added another "expect" case to the original script to deal with that situation.

    I have used "expect" to deal with a somewhat different problem and found it works quite well. I believe that the most common use for it is automating log-ins on systems that can't use ssh keys or the equivalent for whatever reason.

    I'm not saying that I wouldn't have made the same error as the original author, but if I had anticipated the problem then there's a tool which exists specifically to deal with this problem.

    1. jake Silver badge

      "expect" isn't a basic utility, it's an add-on.

      There is a possibility that the system our protagonist was using didn't have Tcl installed, and thus he didn't have access to expect. However, your basic point that one should always check the sanity of any replies when querying equipment with scripting is valid ... and an issue that I'm sure many of us reading can commiserate with.

      "It ain't what you don't know that gets you into trouble. It's what you know for sure that just ain't so." --Samuel Langhorne Clemens

      1. A.P. Veening

        Re: "expect" isn't a basic utility, it's an add-on.

        Nice to see another author here than Terry Pratchett. And Mark Twain was right.

        1. Spanners Silver badge
          Pint

          Re: "expect" isn't a basic utility, it's an add-on.

          "Nice to see another author here than Terry Pratchett."

          I think I've seen the occasional threat of horses' heads here. That's Mario Puzo.

      2. Stevie Silver badge

        Re: "expect" isn't a basic utility, it's an add-on.

        If it were done in Cobol you could define the subroutine as "EXPECTS", the expected value(s) could be defined in either an 05 level or an array named "A-SPANISH-INQUISITION" and tag the exception case as a 77 level named "NOBODY" and get to display an absolutely genuine diagnostic of epic legendarity. I got into a lot of trouble once for doing this sort of thing. Totally worth it.

        1. DougS Silver badge

          Re: "expect" isn't a basic utility, it's an add-on.

          Somewhere amongst my files I have a version of expect I wrote in bash, because I needed the tool but wasn't able to install the stuff needed for the "real" expect.

          Its actually a pretty simple thing to write, and I even had to expand it beyond the "real" tool's functionality. That wasn't the hard part, the hard part was the challenge/response logic when you are doing something a lot more complicated than a telnet login - and the challenges & responses change as firmware does.

          In this case I was doing automated upgrade/configuration of fiber channel switches, some standalone some embedded in a blade chassis. I was questioned a few times about the level of effort I put in to writing this, but given that doing it manually was taking over an hour per switch and the results lacked consistency (causing problems that required hours of troubleshooting later) and there were well over a hundred new ones on the way that would need doing in the next few months the need was clear to me.

          I wonder how long they kept using it after I left. Eventually firmware changes would break the challenge/response and someone would need to fix it. I documented it very well, and encouraged them to look it over and ask me any questions before I left, but that's no guarantee someone made the effort to maintain it.

          1. jake Silver badge

            Re: "expect" isn't a basic utility, it's an add-on.

            Our protagonist was already familiar with perl, which has made this kind of thing fairly easy pretty much since the year dot. Not all that secure, perhaps, but easy. (I don't think security was a buzzword the ISP in question was aware of.)

          2. LeahroyNake Bronze badge

            Re: "expect" isn't a basic utility, it's an add-on.

            You can assume that someone is taking credit for your work right up until the point it stops working... then they will deny all knowledge of ever seeing it.

            1. Down not across Silver badge

              Re: "expect" isn't a basic utility, it's an add-on.

              You can assume that someone is taking credit for your work right up until the point it stops working... then they will deny all knowledge of ever seeing it.

              Ahem. I have vehemently denied ever seeing some code and having no idea about it ...until I've spotted the header revealing me as the original author. Has happened few times, and I honestly have not recognized any of the code that I had apparently written many years ago.

              1. Anonymous Coward
                Anonymous Coward

                Re: "expect" isn't a basic utility, it's an add-on.

                I honestly have not recognized any of the code that I had apparently written many years ago.

                In my travels I've come across a saying along the lines of "Code you wrote 6 months ago may as well have been written by someone else".

                I've revisited routines that were very familiar to me many years ago, and now for the life of me cannot figure out how or why it works.

          3. KSM-AZ

            Use kermit

            Kermit

            Runs on anything even over wet shoestrings, compiles for just about anything. I have a nice kermit script generator that builds updates and reconfigs for cisco routers and switches that are remote, or on a console cable, or ...

      3. Anonymous Coward
        Anonymous Coward

        Re: "expect" isn't a basic utility, it's an add-on.

        Perl does have Expect but it looks fiddly, and Net::Telnet can also be used like expect, instead maybe the basic login method and a unhealthy dose of good faith in non-standard firmware was used.

      4. W.S.Gosset Bronze badge

        Re: "expect" isn't a basic utility, it's an add-on.

        > There is a possibility that the system our protagonist was using didn't have Tcl installed, and thus he didn't have access to expect.

        Tech.Note:

        Expect is wholly distinct from Tcl.

        I have used it to solve nasty inter-VM problems with no Tcl anywhere in sight, let alone installed on our stripped-to-the-bone systems.

    2. Anonymous Coward
      Anonymous Coward

      I am less diplomatic. I would have pulled out my missives describing why we need to patch and upgrade our systems that were ignored by the managers in the room. Failure to act is negligence, not "risk management".

      1. Anonymous Coward
        Anonymous Coward

        True. It annoys me when the engineer who's been given a task and tested then gets the blame for shit that went down through no fault of his/her own. You would assume they'd have all been patched for security. He'd done enough testing and they still gave the go ahead.

        Reminds me of the time I made sure, at the NHS, some tablet devices were put on the network properly and configured properly. By the ex-engineer who'd defected from the IT department to the specific Trust that hated the IT department, so would try to bypass IT with kit. He knew what he was doing, heck he'd written one of the old remote tools we use to use (using other peoples code mind you). I knew he knew what the passwords still were (they'd never been changed, despite me pointing this out a year or two back. But what do I know, I'm only a "contractor"). So I made sure they were connected properly. I informed the stakeholder and heard nothing.

        3 months later an e-mail went out asking about them. I piped up and said I'd bought this up 3 months ago. And then all hell broke lose. They attempted to end my contract, despite me having the evidence I'd e-mailed them, warning them 3 months before. And that I'd warned them months before that, that the likes of the ex-engineer had access to everything still and still knew passwords. Luckily I had two managers that defended me so I stayed but still got a bollocking.

        Bullshit. If I hadn't needed the money I really should of walked. Essentially the higher ups fucked up included the stakeholder and because I was a contractor, they wanted to make me the scapegoat.

        I'm not sorry to say, I hated the director before that and even more so after. So much so I recently heard he'd had cancer for a year and I'm not sorry to have thought "Shame he never died of it".

        There are ways to deal with situations in work. Trying to belittle people isn't one of them.

        1. DougS Silver badge

          Well he WAS to blame

          So he rightly got the blame. He also rightly didn't get in trouble for it, because he was doing the job he was told to do and neither he nor his boss knew ahead of time what he was doing might crash the CPUs and bring down the entire link.

  5. Rusty 1
    Facepalm

    Telnet access

    I've worked both with embedded systems and general software applications that provided wonderfully rich functionality via telnet. In a good number of cases though, the handler for the telnet access appeared to in a single threaded main loop, blocking the functioning of the reset of the system. There were some quite awkward "pauses" in operation :-(

    1. disgustedoftunbridgewells Silver badge

      Re: Telnet access

      I once worked with an unnamed yet completely vital system which outputted its log over a network connection similar to syslog but not.

      We often ran the log viewer app on our workstations over a VPN as it was very useful for development.

      After a while we started getting very strange errors on this critical system that we couldn't diagnose - random and long, noticeable delays in execution of the time critical code.

      We reported to the manufacturer who spent quite a while looking into this problem.

      It turned out that the network part of the log output code ran in the main thread, so our office's dodgy Internet were causing the whole system to hang.

  6. heyrick Silver badge

    Well, there was this time...

    ... That I happily decommissioned a god awful Windows 95 machine using a large mallet. It was very satisfying, and it freaked out the secretaries.

    I heard it was being told as an example of a non health and safety approved way of ensuring obsolete data was, erm, wiped...

    1. skswales

      Re: Well, there was this time...

      I decommissioned one persistently faulty EPROM with a hammer for wasting hours of our time during RISC OS development. 'You can't do that, they're £65 each!' 'Watch.'

      Then another in the microwave.

      1. Anonymous Coward
        Anonymous Coward

        Re: Well, there was this time...

        My reply to whiners like that is always: My hours are more expensive. If I ever get trapped into fixing a system with this chip/part again it's more expensive.

        Oh and when dealing with stuff that you've been messing with for ages that got a little mangled in the frustration last time you had to build it back out: "hey that's just a little bent, can you fix it?". *SNAP* "Oops, Nope". (Sorry, not sorry)

      2. timrowledge

        Re: Well, there was this time...

        Yow. I actually remember hearing that story from you in some random Cambs. pub circa ‘88. Scary - 30 years ago. I’m even still using RISC OS at least occasionally... and Smalltalk all the time.

    2. Pen-y-gors Silver badge

      Re: Well, there was this time...

      Surely a lump hammer and a cold chisel is the standard way to decommission hard drives?

      1. bpfh Bronze badge

        Re: Well, there was this time...

        Cold chisel? I used a shotgun...

        https://www.youtube.com/watch?v=r7GZlHmLDWg

        1. Aladdin Sane Silver badge

          Re: Well, there was this time...

          Damn it feels good to be a gangster.

      2. W.S.Gosset Bronze badge
        FAIL

        Re: Well, there was this time...

        "Interestingly", that's precisely the tactic used by the chap who burgled/bullshitted the Paris Accord/Convention with fictional data dressed up to present Calamity!/Armageddon! and on which the whole "result" depended. (The "paper" was rushed to publication in a manner and pace never seen before, AND every single delegate received a paper copy of same, and it was several times physically and dramatically brandished by delegates proclaiming how dramatic and urgent the situation was: "see!? PROOF!")

        On being sprung using/having stolen a semi-AI-pseudo-model which the authors couldn't get to run twice and get the same numbers, and on being pressed to deliver up his data, he claimed his disk had corrupted.

        On being told that he'd got so many genuine scientists so angry that they'd stumped up the money to have it forensically recovered, he "regretfully" informed them he'd smashed the disk with a hammer to protect the data from misuse.

  7. Alan Brown Silver badge

    "remote freaking systems"

    "And, if the extra traffic wasn't enough, the CPU would crash and then they had to dispatch the technicians "

    One thing I learned a LONG time ago about remote routers of any kind is that if you don't have some kind of watchdog on them you're going to have to send someone out sooner or later.

    Rigging one up before sending the things out is a LOT cheaper than one roadtrip to push the reset button (or power cycle) afterwards.

  8. My other car WAS an IAV Stryker

    Some former co-workers may remember when we got a reminder to update our timecards every day at 2:00 via NET SEND if you hadn't updated it before then (policy was to put in the morning's time when leaving for lunch).

    But the BOFH(s) on the first floor didn't secure that option from being used by anyone.

    After some small tests to see if it worked between individual users, I put together a text list of department usernames and wrote a QBasic script (being a non-dev) that called the command line to send custom messages to everyone in the department, one at a time down the list. I believe I tried it once and I failed, but others may recall the trials.

    It wasn't long after, especially after the rollout of Win7 (it was XP originally), that SEND was no longer functional. Both playtime and daily all-hands reminders ceased.

    I hope to be remembered for the good things, like the custom powertrain module testing kits I had built, or the graphical "dashboards" that are highly useful in the chassis dynamometer control room to see live telemetry. Now I'm designing wire harnesses instead of playing with powerpacks.

  9. Spanners Silver badge
    Windows

    @My other car WAS an IAV Stryker

    NET SEND was a really useful thing. I am now missing it again. Is there anything that will do the job nowadays?

    1. I Am Spartacus
      Joke

      Re: @My other car WAS an IAV Stryker

      I think the new version is called SMS, or if you need a GUI, WhatsApp.

    2. Prst. V.Jeltz Silver badge

      Re: net send

      " Is there anything that will do the job nowadays?"

      yeah write a script along the lines of msgbox("hi there") , copy it to destination pc and run it from there using psexec.

      Unless you mean do the job of messaging every user on the network and getting yourself fired , in which case you need a bigger script.

      1. jake Silver badge

        Re: net send

        write and talk still work, and should be available on most *nixish platforms.

  10. Kevin Johnston

    Ahh...EEPROMs

    As a newly badged engineer I was working in Systems Test on radar systems and I was given the task of programming and labelling the various PROMs for the systems. This was done with white ink for readability but after final test I would go round and re-mark them all. In an attempt to give this some degree of permanence I would then dab some varnish on to seal it.

    By the time we got to my third system I was looking to speed up the process so I had re-marked them all and put them on some cardboard so I could use some spray varnish which was quicker to apply and dried faster too. A couple of hours later I discovered the third feature as the system would not run and I had to re-burn a full set as the varnish was bonded to the legs as a perfect insulator on the first set.

    Not my finest hour

  11. tfb Silver badge
    Alien

    Backing off

    I think the real lesson here is: if something goes wrong when you're talking to a system, back off in some organised way (exponentially up to a limit then just keep trying once a minute or something) rather than hammering endlessly on the door.

    I worked for a company which had a back-end database behind some vast farm of caching systems. If the database failed and had to be restarted then the caching machines just sat there and tried to authenticate as fast as they could. This just battered the database to death: some resource (connections?) ran out so it started dropping connections and then the whole system just became effectively catatonic as the caching systems effectively launched a DoS attack on the DB. The answer, a very bad one, was that if the database fell over the entire front end had to be stopped and then brought up gradually. I don't remember if we ever persuaded the developers to change their system to back off if it failed to authenticate.

    1. defiler Silver badge

      Re: Backing off

      One of our clients used to use a Mac email application named after a bird you'd send down a mine to check the air.

      Whenever his password expired, it would try to authenticate over 100 times a second. And that's across the internet - not even locally. His account would be locked in an instant...

      We told him to stop using it, once the devs didn't seem to bothered about fixing it.

    2. Anonymous Coward
      Anonymous Coward

      Re: Backing off

      If the database failed and had to be restarted then the caching machines just sat there and tried to authenticate as fast as they could.

      I did some training at a technical institute many years ago. Recovery from a network outage was a very painful process - the machines we students had were all disk-less, and hundreds of students trying to re-boot net-booting computers at the same time...... (Back in the 311 days, when it would've been rather difficult to stop someone messing with c:\windows on a given machine).

    3. Wexford

      Re: Backing off

      I ran a Solaris farm with NIS+ for directory services in the 90s. A cron job that ran every minute would check a file for updates to apply to mail aliases, then do a "nisping" in a loop to commit until the new alias showed up in the table.

      Sadly I once made a change that, when someone deleted their alias via the front end, the damned thing would delete it then do a nisping in a loop to commit until the new alias showed up in the table...which of course it never did.

      Once this loop started, the NIS+ database would get corrupted around two days later. It took me a few weeks of daily restores of the entire directory, during which emails would bounce because the user principals were missing, before I diagnosed the problem. I'd noticed a regular ticking noise coming from the server. "What's causing that disk activity?" I wondered and found the looping process.

  12. Anonymous Coward
    Anonymous Coward

    Interrupts

    Had two (which both ended with drive deletions..)

    1. Whilst editing some assembly language for a program that needed debugging, I accidently missed off a h to denote hex in the interrupt. Set it running and noticed a drive light come on and the machine lock up. Never did recover either.

    2. before GIT and CVS, we had a RCS. We were a large dev team and we would often symlink into the repository to get branches and work on code. A colleage had done this and was cleaning up and managed a rm-rf complete with following symlinks. The delete wiped his files and then traversed into the symlink and started on that too. We had backups but it was a lesson to not link into the repository directly...

    1. Anonymous Coward
      Anonymous Coward

      Re: Interrupts

      The other week I accidentally deleted all of /var/log/apace2/, rather than the subdirectory of archived log files* I was aiming for. As long as no one needs to check an old log in the next couple of weeks I'm probably ok...

      * 40 fecking GB of them. Logrotate? Never heard of it apparently.

      1. Anonymous South African Coward Silver badge

        logrotate

        Had a Smoothwall that just fell over due to a full HDD.

        Narrowed it down to excessive logs - and from there to logrotate.conf whose commands was commented out.

        Found it was one of the mods that I installed that just "disabled" logrotate - edited the conf file by hand and fixed it. Was quite interesting. Had to do a manual logrotate first which took quite a while :)

        1. Prosthetic Conscience
          Unhappy

          Re: logrotate

          You're lucky, if it was a Fortinet on some models logging would just kill the storage medium after a little while due to the excess IO demand..

    2. Anonymous South African Coward Silver badge

      Re: Interrupts

      Ahhh, good old assembly language. Very cryptic, but very powerful.

    3. JBowler

      rm -rf complet with following symlinks

      For those of you out there who don't speak UNIX, that post is a Troll.

      1. Anonymous Coward
        Anonymous Coward

        Re: rm -rf complet with following symlinks

        Sorry - it was mounted/hard linked. Not symlinked...

      2. Criggie

        Re: rm -rf complet with following symlinks

        If someone's here without at least a passing understanding of unix, then they've come to the wrong part of town.

  13. NonyaDB

    "Mr. Anon, why does our internet randomly cut out on us?"

    "Well, sir, we have a point-to-point wireless 'shot' from our tower here to another tower about 60km away and unfortunately the shot lies directly in the approach flight path of the airport over there so every time a Blackhawk or Apache comes in to land, they literally sit in the beam path for a few minutes and block the signal. Once they fly off then it takes a minute or two for the two modems to re-establish the link."

    "Well, can't you move the towers?"

    "Afraid not, sir. They're permanently installed."

    "Well who the hell did that?"

    "The United States Department of Defense, sir."

    "Oh. OK."

  14. Walter Bishop Silver badge
    Facepalm

    Technicians dispatched to sites

    the CPU would crash and then they had to dispatch the technicians – to four different cities in the UK from just one base. During the night. to dozens of sites.”

    What kind of a device requires a site visit if it crashes, shouldn't it reboot after failing to trigger a heartbeat after a set period?

  15. Displacement Activity

    Yes, alpha particles

    <nerd mode>

    Cosmic rays cause soft errors in memory chips and general circuit failures. At sea level, 'cosmic rays' are primarily high-energy neutrons. Neutrons are uncharged, so don't themselves cause circuit upsets. However, when they're captured in a nuclei in a circuit element, they produce charged secondaries, including alpha particles, which do cause circuit upsets. See https://en.wikipedia.org/wiki/Soft_error#Cosmic_rays_creating_energetic_neutrons_and_protons, for instance.

    </nerd mode>

  16. Anonymous Coward
    Anonymous Coward

    Russian comms hardware? Really?

    1. defiler Silver badge

      In Soviet Russia, Internet browses you!

  17. vincent himpe

    Newly minted silicon

    While testing a fresh-from-the fab prototype integrated circuit on bench the big boss walks in exclaiming

    'The customer is excited , he wants samples'.

    To which the guy testing them responds : And how many do they want ? one, two or all three that work ...

    (out of like 500 ... the chemistry was off. new process )

  18. Anon Ymous 42

    In 1979, as a high school student I was asked to write the attendance program. Each home-room had a punch card for each student. Those absent at home-room would have their cards sent to the main office, these were then fed into the card reader and two lists generated. One for all the teachers so they knew who was absent, the other to the nurse who badgered parents for excuses. I wrote the program so that any card with my last name never end up as an entry on the nurses list. Skip homeroom and I could skip any class that day. I never got busted because I never abused it. After I went to collage my brother, 5 years younger figured this out quickly and got busted within the first 2 months he later arrived at the high school.

  19. Michael Wojcik Silver badge

    does anything in IT really ever die?

    Has anything you've ever done ended up outlasting your time at a firm?

    I expect most here have something like that.

    There's one commercial product I started working on in 1988 which is still in use at some customers sites. That was developed by a company that no longer exists, though my current employer now owns the technology, so it's debatable whether it meets the "outlasting your time" criterion.

    In 1989-1990 I worked on XGKS at IBM, and that's still available for download from Sourceforge (and I was pleased to find my name still in the README). There's a decent chance it's still in use somewhere, though the last update to the source was in 2004. Still, that's long after I left IBM (in 1991).

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2019