back to article Machine-learning boffins 'summon demons' in AI to find exploitable bugs

Surrounded by all the hype in AI, it’s easy to sing the praises of machine learning without realizing that systems can be easily exploited. As governments, businesses, and hospitals are beginning to explore the use of machine learning for data analysis and decision making, it’s important to bolster security. In a paper [PDF] …

  1. Lee D Silver badge

    The biggest problem with any kind of ML or AI. Unverifiability.

    The reason we used machines in the first place was to give us answers we were certain were correct, not subject to human error or interpretation or carelessness or exhaustion. Elimination of errors past the problem-input phase mean that we can use computers for mathematical proofs, even, which is the highest rigour of application.

    But ML or AI (which STILL DOESN'T EXIST) - we have absolutely no idea how it arrived at the answer, if the answer is correct (without verifying it against some other more rigorous system), or whether the answer will still be correct once we plug in different starting conditions or change the problem slightly. We are literally clueless.

    So when it comes to security, and saying what happens when people deliberately put in invalid, out-of-bounds, taxing inputs into the same systems, and expecting to be able to predict or bound the results, we stand absolutely no chance.

    Things like ML have a place, but that place is in providing an answer that you can accept is sometimes incorrect. It's almost a form of analogue computing or fuzzy logic. Their place isn't in anything you care about, anything important, anything where inputs are untested and unbounded, or anything that you can't allow to go wrong or which you might need to "tweak" later to account for such.

    Great for the Kinect guessing whether you've made your dance move right or not. But now consider how to view Tesla's Autopilot and similar systems.

    Training anything - dog, cat, AI - on input alone and then "certifying" them for a particular job after a lot of input is ridiculous. Because when they hit unexpected input (which, by definition, is anything you haven't trained them on), their actions are unpredictable. It's why a vast chunk of most modern programs is nothing more than checking inputs, handling exceptions and overflows, and bailing out if things aren't as you expected.

    When you start looking at security, that chunk gets bigger and bigger and bigger and even the humans make mistakes because they DIDN'T SEE an attack vector when the program was written. AI isn't going to change that, it's just going to make it worse by being unpatchable (because we don't understand what it's actually doing, and certainly can't change JUST that bit of their behaviour) and unpredictable even if it appears to pass all the tests.

    There is literally nothing to stop a ML or AI agent from suddenly throwing out a completely random answer purely because the input wasn't in its training, or wasn't in the same kind of pattern as in its training.

    1. Filippo Silver badge

      Very true; "AI" is fundamentally different from traditional software engineering and should never be used to attempt to solve the same problems. You can never rely on the answer of an "AI" system in the same way that you can rely on the answer of a classical algorithm. And this is not something that can be fixed; it's a fundamental property of how such systems work.

      That said, the "AI" system should at least not crash outright or allow arbitrary code execution when encountering weird input. Those are traditional bugs and should be fixed as such.

    2. Little Mouse

      unverifiability

      I'm pretty sure Asimov described this problem with his positronic brains which were orders of magnitude too complex for humans to fully understand.

      1. Destroy All Monsters Silver badge
        Holmes

        Unverifiability: Welcome to the Real World, where things are more complex than they seem.

        That's why we have statistics.

        A book from 1997:

        Empirical Methods for Artificial Intelligence

        ...it's actually a book about statistics and running experiments on your new AI-gorithm to see whether it actually is as good as you think it is.

        (Also look out for the Black Swan, i.e. the falling-down-the-stairs-HAL-9000-level breakdown that you didn't even know was coming)

        1. Charles 9

          Re: Unverifiability: Welcome to the Real World, where things are more complex than they seem.

          Only thing is, statistics are even less truthful than damned lies...

    3. Anonymous Coward
      Meh

      Untrustworthy safety critical AI application

      The biggest problem with any kind of ML or AI. Unverifiability.

      Which would seem to be a bit of a showstopper in many of the safety critical applications that it is being touted for, like self-drive cars and some medical uses.

      1. Pascal Monett Silver badge

        @ Smooth Newt

        "Which would seem to be a bit of a showstopper [..] for, like self-drive cars"

        I think we're going to see about that in the years to come. Then again, I'm against calling that AI. It's just reams of code developed for a specific purpose. Highly complex code to be sure, with a boatload of requirements such as we have never seen before, but specialized code nonetheless. You won't be able to put it up against Kasparov in a game of chess, which is something a true AI could do.

    4. The Man Who Fell To Earth Silver badge
      Mushroom

      AI Mental Illness

      The dirty little secret that no one in the AI community wants to talk about is AI mental illness. People now just flush their AI if they think it is going off the rails. But just as with humans, the process of going off the rails can take years and be so gradual as to be imperceptible until its too late. And don't think that biological neural systems are magically less determanistic that silicon ones.

      1. amanfromMars 1 Silver badge

        Re: AI Mental Illness's LoveChild and Deep Dark Web Country Cousin

        The dirty little secret that no one in the AI community wants to talk about is AI mental illness. .... The Man Who Fell To Earth

        That is as may be, TMWFTE, however the priceless gem mine that a SMARTR AI keeps secret for stealthy secure silent sale of its wares, is Total Information Awareness with which IT can control and direct the madness for/in the insane?

        Methinks many would tout that ability with facilities, a touch of pure genius.

        Who was it who first said .... "If you can't beat them, join them and play a smarter game to win win in a zero sum game and/or loser environment"

        And these quotes are APT ACTive IT relative too .....“You never change things by fighting the existing reality. To change something, build a new model that makes the existing model obsolete.”― R. Buckminster Fuller

        “The most dangerous man, to any government, is the man who is able to think things out for himself…Almost inevitably, he comes to the conclusion that the government he lives under is dishonest, insane, and intolerable.”—H.L. Mencken, American journalist

        O divine art of subtlety and secrecy!

        Through you we learn to be invisible, through you inaudible;

        and hence hold the enemy's fate in our hands. -- Sun Tzu, The Art of War, c. 500bc

    5. Anonymous Coward
      Anonymous Coward

      ML wins when it's better than a human. If it gets stuff wrong occasionally it doesn't matter, because humans make mistakes and physical parts fail.

      Unfortunately, irrational human attitudes might get in the way of implementation.

      1. find users who cut cat tail

        > If it gets stuff wrong occasionally it doesn't matter, because humans make mistakes and physical parts fail.

        It matters unless ‘the machine said so’ ceases to be universal excuse for not only arbitrarily stupid decisions, but also their strict enforcement with no possibility to appeal or negotiate.

        Humans make lots of mistakes so they are aware they can make mistakes (with some notable exceptions). Someday the AI that does not exist yet (TAITDNEY) may be advanced far enough so that you can explain it its mistake when it makes one. But currently those trying to replace humans with AIs seem to be doing *because* they can avoid this possibility.

    6. Kristaps

      Here are some of my ramblings - why think of ML as software and not as of a simple mind? When you learn to drive, certain neural structures are created in your brain, their connection weights adjusted and eventually it's robust enough that you can drive safely, though still susceptible to illness, seizures, tiredness and other biological factors (let's ignore things like other drivers being tits). Can you verify these neural structures when you teach a person to drive? A poorly trained "AI" (please excuse my careless use of this term for the sake of simplicity) will suffer from conceptually similar problems.

      As soon as the given AI can drive at least as safely as an average human, it should be ok to use it in a self-driving car. There's still a chance for it to crash, but so is there when a human is driving. All that you require of the AI is sufficient complexity and learning experience. (You might think that say a chimp's brain is incredibly complex compared to our most powerful computers yet they can't drive a car, but then again a lot of their available computational resources are used for other processes whereas the aforementioned AI can have its sole purpose to be the driving of a particular car).

      1. Charles 9

        "As soon as the given AI can drive at least as safely as an average human, it should be ok to use it in a self-driving car."

        But there's a catch. AIs don't learn the same way we do, and in fact we don't always understand HOW we as humans ourselves learn. For example, there's the concept of intuition: the stuff we learn SUBconsciously, like the very subtle difference between a normal person and a suspicious one, between a car likely to stop and one likely to run the red light, the tells that a huge tree branch is going to fall in my path and I need to get out of the way BEFORE it actually falls (or it'll be too late), or perhaps the hints that the jerk in the corner is just trolling with the self-driving car that can't afford to risk hitting a pedestrian given half the chance. Since we don't know how ourselves we pick up on these subconscious hints, we have no way to teach them to an AI, so it doesn't learn those subtle things that can help prevent accidents without our even thinking about them. If you look up "self-driving cars intuition" you should probably find a few articles that wonder the same thing.

    7. Orv Silver badge

      One of my darker suspicions is that unverifiability is the goal in some applications. You can't put anyone in jail for a decision if they can shrug and say, "the computer did it, and it's a black box."

    8. LionelB Silver badge

      The biggest problem with any kind of ML or AI. Unverifiability.

      A bit like Human L or Human I, then. We definitely shouldn't let humans do anything as risky as driving cars.

      There is literally nothing to stop a ML or AI agent human from suddenly throwing out a completely random answer purely because the input wasn't in its training, or wasn't in the same kind of pattern as in its training.

      Just that.

      1. Charles 9

        There is literally nothing to stop a human from suddenly throwing out a completely random answer purely because he didn't care about the input and just wanted to be a jerk.

        ex. "What's 2 + 2?" - "Gynecology"

  2. David O'Rourke

    Source code not required

    Actually you don't even need access to the source code. Researchers have shown that they can deliberately produce false results from a variety of ML algorithms whilst treating them as black boxes.

    1. Anonymous Coward
      Anonymous Coward

      Re: they can deliberately produce false results

      I think you can get that with people as well.

      1. James Loughner
        Big Brother

        Re: they can deliberately produce false results

        As having been proven by the US election

    2. Lee D Silver badge

      Re: Source code not required

      ML is vulnerable to fuzz-testing as much as human-written code. That's not surprising.

      In fact, some of the best ways to find bugs are by not trying to read the code (which includes certain assumptions that may not be true in reality, e.g. Rowhammer, compiler constraints, etc.) but by just throwing random but vaguely-valid code at everything you possibly can and seeing if there are any unintentional side-effects.

  3. John Smith 19 Gold badge
    Unhappy

    Over the years people have done AI projects in software development.

    I always wondered why the first system they had their new tool to chew on was a copy of itself.

    Self improving?

    1. Lee D Silver badge

      Re: Over the years people have done AI projects in software development.

      Do you mean you don't understand why they don't do that?

      Because the results are generally slow and meaningless. There's no "AI" as you might think it. That just doesn't exist.

      Take genetic algorithms as an example - literally you pitch a load of algorithms against each other, see one which is closest to what you want, and then "breed" from it to put similar code into another generation of algorithms, that you pitch against each other and so on.

      Thousands of generations later, you get something that can do something really quite basic with a very base level of repeatability. It gets *better* with each generation (not entirely true, sometimes it quite clearly goes backwards!) generally speaking, but it never gets to a point where it's in any way infallible or quicker than humanly steering it (heuristics), or reliable.

      The big thing in GA is the selection criteria - how do you know who did best, how many of those do you breed from, what kind of breeding crossover size do you allow, etc. If you apply a GA to that, it gets even worse. It's basically a blind, random search. Sure, given a few million years of execution, it might end up somewhere but all you've done is add complexity and increased the time it takes to do anything by an order of magnitude.

      Though GA != AI, all the machine learning things you see have the same problem. You know what the criteria for success are, you know what leads there, you can measure all kinds of things (in fact, the more you can measure, the worse it gets!), but applying them to themselves you end up in a "blind-leading-the-blind" situation that just makes everything even worse. And in the end, just tweaking the criteria for success itself achieves the same result quicker (i.e. the criteria for success of the "master" GA gets folded into the criteria for success of each "underling" GA anyway). Except "quicker" is by no means bounded or guaranteed in human terms.

      The problem is that people THINK we have AI. We don't. They think we have machines that can learn. They don't. They think that getting the basic of "learning" machines and scaling it up will just work. It doesn't. They think that once something starts to "learn", we can train it into a HAL-9000 by just throwing more resources at it periodically. We can't.

      Like compressing a compressed file, setting one to teach another isn't going to achieve anything any quicker than you could achieve by just focusing and "nurturing" the target anyway. It's like being an educated person, training an uneducated nanny to then educate your child. You could do a better job by just doing it directly.

      But the biggest problem - machines STILL DO NOT LEARN. Even in the most impressive of demos and achievements (Google's AlphaGo is unbelievably amazing - I know, I studied Maths and Computer Science under a professor who studied machine-algorithms for winning Go for his entire life... you have no idea of the leaps AlphaGo has made. But it's STILL DOES NOT LEARN).

      1. Charles 9

        Re: Over the years people have done AI projects in software development.

        "But the biggest problem - machines STILL DO NOT LEARN. Even in the most impressive of demos and achievements (Google's AlphaGo is unbelievably amazing - I know, I studied Maths and Computer Science under a professor who studied machine-algorithms for winning Go for his entire life... you have no idea of the leaps AlphaGo has made. But it's STILL DOES NOT LEARN)."

        For clarification, specify what you mean by "learn" and perhaps give a specific example.

        1. Anonymous Coward
          Trollface

          Re: Over the years people have done AI projects in software development.

          Make 1000 "brains". Give them a "task". Measure their "success". Take the top 10 and "breed" 1000 new ones. "Kill" the other 990. Rinse and repeat 1000 times.

          "Learning!"

        2. Lee D Silver badge

          Re: Over the years people have done AI projects in software development.

          "For clarification, specify what you mean by "learn" and perhaps give a specific example."

          No problem.

          Is your child learning by being told to memorise all the exam answers? They will certainly pass tests, but are they "learning"? Will they be able to apply that knowledge, acquire or infer related facts, or step outside the boundaries of their rote-taught curriculum? Most people will argue "No". That's not "learning". It's memorisation. Computers are perfect at memorising. Feeding in a billion games and telling it "this is a good position", "this is a bad position" and making it memorise those is not learning.

          Even simple transforms are not learning - just switching the order of the answers on a multiple-choice exam, so that the child has to memorise the ANSWER, not just the letter assigned to it. The computer equivalent? The same position seen as a rotation, reflection, translation, change of colour, etc. or even seeing a miniature part of it reproduced on a larger board. Is it "learning" to memorise all the positions and then use similarity tests to assign a value? Most people would argue "No."

          I'm using the inferred and standard human definition of learning that most people will not argue with, and which are reflected in the dictionaries:

          "become aware of (something) by information or from observation."

          and "gain or acquire knowledge of or skill in (something) by *study*, *experience*, or being taught."

          To "become aware" that you're about to lose a chess game, that you've NEVER seen before, never played that position before, have no perfectly memorised table of losing positions for, but which you can *infer* you will lose without that specific a knowledge? That's learning.

          Current "AI" does not learn. It adds to a massive database of experience, yes. That database is associated with a key of "desirability", yes. But outside of that, the computer is unable to infer. Feeding it a billion games from masters might give it enough database to win. But humans quite clearly do not require that to learn the game.

          "AI" is also almost entirely heuristical. Humans have told it "this will be a winning position", "this will not", "this is X times more desirable an outcome". Whether by rules implicit in the system, input into the data, or programming which contains such assignment. Though you can feed in a massive games database and it can automatically form an association between "moving into the top-left corner" and "winning 0.184% of matches", that is singularly useless from an "AI player" point of view.

          AlphaGo starts down the routes of finding patterns. This is the data, this is the winning items, this pattern that I've formed from those winning games can be described as a board position looking like X, and in other games that I've never seen before, games including board position X result in a win 12.749% of the time. It's pattern-forming, and pattern-matching. But the patterns it can possibly find are described by humans again, whether coded or parameterised.

          At no point are such machines as DeepBlue or AlphaGo inferring or hypothesising or doing anything unexpected. They don't have an understanding of the position and the ability to form similar patterns that may improve that association. It has to be coded specifically. AlphaGo is leaps and bounds ahead in this, beating predictions for gaming models of Go by decades. But it's still not inferring as you would need to "learn".

          Left to it's own devices, it wouldn't be able to formulate a strategy. It can only take a HUGE database of games and form correlations between their properties.

          I like to think of it as the Fosbury Flop principle. Take this as an analogy, not a strict example! Put into a high-jump contest, even the best of today's AI wouldn't have the insight to suddenly invent a different way of working that was still within the rules but not present in the database of all existing high-jumps before it. Similarly, there's a record in the cricketing world where there was a period of time during which there was no rule specifying a maximum width of a cricket bat - until one guy played with a bat wider than the stumps.

          "AI" isn't capable of that "within the rules, but outside of their own experience" thinking, true learning. They cannot infer. They cannot build a pattern outside of certain set criteria. And they are still, at the end of the day, expert systems and statistical analysers on large databases.

          1. LionelB Silver badge

            Re: Over the years people have done AI projects in software development.

            @Lee D

            I think you miss a crucial element of (machine, and indeed human) learning: the facility to generalise. This is what underpins "deep learning" algorithms such as AlphaGo.

            So a trained deep learning algorithm doesn't just barf when it sees a pattern that it hasn't encountered in training - it's not just looking up a response in a database. Rather, it responds more along the lines of: "well, when I've encountered patterns similar to this during training, then it turned out that response such-and-such seemed to work, so I'll go with that".

            This seems to me not a million miles away from human learning. Yes, it may take human intervention to set up training sets and even short-cuts/heuristics to successful training - but then isn't that exactly what a good human teacher does? In deriding the human element of designing a machine learning system are we not perhaps setting the bar for machine learning higher than for human learning?

            Also, I don't think your Fosbury Flop example is a good illustration of learning, so much as of creativity - surely not the same thing. I mean, many humans can learn an existing high-jump technique - but it took a rather special creative (as opposed to learning) step to invent a novel technique.

            1. Charles 9

              Re: Over the years people have done AI projects in software development.

              Let me add to the argument here. Going to your chess example, I would suspect a complete novice would not even realize they are losing until their opponent makes the final move and declares checkmate, and usually not even then unless the other player points out why the king is cornered. I know it happens to many a novice. Heck, it happens a lot to novices of Connect Four and Pente, and these are much simpler games.

              What allows humans to improvise is a knowledge base taken from firsthand experience. This is something only time can give to AI systems, just as it takes time for humans to figure out the coordination of legs, hips, arm, and wrist needed to make a very good throw (and because this is different for each person due to body types, it's something that can only be hinted, not necessarily taught; you're on your own for the fine-tuning).

              After all, the batsman who came to the pitch with that bat probably didn't cook the idea up whole cloth. He probably watched a tennis game and made the connection (perhaps subconsciously). Just as the guys at St. Louis University who first tried gridiron's forward pass probably thought back to games like baseball and thought, "Why not?" Or the high jumper who thought perhaps an arcing movement of the body can allow some extra inches. Bursts of creativity usually don't just spring out of nowhere. AI needs the knowledge base first, and we're only now getting to that part.

        3. LionelB Silver badge

          Re: Over the years people have done AI projects in software development.

          For clarification, specify what you mean by "learn" and perhaps give a specific example.

          Big fat upvote for that.

          And while we're about it, what exactly is this "AI" (that we DEFINITELY don't have)? How will we know when we've got it? Does it have to be human-like?

    2. Primus Secundus Tertius

      Re: Over the years people have done AI projects in software development.

      Agreed.

      The real future of AI is to understand how AI does or will work.

  4. frank ly

    I thought this was a Laundry Files story

    I was mildly disappointed, at first.

    1. David Austin

      Re: I thought this was a Laundry Files story

      Has someone got Angleton on speed dial?

  5. allthecoolshortnamesweretaken
    Coat

    I thought Al's demons were Peggy, Kelly and Bud?

  6. Spender
    Stop

    open source?

    "People [could do bad things...] It’s a realistic possibility, granted that a lot of machine learning software is open source"

    I'd offer that these statements form a non-sequitur.

    Isn't this what people used to say about encryption? Proprietary = more secure? That didn't work out too well, did it?

    Better that vulnerabilities are out in the open, rather than being quietly exploited by those "in-the-know".

    This kind of research is possible specifically because the algorithms are so accessible.

  7. dgc03052

    Welcome to the grand illusion

    Current ML seems to be at the biological equivalent level of retinas, perhaps with a couple of neurons above.

    Now think about all the optical illusions that we can be tricked with, and how hard it is to discover some of them - every new instance of ML is going to have it's own set of illusions/false outputs. They may get better and better, but every one of them is going to have it's blind spots and ways to be fooled.

    And that isn't counting any of the basic coding, memory, etc., bugs, that can crash things, rather than "just" provide the wrong output. They may be incredibly useful, or even better than a human, but they will never be perfect.

  8. amanfromMars 1 Silver badge

    You can take a horse to water, but does it think?

    There's a low awareness of vulnerabilities in neural networks, say researchers.

    Oh????? What would those researchers be researching then, KQ, for any neural networker worthy of note in the field would tell you the exact opposite and be extremely excited about all of the possibilities and opportunities which their high awareness of all the vulnerabilities in neural networks presents and gloriously conceals and protects from idiot misuse and grand corporate abuse.

    And do you think the fake news networks/media moguls realise they haven’t a clue about what to do about what is discovered and being ever so stealthily uncovered?

    Indeed, why not take IT a great deal further here, and ask El Reg whether they realise all of that which is mushrooming around them?

    1. amanfromMars 1 Silver badge

      Re: Thirsty Horses RFI ….. Realistic Capital Market Valuation on Priceless AI Product[ion]

      Thanks for all the phish, El Regers. Whenever the silence is deafening, is one way out ahead of the madding crowd, and in pristine proprietary and popular primitive populated fields with Clouds, is that more than just heavenly.

      And one can then decide if it* is to remain unknown as a known unknown, above TS/SCI Classified MkUltra Sensitive and the work of Global Operating Devices or released for devils and daemons to plague and create hell, madness and mayhem in SCADA Remote Command and Virtual Control Systems ….. which is the default Planet Earth Macro Management System, is it not?

      And that makes it* AIReal Doozy of a NEUKlearer HyperRadioProActive Weapons System against which there be no physical crack/hack/attack vector. And that is a whole new football game.

      Fake alternative fact news, El Reg, or AI Breakthrough with Quantum Communications a'Leaping Leading?

  9. Bucky 2
    Unhappy

    WONTFIX

    WONTFIX responses aren't limited to ML.

    My favorite is the one where mysqldump produces a corrupt dump if a table contains a binary column. You can add a switch to fix this (--hex-blob), so binary columns are dumped in hex.

    Making blobs dump as hex by default was rejected as WONTFIX.

    Producing an error message that the dump file would be useless because of a blob column was also rejected as WONTFIX.

    The reasoning for broken-by-default is that users are supposed to RTFM and use the proper dump arguments.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like