back to article Protecting code's secrets wins ACM prize

Better code obfuscation has attracted the attention of the prestigious Association of Computing Machinery, which has anointed an Indian-born developer working at IBM's TJ Watson Research Centre with an award for his work. Protecting code, even as a binary, from being reverse-engineered is difficult: any solution that encrypts …

COMMENTS

This topic is closed for new posts.
  1. Christian Berger

    OMFG

    The only way this could be used is in malware, since there you want people to not be able to reverse engineer code.

    Otherwise if I'm running code I have every right to reverse engineer it. After all, at least as a German*, I have a basic right to privacy and integrity in computing. If my computer obeys commands, I need to be able to understand them as well, at least in principle.

    *) In 2008 the German constitutional court ruled that such a right exists and applies to everybody. However obviously the German constitution is not necessarily applicable to other countries.

    1. Anakin

      Re: OMFG

      I totaly agree.

      The hell will freese over before i run any code on my computers that can't be dissambled i IDA

    2. Anonymous Coward
      Anonymous Coward

      Re: OMFG

      You're correct - you have the right to reverse engineer. You don't have the right to expect that this will be easy.

      This technology has a number of useful applications, but, sadly, malware is one of the less useful ones.

      German constitution does not apply outside Germany - the EU has however implemented a directive giving all EU citizens a similar right.

      1. Destroy All Monsters Silver badge
        Trollface

        Re: OMFG

        Otherwise if I'm running code I have every right to reverse engineer it

        Sure, but you may not be ABLE to.

        It's like with these "human 'rights'" that we so often hear about which are just reachable in a platonic ideal of a western liberal society with acceptable economic infrastructure.

  2. Neoc

    If it has to be decrypted for it to work, someone will eventually break it.

    1. Gordon 10

      Exactly - you just move the inspection point from the binary source to a stream monitor post decryption.

      Not saying it would be easy but since you would have full control of the machine the code is running on - you would be the ultimate man in the middle.

      The whole PC code base from firmward upwards would have to be globally encrypted for this to be fully secure.

      Edit:Foxyshadis made the same point below but with better in depth knowledge.

      1. Bronek Kozicki

        Just run it inside virtual machine, and refuse to buy any proprietary software unable to run inside a VM. Reverse engineering aside, you may have other good reasons for this, e.g. running your own private cloud with live migration for 99.999% uptime, hardware assets management etc.

  3. foxyshadis

    This sounds like missing the point entirely

    Most non-trivial unpackers are already based on tracing and reassembling the code as it executes, or by having completely reverse-engineered the packer. I don't understand what this solves, since all of the routines that programmers will want to protect are also likely to be the ones executed most often. I'm not aware of many obfuscation schemes that are easily beaten by "algebraic methods", so this lands squarely in the land of "fancy tricks that impress programmers but have no real-world applicability".

    1. stanimir

      Re: This sounds like missing the point entirely

      "fancy tricks that impress programmers ACM but have no real-world applicability".

      FTFY

      And I totally agree, at some point the code has to go to the CPU (or GPU) to execute. Assembler is harder to read than C but still very far from ability to hide the real execution/algorithm flow.

      1. david 12 Silver badge

        DRM

        The main point is addressed in other posts, but just note that the objections are also arguably invalid. There already exist hardware devices for taking an excrypted stream, and decrypting only the output. The equivilant is obvious: an encrypted program that can be dissassembled only on decrypting hardware.

        DRM hardware is only protected by legislation, but that's still good enough for one large industry.

    2. Anonymous Coward
      Anonymous Coward

      Re: This sounds like missing the point entirely

      It is not about packing already generated code. It's about generating code that is functionally equivalent to the original source code in terms of inputs and outputs, but does not follow the same logic internally.

  4. Anonymous Coward
    Anonymous Coward

    If I understand the paper correctly (quite unlikely):

    You look at the program function as a set of inputs with a corresponding set of outputs. Then replace it with a mathematical function that basically decrypts the correct output for each given input. So an attacker can't reverse-engineer the underlying sequence-condition-iteration logic of the original code. The same decryption function being used for every program function, but with a different "key" that maps the outputs to the inputs.

    So running it in a debugger is a bit like trying to step through interpreter code. In fact it's more like stepping through an emulator for a custom CPU that has unknown opcodes and is running a program compiled with those opcodes.

  5. This post has been deleted by its author

  6. Anonymous Coward
    Anonymous Coward

    I've come across plenty of programs that are nearly impossible to reverse-engineer even when you have the source code. Ironically, often the worst offender for illegible code is text books that are supposed to teach coding.

    This is one such textbook gem:

    old_q = Q;

    Q = (uint8)((0x80000000 & R[n])!=0);

    tmp2 = R[m];

    R[n]<<=1;

    R[n]|=(uint32)T;

    switch(old_q){

    case 0:switch(M){

    case 0:tmp0=R[n];

    R[n]-=tmp2;

    tmp1=(R[n]>tmp0);

    switch(Q){

    case 0:Q=tmp1;

    break;

    case 1:Q=(uint8)(tmp1==0);

    break;

    }

    break;

    case 1:tmp0=R[n];

    R[n]+=tmp2;

    tmp1=(R[n]<tmp0);

    switch(Q){

    case 0:Q=(uint8)(tmp1==0);

    break;

    case 1:Q=tmp1;

    break;

    }

    break;

    }

    break;

    case 1:switch(M){

    case 0:tmp0=R[n];

    R[n]+=tmp2;

    tmp1=(R[n]<tmp0);

    switch(Q){

    case 0:Q=tmp1;

    break;

    case 1:Q=(uint8)(tmp1==0);

    break;

    }

    break;

    case 1:tmp0=R[n];

    R[n]-=tmp2;

    tmp1=(R[n]>tmp0);

    switch(Q){

    case 0:Q=(uint8)(tmp1==0);

    break;

    case 1:Q=tmp1;

    break;

    }

    break;

    }

    break;

    }

    T=(Q==M);

    1. Mage Silver badge
      Pint

      Arrrgh My eyes

      You owe me for a doctor and optician's visits now!

    2. no-one in particular
      Joke

      So *that's* where our contractors* learnt how to write code.

      *other contractors are available, your code quality may go down as well as, um,

    3. Colin Wilson 2

      That C's so complicated

      Try Pascal instead - it reads like a book:

      program xmas;

      {$APPTYPE CONSOLE}

      function t_ (t, _ : Integer; a : PChar) : Integer;

      begin

      if 1<t then begin if t<3 then t_(-79,-13,a+t_(-87,1-_,t_(-86,0,a+1)+a)

      ); if t<_ then t_(t+1,_,a); if (t_(-94,-27+t,a)<>0) and (t=2) then if _<

      13 then result:=t_(2,_+1,'%s %d %d'#13) else result:=9 else result:=16 end

      else if t<0 then if t<-72 then result:=t_(_,t,'@n''+,#''/*{}w+/w#cdnr/+,'+

      '{}r/*de}+,/*{*+,/w{%+,/w#q#n+,/#{l+,/n{n+,/+#n+,/#;#q#n+,/+k#;*+,/''r :'''+

      'd*''3,}{w+K w''K:''+}e#'';dq#''l q#''+d''K#!/+k#;q#''r}eKK#}w''r}eKK{nl]'''+

      '/#;#q#n''){)#}w''){){nl]''/+#n'';d}rw'' i;# ){nl]!/n{n#''; r{#w''r nc{nl]'+

      '''/#{l,+''K {rw'' iK{;[{nl]''/w#q#n''wk nw'' iwk{KK{nl]!/w{%''l##w#'' i; '+

      ':{nl]''/*{q#''ld;r''}{nlwb!/*de}''c ;;{nl''-{}rw]''/+,}##''*}#nc,'',#nw]'''+

      '/+kd''+e}+;#''rdq#w! nr''/ '') }+}{rl#''{n'' '')# }''+}##(!!/') else if t<

      -50 then if _=Ord(a^) then begin result:=Ord(a[31]); if a[31]=#13 then WriteLn

      else Write(a[31]); end else result:=t_(-65,_,a+1) else result:=t_(Ord(a^

      ='/')+t,_,a+1) else if 0<t then result:=t_(2,2,'%s') else result:=Ord((a^=

      '/') or (t_(0,t_(-61,Ord(a^),'!ek;dc i@bK''(q)-[w]*%n+r3#l,{}:'#13'uwloc'+

      'a-O;m .vpbks,fxntdCeghiry'),a+1)<>0))

      end;

      begin

      t_ (1, 10000, '')

      end.

      1. Duncan Macdonald
        Mushroom

        Re: Try this - in APL

        life←{↑1 ⍵∨.∧3 4=+/,¯1 0 1∘.⊖¯1 0 1∘.⌽⊂⍵}

        This little line of code calculates the next generation in the game of life !!!

        1. Destroy All Monsters Silver badge
          Headmaster

          Re: Try this - in APL

          EL REG WHEN ARE WE GETTING "tt" TAGS?

  7. John Smith 19 Gold badge
    Unhappy

    This is clearly a computer science paper

    It's got lots of odd symbols in and everything.

    Sadly what it does (or even how it does it) is completely beyond me.

    1. BlueGreen

      Re: This is clearly a computer science paper

      > Sadly what it does (or even how it does it) is completely beyond me.

      I've got a degree in CS and a lot of it baffles me too. The problem isn't the abstraction but the relation of those abstractions back to the real world, basically, relating them back to things I understand already. Also helps if I can understand how they help you do something - that gives me motivation to learn them as tools.

      So on that note, I think I understand what a lattice (a set with orderings leading to a top & bottom element blah blah) is but can anyone explain to me why they are useful?

  8. CommanderGalaxian
    Boffin

    Err, so what gets tested at test time?

    But surely in order to develop, debug and test, you will still have to use unencrypted binaries. And there's the problem, once you switch on encryption, the code you will be running will no longer be the code you have tested.

    But then maybe once you get to test stage you could just use the encrypted version. But then, if there is a problem (and there will be), you will have to switch back to the unencrypted version for debugging...which won't be exactly the same as the encrypted version.

    Then again, maybe there will be an auto-unencrypting decompiler/debugger to go with the compiler.

    1. Destroy All Monsters Silver badge
      Paris Hilton

      Re: Err, so what gets tested at test time?

      Yeah, but so what?

      This operation will be used in low-assurance SW for consumers (and fock them if it doesn't fully work)

      or in SW that "may fall into enemy hands" and for which more money can always be found for "bang on it until it works" operations.

      1. CommanderGalaxian

        Re: Err, so what gets tested at test time?

        Yes, that's basically my point, if the "encryption" method is to run code that is functionaly equivalent to - but not actually the actual code - then wtf? Can't see it catching on for avionics and the like.

  9. bigtimehustler

    This is completely pointless, if you run the programme through all of its functionality while intercepting all traffic on its way to the CPU then you will have the unencrypted machine code. From there, you can put it back together and decompile and inspect to your hearts content. So a lot of effort to solve something that can simply be bypassed. Pretty much like saying pin numbers protect you from bank fraud when someone's holding a gun to your head.

  10. druck Silver badge

    The best obfuscation of code I ever saw was an application written in interpreted BASIC for the Archimedes (under Arthur, pre-RISC OS). Every variable and procedure name in the code had been replaced with the word ClaresMicroSupplies - but using different combinations of upper and lower case letters. The computer has no trouble recognising these as different symbols, but it's impossible for a human to tell thousands of these apart.

This topic is closed for new posts.

Other stories you might like