back to article Carnegie-Mellon Uni emits 'don't be stupid' list for C++ developers

Carnegie-Mellon University's Software Engineering Institute has followed-up its secure C programming rules from last year with a similar set of standards for C++. In the institute's announcement on Wednesday, it says it has put ten years into researching secure coding. The resulting SEI CERT C++ Coding Standard has 83 rules …

Page:

  1. Destroy All Monsters Silver badge
    Windows

    So...

    Basically, we need an expert system to tell the developer when he might want to reconsider (best while writing code, otherwise in a post-processing step).

    Actually, these more or less exist, I hope people use them. OO-flavored pseudo-assembler is as nice as a glass jar of elephant's foot.

    1. jake Silver badge

      Re: So...

      Actually, hand-massaging of compiler output prior to assembling & linking is kind of fun, when you approach it with the right attitude. Jars of nuclear waste? Not so much.

      Disclaimer: I code in C and rarely touch the super-set clusterfuck known as C++ ...

      1. Anonymous Coward
        Anonymous Coward

        Re: So...

        Now now, jake,

        Good C style code, can be made a lot more efficient, using the more expressive C++ compilers, as an assembler man, I present you proof https://godbolt.org/g/uq6p8o

        If you've not had a play recently you might be surprised.

        Now if you were to complain about bootstrapping a minimal C++ runtime, compared to a minimal C runtime, fair point, but the language is a fine one, in capable hands.

        How's the greenhouse doing? I'd like my own slack powered farm one day, you make it sound a lot of fun.

    2. Robert Carnegie Silver badge

      The expert system should appear as an animated paperclip.

      Then everyone will be careful to never, ever risk summoning it.

  2. DrXym

    Good advice but

    Some of the advice is borderline farcical, not because the advice is wrong but because the language allows those things to be written in the first place.

    - Do not read uninitialized memory

    - Do not access an object outside of its lifetime

    - A lambda object must not outlive any of its reference captured objects

    - Do not rely on the value of a moved-from object

    - Use valid iterator ranges

    - etc.

    The list goes on and on.

    It's no wonder that Rust is gaining such interest. I'd say a very substantial percentage of these issues, would be stopped in their tracks by the design of the Rust language / library and by the compiler's lifetime / borrow checks without sacrificing any performance.

    1. gv
      Alert

      Re: Good advice but

      Good fun and interesting times can be had by doing the opposite of those rules.

      1. tiggity Silver badge

        Re: Good advice but

        Indeed, browsing through uninitialized memory can throw up all sorts of interesting information.....

    2. Ogi

      Re: Good advice but

      > Some of the advice is borderline farcical, not because the advice is wrong but because the language allows those things to be written in the first place.

      Any language flexible enough to give you full and total control over the machine is powerful enough to blow your foot off if used incorrectly.

      The concept of C (and C++ presumably) is that the language is your servant. You tell it exactly what to do, and it does it (as long as it is a valid instruction). It doesn't advise you, it doesn't question you, and it doesn't deny you the ability to do something.

      Of course, whether it does what you intended it to do, or goes off and kills a puppy, is an issue of programming ability and/or understanding the problem set you are trying to solve (and the constraints of the environment).

      Like most tools, there is a time and place for it. I am not going to whip up a quick C program to parse a text file, but likewise I am not going to write a kernel (or embedded code) in Python or Bash.

      I think it is a good thing that CERT has done this, like a "best practices" if you want to write more secure, less exploitable code. It is up to the end user whether to follow it, or whether they really need to access unallocated memory for some particular reason.

      No comment on Rust, because I haven't had a look at it myself, but have heard good things from people.

    3. Dan 55 Silver badge

      Re: Good advice but

      You say do not read uninitialized memory, I say that memory's mapped to hardware device so I don't need to initialise it.

      You say do not access an object outside of its lifetime, I say have you not heard of RAII?

      You say a lambda object must not outlive any of its reference captured objects, I say I know when to pass by value instead.

      You say do not rely on the value of a moved-from object, I say you don't need to because you've just used move from to quickly create a new object and can now use this object for something else without all the expensive object creation cost.

      You say use valid iterator ranges, I say there are iterators to do that. If you don't use them, that's your choice.

      You say C++ lets you shoot yourself in the foot, I say be clever enough not to aim at your foot in the first place.

      1. John Smith 19 Gold badge
        Unhappy

        "I say be clever enough not to aim at your foot in the first place."

        Which suggests you practice these rules (or things with equivalent effect) unconsciously

        These rules are not for you.

        They are for the people who don't do so, who need to be told this is a bad idea, before they do it.

        Of course whether they will bother to read these rules before they make an almighty clusterf**k (is core Windows still written in C++ ?) is another matter.

    4. Suricou Raven

      Re: Good advice but

      C is a powerful and dangerous language. The ability to manipulate memory at such a low level without any form of bounds checking makes it one of the fastest languages around - which is why C and C++ remain the language of choice for any situation where performance is absolutely critical.

      1. Claptrap314 Silver badge

        Re: Good advice but

        BS. I spent eight years in asm because that was the only language that could handle the problem I was addressing. Going back up to C was astoundingly painful from a performance standpoint. MP computations without the carry bit? Here--let's just throw away half of our register size & reduce speed by a factor of four. Check for integer overflow? Let's see... Nope, that won't work. Naw, that won't either. Good thing I spent a lot of time working with registers--I would hate to think what would happen if someone without such experience tried this!

        For absolutely critical performance, you want access to the parts of the instruction set that C denies you.

        1. John Smith 19 Gold badge
          Unhappy

          "For absolutely critical performance,..access to the...instruction set that C denies you."

          I see, you don't use a profiler so can't spot where the 5% that actually matters is being executed.

      2. david 12 Silver badge

        Re: Good advice but

        >one of the fastest languages around<

        Compared to LISP, the other language CS majors were being taught in the 80's.

        On the other hand, Pascal and FORTRAN were always faster and lighter than C, (C99 and C11 and C++ made significant changes to partially address the gap), which is why C in particular was never used for any situation where performance was absolutely critical.

        "The ability to manipulate memory ... without any form of bounds checking", was of course, the least of the disadvantages of C & C++. Once you get outside debug builds, all languages allow compilation without bounds checking.

        1. Ken Hagan Gold badge

          Re: Good advice but

          "Pascal and FORTRAN were [in the 80s] always faster and lighter than C"

          Just to elaborate, for the benefit of the down-voters who are clearly in need of a history lesson...

          Fortran explicitly allows (and has always done so) the compiler to assume that arguments (which are traditionally passed by reference) are not aliases of each other. C has never had such rules and there is plenty of code out there that would be broken by such. With no aliasing, the Fortran compiler has optimisation opportunities that the C compiler does not have.

          This is particularly significant in numeric code, where passing two or more arrays of numbers is an extremely common thing. Consequently, C has always struggled to match Fortran's performance in this area. In mitigation, most C compilers have compiler options or pragmas to let you relax the aliasing rules in particular cases, but obviously that takes you beyond the standard language.

          On the other hand, these days it is less of an advantage because the fastest code is probably written in assembly language using SIMD instructions, or written for the GPGPU. Languages differ and times change.

          1. swm

            Re: Good advice but

            The original FORTRAN compiler competed with assembly code and most "real" programmers shunned FORTRAN until it could be demonstrated that it produced code as good as or better code as an experienced assembly language coder. This explains some of the bizarre rules for subscripts in FORTRAN I. The rules were designed to match the index and test instructions of the IBM 704. This is also why DO loops go through at least once. The three-way branch IF statement also matched the compare and skip instruction.

    5. bombastic bob Silver badge
      Devil

      Re: Good advice but

      not using (...) and not allowing unsigned integers to 'wrap' is short-sighted...

      the implications of 'printf'-like utilities, as well as gcc format checking pragma, were pointed out in the comments. But there are use cases for wrapping an unsigned integer, SUCH AS the calculation of a time interval on a 32-bit unsigned value that calculates milliseconds or microseconds, and wanting to schedule events based on elapsed time. When you EXPECT a wrap-around, you can code around it.

      example:

      uint32_t lTick = millis(); // # of milliseconds since start, using 32-bit unsigned value

      ...

      if((int32_t)(millis() - lTick)) > my_interval) { do something; lTick += my_interval; }

      this pretty much works universally, and is similar to what the Linux kernel does when scheduling things based on 'jiffies'.

      (and in some cases I'll even truncate the math down to 16-bits to make code work faster, such as on a microcontroller like Arduino, where this example might be used a LOT)

      so... maybe NO UNHANDLED unsigned integer wrapping?

      (yeah a FEATURE, not a bug - I like to work WITH the system's limitations, not against them)

      1. Richard 12 Silver badge

        Re: Good advice but

        Guidelines are there to make you think when you break them.

        Almost of of these recommended practices will have occasions when they should or indeed must be broken.

        - The classic "access uninitialised memory" reason is when the memory content in question comes from another device, whether by DMA or other means. Initialising it yourself would be a bug - but reading past where the DMA has currently reached is also a bug.

        So yes, break the guidelines - but when you do, think about why you're breaking them. Is it the right thing to do, or just the easy thing?

        1. barbara.hudson
          Unhappy

          Re: Good advice but

          Unfortunately, your point will be lost on the 99.9% of c programmers who never understood the use of, for example, the volatile keyword ...

  3. Anonymous Coward
    Anonymous Coward

    The document's also available as a wiki

    Be careful is you use the wiki "copy" as it's live and is a real pain to work with if you're using tools for automatic enforcement as they will enforce against a specific version.

  4. jake Silver badge

    Oh, goodie!

    Yet another C++ "How-To" book. That's just exactly what the world needs, judging by the dearth of such tomes on Uni bookshelves.

    1. Anonymous Coward
      Anonymous Coward

      Re: Oh, goodie!

      Most of the books that are out there show design patterns giving elegant solutions to common problems. There is very little out there to help stop you from accidentally writing code that contains instances of undefined behaviour.

      CERT C++ (and similar, e.g. MISRA C++) aim to try and eliminate such behaviour from code.

      1. Ken Hagan Gold badge

        Re: Oh, goodie!

        "There is very little out there to help stop you from accidentally writing code..."

        and a 435 page tome is not going to help with that one iota because unless the checking is automated it simply won't happen. No-one has the time and energy to check each line of new code against such a vast list.

        "Once established, these standards can be used as a metric to evaluate source code (using manual or automated processes)."

        As it happens, I already have a compiler that will test quite a number of these things in those cases where it can be checked at compile-time. (Modern compilers do pretty much everything that lint did 20 years ago.) The difficulty is that we still have all the undecidable cases left over and (as hinted by an earlier commentard) any sufficiently powerful language will allow undecidable cases.

        If CERT (or anyone else) has a magic wand that will point at those cases (with sufficiently few false positives that it is actually worth me reading the output and sufficiently many true positives that I can justify splashing the cash on the tool) then I'm all ears. They could prove the worth of their tool by pointing it at some of the millions of lines in popular FOSS projects and submitting bug reports for all the security problems *before* they are posted as 0-days. (And yes, some vendors have tried this, but the fact that everyone didn't immediately go "wow!" and buy their product suggests that the results weren't clear-cut. Perhaps all the low-hanging fruit in this area were eaten by lint when I was in short trousers and even the medium-height stuff has been picked off by language evolution since then.)

        1. Anonymous Coward
          Anonymous Coward

          Re: Oh, goodie!

          Coverity is decent.

          Sonar has little value to me over using Clang's static analyser, which is really rather good, combined with the usual suspects, compiling against different compilers / platforms, valgrind.

          Nothing else I've seen is worth the price or more importantly offers an advance on the open source tools.

          1. Paul Crawford Silver badge

            Re: Coverity is decent

            It is also available free to FOSS projects.

            While there are numerous warning that can be ignored, the golden rule for all such code-profiling tools is to make sure you understand the nature of the warning before you fix it or ignore it.

            Also worth a mention are some free (at least on Linux, maybe others?) memory checking tools like valgrind and the good old electric-fence library. While not checking your source code as such, they do help with detecting run-time memory errors such as double-free, leaks, etc.

    2. Paul Crawford Silver badge

      Re: Oh, goodie!

      Remember this: C is basically a universal assembler, created to allow an OS to be written in a largely machine-independent manner. As a result it allows all sorts of potentially dangerous actions (in particular pointers, but not helped by some of the more odd/obscure syntax that sticks around).

      Rule #1) If you can't program in assembler with any degree of success then don't use C

      Rule #2) C++ adds some better features, and adds some worse features

      Rule #3) If safety is more important than performance or universal support use another language.

      Rule #3.9999999) Don't use flaky Pentium FPUs

      1. Anonymous Coward
        Anonymous Coward

        "If safety is more important than performance or universal support use another language."

        That's why an OS shouldn't be written in C/C++ any longer... <G>

        1. Paul Crawford Silver badge

          Re: That's why an OS shouldn't be written in C/C++

          Oh yes, most of the OS kernel should as it needs that sort of memory wrangling and I/O poking sort of thing.

          Most of the user-land tools and utilises probably not...

      2. Salamamba

        Re: Oh, goodie!

        upvoted as I remember the FPU issues coincided with my conversion to AMD processors

      3. Anonymous Coward
        Anonymous Coward

        Re: Oh, goodie!

        >Remember this: C is basically a universal assembler <

        Remember this: FORTRAN is basically a universal assembler

        Remember this: COBOL is basically a universal assembler

        Or, for those of you too young to remember: read up on a little history.

        1. Paul Crawford Silver badge

          Re: Oh, goodie!

          "FORTRAN is basically a universal assembler"

          Not really. While *ALL* compiled languages eventually result in assembly-level instructions, C is a slightly special case in that it allows quite easy means of arbitrarily addressing memory locations and interacting with asynchronous events such as signals/interrupts. It also has many bit-wise sort of options in terms of manipulating integers, bit fields in structures, etc, that are useful for hardware driver I/O, etc.

          That is not part of the usual FORTRAN syntax nor (I presume, not used) COBOL. E.g FORTAN 77 had no memory allocation support, you had to define fixed-size arrays at the start.

  5. karlkarl Silver badge

    Pretty much everything here can be avoided by using smart pointers. Unfortunately so many C++ libraries do not use these so the safety breaks at this point.

    And when using C libraries... well Rust will have the same issue, you need to make wrappers, and if these have faults, you get dangling pointers and unhappiness like that :(

    1. Anonymous Coward
      Anonymous Coward

      Really? Smart pointers save the world?

      They are a useful tool, but they won't stop you falling into most the traps that the standard sets for the unwary (think undefined behaviour).

    2. Anonymous Coward
      Anonymous Coward

      _

      If you understand how pointers work, 'smart pointers' only obfuscate.

      Eschew obfuscation.

      1. Zippy's Sausage Factory
        Joke

        Re: _

        Eschew obfuscation

        Sounds like the name of a modern death metal track. Seriously, seems bands have a competition among themselves to see how many polysyllabic words they can use per album that haven't been cited since the 1600s.

        I blame Slayer*

        Joke alert because ... er... this isn't a very serious post?

        * last time I looked up abascinate, "Reign In Blood" is cited from 1986 as the first usage for about 300 years, the show offs...

      2. Anonymous Coward
        Anonymous Coward

        "Eschew obfuscation"

        In C++, if you're not using smart pointers, you're going to reinvent them yourself, believe me. For example the lack of a "finally" statement fro try blocks inevitably leads to issues when you have to ensure some resources are always released.

        The religious faith in RAII creates problem when you create objects which are referenced through pointers because you can't allocate everything on the stack for big, complex applications and stacks are often limited resources - so you need a pointer that can destroy the referenced object when going out of scope - et voilà, a smart pointer... which is smart because the language is not.

        Languages which are using ARC have implied smart pointers, and they share the same issues. you need to have more than one type to address different uses, and ensure you use the right one. Plus some bottlenecks for concurrent accesses when the pointer is shared.

        1. Dan 55 Silver badge

          Re: "Eschew obfuscation"

          The religious faith in RAII creates problem when you create objects which are referenced through pointers because you can't allocate everything on the stack for big, complex applications and stacks are often limited resources

          malloc and new use the heap...

          1. Anonymous Coward
            Anonymous Coward

            Re: "Eschew obfuscation"

            Yeah really. IIUC the point of RAII is having the compiler do the new and delete for you, but new and delete nonetheless, so it would sort of _act_like_ objects on the stack-- which are free once execution leaves that scope-- not because it could or would _use_ the stack.

            NB: it's really not my business, I've decided that knowing lots of C++ is a liability :D

            1. Anonymous Coward
              Anonymous Coward

              Re: "Eschew obfuscation"

              n/m I misunderstood OP a bit. Of course 'not the stack' and they weren't implying it either.

    3. Anonymous Coward
      Anonymous Coward

      Smart pointers themselves are an hack to try to fix deficiencies of the C++ design, which C++ "designers" prefer not to address for religious reasons. Moreover, like all pests, they tend to proliferate and there's the risk to choose the wrong one if you don't know very well how each of them behaves.

      But many of the issues in that document stems from bad decision made more than forty years ago in C, borrowed in C++, instead of being deprecated. That's because C programmers like a "loose" language - because it's easier to write it with simple text editors, and that doesn't play well with security. Security works better with more "strict" languages - it's no surprise they become fashionable again - but they also require more discipline - and better tools - to be used.

      1. Brangdon

        re: for religious reasons

        That is, for backwards compatibility and efficiency.

        1. Anonymous Coward
          Anonymous Coward

          "for backwards compatibility and efficiency."

          And security issues.

          Backwards compatibility is important - but not when it creates more issues than it resolves. Efficiency too, We have insecure OSes, for example, because someone decided flat, overlapping address spaces for code and data were more "efficient" . so efficient ROP works perfectly.

          When people become so afraid of changes they feel the need to keep alive on outdated and risky designs, the only outcome will be huge security breaches. If in 1970 that was not a big issue, today it is, and even C/C++ need to change accordingly. Hundreds of pages about how write proper code won't be enough.

          It's just like believing more trained users will cause less mortal incidents, so you don't need safer devices by design.

      2. Anonymous Coward
        Mushroom

        not-so-smart with pointers

        > Smart pointers themselves are an hack to try to fix deficiencies of the C++ design, which C++ "designers" prefer not to address for religious reasons.

        Actually, no. Smart pointers are for programmers who can't handle memory management correctly.

        1. Anonymous Coward
          Anonymous Coward

          Re: not-so-smart with pointers

          Smart pointers are about communicating something to the reader.

          std::unique_ptr<Foo> foo ; // I own this, I can move ownership but I cant (share/copy), lifetime is as mine.

          std::shared_ptr<Foo> foo; // I have an interest in this, it might outlive me, but my share keeps it alive.

          std::weak_ptr<Foo> foo; // I have an interest in this, but my interest doesn't keep it alive.

          Foo * foo; // a wild pointer, set to whatever what on the stack last.

        2. Ken Hagan Gold badge

          Re: not-so-smart with pointers

          "Actually, no. Smart pointers are for programmers who can't handle memory management correctly."

          Actually, no. Smart pointers are for programmers who completely understand resource management, completely understand that different approaches are correct in different situations, and don't see why actually implementing any of those approaches should be their job, or why any of those approaches needed to be implemented more than once.

          Garbage collected languages are for those who get some, but not all, of the above points.

          1. Anonymous Coward
            Anonymous Coward

            Re: not-so-smart with pointers

            smart pointers are for programmers that understand when to let the language do their work for them.

            1. DrXym

              Re: not-so-smart with pointers

              "smart pointers are for programmers that understand when to let the language do their work for them."

              Smart pointers aren't part of the language. Nor are strings or a bunch of other things that other languages take for granted.

              Instead they're implemented by templates. While that's better than nothing it's certainly not as good as proper intrinsic types that can be checked by the compiler, generate efficient code and meaningful errors during compilation.

              1. Ken Hagan Gold badge

                Re: not-so-smart with pointers

                "While that's better than nothing it's certainly not as good as proper intrinsic types that can be checked by the compiler, generate efficient code and meaningful errors during compilation."

                If your compiler cannot generate efficient code from source then you need a new compiler. If your language doesn't let you state the constraints that apply for given types, you need a new language. If you need more meaningful errors than "look at this line -> here" then you need a new programmer.

                The advantage of implementing this stuff *outside* the language is that any third party can add to the toolbox. You don't need to wait for your compiler vendor to implement the feature and if you are trying to write portable code then you don't need to wait for *every* compiler vendor to implement the feature.

          2. Anonymous Coward
            Mushroom

            Re: not-so-smart with pointers

            > [ ... ] resource management

            That buzzword was invented specifically for programmers who couldn't handle memory management correctly, and got very offended when this inability was pointed out to them at code review time.

            It's not memory, it's a resource. Your program does not leak memory like a sieve, it is not optimized for efficient resource management.

            Sounds savant and less incompetent.

            1. Ken Hagan Gold badge

              Re: not-so-smart with pointers

              "[resource management] was invented specifically for programmers who couldn't handle memory management correctly"

              Really, I thought it was because files aren't memory, database transactions aren't memory, network connections aren't memory, windows aren't memory, graphics contexts aren't memory, function hooks and callbacks aren't memory, ...

              You are talking utter crap. The word "resource" is used here because exactly the same code can be used to manage just about anything that needs to be properly disposed of at some later point in time and you know, at the time of "acquisition", when and how that disposal ought to be done. That description applies to a huge range of use-cases, only a tiny fraction of which are related to memory management.

Page:

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon