back to article An introduction to static code analysis

So, by some misfortune, usually instigated by "management" or by "tradition", you are stuck with a C/C++ program to maintain. Not only do you get the positives of speed, you also get the negatives of the lack of memory management. How do you know what is lurking underneath the templates and calls to new and malloc? How do you …

COMMENTS

This topic is closed for new posts.
  1. Anonymous Coward
    WTF?

    Advertisement, and multi-threading

    Perhaps you should include a disclaimer that this is a thinly-disguised advertisement for the author's products? (The small print at the bottom, "Chris is a Static Code Wrangler at monoidics.com", is not clear and is very hidden).

    The Monoidics tool doesn't work on multithreaded apps, according to their website. Unfortunately, this is a really common limitation in static analysis tools. This is especially annoying as the rest of the world is adding threads to take advantage of multiple cores.

    Can anyone recommend a good static analysis tool for a multithreaded application? I'm mainly interested in finding the deadlocks and race conditions. So the Monoidics tool is useless. I've seen the excellent work Coverity has done on the Linux kernel, but when they demo'd it on our code they didn't have any of the multithreading checks. Any suggestions?

  2. Guus Leeuw

    Next installment, please

    Can't wait!!!

  3. Mike007 Bronze badge

    "prove the absence of bugs"

    and how exactly can a program prove that a program is bug free?

    if clicking exit only minimises when the "minimise instead of exit" option is set, is that a bug? what about if it does it when the "play a tune on startup" option is set, is that a bug? a program has no way to tell the difference between those settings to know, and i would certainly call one of those a bug!

    1. Filippo Silver badge

      bugs

      I think it's more likely that it can find all *memory leaks*. Maybe it can find null references and accesses to unallocated memory too. But, all bugs? Doesn't even make sense.

    2. DZ-Jay

      Don't be dense

      The article does not say that a program can be proved to be "bug free," it says that mathematical proofs can confirm the absence of bugs. That is, some very specific (and common) bugs can be proved to not exist. The absence of all bugs was never suggested.

      The types of bugs that can static analysis can prove to not exist are memory leaks, null pointers, and other common errors as such. Violation of business rules can never be generalized to all applications.

      -dZ.

      1. the spectacularly refined chap

        It can't even do that

        No automated tool can do even that in the general case. A lot of these tools fail on precisely the code where they are most needed. Nodes with a variable number of children (e.g. void *next[1] as the last member of a node that is extended at the malloc() stage) comes to mind. So do shared structures with reference counting, where a given sub-structure may have multiple parents.

        Ultimately it is an infinitely complex problem that is completely impossible to solve in the truly general case: you can consider it an example of the halting problem if you want, although there are plenty of other cases that are similarly impossible to prove correct in an automated manner.

    3. Anonymous Coward
      FAIL

      "C/C++"?

      I lost all respect for the article well before getting as far as "proving" the unprovable. There is no such language as "C/C++" and people who use that term to imply both are the same and interchangeable are invariably not particularly great at either C or C++.

  4. Anonymous Coward
    Anonymous Coward

    Amazing piece!

    Especially since the programming language referenced doesn't actually exist.

    Yes, pedantic. But honestly, even if this is lazy shorthand for "C and C++", the code is essentially C and completely ignores C++, as does the rest of the text apart from being lazy. C++ brings powerful new tools but with them come equally dangerous snags and gotchas, that in turn can be mitigated by careful use of those powerful tools. But not a peep from the author. And, of course, it pretends no other tools than static code analysis existed before now, which isn't true.

    Even a decent libc will provide decent reporting and debugging tools for the memory management this author claims doesn't exist in this fictional language. Well, maybe it doesn't. But I write code in both C and C++, and yes I do know the difference, thank you, and I do have plenty of tools and tricks that help a lot here.

    This is not to say that the buzzword-to-be touted here isn't useful. It merely is not, as in cannot possibly be, nearly as dramatic as portrayed, and the portrayal smacks of betrayal of the tools that came before. That means I now feel, having read the article, I wasted my time. Which is a pity, for a codesmith too needs to keep his toolchest well-filled.

  5. Gerard Krupa
    FAIL

    the ability to prove the absence of bugs?

    http://en.wikipedia.org/wiki/Turing%27s_proof

    I've not seen any static analysis tool that isn't full with controversial rules. Still, it would be nice to see something like PMD telling me where bug actually are instead of just a vague indication of where they might be based on heuristic patterns.

  6. Anonymous Coward
    Anonymous Coward

    Absence of bugs

    To paraphrase Dijkstra, "this shows the presence, not the absence of bugs"

  7. tr7
    Thumb Down

    Comment bait?

    How do you know what is lurking underneath the templates and calls to new and malloc? How do you know if that program is leaking memory or doing other dangerous things?

    In Visual C++ #define CRTDBG_MAP_ALLOC and _CrtDumpMemoryLeaks() has been around for quite some time which makes detecting memory leaks a no brainer.

    I hope for Monoidics sake the product is better than the article.

    1. This post has been deleted by its author

      1. Blue eyed boy
        FAIL

        not bugs in the language implementation

        I have encountered this: a spectacular one in the Microsoft C/C++ compiler version 7. (N.B. NOT Visual C). The construction it choked on was the use of a transient object in an initialiser list:

        class thingy

        {

        public:

        thingy(chr *text);

        int value();

        ~thingy();

        };

        class whatsit

        {

        int number;

        public:

        whatsit(char *text);

        ~whatsit();

        }

        whatsit::whatsit(char *text) :

        number(thingy(text).value()) // This is the crucial line

        {}

        void main()

        {

        whatsit *z;

        z = new whatsit("Sit back and watch the system crash\n");

        }

        All would go well till we got to the whatsit constructor and the call upon the value() method of a transient thingy to initialise whatsit::number. Microsoft C7 would then call the following, in order:

        (a) thingy *destructor* with a garbage pointer (most likely NULL) for *this

        (b) thingy::value() method with (usually) the same garbage pointer

        (c) thingy constructor with a valid address for a new thingy

        Fix: purge the code of transient objects in initialiser lists. Done, and all was well.

  8. heyrick Silver badge

    Back in the nineties...

    ...I was teaching myself C, having grown up with BBC Basic. I quickly learned that memory was something I needed to entirely take care of myself.

    To aid in this, I wrote some functions to wrap around calloc(), realloc() and free() (plus an atexit() routine). This simple little thing trapped and reported all sorts of idiotic mistakes, like the one suggested in this article. No maths, no fancy product, just some simple logic capable of raising an error...

  9. Anonymous Coward
    Anonymous Coward

    silly article.

    > There are a variety of methods for analysis of your program: denotational semantics, axiomatic semantics, operational semantics, abstract interpretation, and separation logic

    denotational/semantic/operation semantics are means of expressing formal (ie. mathematical) semantics, NOT analysis techniques. Abstract interpretation is an analysis technique (more precisely, can be used to that end). That the author doesn't understand the difference and the use of each is just incredible and invalidates what he's written.

    > the ability to prove the absence of bugs

    Maybe, but only against a formal specification - which is where the semantic systems mentioned above come in. But mapping from formal spec to all viable programs that implement that spec (which include, of course, the one the user wrote to get the job done) - impossible in general, I guarantee it.

    @The Author: make sure your next article is not cobblers, thank you.

    @AC "Advertisement, and multi-threading", dunno about multithreading, reading up on Spin at the mo <http://spinroot.com/> but this is a model checker not a code checker - tough to get your head round but nifty though!

    @tr7: > ... #define CRTDBG_MAP_ALLOC and _CrtDumpMemoryLeaks() has been around for quite some time which makes detecting memory leaks a no brainer.

    These only work when you run the prog (I assume?), meaning e.g. that you can't rule out missing obscure corners where leaks lurk, unless you run the prog in all possible ways, which is impossible. Static analysis can guaranteed pick these cases up, under the right circumstances.

  10. Anonymous Coward
    FAIL

    Another bug in the program

    The return value of the function is never used, and since no output is ever generated the entire program aside from main returning EXIT_SUCCESS can be optimized away. While a good static analyzer would let you know about the memory leak, a great one would not even check the function for a memory leak since it doesn't matter.

  11. Hungry Sean
    Dead Vulture

    oi

    As someone who would potentially be interested in the technology being discussed here, I can't help but feel that the tone and level of understanding assumed in this article is completely unsuited for the audience. C is alive and well and most good engineers realize that it will be with us for quite some time. The example of where these tools would help shown in the article is something that would also be caught by valgrind.

    While I appreciate that memory leaks are a major source of pain, the things that I find catching me more often are logical errors (e.g. the heap I implemented isn't actually preserving the heap property on insert because somewhere I have a comparison reversed). I don't think these tools will help out there (normally you use unit tests instead), lint is good for style checking, valgrind o purify are good for memory leaks, so why exactly do I care about these static analysis tools?

    1. Anonymous Coward
      Megaphone

      you need

      ...fomal verification: http://en.wikipedia.org/wiki/Formal_verification

  12. Tom Wood

    Not dead languages

    "So, by some misfortune, usually instigated by "management" or by "tradition", you are stuck with a C/C++ program to maintain.

    ...

    While garbage collected languages have become a large part of the programming marketplace, interestingly, C/C++ is still largely used in many critical domains: banking, embedded (automobiles), avionics, networking, operating systems, etc."

    1. Static analysis isn't just about finding memory leaks.

    2. Garbage-collected languages can still benefit from static analysis tools.

    3. As you note, C or C++ are widely used in many domains, and mostly not for reasons of "management" or "tradition". Like device drivers, operating systems, and all sorts of embedded systems (like, for example, at least the lower layers of the stack in a mobile phone, PVR, set-top box, broadband router/modem, ...) where performance on a low-power system is the main criteria. C and C++ is also widely used for lots of high-performance application software, including Firefox, Chrome and OpenOffice.org, and server software like Apache, and heaps of proprietary stuff too. So yeah, C and C++ are far from dead languages.

  13. AbortRetryFail
    Thumb Down

    Rubbish article

    The author states that nobody writes C++ any more and makes a thinly-disguised sneer at the "misfortune" that you may have to maintaining it, then goes on to contradict himself by making a long list of market sectors that do actively develop in C++.

    For the record, myself and a large number of others do actively develop in C++. And with judicious use of smart pointers and RAII techniques, there is absolutely no reason to have resource leaks. And all without some bloated garbage collection that cleans up when it feels like it rather than when you want it.

    The whole article seems to be a cross between an advert for a tool and a troll for comments.

    1. James Thomas

      pfft

      Hear, hear! Its entirely possible to write c++ without memory leaks with just by using containers and smart pointers. When i write code the isn't a single `new`, `delete` or `malloc` to be seen in application code, and consequently memory leaks are easily avoided.

      Interestingly the only app i`ve worked on recently that suffered badly from memory leaks was written in .net.

      1. Anonymous Coward
        FAIL

        @James Thomas

        « the [sic] isn't a single `new`, `delete` or `malloc` to be seen »

        With all due respect, this remark show how little you know about C++ and RAII (smart pointers being just one aspect of this technique). It's true that C++ can somewhat reduce the need for dynamic allocation compared to languages/environments like Java or .Net. However, I guess we'll easily agree that one can hardly avoid dynamic allocation (and thus, 'new' and even sometimes 'delete' keywords) altogether in any decent-sized project. The point you're totally missing is how RAII techniques allow one to tightly encapsulate those mandatory parts, to the point that correctness can be proved quite easily (and often locally, which is even more important), and how this in turn allows to safely and automatically reclaim the resources whenever an object goes out of scope.

        I won't go into a long explanation of how RAII works precisely, you can search that on the web as there are tons of very good resources (I recall some interesting papers from Herb Sutter, maybe you can start here).

        « the only app i`ve worked on recently that suffered badly from memory leaks was written in .net. »

        Now we have a problem: either you're implying that the .Net VM itself leaks memory (really? a bug in the garbage collector? file a report then!), or your objects were still reachable out of some overlooking from your part (and thus it wasn't strictly speaking memory leaks, since by definition leaks happen when an object becomes unreachable).

        This only proves that each and every programmer out there can (and will, at some point) fail, no matter how many safety belts we use. RAII techniques are no better than garbage collection at protecting one against those classes of logic errors: how can memory be reclaimed, whichever technique is used, if the programmer wrongly tells the program that those objects are still needed?

        And thus you're missing the point again: what AbortRetryFail was saying is that when using the proper tools, memory management in C++ is just as safe and easy to handle than in any garbage-collected language. Obviously, C++ being what it is, this requires a bit of preparation (encapsulation work for specific cases, mainly) but once this is done it's just basic allocate-and-forget.

        What AbortRetryFail never implied (even though you seem to think he did) is that there is some silver bullet out there that could automagically correct one's failures at programming.

        As a footnote, I'll add that, yeah, I saw the irony and sarcasm in your post. But I'd rather give you the benefit of the doubt and assume that you were wrong in good faith.

  14. Ed
    WTF?

    Games

    Bear in mind that a large proportion of the games industry uses c++ too.

    I thought the article read rather like an advert and was rather condecending at the same time...

  15. Bruce Hoult

    C and garbage collection are not mutually exclusive

    It's a very long time since I wrote a significant C/C++ program without using the Boehm garbage collector.

    It makes life far more pleasant. You have to write a lot less code, think about housekeeping a lot less leaving more time to think about the actual problem you're solving. And it usually makes for faster programs, to boot.

    Yes, faster. The overall execution time is usually lower than programs using malloc&free, and certainly much lower than anything where you'd otherwise resort to reference counting.

    The downside is slightly higher memory use, and the odd pause while the GC runs. Unless your program uses gigabytes of RAM (some do of course, but most don't) the extra memory is inconsequential these days, and the pauses down under 100 mS which is undetectable unless you're writing a video game.

    1. Anonymous Coward
      Anonymous Coward

      Conservative GC works, like the Boehm collector, has a cost?

      It can leak by mistakenly recognising a piece of memory as a valid pointer when it isn't, in cases where a compiler does optimisations which 'vanish' pointers briefly (or indeed, where the programmer does), in theory a live object can be collected. You need to know what your compiler does.

      It's horrible, hairy stuff and probably tolerable for non-critical desktop apps but don't treat it as a magic black box.

      'ang on, just clicked, are you the Bruce Hoult who was involved in dylan? If so you probably know more about this than me. Still worth pointing out to others I guess.

  16. Anonymous Coward
    Flame

    C++ memory management

    As a fellow commentard duly noted, there is no such language as C/C++. But this is only part of the point.

    I just can't bring myself to believe that the author of this blatant infomercial is that ignorant about the C++ memory management techniques. Granted, he has to sell his product, but deliberately implying that C++ memory management has to be done the old C way won't buy any customers. Any half decent C++ programmer knows about how RAII can (and must) be used for precise garbage collection, and how it fits in with exception safety (which is the basics of C++ programming).

    FFS, how is it possible that in 2011 some people still confuse C++ and "C-with-classes"? Shame on you, author.

  17. This post has been deleted by its author

    1. This post has been deleted by its author

  18. redpola

    C has no memory management?

    Give me hands-on memory management over garbage-collection anyday.

  19. Jolyon Smith
    Grenade

    "Lack of memory management" ... bollocks !

    It's not that there is no memory management, it is that there is no garbage collector. Oh, unless you are using one.

    But even without a GC mechanism, there is always the GC of last resort... the software developer.

    And with that comes (at least the potential for) fine grained, optimised memory management, not coarse, optimistIC memory "management" (management in quotes because a GC doesn't really "manage memory", it manages the mess left behind by NOT managing memory).

    GC is to memory management as airbag and safety belt is to driving a car.

  20. Displacement Activity
    FAIL

    NFTR

    Got to agree with most of the other commentards here. This is a bad place to advertise your product. Lots of (most?) people use "C/C++". We know what we're doing. Memory management isn't magic; all those other languages probably wrote their memory management in "C/C++" anyway.

  21. peyton?
    FAIL

    for me, the fail was

    'At this point, some might suggest that "no one writes in C/C++ anymore"'

    I guess those "some" have never heard of TIOBE?

    http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html

    I also recall a Reg article about C dominating open source projects by like 40%?

This topic is closed for new posts.

Other stories you might like