back to article Languages don't breed bugs, PEOPLE breed bugs, say boffins

If you want to spark a religious war, express an unshakeable preference for a programming language, and by preference, make your favourite something relatively obscure, like Erlang. It turns out, according to a study by a bunch of UC Davis boffins, the differences in code quality between languages are pretty small. To be …

  1. Anonymous Coward
    Anonymous Coward

    No surprise

    With the exception of course work, I've never used the same language twice in a row. Requirements were the sole driver for how did things.

  2. Paul Crawford Silver badge
    Joke

    What, no assembly language projects?

    1. yoganmahew

      Is there nothing to be said for another fixed point divide error?

      1. Primus Secundus Tertius

        Dodgy arithmetic

        In the 1970s I worked on various projects that used fixed point arithmetic, as floating point hardware was not available. Very few programmers could "get it".

        Even if the computer operations were correctly coded, there were other algorithmic errors: most often, the loss of precision when subtracting a large number from a nearly equal number.

        PCs now have floating hardware by default, but I do fear for micro-controller systems.

        1. annodomini2

          Re: Dodgy arithmetic

          Necessary evil, when you are on MCU grade HW with no FPU running at maybe 100Mhz and 32bit if you're lucky.

          A SW floating point implementation can take 30-40x the runtime, in a hard real-time system this is not an option.

          Even with a FPU it's still slow.

          Fixed point arithmetic is not difficult, but does take some comprehending and yes it is possible to make mistakes.

  3. Pete 2 Silver badge

    Still playing favourites

    > the differences in code quality between languages are pretty small

    Maybe so. But what about the differences in (language) learning time, ease of code development, the size of the executable and the speed it runs?

    It's also arguable that people who were taught one programming style will be more comfortable and produce better product when using languages which conform to that technique than if they are made to use a different, possibly merely more trendy, method of turning letters into bits.

    It would also be instructive to see whether the IDE (or lack thereof) used, or different coverage/testing techniques employed by different programmers contributed to the buginess of the end result.

    No matter how good / bad the language: the crucial difference is always the quality, documentation and extent of the supporting libraries and and learning material.

    1. Alfred

      Re: Still playing favourites

      I've always found the crucial difference is the quality of the programmer, rather than the quality of the supporting libraries and learning material.

      1. Jared Vanderbilt

        Re: Still playing favourites

        True, but we are short on quality programmers. Software development infrastructure has to support to the whole community, not just the top 10%.

        IT would grind to a halt if 90% of our programmers went back to pushing brooms.

        1. Trevor_Pott Gold badge

          Re: Still playing favourites

          Only the top 1% ever matter. Don't you follow American politics?

  4. sorry, what?
    Devil

    It's the programmers, stupid

    Obviously. Aptitude, understanding and experience will most impact quality. And sadly there are enough stupid programmers to keep contractors in work for life. As long as they don't mind fixing crap code.

    1. Herbert Meyer
      FAIL

      Re: It's the programmers, stupid

      Or maybe it's the stupid programmers. I always feel stupid when a typo kills a program. Misplaced commas, semicolons, misspelled names and keywords, etc. I usually shoot myself in the head (no serious damage, I was not using it), and go have a drink.

  5. Torben Mogensen

    Erlang and more

    The article mentions three factors that (somewhat) improve code quality: Strong typing, static typing and managed memory. Erlang has strong typing and managed memory, but it is dynamically types. So by the (somewhat weak) conclusions in the paper, it is no surprise that Erlang is close to the average.

    What would be interesting is factoring out the number of years the programmers have worked with their language: C and C++ programmers have at least potential to have worked longer with their language than Erlang, TypeScript and Haskell programmers, and it is very likely the case that programmers who know their language intimately will make fewer errors. This also makes a case for languages that are sufficiently simple that you can actually manage to learn all of it.

    Also, these days, huge libraries mean that you quickly can produce useful code in almost any language, as long as you stay within the scope of the libraries. This makes the actual properties of the language itself largely irrelevant, again if you stay within the scope of the libraries. So a true test of language (isolated from library) productivity/quality would require giving programmers a task where they can not make any significant use of non-trivial library functions, or where you deliberately forbid use of anything but the most basic libraries, e.g., by limiting the total library code used to, say, 1000 lines.

    1. Charlie Clark Silver badge

      Re: Erlang and more

      C and C++ programmers have at least potential to have worked longer with their language than Erlang, TypeScript and Haskell programmers

      Erlang and Haskell have both been around for quite a while and are well-established in certain domains. TypeScript is so new that it doesn't really count.

    2. Anonymous Coward
      Anonymous Coward

      Re: Erlang and more

      It would also be interesting to know if the Erlang projects involved concurrent processing that scaled to fit the machine architecture. This is a hard task for most languages, so if Erlang achieved approximately equal code quality to non-concurrent programs in other languages, that could be a win for Erlang.

    3. Primus Secundus Tertius

      Re: Erlang and more

      Libraries cause as many problems as they solve. The Heartbleed bug was a conspicuous example, but there are many others.

      The only libraries that are nearly reliable are the time-honoured functions in the C library, and their equivalents in Fortran and Cobol. All the new language libraries are rushed out by promoters who just want to get it onto their CVs.

      1. Mage Silver badge

        Re: Erlang and more: C Libs??

        Learning C (or any language for real projects) is mostly learning the Libraries. Especially which ones NOT to use. There are loads of traditional C lib functions you'd be mad to use.

        1. Michael Wojcik Silver badge

          Re: Erlang and more: C Libs??

          There are loads of traditional C lib functions you'd be mad to use.

          I don't know about "loads" - the standard C library simply isn't very large, and if you want to write portable C code for hosted implementations then much of the library is indispensable for many applications. But certainly there are a number of ill-conceived elements in the standard library. gets is the canonical example, but strncpy, sprintf, atoi, and others are nearly always the Wrong Thing. Then there are those that are inappropriate in many circumstances (e.g. assert), or difficult to use correctly (e.g. everything in ctype.h).

    4. Michael Wojcik Silver badge

      Re: Erlang and more

      C and C++ programmers have at least potential to have worked longer with their language than Erlang, TypeScript and Haskell programmers, and it is very likely the case that programmers who know their language intimately will make fewer errors

      In my 30 years of professional C development - and this is, I think, supported ably by the archives of the comp.lang.c newsgroup, for example - years of using C and C++ have little to no correlation with knowing those languages. Often people who have been writing C or C++ code for decades have a severely limited and largely apocryphal understanding of both the language standard and typical implementation decisions.

      Most programmers seem to learn only a significantly restricted subset of a given programming language, often colored by the quirks of a particular implementation.

  6. Charlie Clark Silver badge

    Interesting but no cigar

    The analysis of defects is heavily dependent upon the bug tracker and the quality of that is heavily dependent upon the users.

    Additional information could be achieved through static code analysis, test coverage and penetration testing where possible. Tests can serve as the formal expression of the contract that code is supposed to implement. I'd wager that there is a significant negative correlation between (unit) test coverage and errors regardless of the language. There might be a correlation between language and test coverage, though this might be less necessary for purely functional programming.

  7. Mage Silver badge

    Real Programmers

    can write ForTran programs in any language.

    Yes, programmers may prefer certain languages, in certain domains the choice may be important. It may affect maintainability, but the real problem is you can teach a programming language easily, teaching programming much harder and a minority are good at it.

    1. Charlie Clark Silver badge

      Re: Real Programmers

      I think you're missing an adjective in your last clause.

    2. Primus Secundus Tertius

      Re: Real Programmers

      Very true!

      I saw lots of "engineers' Fortran" written in Coral66, Pascal, and C.

    3. Phil Endecott

      Re: Real Programmers

      > write ForTran programs in any language.

      My favourite was a huge lump of C code, translated from Fortran, that looked like this:

      arguement = a;

      b = sqrt(arguement);

      arguement = b;

      c = sin(arguement);

      Yes, "argument" was spelt wrong throughout. Every function call took arguement as its argument. And it was a global variable! "Thread safe? What's that?"

    4. wdmot

      Re: Real Programmers

      you can teach a programming language easily, teaching programming much harder and a minority are good at it

      I agree. I'm not good at teaching anything, and I admire those who are good at teaching, especially teaching programming.

  8. hammarbtyp

    Why single out Erlang?

    "Even Erlang" !!!???

    I'm sorry but was the author scared as a kid by Erlang's non-imperative syntax, concurrent architecture and nine nines reliability?

    As that great programmer Bill Shakespeare put it..

    If you load us, do we not use memory?

    if you run us, do we not execute?

    if you wrongly implement the functionality , do we not die?

    and if you wrong us, shall we not have a flame war or the benefits of functional programming languages?

    You hurt me, you really do

    1. Tom 7

      Re: Why single out Erlang?

      True - I remember the joy of working my way through the 'Functional Programming' book (from 1982?) implementing the virtual machine (SECD?) and feeding in the page or so of numbers in the back to get my first Lisp interpreter running so I too could enjoy the fun of Functional Programming.

      Pure Joy!

    2. Michael Wojcik Silver badge

      Re: Why single out Erlang?

      To start a religious war. Richard even wrote that explicitly right in that first sentence.

      Looks like he caught half a dozen. Good show.

  9. jake Silver badge

    If you don't know the ins & outs of ANY language + compiler + OS ...

    ... you are going to produce crap code.

    It's not exactly rocket science.

    1. The First Dave

      Re: If you don't know the ins & outs of ANY language + compiler + OS ...

      "It's not exactly rocket science."

      But that is very much the point: it _is_ rocket science, in some cases.

      1. jake Silver badge

        Re: If you don't know the ins & outs of ANY language + compiler + OS ...

        No, it's NOT rocket science.

        Rocket science is moving hardware along an assigned vector.

        Programming is moving ones and zeros through an array of relays.

  10. Anonymous Coward
    Anonymous Coward

    Complexity....

    Language/Projects

    C/linux, git, php-src

    JavaScript/bootstrap, jquery, node

    Ruby/rails, gitlabhq, homebrew

    Scala/Play20, spark, scala

    I appreciate that finding comparable complexity is challenging, but comparing the bug tracking effort and complexity of linux to Ruby's rails (however excellent rails may be) or Scala's Play framework doesn't really work does it?

    Pascal(/Object Pascal/Delphi) is a notable absence, I don't think they have a procedural-static-strong-unmanaged in the list.

    1. Anonymous Coward
      Anonymous Coward

      Re: Complexity....

      +1 for Pascal/Delphi

      Back in the day code needed not just to work but was also required to match a given style and approach/dogma that was consistant for all professional use of a language. In my day young coders typically started out with BASIC and then progressed to pascal and assembler with emphasis upon understanding from the bottom up. The OS and hardware of the particular platform were well documented and with this documentation being freely availible.

      Sadly PC's have had a history of intentional obscurity and programming has changed from being an understanding of how a computer really works to how to make it do what you want with the minimum amount of understanding/ coding time.

      It is hardly suprising then that coders now have no "code"

    2. Charlie Clark Silver badge

      Re: Complexity....

      Lisp is also noticeable by its absence.

      The inclusion of TypeScript and CoffeeScript initially threw me, and I do think they shouldn't be included in the assessment, partly because they're both too new, but mainly because they produce Javascript. However, a head-to-head of the three for solving common well-understood problems might be interesting.

  11. Anonymous Coward
    Anonymous Coward

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."

    I can't remember who (if anyone) this was attributed to, but it is still as true today as the day I first heard it. And I have seen a lot of programming languages betweentimes.

    1. DropBear

      I'm not so sure about that - it's true today because mainstream languages are far more similar to each other than they are different, in how they go about turning text into code. While I'm not expecting that to change in the next decade, I take issue with the stance that it's flat out impossible to devise some method of conveying desired functionality that makes typical programming errors hard if not impossible to commit. It would still make it perfectly possible to specify a different functionality than the one you mean to mind you, but I don't think it's impossible to deliver that functionality in a robust, fault-resistant way. And before you ask just how one would go about doing that - well if I knew, would I be sitting here commenting on El Reg....?!?

      1. TheOtherHobbes

        The study would have been more useful if it controlled for experience and background. E.g. Objective-C, JavaScript and maybe Python coders are more likely to be relative beginners and/or semi-pros - not because pros don't use them, but because a lot of newcomers are likely to start with them.

        Haskell programmers are more likely to be hardcore comp sci types, or pros with a serious comp sci background. Java/C++ are going to cover a spread of experience, from new grads at one end to >25 year lifers at the other.

        So I think it's confusing two questions. One is 'Do certain languages help you write bug-free code?' The other is 'Do programmers who write good/bad code tend to use certain languages?'

        1. ultimate_noobie

          As a more or less professional "verification engineer", I have to totally agree that these are the two big questions one would need to separate and analyze to get any real direction out of the data. Worst yet, you'd need to add a third and fourth to that list: "Does choice of language affect future revision?" and "Does experience as a team affect outcome?", to really get anything meaningful about choice of language for a project, be it open or commercial.

          Point of interesting note, the most defect free (both in terms of code errors and requirements) I've seen in the last decade have been written in functional Ada95 and the worst I've seen are in C++98. Even people with little experience tend to be able to at least read Ada, be able to update it with some googling and keep everything consistent for 20 year old code. It _seems_ that anyone with even 2 years with C++ will try to throw everything into templates and start inlining for "efficiency" reasons. And that holds for both individuals and 20+ person teams. (The 'best' I've seen have been prototyped in Python first since that would give them reliable-ish results before 'hard coding' into a language with a compiler and they could just use the python as PDL requirements. Not something I expect to see in open software but there's always hope.)

      2. Michael Wojcik Silver badge

        "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."

        ...

        I'm not so sure about that...

        It's trivially true, because "bad code" is a degenerate metric. It's a statement that expresses a subjective impression rather than anything that can be formally falsified.

        Consider for example those "visual" programming environments where you arrange processing modules in a workspace and then wire them together to express data flows. I worked on one of those in the late '80s (IBM's Data Explorer/6000). It was a Turing-complete medium for expressing computation, so a "programming language", but it was certainly difficult to commit many common errors. You could still do things like create graphs that expressed algorithms with abysmal performance characteristics, though; is that "bad code"?

  12. preppy

    Definition of "bug".....or definition of "defect"?

    I'm not sure what the paper is trying to measure, since there is a hierarchy of defect types with the most immediate defect being "fails to compile" or "fails to run"........all the way up to "fails to match the original business requirement". In this latter case, the code might be technically perfect!

    Examining Table 5 in the original paper ("Categories of bugs...") it would appear that failure to match the original business requirement is NOT considered a bug!!!! So perhaps the rather technical focus of the paper misses the main point of having software in the first place.

    ....just saying.

    1. Kristian Walsh Silver badge

      Re: Definition of "bug".....or definition of "defect"?

      You've hit on the Achilles' Heel of FOSS projects. There usually isn't a business requirement to meet, so the code is free to go wherever its devs want it to. That's both good and bad, depending on whether you're a developer or a user...

      The other problem I have with this is that using github as a source will over-represent the work of very inexperienced programmers. These days, everyone's "first code project" ends up on github. If the developer's own inexperience is the major limiting factor, how can you accurately judge the effect of the language they've chosen on the code quality?

  13. Shady
    Mushroom

    ...the differences in code quality between languages are pretty small...

    Even between PHP and C# ?

    [Lights touchpaper, stands well back]

    1. lurker

      Re: ...the differences in code quality between languages are pretty small...

      I've certainly seen plenty of garbage code written in both.

  14. Anonymous Coward
    Anonymous Coward

    4GLs

    It's a pity that 4GLs died a death - a comparison of those against 3GLs and functionals would be interesting.

  15. Mage Silver badge

    Modula-2

    Most people used it as a sort of Pascal failing to understand the advantages of

    1) Opaque Modules

    2) Procedure Variables

    3) Anonymous arrays etc of same size not compatible / Stronger typing

    4) dynamic sized variables

    5) Co-routines and Stack frames etc as part of language to create multi-CPU, Multi-threaded, Mutexes etc

    6) default on Compiler features such as array bound checking

    7) Zero need for ANY global variables (i.e. Local Persistent variables for repeated calls of function/procedure in a Module) or "global" only to nested modules

    8)Separate compilation of modules without DLLs.

    and many other features alien to Pascal. The only aspect of C++ better is syntax for objects (as M2 can do them sensibly). The failure of C++ is C compatibility. Too many C++ compiled programs are traditional C style. Doesn't anyone read what Bjarne Stroustrup (father of C++) said about C?

    http://en.wikipedia.org/wiki/Bjarne_Stroustrup

    C# is of course really MS version of Java, and a creditable effort. No point in VB.net if you bother to learn C#

    (The VB6 is another story, if you pretend you are writing C++ or Modula-2 in it and never did BASIC or Fortran).

    1. thames

      Re: Modula-2

      Have an up-vote for mentioning Modula-2. It was my favourite language at one time. Too bad it died many years ago. JPI had a fabulous IDE, compiler, and debugger, which I thought were far better than anything from Microsoft or Borland in any language.

      Today though I think that Python is the bee's knees. It gets the job done in a lot fewer lines of code, and these days I'm far more interested in getting results than I am in the finer points of opaque types (that's the proper name for it, not opaque modules).

      So far as bugs are concerned, most studies say that the number of bugs per line of code tends to be constant, regardless of language, so fewer lines of code means fewer bugs. Good test coverage tends to be the only thing that really works so far as rooting out the remaining bugs is concerned.

      1. Mage Silver badge

        Re: Modula-2

        Been doing C++ since 1988. Last did any Modula-2 about 1997. I've forgotten most of it. What is really horrendous is maintaining or extending a Web Server application. I think I counted something like 9 languages on one project I worked at last year (if you count SQL, Oracle SQL stuff telling the engine what to do, HTML and CSS as "languages"). You are reduced to using a text editor, running on a test server with debug options on and it's like 1979, except you have multiple windows, 8G RAM, 2Tbyte storage and quad core 64bits instead of 6502 or Z80, one screen and 1M byte 8" floppies.

        People writing gadget or desktop GUI applications in one language with a visual GUI editing forms/Windows etc don't know how cushy they have it. Even embedded JAL, C or Assembler for a PIC micro-controller has a better Compile Time environment and less error prone that Web Server development. It explains why Microwave ovens and Washing machines mostly work and the Internet is full of semi-broken poor usability websites full of eye candy and security flaws. Why on mailing lists are we STILL seeing almost every week SQL injection, PHP errors, cross site scripting, escalation of privilege etc?

  16. roselan

    Most of us suspected for a long type that "static vs dynamic type" language debate was less important that environment (time to do you job) for code quality.

    This paper confirms it, I guess.

  17. Tom 13
    Joke

    Wasn't Erlang the language

    the Martians used to write their rocketry programs in Jeff Wayne's musical version of War of the Worlds?

    1. Tom 7

      Re: Wasn't Erlang the language

      No - the Erlang is the measure of simultaneous phone calls a telephone exchange can handle.

      1. Mage Silver badge

        Re: Wasn't Erlang the language

        Erlang can be used for Mobile.

        It's not actually the maximum number of calls on an exchange (though related), but it is the unit of "traffic"

        "When used to represent carried traffic, a value (which can be a non-integer such as 43.5) followed by “erlangs” represents the average number of concurrent calls carried by the circuits (or other service-providing elements), where that average is calculated over some reasonable period of time. "

        See

        http://en.wikipedia.org/wiki/Erlang_%28unit%29

        You need Stochastic analysis to predict what traffic as system can carry. You can use these methods to predict actual speed for a contended backhaul of a known contention ratio (total subscribers x speed sold even if not connecting vs actual backhaul speed)

        See also

        http://en.wikipedia.org/wiki/Erlang_distribution

  18. Kamal Hashmi

    Programming/Coding is NOT Design

    Oh, for c**p's sake, the language is merely used for implementation.

    You use (and learn) whatever the customer wants.

    A proper study would have looked at bug rates in different language implementations of the same design.

  19. razorfishsl

    You just know some guy in hiking boots and a beard is not going go care about error handling, it's just too pro-establishment.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like