So they...
...finally got to the ROOT of the problem...
Someone had to say it.
Tens of thousands of bugs have been eliminated from the program CERN's atom-smashers are using to identify Higgs boson – just don't expect an answer to life the universe and everytime anytime soon. CERN says it has squashed 40,000 bugs living in ROOT, the C++ framework it is relied upon to store, crunch and help analyse …
Slightly polarised for clarity:
The job of a scientist is to do science - developing theories and models and writing code to back them up or prove the concept. Programs written may not even be intended to be used by anybody but the original author and their shelf-life is "until the concept is proved", at which point they drop the code and move onto their new task.
The job of a professional programmer is just that: writing programs to a professional standard. Since this is what they do (roughly) all day, every day, they can get pretty good at it. Their programs are intended for use by other people and the shelf life is "until my company decides not to support it."
Yes, some scientists do write some awful code (but so do some professional programmers), but the two groups write programs for different reasons so a direct comparison seems invalid.
(I write this as a PhD student who writes software for a living).
"The job of a scientist is to do science - developing theories and models and writing code to back them up or prove the concept. Programs written may not even be intended to be used by anybody but the original author and their shelf-life is "until the concept is proved", at which point they drop the code and move onto their new task."
The attitude of the staff of the Climate Research Unit, which gave us the memorable reading experience that was the harryreadme file.
Hint. If *your* kind of science can be done by writing a dozen equations on a blackboard great. The software is merely a quick and dirty demonstrator of your theory and you can write it any way you like.
However *if* your kind of science involves crunching multiple *large* data sets (and the *way* you process them) is *critical* to your whole thesis (and/or will be used to make *billion* pound policy decisions) your process should be substantially more methodical.
Since you did me the courtesy of leaving your name on your comment let me explain.
"Why so mach asterisk bracketing sir?"
At the risk of being obvious because there several points I want to emphasise.
Not everyone here speaks English as a first language (and I sometimes doubt some posters have Earth as a first planet either).
"on random words derails the little voice in my head that I read with."
At the risk of a humty dumpty answer the words I emphasise make perfect sense to me.
Now *apart* from my commenting style do you have a substantive comment to make?
The c++ code (not ROOT) on LHC experiments has to run day in day out over and over again processing the data and is written to a much higher standard than the typical academics code - most of the code in ROOT is only used in private code used by 2 or 3 individuals at a time for data analysis, not "mission critical" stuff (although a small part of ROOT is used for data persistification to file). The people writing the code that is used to process the data do write code day in day out and are pretty good at it - this is often 50%+ of their job, even though they trained as scientists in most cases. The rest of them are perhaps as you suggest ;) - I've seen lots of awful coding in LHC experiments.
Well Python is probably preferable to C++, even if it is still imperative. I can't think of a language much less suited to having hundreds of non-programmers collaborating. It's just so dangerous. How on earth can they do proper unit tests? I bet as soon as any particular experiment comes to an end the associated code gets binned straight away due to rot. A better approach would have been to get them trained and using ML from the start.
The Standard Model Higgs (and indeed, there are others, like Supersymmetric Higgs) is just a mathematical consequence of the Standard Model. We know the latter is not a 100% correct description of physics, so it might well prove to have a hole where the Higgs should be.
Indeed, by end of the year there might well be enough data to exclude the Standard Model Higgs "at 95% confidence level". That would make things interesting.
More here: http://www.math.columbia.edu/~woit/wordpress/?p=3960
My experience of C++ -- which goes back to the early 90s -- is that's its a very powerful tool that's responsible for pretty much every large scale screwup in modern software design (plus the inevitable software bloat). I describe giving a typical programmer this tool as "a bit like giving a toddler a chainsaw as a Christmas present". I also think that object methodology is seriously overused; its all that gets taught so we're stuck with the "if all you've known is a hammer then everything looks like a nail".
Now, rather than making the typical programmer statement "its buggy because its got x million lines of code in it" we should be asking why its so big, why it doesn't break down into testable components and so on. Ordinary, everyday stuff that I will admit seems to be elusive to our Windows bretheren (Microsoft doesn't go out of their way to make their stuff easy to work with, IMHO) but absolutely essential if you're doing serious work such as embedded design.
I find professional programmers -- CS majors -- among the worst offenders because they only know their coding abstractions, they see the code as the goal rather than it being a model of some thing or process.
Flame On.....
Well, at the time ROOT development was started, writing a major scientific package at CERN in C++ instead of Fortran 90 was a controversial choice, which surprised me, as an observer in the next building, more than somewhat. My guess is that, yes, OO programming in an unsafe language like C++ would be more error-prone that subroutine-based structured programming in Fortran 90; but if you knew the individual mostly involved in originating ROOT, you would know that making this argument at the time was doomed to failure. The fact is that if ROOT hadn't been written in C++, it would never have been written at all. The modern generation of physics experiments would be using some completely different software for event reconstruction. I expect it would have a different set of 10000 bugs.
I'm looking at the timescales here, and wondering what you could have picked from. I suspect, for one thing, that if Windows was even involved, it was NT, and the reality was mostly some form of Unix.
And you wanted something that, as a programming environment, would be available for a long time.
The world looks a bit different now. I had a few megabytes of storage then. Now we talk in terabytes. I reckon they didn't so so badly, if they only found 40,000 bugs.
"Microsoft doesn't go out of their way to make their stuff easy to work with, IMHO"...
Have you taken a look at F#? Fabulous language - OCaml on interoperability steroids and a real programmer's tool. Not everything that comes out of Microsoft is rubbish. (Just a shame you have to rely on Mono to use it on a proper OS.)
No they aren't. Windows is a mess for the uninitiated. For people touching a computer for the first time it's the absolute worst OS choice out there. It requires the most maintenance and has the least intuitive interface of any OS I've ever had the displeasure of using.
That C++ is dangerous that C++ is like a chainsaw, etc.
Any language can engender a bloody mess, or a fine piece of digital crafting.
Being it VB6, .NOT, Delphi, Perl, Bash,or C/C++
I have seen software being written in VB6 which was both well designed and elegant, and I have seen true C++ abortions of nature, and the other way around.
It is the developer who makes the difference here.
here's what I want to know, why didn't anyone investigate the "false positives" and "correct" them. I have found many problems in porting F/LOSS code to other platforms that once you clean up all the "warnings" seem to magically disappear.
All programers should have this drilled into their heads, "Compiler warnings ARE bugs."
is a pile of utter rubbish. It is terribly written, and it's continued use is holding back progress.
The developers don't know a decent inheritance structure from their elbow, and have only grasped the concept of name spaces.
The program encourages the mixing of analysis code and graphical representation code which leads to utter confusion.
Half the methods aren't implemented, the cl;asses are terrible. Everything inherits from a generic TObject.... The have invented their own scriping 'language' CINT, which
Memory management and garbage collection are alien words to these developers, and they perfer to delete / keep 'ROOT' objects at the end of functions, not allowing for scope of ownership.
Have a look at the ROOT source.....
#ifdef private
#undef private
#define private_was_replaced
#endif
// For now explicitly disable copying into the value (i.e. the proxy is read-only).
private:
TImpProxy(T);
TImpProxy &operator=(T);
#ifdef private_was_replaced
#define private public
#endif
and yes, I am a particle physicist at CERN.
Don't let them fool you into thinking running coverity is a success... is a flipping nightmare...
...CERN has so much data that nobody can look at it. They therefore filter and sort the data based on their preconceived concepts. So their search is bounded by their current theories.
It's a perfectly valid point.
CERN should be working on some generalized data exploration tools and put it all on the Internet. Call it CERN-Zoo.
Alex24 is spot on. For such a widely-used tool, ROOT is quite spectacularly badly designed. The thing is, it was always known to be such. CERN's computing division bigwigs had a massive fight with the ROOT originators in the late 90's and they were cast into the wilderness for several years. Unfortunately for the field of particle physics and everyone working in it, CERN's "official" attempts to provide new data analysis tools were cack handed and under-resourced, and people started using ROOT because they literally didn't know better - all they'd had before was home-grown Fortran 77 libraries and scripting languages. After a couple of years CERN's management realised they had lost the war and ROOT was finally anointed as the official tool of high energy physics analysis. We will be living with this unmaintainable, untestable embarrassment for yearsto come. (Anon for obviuos reasons)