Google pits C++ against Java, Scala, and Go
Google has released a research paper closely comparing the performance of C++, Java, Scala, and its own Go programming language. According to Google's tests (PDF), C++ offers the fastest runtime of the four languages. But, the paper says, it also requires more extensive "tuning efforts, many of which were done at a level of …
What *really* happened
The optimisations involved were all performed by humans - rewriting the code with the explicit goal of making it run faster. In the case of the two languages on the Java platform (Java and Scala), the optimisation also involved tuning GC parameters.
Interestingly, all of the changes made in the Scala code to speed it up were available to the Java code (They both compile down to the same bytecode). So what happened here is that Scala took techniques which would just be too verbose and otherwise impractical in Java, and made them more generally accessible.
Now *that's* a result.
Too verbose in Java?
Are there any techniques that are *not* too verbose in Java?
I've used it often enough, but if there was ever a language designed for people who *really* like typing, it's that one.
C++ vs all the other
C++ is the best language ever when I am writing the code.
When I am reading somebody elses code it sucks.
(10+ years experience in offshoring)
@Ken Hagan
"*most* code needs to run faster than it does at the moment"
I would say "some" rather than "most". If computers are too slow it's only because we continually push them to do new things which weren't previously necessary until they break, in which case they will ALWAYS be too slow by definition. I mean, I'm sure it's very clever that I can write a Unix emulator in a Javascript interpreter running in a browser running under Linux running in a virtual environment running under Windows running under Mac OS X, but really, that's hardly something we NEED to be able to :-)
I'm continually amazed at how fast Java is these days. You can do quite serious graphics or scientific programming in Java and effortlessly have it run faster than a heavily optimised native program only a few years previously. My favourite adage is still that CPU cycles are cheaper than developer cycles!
@ Rolf Howarth; Not always...
"My favourite adage is still that CPU cycles are cheaper than developer cycles!"
Not when you're having to build your own power stations to run your data centre they're not.
http://www.wired.com/epicenter/2010/02/google-can-sell-power-like-a-utility/
You dont want to start from here...
Your right bazza - Google can save a bucket load by optimising C++ code, but for a lot of apps and companies it would be a lot cheaper in the short term to just bang in another processor - moving from a single core to a dual core would cost about 10minutes of a good C++ programmers time.
But by doing that you are potentially putting in roadblocks for the future. If you think your app is going to go worldwide then you have to make it 'enterprise' compatible from the start, this may cost you a bit more but in the long term it will potentially save you billions.
Languages all have their own tricks
An important factor is whether the language allows (or entices) you to use constructs that defeat compiler optimisation. For example, much of the speed of old-fashioned Fortran came from the absence of anything like pointers - so the compiler could more accurately assess the scope within which a variable might be referenced.
C's use of pointers is probably the main thing that makes it slower than old fashioned Fortran, but if you code carefully and don't use pointers, that gap will close. With C++ you're one more step removed from knowing whether the compiler will be able to optimise what you write, so more frequently it can't.
To give another example from Java, garbage collection can be a real problem but can often be mitigated by avoiding unnecessary object creation/destruction. Unfortunately, this is again something that a compiler is unlikely to manage on its own as it's part of the logical design of the software and at a higher level than compilers work at.
So while compiler optimisation is always a good thing, good software design also lies at the heart of run-time efficiency. Those who say you can't beat the compiler at optimisation may be right at the level of loops and method calls, but at a higher level it's easy to design something so it runs slowly in any language you like. Knowing what the language does fast and what it does slowly is where the solution lies. So there's no real substitute for experience with the particular language in question.
Of course, this does make C++ programmers superior beings, as few are able to gain much experience with this language without shooting off both their feet at some point.
@Werner McGoole
"C's use of pointers is probably the main thing that makes it slower than old fashioned Fortran, but if you code carefully and don't use pointers, that gap will close."
Can you elaborate that? I ask because I am thinking about the use of pointers in C to avoid eg copy-on-call when passing large structs to functions...
Query
Are there published benchmarks to demonstrate that pointers increase speed ? My tests showed that they are about 20% slower than arrays with incrementing indices.
I was surprised because pointers look as if they should be faster. Certainly pointers were often closer to the concept, so faster to code.
Re: Can you elaborate on that
Aliasing. If I have a function...
void f(int* a, int* b) { ... }
then a C or C++ compiler has to assume that a and b might point to the same storage. (Or if they are arrays, that their ranges might overlap.) Therefore, whenever it has written through *a and subsequently needs the value of *b, it has to reload it from memory. That reduces opportunities for keeping values in registers, not just values of the function arguments, but anything numerical results that were computed from them.
In Fortran, the compiler is allowed by the rules of the language to assume that no overlap exists. If that is not true, you need to code the function differently. The advantage this gives is probably the main reason why Fortran still has the edge on numerical codes, and the motivation for "noalias" style pointer qualifiers added in more recent versions of C and C++.
^ what he said (Ken Hagan)
But I wouldn't let it put you off using them in C. A lot of people say pointerless languages can be faster because of this possible indeterminacy. But, with my limited experience, i can't say if it is that much of a liability.
Just consider what you're doing. Avoid pointers if you can but realise what power you have in them.
Nothing wrong in passing pointers to large structs, imho.
the real problem with java is...
did anyone notice, it needed the greatest number of lines of code? Verbose doesn't begin to cover it.
simple used to be faster
Legacy Fortran code is fast because the F77 compiler was really simple, no function stack and no dynamic memory allocation. Just fixed arrays and static libraries for everything. basic stuff like recursion was not allowed
Pure C-compilers used to be fast also before all the OOP stuff was added.
And not to mention all the kernel, multitask, GUI, interrupt libraries you have to incorporate.
For many simple tasks todays computers are too complex.
The 10% of C++ programming
We are a C++ shop and we limit the use of C++ language features. We have some 20+ years of developing in C++ and our standards are built from field experience in our application area, and constant profiling of new algorithms. If I were to outline some of the restrictions we put on using language features they'd be a swarm over this post giving it the thumbs down. Yet in our application area, creating tooling for the manufacture of 3D objects, we are the fastest and most accurate in the world.
Hehehehe...
If you guys are doing what I think you are doing, good for you! Less is more.
What I can't stand about C++ is not being able to see in my head what the compiler is likely to be doing. And the messier (or more C++ features used), the bigger my headache gets.
It's all just philisopher's stones
While the academics and the clever folk put lots of work into developing new languages and the frameworks/ecosystems they live in, to me the user's of these new systems always seem to go through a very specific cycle; Namely, the "wannabes" rush to every new thing in the hope that it will contain the secret that prevents them from having to learn, think and do.
It's the new Philosopher's Stone. The kiddies all think "this will make my code into gold!! ... nope, sorry. Education, experience, hard work. The language is a tool not a magic rock.
@John Lilburne
Mr Lilburne:
By all means please post your list of restrictions. I am very curious to see it, and I promise not to flame or down vote you.
NPOV?
"C++ and Java require statements being terminated with a
’;’. Both Scala and Go don’t require that. Go’s algorithm
enforces certain line breaks, and with that a certain coding
style. While Go’s and Scala’s algorithm for semicolon
inference are different, both algorithms are intuitive and
powerful."
I don't think that would pass Wikipedia review... One might argue that inference of syntactic elements from whitespace is ugly and error prone, and enforcement of K&R style doubly so - unless you do it properly and get rid of braces altogether, like Python. Adding semis is like breathing, you don't even know you're doing it; so why mess with it?
Also, in terms of conciseness, it hardly seems fair to compare ISO C++ with something brand new like Scala and Go: Why not C++0x, which instantly gets rid of the lot of the verbosity with 'auto'? And Scala's fancy for comprehension structure was the first thing they threw out when optimising it!
Go's future?
I like the direction where go is headed but it lacks the proper tools to do currency right. If it bad a bcd or fixed point type (like every modern CPU support), then it could be very big in many fields that are still fighting over floating point money.
Comparison to non-OO language?
All the languages in the test are object oriented to some degree -- though C++, Go and Scala less than Java. OO makes a language difficult to compile for fast execution speed and difficult for humans to optimize their code, so I would really like to see a similar experiment include languages with no OO features.
Garbage collection is also less efficient in OO languages than not, partly because updating of old objects to point to newer objects is prevalent and partly because the GC is required to call finalizers on collected objects. So the mentioned problems with GC on JVM need not apply to non-OO languages.
Um, Scala is most certainly not less OO
than Java. And C++ code can be entirely free of objects, so "languages with no OO features" were already included, because C was essentially included. Also, Go isn't even OO, it's only object-based.
Functional languages
Surprise, surprise a functional language comes out best, even a half-hearted one that is crippled by running on the JVM.
Maybe they should have tried a decent functional language like Haskell or ML that would have been a fraction of the code of Scala and yet compiled to speeds close to C++. It could have even auto-parallelized some tasks to run across multiple CPU cores.
Look up 'The Great Computer Language Shootout' for a much bigger comparison of languages across various useless benchmarks.
Scala FTW
Google supposedly hires the smartest of programmers, and Scala supposedly requires programmers to be smart ... it should be a good match. But sadly Google uses a lot of Java and C++ and is focusing on Go, a new language with the design of which ignores almost everything that people we have learned about language design over the last decades, repeating the major mistakes of Java that leave the programmer to do a great deal of the work that the language should be doing, especially not promoting code reuse. If Larry and Sergei were personally to learn Scala and sit with Martin Odersky and learn just what it offers and how it is so much superior to Go, Java, and C++, Google could revolutionize their practices and knock the programmer productivity ball out of the park.
Buried in the report
"Jeremy Manson brought the performance of Java on par
with the original C++ version. This version is kept in the
java_pro directory. Note that Jeremy deliberately refused
to optimize the code further, many of the C++ optimizations
would apply to the Java version as well"
But no "Java Pro" line in the benchmark table...?
