Feeds

back to article Analyst: Tests showing Intel smartphones beating ARM were rigged

So does Intel's Atom-based Clover Trail+ platform really outperform ARM processors in smartphone benchmarks? Not so fast, says one analyst. In June, The Reg reported analyst firm ABI Research's claim that it had pitted a Lenovo K900 smartphone based on Intel's Atom Z2580 processor against a brace of devices build around ARM …

COMMENTS

This topic is closed for new posts.

Page:

Silver badge

It is too easy to pick on benchmarks.

Would this be from the same processor maker whose performance library and compiler famously "improves" benchmarks when run on it's own "genuine" processor family and disables optimizations when run on competing CPUs with the same instruction set ?

10
0
Thumb Down

Re: It is too easy to pick on benchmarks.

Saying "it's too easy to pick on benchmarks" implicitly acknowledges the weakness of what they did. They might as well have said "You didn't really trust what we said, did you?"

2
0
Gold badge

Re: It is too easy to pick on benchmarks.

"Would this be from the same processor maker whose performance library and compiler famously "improves" benchmarks when run on it's own "genuine" processor family and disables optimizations when run on competing CPUs with the same instruction set ?"

I don't think so -- last i had heard about someone looking into this (years ago...), they found ICC actually had a *greater* performance increase (compared to contemporary GCC) on AMD processors then it did on Intel processors.

0
4
Bronze badge

So in short....

Don't use Antutu for any kind of cross architecture benchmarking - ever.

Snark over.

Genuine question - is this true of other benchmarks like Geekbench etc, to a greater or lesser degree?

Steven R

3
0

Re: So in short....

Definitely avoid AnTuTu, even Dhrystone is better, more accurate and less easy to cheat...

Geekbench is far more trustworthy as it is done professionally and uses a large set of fairly standard benchmarks. While it is not SPEC, my experience is that it correlates reasonably well with actual CPU performance. It is not without issues however, 2 of the FP benchmarks accidentally use denormals which causes the scores to be slower on some CPUs with slow denormal handling. This will be fixed in the upcoming Geekbench 3.

In my experience every single benchmark has flaws, and that includes Dhrystone, CoreMark, SPEC, EEMBC. Most of them are easy to spot, others require more analysis, but over the last 20 years I've come to the conclusion that the perfect benchmark doesn't exist. Relying on a single benchmark number is foolish, someone may already have broken the benchmark. It invariably happens...

7
0
Bronze badge

Re: So in short....

Cheers Wilco - pretty much what I suspected.

Real world performance is far more important, but more difficult to benchmark.

Geekbench shows my late 2008 Unibody Macbook as being marginally faster under Linux than it is under OS X or Windows, but TBH all three run stupid fast with an SSD in the mix, so I don't mind either way!

Steven R

0
0
Silver badge

hmm

Honestly all around this does not look good for Intel. They really needed to be that much better than ARM due to be ARM's entrenchment in the market. The fact also remains Intel unlike ARM (through IP only) has never shown itself able to survive on lower margin higher volume production that characterizes the direction the market is going. Price per unit is going to be a real consideration going forward. Considering how lean ARM production is done Intel will almost assuredly have to sell at a loss for quite some time to get traction in the market. Time will tell and Intel does have quite the war chest, R&D capability and is a generation ahead of everyone else consistently in fab technology but I see some problems for them for at least the next 5 years.

15
0
Anonymous Coward

Dev[e|i]l's advocate....

I will play Devil's, or rather devel's advocate here:

If I can improve the performance of my program by doing nothing but choosing a different compiler, why shouldn't I? And if that benefits one CPU or system over another, isn't that a legitimate consideration when comparing the two?

If Intel is willing to put the effort into making a compiler that is spectacularly better at generating machine code for their chip from my code, and ARM is not, should that not be a factor in my choices? (so long as their compiler actually does what my code tells it to do, of course.)

On the one hand, it's a shame that Intel won't donate the algorithms they use to make ICC better than GCC back to GCC, but on the other hand, since that very well might benefit all of Intel's competitors, I can see why they don't.

And on the gripping hand, it seems to me that ARM should start really working with either GCC or Clang (more than they do currently, which, granted, is actually quite a bit.)

9
10

Re: Dev[e|i]l's advocate....

It all depends on the reason for the speedup. If it's a general "Intel-optimized compiler", you are perfectly right, this is a very legitimate reason to prefer one platform to the other.

However the past has shown us that hardware manufacturers have been more than willing to game benchmarks, all the way back to the old video cards whose actual hardware recognized testing tools and just skipped a lot of hardware work so the same software would run a lot faster...

12
0
Silver badge

ARM and GCC

Well in general you are correct. The available compilers should be considered part of the architecture.

However, in this case, what happened was that the opttimisations seem to have skipped parts of the actual test itself. That is often due to tweaking optimisation settings etc. In this case, the test itself is broken and is no longer valid.

As an extreme simplistic example the following "loop test" to test looping:

for(i=0; i < 1000000; i++) {}

If the compiler optimises that away then the loop test itself is no longer a reasonable performance indicator.

As for ARM and GCC... Well ARM put a hue effort into GCC. Just look at http://www.linaro.org/ . Many ARM employees (and ARM cash) goes into making this happen and the gcc compiler improvements are actually helping all architectures. The gcc on my x86 ubuntu system reports gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3

12
0

Re: Dev[e|i]l's advocate....

In this particular case it was unreasonable to switch to ICC as GCC is the only compiler that is used on Android. So even if ICC was genuinely better than GCC (and that is questionable - ICC is regarded as a benchmarking compiler, not as a production compiler as it doesn't beat GCC or VC++ on typical code), then it still wouldn't make any difference as nobody uses it to build Android and its applications.

There was also the issue that Intel somehow persuaded AnTuTu to change the compiler for x86, and allow them to tune to the benchmark, including better compilation options and special optimizations - the ones that turned out to optimize much of the benchmark away. The ARM version of AnTuTu is compiled -Os and without inlining on an old version of GCC, so changing to -O3 and a newer GCC version should make a huge difference.

So to be fair, has ARM been given the opportunity to do the same optimizations as Intel? If not, then it was effectively cheating. If money changed hands then this is something that should be referred to the FTC as it would count as anti-competitive practices - something Intel did in the past.

12
1
h3
Bronze badge

Re: ARM and GCC

Realview is arm's superior compiler (Think it has got a new name). Thing is Google is not willing to put the work in to get Android using it.

Intel's compiler is great (As long as you don't use the code on AMD without being very careful dunno whether they were forced to stop that or not).

I compared an Orange San Diego that I have lying around (the Xolo debranded rom / root / no other changes) on ICS to my brothers Tegra 4 One X (Running Jelly Bean 4.1) and there was no real noticeable difference that we could find. (Even though some of the apps are using the arm -> intel emulation. But the San Diego battery lasts for 3 days.

This is a single core really old tech atom that intel is using.

gcc is designed to support as many architectures as possible any compiler just supporting one made by the people who design the processor is always going to be better. (Or it always has been xlc / sunpro they almost universally have been better whenever I have tested them. There is the odd time when it performs the same when the code is nearly all inline asm and then a few that use specific quirks of gcc that are not worth building at all with anything else or if you do then it will be exactly the same once you fix the syntax)

That benchmark is completely useless anyway it doesn't test anything worthwhile at all. (There is ways to increase the score that make the device work much worse but people do it anyway - same with that stupid sdcard read ahead thing that people do).

0
0
Bronze badge

Re: Dev[e|i]l's advocate....

@David D Hagood,

"On the one hand, it's a shame that Intel won't donate the algorithms they use to make ICC better than GCC back to GCC, but on the other hand, since that very well might benefit all of Intel's competitors, I can see why they don't."

Intel sell ICC for profit, whereas GCC is free. I doubt Intel would want to kill their own market. However, the algorithms that make ICC-compiled code quick(er) on x86 aren't likely to be re-usable on non-x86 platforms. If Intel wanted to boost their chip sales by giving everyone a free compiler that made for better performance they'd just start giving icc away; far easier than integrating it's IPR into gcc.

I'm wondering if auto-parallelisation is playing a role here. ICC and GCC (since 4.3) both do it, but certainly it's something that Intel promote; most GCC users I speak to have never heard of it.

0
0
Anonymous Coward

Re: ARM and GCC

> for(i=0; i < 1000000; i++) {}

> If the compiler optimises that away then the loop test itself is no longer a reasonable performance indicator.

Neither would it be a reasonable compiler. Unless it set i=1000000; in its place.

Just saying.

1
0
Facepalm

Re: Dev[e|i]l's advocate....

you did read the part where the Intel processor SKIPPED parts of the test thereby artificially changing the score yes ?

2
0
Anonymous Coward

Re: Dev[e|i]l's advocate....

And did you read the part of my post where I said:

"{so long as their compiler actually does what my code tells it to do, of course.)"

I've worked with compilers ranging from Microsoft C to GCC on PPC, ARM, MIPS, x86 both 32 and 64, 68000, various microcontrollers ranging from 8 bits through 32 bits, DSPs, GPUs, and probably a few others I'll remember in a bit. I've done hard real time signal processing, graphics, systems programming, robotics - and all of that professionally.

Like I said, "Devil's advocate" - I'm not trying to justify gaming benchmarks. However, saying a compiler is bad for optimizing out

for (int i = 0; i < 1000000; ++i) {}

Is WRONG. Anybody who actually has worked with C, studied the specification for the language, and worked with compilers will know that you would have to write that at

for (volatile register int i = 0; i < 1000000; ++i) { volatile register j = 0;}

to have a loop benchmark that won't be optimized out.

(and, I would hope anybody writing a benchmark would actually have done this, and that the code being bandied about here is a typical oversimplification.)

1
0
Bronze badge

Re: Just saying

If i never gets used again there's no point setting it.

1
0
Silver badge

Re: ARM and GCC

"Realview is arm's superior compiler (Think it has got a new name). Thing is Google is not willing to put the work in to get Android using it."

It does not need Google to do anything. ARM could just do what Intel did - they tweaked the Intel compiler to be able to understand GCC flags and thus compile a straight Linux kernel with the Intel toolchain. Nothing stopping ARM doing that.

As gcc improves, there is less and less benefit in running the ARM/KIEL compilers vs gcc. When you get to a certain point it really is not worth the extra cost/drama associated with running the ARM toolchains. From my experience, it seems that point was reached a while back. ARM are actively putting money into GCC development. They want GCC to get better.

I have been using gcc on ARM-based projects for about 14 years In the beginning the compilers were rather fragile and ropey. Now not so. The majority of Linux systems are ARM-based and built with gcc.

1
0
Silver badge

Re: Dev[e|i]l's advocate....

"If I can improve the performance of my program by doing nothing but choosing a different compiler, why shouldn't I?"

The problem is, if simply changing the compiler can result in such different results, can you really claim to be testing the processor, or is it the compiler you are testing? If a clever optimising compiler skips bits and reorders others, are you testing the processor or the compiler?

To my mind, this stuff ought to be written in assembler by a clever person. Them we'll see what the system can do as opposed to how good the compiler is...

0
0
Anonymous Coward

Re: Dev[e|i]l's advocate....

The point is this is a benchmark which is supposed to show a comparable performance using the same source code compiled into binaries for each CPU.

Obviously if GCC was used and produced bad Intel code and good ARM code then this would be a poor test. Same if the Intel code was good and the ARM code bad.

Comparing different CPU families has always been difficult without the added problem of the quality of code generated by compilers. CPU ratings like MHz and MIPS have been misleadingly used for comparisons by people who don't understand the difference in the instruction sets.

RISC vs CISC is also tricky, not that either x86 or ARM are purely CISC and RISC these days.

0
0
Silver badge

Re: Dev[e|i]l's advocate....

"The point is this is a benchmark which is supposed to show a comparable performance using the same source code compiled into binaries for each CPU."

Surely the real point of a benchmark is to measure processing speed, data throughput, number crunching power, etc. There is a need to run supposedly identical code to get an idea of real world performance, however for working out the raw power, it should be assembler.

0
0
Anonymous Coward

Re: Dev[e|i]l's advocate....

"If a clever optimising compiler skips bits and reorders others, are you testing the processor or the compiler?"

A benchmark should be at least vaguely representative of the eventual working environment, otherwise it is a meaningless and unrepresentative waste of effort.

If the chosen working environment is assembler, use that - but you're insane, or exceptionally unrepresentative. If you are that unrepresentative, you'd be well advised to put your own representative benchmark together **in assembler** for all the potential candidates. But outside the world of small microcontrollers etc, what is there left that still uses assembler in any serious way?

If the chosen working environment starts with a high level language, use that.

Does that help at all?

[Assembler coder for 16bit systems and up, for 2+ decades. Not missing it much]

0
0
Bronze badge

Problems with AnTuTu

Well done, clever analyst! However, for me, somewhat less with it, it mainly reminds me of that Desmond I got after 3 years of study at a university which shall remain nameless.

0
0

AnTuTu rigged the tests, wonder how much $$ were thrown their way?

8
0
Gold badge
WTF?

Build a benchmark using a compiler supplied by *one* of the companies being *tested*

And have the audacity to say it proves they are better?

Are you f**king kidding me?

Cockup theory says amateur hour benchmarking. Not too good for future credibility.

Conspiracy says someone (who I won't speculate on) was told "Make Intel look good."

They did.

Not too good for future credibility either.

16
0
Anonymous Coward

"Asked for comment, a spokesperson for ABI Research told El Reg, "Honestly, we feel [McGregor] totally missed the point. He is focusing on the benchmark and not the power performance we highlighted."

What power performance; they listed Amps. Amps is a meaningless number without know the volts; so really the best figure to use is watts. If someone sold an ARM processor that used 10 Amps, but only .01 volts is that any better than a processor that used .01 Amps but 10 volts? Nope. So Amps is useless without knowing the volts. Maybe they should re-release their findings and make sure the point is there, which is how much wattage the various processors used.

9
0
Silver badge
Boffin

That's not strictly true since the power loss goes with I2R therefore the losses at 10A are a million times greater than at .01A

1
5

Power efficiency

Even comparing Watts is meaningless if the CPUs don't run the same task. You really need to compare energy to complete a task in order to do a fair comparison. A CPU that runs twice as fast while using 50% more power will complete any given task using 25% less energy. This is especially relevant in the ABI comparison as both the A15 and Krait cores used are significantly faster than Atom (unless you rig the benchmark of course).

9
0
h3
Bronze badge

Re: Power efficiency

Or you have a compiler that uses SSE properly. (Or someone who can write it properly by hand).

It doesn't matter about the raw performance if the code it is running is better.

Try running the C version of something in imlib2 on arm or intel SSE version is fast as hell. (So probably would the arm version if it was done optimally using NEON or whatever but the fact is it as far as I know doesn't exist).

The San Diego gets 3 days battery life. I have not seen any arm phone get that. (Suppose the Motorola with the huge battery might).

I would like to see a haswell based phone. (Most power efficient part / big battery built with icc it would kill anything arm has.)

0
3
Silver badge
Stop

@Eddy Ito

Power loss is only a factor in transmission. Once the power arrives at a device then we're back to the two basic equations V = IR and P = IV (from which I^2R is derived). If you increase voltage and reduce current to keep P constant then R increases in proportion and there's no change in I^2R.

3
0

This post has been deleted by its author

Silver badge

Re: @Eddy Ito

Yes, but at a system level you still have to account for the internal resistance of things like the battery and the traces on the circuit board. Also notice I didn't bother to mention that power also goes with CV2f although rightly I should have since it could easily be the dominant term given the theoretical voltage disparity if we assume the capacitance and clock frequency of both circuits were similar.

That said, I don't believe for an instant that they were measuring the current of the actual chip alone. It looks like they simply took the current of the entire device and subtracted off the current used by the display to get around the obvious disparity in in the screen sizes.

0
0
Anonymous Coward

Re: Power efficiency

@Wilco 1,

Certainly showing performance per watt is better, but with the benchmarks that was shown, they showed the Intel processor with better performance but then listed the power used in Amps. So no clear comparison could be drawn just by the performance numbers and Amps. At least with Watts being shown, one can draw their own conclusion. Different processors use different voltages and thus knowing just that or the Amps doesn't help. Their entire test appears to be done to make the Intel chip look better. So the Intel processor might have used less Amps, but the reality is that the voltage used was higher which negated the appearance of less power actually being utilized. Even show wattage or even power per watt might not be a true indicator. If one just used the "processor" and took other components out of the mix, that wouldn't paint a true picture as well. With more on the die of ARM SoC chips and Intel having more of the package, then they could still make the Intel processor look better. The entire platform needs to be looked at. Benchmarks are like statistics, you can manipulate them. It would be easy to make an ultra-high performance phone, the question is, how bug would be the battery or the runtime actually be?

1
0
Silver badge
Holmes

Heh.

The whole benchmark looked fishy since it came out, so it is no big surprise that the benchmark had been gamed to favor Intel.

Please, keep that garbage arch out of our mobile devices, its already doing enough damage on the regular PC market, thank you very much.

10
0

Whoops, caught with pants down

7
0

To claim that the power differences highlighted by the original tests are still valid despite the discrepancy in the testing is not convincing; if the benchmark produced higher performance results because it skipped a bunch of instructions then it also did less work and thus should need less total energy to do it; the fact that it ran for less time also means that the static power consumption (which is a substantial part of all power consumption) is also minimized.

So in short, ignore the test results completely, wait for some more balanced benchmark to be available, run that, discover ARM continues to use fewer joules per state change and then carry on :-)

11
0
Silver badge
Unhappy

Correct, but isn't it sad that these things need to be pointed out nowadays?

0
0
Anonymous Coward

underdogs fakery

If it's not Nokia faking Lumia picture footage, its Intel faking benchmarks.

Surely there must be laws against this sorta thing?

10
0
Anonymous Coward

Surely there's no problem with Intel having some "cool proprietary stuff" baked into their compiler if it makes real world applications run faster?

The results just point to the benchmark being a poor benchmark if it can "skip" memory instructions without changing the outcome of the test.

(Assuming there's no benchmark optimisations built into the Intel compiler - of course!)

0
1
FAIL

Oh come on!

Seriously, it's so damn obvious that money influences the comparative results.

It's an old trick that makes many, if not most, results bogus and pointless.

3
0

It is all about efficiency

This issue is really platform efficiency, which is performance/watt, and you have to have to evaluate both sides of the equation to come up with a proper outcome. Measuring the performance is difficult because the benchmarks do not represent usage models, and in this case were not even testing the intended instructions. Measuring power is a challenge because power is a factor of the entire software and hardware implementation, and it is the entire system that determines battery life - the factor important to consumers. I agree that we should examine this in more detail. Jim McGregor/TIRIAS Research

0
0
h3
Bronze badge

Re: It is all about efficiency

So why can the San Diego last 3 days. (And it is pretty thin it doesn't have a particularly big battery). But other than the Razr Max with a much bigger battery. Nothing else can.

(And Intel is not even trying).

I don't particularly like Intel (I have a Xeon E3 because they seemed to be very cheap / fast / not gimped and I want ECC / vt-d) but from experiences with the San Diego and every arm phone I have had it is just more usable. (Can go away for a weekend without having to worry about charging which no one else seems to manage).

0
2
Coat

Re: It is all about efficiency

Well to completely counter your point with overwhelming evidence, my phone lasts at least a week despite its battery dying (it used to be 2 weeks). Yes it has an ARM. So ARM wins again. QED.

Obviously battery life depends on how you use your phone, so either anecdotal data point is useless. Anand's battery life tests showed Intel phones like Lava XOLO having mediocre battery life.

As for Intel not even trying - well if they have to cheat benchmarks and power consumption tests to pretend to be the fastest/most efficient then clearly they are losing the battle. If they were really faster and more efficient then their phones would sell like hot cakes. In 2012 Intel had just 0.2% of the smart phone market, or in total about 1.8 million devices. To put that in perspective, total phone sales were 1.7 billion, and Samsung sold over 1 million phones every single day of 2012.

9
0
Anonymous Coward

Re: It is all about efficiency

Phone battery consumption is not mainly about CPU - it is about how long the screen is on for, how many pixels, how bright. It is also about the standby power used by RAM, and the background processes. If these are not comparable, battery life tells you nothing.

Optimising the CPU for a phone, once you get beyond a day or so, might be more about die leakage than operating power. It depends on the fraction of time for which it is actually working.

0
0
Holmes

battery life and benchmarking

everyone asking about "why the san diego can last 4eva if it is the sux on benchmarking" is missing an important point. These CPU and system benchmarks like Antutu or EEMBC or SPEC will tell you about power/perf under heavy utilization. But for a lot of us, the vast majority of the time our phone is sitting around "doing nothing" and burning power at some slow rate. And even when we're using one of these things, it's generally something not terribly heavy (music playback, reading the register). Yes, there are power hungry usecases like camera or gaming, but I'd be willing to bet these are far less than 10% of the total time your device is active.

Battery life is all about the integral, and the low-power idle modes dominate the time axis and ultimately the total consumption. So, is it possible that Intel has made a phone that has really really good idle power and some good power management even if they aren't that great when you want to do something heavier? Of course it is. In fact, if they are targeting the budget market, whiz-bang features can be less of a design focus which will give more opportunity for get the system power issues sorted out.

0
0
Silver badge

Re: It is all about efficiency

"So why can the San Diego last 3 days."

Meaningless gibberish argument. I'll explain why. My Android phone, in normal use, needs to be charged daily. If I'm running Navigation with GPS and stuff, it might conk out after 4-5 hours but this is highly dependent on conditions. I can't be more accurate as I tend to run it off a doohickey in the lighter socket. It is an Sony Xperia U. Oh, and it periodically fetches mail, has an animated front screen backdrop, all the sorts of stuff that uses processing power.

By contrast, my backup alarm clock is my older Motorola Defy. I have trimmed the software in it, optimised the settings, and it has connectivity turned off. I charge it roughly once every two weeks.

Yes, an Android phone can run for more than two weeks on a single charge. Not exactly the truth as most people would understand it, but not a lie either.

1
0
Anonymous Coward

Intel = Losers

For years benches have been rigged to show better performance than the competition. This has actually been proven by several reviewers. It's a dirty little secret because of those who profit from increased Intel product sales. The unscrupulous will do anything for money.

5
0
Anonymous Coward

So Intel's real mistake was...

...to make the 'improvement' so large, someone got suspicious. If they'd just been willing to settle for a few percent improvement then we'd have been none the wiser.

4
0

Big News...

The sky is blue,

and water is wet.

Intel pays shills,

It's a sure bet.

Rigged benchmarks!

7
0
Sil
Bronze badge

A fishy rebutal

The rebutal itself is fishy and only shows that these types of benchmarks only go so far.

Unless Intel explicitly rigged its compiler for artificial higher antutu scores there is no problem.

One does expect programs to be compiled with the best compilers and the most aggressive optimizations. Same for drivers.

0
4

Page:

This topic is closed for new posts.