NVIDIA has taken the wraps off “Denver”, its long-rumoured 64-bit ARMv8 architecture compatible, system-on-a-chip (SoC), and claims it delivers performance equivalent to some CPUs found in PCs. The company has blogged about the silicon, the successor to its current 32-bit Tegra K1 models. Denver will come in two versions, both …
How quaint... Nvidia reinventing crusoe via Arm. Just this time it is microcode optimization of arm instruction set instead of run-time optimization of VLIW interpretation of x86.
I just choked on my coffee... What's next? Hiring Linus Torvalds?
Deja Vu all around.
The more interesting point here is who owns all of old Crusoe IPR. Intel bought and perpetually licensed some (but not all). What happened to the rest?
My thoughts exactly.
If you've got your grain of salt handy - the Wikipedia article for Transmeta states that Nvidia had bought a license for some of the IP back in the day. It also says Intel has a perpetual license to ALL Transmeta IP.
What microcode?. ARM ? Have I missed something ?
Or is it GPU microcode ?
Sounds like a hardware-based JIT, could make sense with Android. Every time I get a new version of Cyanogenmod all the apps are pre-compiled to make them snappier, having the JIT on the chip could lead to additional optimisations.
Google are already playing around with an ahead of time compiler called ART which will become the default in Android L (5.0). The idea is to save battery life by compiling an app into native instructions during installation instead of compiling it in memory each time it's run (a JIT).
Apps will launch and run faster as a result and battery life is better because code isn't repeatedly compiled and thrown away during launch. I assume that if Android were running on an ARMv8 chip that it could generate 64-bit instructions where it makes sense to do so.
Not sure why you would need a JIT though, when ARM's Jazelle technology can execute Java bytecode natively.
Jazelle executes easy Java bytecode instructions natively, but bails out and switches back to interpreting when it gets anything difficult. The process of bailing out is so slow that in practice, JITs have always been faster.
But I'm sure this article is actually saying that ARMs are now so complex that they have real microcode interpretation of ARM instructions. Which is interesting.
Caching microcode didn't work very well on the Pentium 4. Hopefully only doing it on 'commonly used code' might work better. It'll be interesting to see how this plays out; brilliant ideas in chip design have a habit of not actually working in practice (see Jazelle and the Pentium 4)
And Wikipedia tells us that Denver is indeed a microcoded CPU designed by engineers poached from various companies including Transmeta (and possibly licensing some transmeta tech). The reason it can't do x86 is that Nvidia doesn't have the patent licenses.
7-way superscalar, so if it works it'll be very fast.
I await some real independent benchmarks with interest.
"Not sure why you would need a JIT though, when ARM's Jazelle technology can execute Java bytecode natively."
Because Android doesn't use Java byte code. It has its own registered based bytecode format. I guess a chipset could natively execute that but perhaps it yields no tangible benefit over standard JIT / AOT strategies.
Though the 64-bit ARM instruction set is vary simple (looks more like MIPS than 32-bit ARM), and could easily be implemented directly on a simple pipelined 80's style CPU, most modern CPUs are superscalar, which means that they internally are variants of dataflow machines: Instructions are executed when their operands are available and not in the order they are written in the code, and they use a much larger internal register set than the visible register set, renaming visible registers to internal registers on the fly. Compiling blocks of ARM code to the dataflow machine and storing this makes sense, as you can skip the decode, schedule and register-renaming stages of the execution pipeline. In particular, that would make mispredicted jumps run faster, as you don't get quite as large pipeline stalls.
>JITs have always been faster.
Not so. Looking at results on JBenchmark, the Nokia S40 VM (which I believe was Jazelle based) regularly out-gunned JITs running on more powerful hardware. Ahead of Time (AOT) on the other hand - no comparison.
I've also heard that another problem with Jazelle was it was based on Sun's reference implementation, so using it required an Arm licence and a Sun licence
Look up the old Intel 8086 specs. Microcode has been used in microprocesors since the beginning. Modern processors use much less, but it is still used heavily.
Cmon Ubuntu et all - get your shit working on a phone so I can run anything I want and not have some add tainted crap app that I had on linux (add free) 20 years ago.
Crash bang Oops !
Don't be a stupid early adopter, this means that the resulting mix of 32 bit and 64 bit apps will both bloat memory and crash more too !
Sp, little to no gain and lots of pain
"Performance equivalent to some CPUs found in PCs."
I like that kind of precise statement of performance, a.k.a. markerting speak: It sounds nice, but then some PCs might be construed to include our old 80286 processor running at 12 MHz. We even did image processing on that PC. Still use some of that code today, and it runs like the clappers on new kit. Somehwere in the corner of my office lurks a 80486 at 60MHz. I do not think anybody would like that kind of performance in their tablet or phone.
It would sound better if they had used the phrase "some current PCs, or actually stated which processor they meant.
Still, chips with better performance at lower power are always welcome.
Bring on the Mandelbrots
Rendering Mandelbrots and maybe even the Mandelbulb should show off the performance available.
Re: Bring on the Mandelbrots
These are float intensive tasks, there's nothing in the article that suggests they'll be affected. Plus they're embarrassingly parallel, if that makes any difference.
Re: Bring on the Mandelbrots
It's ARM, if you need Mandelbrots then it's easy enough to add some more silicon that is optimised (only) for Mandelbrots. That's the ace in the ARM pack.
Re: Bring on the Mandelbrots
"These are float intensive tasks"
You can calculate Mandelbrots using fixed point/integer arithmetic easily enough.
64-bits? Is this where the 'Android is just copying Apple' comments go?
((No seriously, JUST KIDDING!))
- Leaked screenshots show next Windows kernel to be a perfect 10
- Product round-up Coming clean: Ten cordless vacuum cleaners
- Something for the Weekend, Sir? I need a password to BRAKE? What? No! STOP! Aaaargh!
- Episode 13 BOFH: WHERE did this 'fax-enabled' printer UPGRADE come from?
- Vulture at the Wheel Ford's B-Max: Fiesta-based runaround that goes THUNK