Memory specialist Micron has announced a new accelerator processor that it claims outperforms Intel's chips when it comes to dealing with streaming data. The "Automata Processor" was announced by the company on Monday and billed as a device that uses the inherent parallelism of memory architectures to speed the ingestion and …
Everything old is new again
So, essentially, they've put transputers on a DIMM. Maybe they can retarget occam-pi to it too.
Re: Everything old is new again
Yes, this architecture was a good idea 30 years ago - see for example
Re: Everything old is new again
How is a scan line rasterizer the same as processors on a DIMM?
Re: Everything old is new again
And their are not transputers either. Far from it.
more details please
What's the programming model? How are DRAM lines utilised? What's the degree of parallelism available? How did a cluster of 4W devices beat a cluster of 95W Xeons?
Re: more details please
>How did a cluster of 4W devices beat a cluster of 95W Xeons?
The same way a GPU beats a cluster of CPUs. CPUs are not 'great' at massively parallel problems, conversely it is unlikely these processors will perform poorly on serial operations.
A new paradigm
and that would be exactly what (other than rapidly escaping hot air)?
Re: A new paradigm
Dilbert's got that Bullshit Bingo one: http://dilbert.com/fast/1991-11-03/
"conventional CPU architectures can have anywhere from 2 to 64 processors"
That's what Micron say, anyway. Maybe someone needs to tell HP - both IA64-based Integrity (RIP) and AMD64-based Proliant (somewhat more "conventional") can beat 64 at the top of the range, assuming we're counting logical processors not processor sockets. Others probably can beat it too. I wonder who makes HP's memory for them...
Source (and further reading):
Why not use an FPGA?
"the AP is a scalable, two-dimensional fabric comprised of thousands to millions of interconnected processing elements, each programmed to perform a targeted task or operation."
Sounds almost like an FPGA,
Re: Why not use an FPGA?
Why use an FPGA when you've got your own spare foundry capacity?
It's just interesting that "memory specialist Micron ... uses a DDR3-like memory interface" - when all you've got is a hammer...
Hahaha wow. What am I reading?
"Automata processor cuts through NP-hard problems like they're butter"
Going overboard with headlines much, El Reg?
What next? Free energy found by combining lifters with homeopathy?
"Planted Motif" problems are NP-complete only as I read on Jimbo's big bag of trivia (NP-hardness may well mean that the problems is way, way harder). Even Micron's new approach at SIMD processing is not going to crack large NP-complete problems significantly faster - speedup is linear, but the cost still increases exponentially with the problem size, so no joy.
And why do I have to go and google for Micron's documentation.
I actually like this
It somehow reminds me of Connection Machine's "active memory" idea. I think. Regular expression processing in massively parallel hardware? Time for reading!
Neat but hardly general?
OK, I've read through their full paper - and it's interesting, I don't think it's going to do that much for general computing, but regexp stuff, hmm very neat - I mean ideal for trawling through intercepts at high speed,
or Layer 7 switch stuff perhaps, and I'm sure people will come up with funkier uses for it - but it doesn't sound a useful way to do really general stuff, and while there is a lot of talk about 'automata' this is more normal finite state stuff than anything too funky.
And since I really couldn't see any links in the Register's article, here are the two I found:
linked from http://www.micron.com/about/innovations/automata-processing
a lot of talk about 'automata' this is more normal finite state stuff
Pretty much what I thought.
That said fine state automata can be powerful abstractions.
But it does sound like the "pixel planes" PE's embedded in memory of Chapel Hill NC in the mid 80's.
I actually think using an interface common to memory modules is a pretty good idea.
Note if this is particuallary good at NDFA implementation that *suggests*
Very fast compilers (or more likely) cross compilers.
Handy if you're going to build compilers for all those interpreted languages currently using the shared runtime. that MS encourage everyone else (but themselves) to use.
Tough to program.
"It will also be tough to program for, though Micron says it is working with researchers to ease this issue."
They had better be looking at an OpenCL extension maximizing default usage.
Why not <foo> ?
FPGAs spend a lot of silicon on the regular connectivity fabric and doing logic ops via table lookup.
Specialized hardware can often do a specific job in fewer square millimeters of silicon and that
ultimately determines the cost of a chip. FPGAs get agility at the cost of chip area. A chip specifically
designed for implementing automata (DFSMs) can win the chip area battle.
Logic-in-memory processing, as has been observed here, is not new although it might have
finally found a niche deep enough to make the economics work. The first SIGARCH meeting
I attended in the middle 70s at SMU in Dallas had a lot of submissions with various versions
of logic-in-memory. The problem was that the logic had to be implemented in 7400 TTL (CMOS
was still RCA's greatest parlor trick - cf the COSMAC 1802) and the memory was equally
painfully expensive. Several designs worked to integrate processing logic in "processor per
disk head" disk drives with the goal of doing various kinds of searches and database operations
as the bits few by under the heads.
Now, however, there is an increasing need to process a transaction stream in real-time (eg, high
frequency stock trading) and those also have significant economic value. The advent of "analytics"
which do large summarization operations on databases has made "column store" databases a
financially successful space, and the ability to hardware assist those operations with the likes of
the Micron part certainly has high-quality appeal.
But it's also true that a lot of money (your tax dollars, actually) has been spent on hardware search
machines which have proven to be "less than completely successful" for reasons relating to the designers' understanding of the formal machinery than chip design issues, but they were unsuccessful anyway.
I commend Micron for taking the risk and trying something New. FusionIO did and was rewarded hansomely, as was Data Domain. I cannot claim to be unbiased on the area in general, but
i look forward to learning a lot more about their part and how they see it being used. (And then,
of course, will be how people actually use it!)