Or, of course.
Why not a very large number of much smaller cores, immediate processing, no caching.
Chain a large number of cores, each carries out a small subroutine then passes the result.
100,000 cores simultaneously processing, minimal storage, data is always in transit.
If you want a small core - a serial single bit processor,
handles unlimited word length calculations with no overflow, no floating point, a group of cores might preform any one calculation.
Now the cores are so small you could afford millions of cores.
They are so cheap you can afford to have many sitting around waiting for some more data to be crunched.
This is just a different idea.
You're flogging a dead horse with super hot, super big, super complicated processors.
Think of something different.
We're stuck on an architecture over 70 years old that's just been jazzed up to overcome bottlenecks. This is just another bit of jazz, interleaved, wide bus, static ram, fuck who cares, ho hum.