back to article Intel's Broadwell Xeon E5-2600 v4 chips: So what's in it for you, smartie-pants coders

Intel today officially pulls the wraps off its mildly delayed Xeon E5 v4 server processors. These chips follow up 2014’s Xeon E5 v3 parts, which used a 22nm process size and the Haswell micro-architecture. Intel shrunk Haswell to 14nm, and after some tinkering, codenamed the resulting design Broadwell. Server and workstation …

Silver badge

hear that

It's the price of vsphere cost on a per-core basis going down again with 22 cores/socket.

The most important info to me in this article was: 22 cores, and socket compatible with V3 systems. That is awesome.

0
0

Re: hear that

Yeah, it's pretty obvious now why MS is going to limit/license Server 2016 based on core count. Bet VMware follows along shortly with a new vCPU tax.

1
0

Re: hear that

You mean Microsoft will do "an Oracle". Chip/socket count, core count, MHz, memory, all will be taxed.

4
0
Silver badge

Re: hear that

"That is awesome."

For some workloads, yes. For things like databases where single thread speed matters the 6 core part running over 4GHz (Turboboost) is still king. If you look closely at Turboboost, a 12 core chip has very similar overall compute to a 6 core chip.

1
0

will do methinks for a new Mac Pro

Almost enough cores I reckon, and the current model has been around nearly 3 years.

1
0

Re: will do methinks for a new Mac Pro

you can never have enough cores!

1
0
Silver badge

Here's loving the virtualization and database tweaks. Time to start banking a shitload of pennies.

0
0
Joke

But....

Can I play solitaire on it?

2
0
Silver badge
Linux

Top ask!

Will it run Linux Mint?

2
3
Silver badge

Yes, but...

Will it rotate my tires, and check my oil??

Lots of nice features to be sure, but what does it take to fully utilize them in current products? Looks like there is a bunch of work ahead for the kernel/vm programmers to get up to speed, and even then it will need a bunch of texting to make sure it works OK. For normal user programs (Solitaire anyone), the cards will just shuffle faster.

Of course those mining bitcoins hopefully will consume less power in their quest to make zillions.

0
0
Silver badge

Re: Yes, but...

Of course those mining bitcoins hopefully will consume less power in their quest to make zillions.

If bitcoin miners started using Intel CPUs to perform the hashing then the power cost per bitcoin will skyrocket.

0
0

working TSX?

Quote: "while teasing developers with goodies like posted interrupts, working TSX,"

Surely that should be

"while teasing developers with goodies like posted interrupts, allegedly working TSX,"

Pretty much every Intel chip product of the last decade (and probably longer) has had multiple errata, I suspect most of them found after release. I think claiming TSX is working is a bit premature until it's seen in the wild for a while.

2
0
Silver badge
Meh

Hmmm

Do they still use 1 fpu per 2 cores? If so, for my work it's effectively (slightly less than) half the number of cores

1
1
Silver badge

Re: Hmmm

Wasn't that AMD...

4
0
Silver badge

Re: Hmmm

They are both at it. I don't remember which one was first with the idea.

It totally screws up low latency floating point DSP work. I have a 'standard' dual core Intel that performs better than sooper-dooper quad cores of both Intel and AMD, with similar clck speeds and identical OS.

2
1
Anonymous Coward

3.5

So the fastest is 3.5GHz?

Clock speed is far more important for anything I do than the number of cores or any of the other fiddles.

This multi core madness is ruining performance improvement.

2
1
Silver badge

Re: 3.5

Clock speed is far more important for anything I do than the number of cores or any of the other fiddles.

Sun found that out the hard way with Niagara (and later T3). Looked nice on paper, but in the real world (except for some quite specific workloads) it turned out not such a great idea . T4 was the first one that actually worked fairly well.

2
1
Silver badge

Re: 3.5

Processor design hit a MHz barrier years ago, at approx. 4GHz.

If you can't make your workload multicore then you are never going to go faster on electronic semiconductor hardware.

Put your effort into finding ways to use those extra cores, because otherwise you will not get more work done per unit time until there is an all-new type of hardware in town.

6
0
Silver badge

Re: 3.5

"So the fastest is 3.5GHz?"

No, the fastest base clock is 3.5GHz. The fastest core is much better than that - I think at the moment it's about 4.3GHz in the "old" Xeon range. You do have to sacrifice cores and use Turboboost but clock speed isn't dead completely.

I think IBM have a 6GHz chip available, but you may have to change architectures to use it :)

1
0
Anonymous Coward

Re: 3.5

I agree that some workloads require super-duper clock cycles on a single thread. I have such a workload, we would love a 8-10Ghz chip for crunching this particular workload. It can't be run in parallel (well it can but that slows it down), I want a fast clock cycle CPU for one big job we run that takes 30-40 secs every 3 minutes. Shaving that time in two would be worth paying money to me.

My knowledge of CPU design is third only to my knowledge of popular beat music and fashion (are flares still in?) - Could you design a multi core CPU with perhaps one of the cores running at silly speeds and the rest of the 21 cores at the normal 3.4Ghz (or whatever). Not a troll simply wondering aloud and hoping a bloke at Intel comes back and says "Look what we have just for you!"

Thanks

1
0
Silver badge

Re: 3.5

Turboboost is based on thermal envelope for the whole package so yes, it would be theoretically possible to do, and Turboboost does already dynamically change things depending on current workload so in your case while "the process" runs one core could run at 4.2 then when it finishes all six could run at 3.6.

Of course, you could take the easy route and build a server with one 22 core and one 6 core then use various gubbins like NUMA and core affinity to tie virtual machines to the right cores. Or take the really easy route and just buy two servers :)

3
0
Silver badge

Re: 3.5

Not going to happen in semiconductors.

The ~4 GHz limit is due to the physics of how the clock is distributed around the chip.

As process size shrinks, the smaller physical distance between gates reduces latency (linearly), however interference increases (inverse square law) and thus Bad Things happen.

If your workload really can't be done in parallel then you're stuck.

However, it is very unlikely to be genuinely true. Very few workloads are totally serial, and so you can usually find some sections that are independent.

If you find it runs noticeably slower when running in parallel then your architecture for doing it is almost certainly incorrect, and is blocking threads way too often.

At worst it should be slightly slower due to thread context switch.

1
0

Re: 3.5

"...Sun found that out the hard way with Niagara (and later T3). Looked nice on paper, but in the real world (except for some quite specific workloads) it turned out not such a great idea ...."

The Niagara T1-T3 were niche processors. In that niche they excelled. An article said that a Niagara T1 cpu running on 1.2 GHz with 8 cores, where 50x faster than a Intel 2.4 GHz dual core server. No typo, 50x!!! The workload was about web server, serving many light weight clients.

Today the SPARC M7 is typically 2-3x faster than the fastest x86 cpu and POWER8. It is all the way up to 11x faster than x86 and POWER8. As the SPARC M7 can encrypt data for free, encryption costs 2-3%. Whereas on x86 typically performance will be halved or worse, hardly leaving no horse power over to do useful work on x86 when using encryption.

Here are 25ish benchmarks where SPARC M7 is 2-3x faster, such as SPEC2006, Hadoop, SAP, Neural Networks, Specjbb2005, etc etc:

https://blogs.oracle.com/BestPerf/

(Funny thing is that if you look a bit on that site, you will see that SPARC M7 is twice as fast at SPECjEnterprise2010 than the stated record.

0
0

Re: 3.5

I wrote a blog http://wp.me/p4PWml-1h on this great Intel charade. Performance per core is flat with Haswell & Broadwell EP to the original Nehalem Gainestown. For workloads priced by the core, especially Oracle, all this means is you pay more per socket for flat performance.

0
0

Re: 3.5

There is Oracle marketing spewing their lies. SPARC M7 goes around claiming performance superiority on a per socket level....They will compare their socket to an Intel or POWER. They won't mention the SPARC M7 socket consists of 32 cores while they will compare it to an Intel server with 10 cores or a POWER8 server with 6 cores. Oracle marketing continuing to purposely misstate the facts.

0
0
Silver badge

22nm vs 14nm

Surprising how little extra performance they were able to get from that process shrink.

Only 4 extra cores for the same power draw? I guess some of the new features take up a LOT of silicon, or intel threw billions of bucks at a lame process node.

Still, looks like a nice bit of kit, I want one to run Crysis.

0
2
Anonymous Coward

teething

anybody have any insight on how long it usually takes for initial teething problems to be worked out?

I need a hardware refresh!

0
0
Silver badge

Re: teething

IMO, it is just in time for the next Version of the Silicon with 32 cores.

Gotta keep you drooling about upgrades haven't they?

0
1

This post has been deleted by its author

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Forums

Biting the hand that feeds IT © 1998–2018