back to article Intel's new Xeon: Easy to switch between dual memory modes? Uh, no

Intel's new Xeon E7 v2 server chips are capable of two different memory modes, but don't expect to switch between them willy-nilly or to have two VMs running on the same box using different modes. As we noted in our deep dive into the innards of Chipzilla's new top-of-the-line x86 server processors, the Xeon E7 v2 series …

COMMENTS

This topic is closed for new posts.
Silver badge
Paris Hilton

So performance mode is unreliable somehow...

So what happens? Random bit errors (hopefully caught by ECC memory, though the designer may have thrown it out for performance in the first place)? Random bus initializations? Random threads dying because of "bus errors"?

And where are the reliability numbers?

0
0

Re: So performance mode is unreliable somehow...

Bit errors are presumably seen as not allowed, otherwise it's a pointless technology. What's more likely is that RAM transactions have unreliable timings; i.e. it might take longer one time than another which would be the bane of anything latency critical (high frequency trading was mentioned, anything realtime would likewise be affected).

1
1
Anonymous Coward

Re: So performance mode is unreliable somehow...

"And where are the reliability numbers?"

How dare you.

Intel says it's better. That should be more than enough for you and indeed the whole wide world to go out and buy lots.

2
1

I have to admit that I am a little confused. Are you saying that in 1:1 mode (as opposed to 2:1) mode memory fetches might return data from the wrong place ?

As for in-site re-config. Assuming you didn't have to exchange actually memory dims I would think the process is a re-boot, loading config code which touches control registers in either the processor or memory and controllers, maybe memory test and your back up. Let's say 2 hours max in the worst case.

Of course, that's just a guess

0
0
Bronze badge

what is new, exactly?

The article left me confused. Memory mirroring with lockstep is not a new technology.

It would seem that actual news here is double the bandwidth, which can be utilised in one of two ways (none of which is new). Either you take the lockstep, meaning pair of DIMMs handling shared duties (mirror each other) and only half of the installed memory size is available to OS, or you take those as independent modules, exposing full size of your DIMMs to the OS.

Either way you will use ECC memory, so the only "unreliability" implied here would be when you actually have unrecoverable memory errors, i.e. more than 1 bit flipped. In which case lockstep would presumably allow you to continue work, hopefully with your data intact (thanks to mirroring performed by CPU) while performance mode will force you to replace memory modules (taking the machine down, first). Either way, you would see it coming since failing ECC memory modules would first flip single bits (recoverable and logged if you have decent OS), so you should be able to replace the memory during e.g. weekly machine bounce, before unrecoverable errors happen. There is nothing new here. Except for the bandwidth, it would seem to me.

2
0
Silver badge

Re: what is new, exactly?

The obvious next step is hotswap: If a module fails the server can illuminate a light on the mainboard to indicate it. Press a button, channel is taken out of service. Swap memory, press the button again, channel comes back up. Zero downtime, if your server case allows you to access the RAM without having to get the whole thing disconnected and out of the rack.

1
0
Bronze badge

Re: what is new, exactly?

Hotswappable memory is well into mainframe territory (some of them have hotswappable CPUs as well).

0
0
Bronze badge

Re: what is new, exactly?

Quite common in 4-socket servers, actually.

0
0

I expect Lockstep Mode will mostly be used by NonStop servers.

0
0
Bronze badge
Meh

So Intel have (re)invented memory dividers then? Overclockers have been using similar features for years (mainly to slow the RAM down to a speed it can handle when the CPU is clocked up).

The worrying part is that they seem to be sowing some doubts as to the reliability of the "Performance" mode. Why would you use a server that had a BIOS/EFI setting that made the memory not reliable? And if it is reliable for production use, why would you ever use it on the non-Performance setting?

0
0
Bronze badge

I suspect this is misunderstanding. With mirroring (in lockstep or not) the CPU can recover from otherwise unrecoverable memory error, and without it "only" correction of single bit errors in an octet is possible, with help of ECC.

0
0
Anonymous Coward

Where's the news in this news?

"I expect Lockstep Mode will mostly be used by NonStop servers."

Maybe, but NonStop hasn't needed instruction-level or memory-access-level lockstep for years. Comparisons for correctness were done at the level of logical IO operations (disk, network, etc), which are rather slower than main memory (never mind cache), and therefore rather simpler in terms of designing a comparator that doesn't slow things down too much. They may have moved on again since the days of the Logical Synchronisation Unit.

More generally, I'd welcome a concrete documented example of where in a real x86 box "lock step memory" actually means "mirrored memory". I know it can be done, but who bothers?

[1] is a link to a 2009 HP document which attempts to explain what HP think "lockstep memory" means in the context of their Xeon-based Bladesystems, and it doesn't sound like mirrored memory. Note that this lockstep memory is documented as slowing things down.

So given that HP Bladesystem seems to have been around a few years, what, exactly, is new in this Intel announcement? Where are the slides so we can make our own minds up? [2]

[1] http://h30507.www3.hp.com/t5/Eye-on-Blades-Blog-Trends-in/Information-on-DDR3-memory-lock-step-technology/ba-p/47527#.UwTwsPuEVXs

Correction welcome (in every sense).

[1] http://h30507.www3.hp.com/t5/Eye-on-Blades-Blog-Trends-in/Information-on-DDR3-memory-lock-step-technology/ba-p/47527#.UwTwsPuEVXs

[2] http://software.intel.com/en-us/articles/intel-xeon-processor-e7-v2-family-technical-overview

"Lockstep Memory mode uses two memory channels at a time, stores half the cacheline in one DIMM on one channel and the other half on the next, and offers an even higher level of protection. In lockstep mode, two channels operate as a single channel—each write and read operation moves a data word two channels wide. In three-channel memory systems, the third channel is unused and left unpopulated. The Lockstep Memory mode is the most reliable, but it reduces the total system memory bandwidth by one-third in most systems."

Quite similar, on the surface, to the HP 2009 info, but what do I know.

2
0

The 5500 series Xeons had mirroring

The 5500 series Xeons could do memory mirroring, (think memory raid1) if a DIMM failed you could schedule downtime to swap rather than the server crashed.

I doubt that the performance mode is unsecure, more that mirrored mode is more secure. If you've got 96DIMMs in a server the chances of one failing is far higher than a 4 DIMM server??

As for the switching modes, boot into the bios change setting reboot. Even on the very slowly booting HP DL360 G6s you should only be 5-10 minutes

2
0
Bronze badge

Re: The 5500 series Xeons had mirroring

Agreed.

My motherboard, SuperMicro X9DA7 (two sockets LGA2011 for E5-26xx) can do mirroring too. It's not new feature.

Extra bandwidth in connection to mirroring (so apparently there is no performance penalty compared to current generation) seems to be new, to me.

1
0
This topic is closed for new posts.

Forums