back to article Linux kernel Spectre V2 defense fingered for massively slowing down unlucky apps on Intel Hyper-Thread CPUs

Linux supremo Linus Torvalds has voiced support for a kernel patch that limits a previously deployed defense against Spectre Variant 2, a data-leaking vulnerability in modern processors. Specifically, the proposed patch disables a particular Spectre V2 defense mechanism by default, rather than switching it on automatically. …

  1. gerdesj Silver badge
    Childcatcher

    Top stuff

    This is a proper nerdy article which has slithered onto el Reg. Me: I absolutely love it. You can try and use terms like "Linux supremo" to try and sound a little bit user friendly but in the end this is a complex subject that will have many readers glazing over before line three. STIBP THBIS NONBSEPNSE is close to genius (OK: I spat wine on my screen!) Well researched and documented article - thanks.

    Now as to the meat: Spectre and Meltdown have yet to really *be* compromises as far as most of us civilians are concerned. We don't yet hear of any S&M compromises but they surely exist and will be deployed by the clever mob. The not so clever mob (the usual non govt haaxxor nob ends) will eventually come up with something and become a pain.

    Keep patching, kids.

  2. Bitsminer
    Childcatcher

    Spectracular

    Reliability and trustworthiness now has a spectrum of options to be selected. Get it wrong and get pwned!

    X86 = Special Executor for Caching Troubles with Revenue Extraction

    1. Voland's right hand Silver badge

      Re: Spectracular

      X86 = Special Executor for Caching Troubles with Revenue Extraction

      Various spectre variants affect Arm, PPC and other cpus. So I do quite see your point here.

  3. Bill Gates

    Hyper-threading was a gimmick anyway. Just get rid of it. It is not needed, we have lots of REAL cores now, we don't have to pretend.

    1. Gene Cash Silver badge

      Hell, let's actually use the cores we have!

      After watching xosview for a decade, I think the only thing to use all the cores at once is gcc, when it builds kernel modules for VirtualBox, nvidia drivers, etc.

      I think other than that, KSP now uses about 1.8 cores.

      1. bombastic bob Silver badge
        Devil

        "let's actually use the cores we have!"

        Well X11 is client/server and you'll see more threading because of it. And the OS has kernel threads that try to make use of multiple cores for IO and other things (well Linux and BSD, anyway, dunno windows probably does too).

        But yeah multi-thread algorithms are still a bit behind the hardware tech last I looked, except for things that are trivially threaded. Some time ago I did a threaded quick sort as a demo, and a more practical discrete Fourier transform with threads [which is somewhat trivially threaded]. Where I get the most benefit is from a build, which I always try to invoke with jobs 'make -j' set to twice the number of available cores.

        /me does not even know if Microsoft's compiler can do that, simultaneous jobs to build things. BSD make and GNU make have been able to do that for at least a decade...

        1. Richard 12 Silver badge

          That's multiprocessing, not multithreading

          Microsoft compilers do it that way too, though you have to use the new-ish msbuild or the 3rd party jom as Microsoft's cmake doesn't support -j for some reason.

          However it took until MSVC 2017 before msbuild became capable of doing any of the other build steps in parallel.

          And linkers are still mostly single threaded.

          1. Michael H.F. Wilkinson Silver badge

            Re: That's multiprocessing, not multithreading

            True, make -j 32 on our 64-core Opteron machine does absolutely fly through big builds, but that is indeed multi-processing rather than multi-threading proper. We do write code that scales well up to 64 threads (up to 50x speed-up), but that only really works if you have serious compute loads (like multi Gpixel images) to process. Many applications don't use multiple threads very heavily.

            1. Bronek Kozicki Silver badge

              Re: That's multiprocessing, not multithreading

              Many applications don't use multiple threads very heavily. Yes, because you need either 1) embarrassingly parallelizable algorithm (fitting within existing imperative programming paradigms) or 2) a new programming paradigm which limits data sharing between threads. Without either of these, the horizontal scalability of your application is severely limited by the Amdahl's law. New languages like Go or Elixir, or frameworks like Akka go some way towards 2), but few programmers can be bothered.

            2. katrinab Silver badge

              Re: That's multiprocessing, not multithreading

              "Many applications don't use multiple threads very heavily."

              Maybe, but it is possible to run more than one application, or indeed one virtual machine on a single computer.

      2. Chemist

        "I think the only thing to use all the cores at once is gcc"

        Video transformations often(usually) use many cores

        Try scaling video using ffmpeg - it uses all 4(8 cores) cores on this i7 laptop

      3. Uncle Slacky Silver badge

        Calibre uses multiple cores when converting ebook formats (one core per book), and of course those online cryptocoin miners will use them all if you let them.

        1. phuzz Silver badge

          "Calibre uses multiple cores when converting ebook formats"

          I did not know that, and it's a great feature to have, only it might be more useful if it took more than five seconds to do a conversion in the first place.

          "Surely a virtualisation platform would use all the cores/thread you can throw at the virtual machines?"

          Yes, but then the question is, what multithreaded workloads is you virtual machine running, or is it only using (mainly) one core?

          Perhaps if you were running multiple emulators inside your VM...?

      4. Anonymous Coward
        Anonymous Coward

        Surely a virtualisation platform would use all the cores/thread you can throw at the virtual machines?

        I was thinking the other day how virtualization used to be so that you could sweat your assets as much as possible, using spare CPU time for other servers and allowing peak performance every now and then on shared resources. However, cores and memory are so abundant now that many suppliers who have a say over their use on your system specify minimum unshared amounts of cores/ram etc and don't 'allow' you to set more modest and reasonable requirements. Not so much of a concern in the Linux world I know.

    2. bombastic bob Silver badge
      Devil

      ots of REAL cores now, we don't have to pretend

      that's a very, very good point. Except for the legacy boxen...

    3. Anonymous Coward
      Anonymous Coward

      Not true. On many workloads you’ll never get the same level of performance per watt and per $ no matter how many cores.

  4. T. F. M. Reader Silver badge

    Hyper-threading itself may be bad for performance.

    Ever since hyper-threading was introduced it's been pointed out that it may be bad for performance for many (most?) workloads because the two hardware threads would be fighting each other for the (single) cache. The usual advice (and not from OpenBSD) was: switch it off unless you really know what you are doing and you've benchmarked your particular workload.

    Now, if you want additional protection from the Spectre haunting your CPU, you will make you computer slower still.

    Just switch hyper-threading off. Make all the SW patches useful only when it's on optional.

    1. Korev Silver badge
      Boffin

      Re: Hyper-threading itself may be bad for performance.

      Ever since hyper-threading was introduced it's been pointed out that it may be bad for performance for many (most?) workloads because the two hardware threads would be fighting each other for the (single) cache.

      This is standard practice in HPC land; in addition to the cache, the tight loops that typically get run in this scenario mean that there is competition for the same registers.

    2. Dave K Silver badge

      Re: Hyper-threading itself may be bad for performance.

      B-but I like seeing 16 cpu usage guages in task manager :(

      Admittedly this would be my only noticeable loss when disabling HT...

    3. owlstead

      Re: Hyper-threading itself may be bad for performance.

      At the time it was introduced hyperthreading only offered a mere 10% or so of performance improvement and even slowed down the CPU for others.

      Here is a comment that discusses it at technical detail for the Intel P4.

      https://www.reddit.com/r/Amd/comments/7tzum9/does_zen_architecture_receive_single_thread/dthd122

      That's a long way off for what you can do with SMT on current processors, especially when it comes to CPU and thread heavy workloads:

      https://www.hardwarecanucks.com/forum/hardware-canucks-reviews/74880-amd-ryzen-7-1700x-review-testing-smt-4.html

      So yeah, switching off SMT (Hyperthreading is an Intel marketing term) is generally a very bad idea with regards of performance. Rendering video is a common use case for faster CPU's - if you don't want that buy a cheaper CPU without SMT.

      If I was on a multi-core chip I would want to enable it (preferably automatically) only for certain applications. But I don't think we have options for that.

  5. _LC_
    Boffin

    Hyperthreading

    It's like this: When the CPU has to access something from memory (there are various caches in-between), it has to wait an eternity (easily 1000+ cycles). With "Hyperthreading" it would simply continue executing another thread. THIS IS NOT THE SAME as employing another core, as another core will consume A LOT more power (extra caches, etc.).

    Therefore, "Hyperthreading" is a good idea. In fact, it works even better when designed for more parallel threading (like SMT8 on the Power9, for instance).

    The problem here is that they decided to ignore the MMU (memory management/multiuser protection) in favor of speed (speculative execution). Now they pretend that those are all bugs. It's close to impossible to run so many red lights without asking yourself if you missed something.

    These processors should go back. Them claiming, that things are complicated and that it's virtually impossible to get them right, is an insult. This reminds me very much of Volkswagen and their Diesel scandal...

    Don't buy any new systems until they fix this. So far, they cannot be bothered, it seems. Buy a cheap in-order ARM/Risc-V system instead.

  6. Zolko
    Linux

    uname

    "was added to Linux 4.20 and backported to Linux 4.19.2"

    > uname -r

    4.19.1-041901-lowlatency

    good, I'm safe (icon, obviously)

  7. Anonymous Coward
    Anonymous Coward

    In all of this, there's one thing I don't see....

    ....namely, what access does a bad actor need in order to exploit these hardware defects?

    If a task with non-root permissions can run these exploits - clearly very bad.

    If a task needs root permissions, then maybe not so bad.

    If the bad actor needs physical access to the box, then maybe very unlikely.

    Can El Reg readers help this end user understand? Thanks.

    1. Spazturtle Silver badge

      Re: In all of this, there's one thing I don't see....

      Malicious javascript can use these exploits. Although there are no current POCs for this particular variant like there are for the others.

  8. BinkyTheMagicPaperclip Silver badge

    Cheers for the OpenBSD shoutout

    Although the security situation in OpenBSD is evolving. The current situation, unless they've changed it again in a snapshot recently, is that by default hyperthreading is disabled by not scheduling tasks against a hyperthreaded instance.

    Therefore, if you run top and kick off a multi process compile on a pair of 8 core CPUs with hyperthreading, top will display all 32 'cores' but only show activity on 16 of them.

    The real danger is that the number of potential exploits is rapidly multiplying. Whilst some of the issues found are difficult to leverage, I wouldn't put money on this being the case for all future design choices that can be taken advantage of.

  9. Rich 2
    WTF?

    That can't be right!

    You quoted Linus several times, but where was the profanity? I can only guess the anger management therapy or whatever it is he said he was doing is working.

    Well, f*ck me!

    1. Rajesh Kanungo

      Re: That can't be right!

      He swore off profanity.

  10. Someone Else Silver badge
    Happy

    "STIBP THBIS NONBSEPNSE"

    Nice!

  11. Rajesh Kanungo

    Intel Hyper threading is an oxymoron anyway

    In general I have associated hyperthreading to imply a large number of threads.

    Intel uses 2 threads per core and calls it hyper.

    I know companies which have built processes with 64 threads per core.

    Threads were really meant, in these systems, for computational separation but not memory isolation. For example, you establish a pipeline of processes that data has to flow through to end up at a socket endpoint.

    Intel, at some point, may have pushed this as a mkt advantage, selling ‘more’ cpus than they really had.

    Are there many applications that get a performance boost IRL from threading? The requirements for cache coherence is extremely tight. I can think of same instruction same data as the basic requirement.

  12. JLV Silver badge

    inquiring minds want to know

    When are we due for Intel chips with the Spectres and Meltdowns vuln classes nuked again? Within reason, of course - nothing remains secure forever - but least without a whole slew of theoretical issues that hackers can play with.

    It’s almost as if they didn’t want to jeopardize their current sales by having people put off buying.

  13. JohnFen Silver badge

    Hooray!

    "So a patch in progress will allow admins to turn on STIBP if needed, but not by default."

    This is great news. My long-term mitigation plan is to get rid of my Intel-based machines entirely, and until then I want to pick and choose which mitigations I'm willing to accept. This is one I am not.

  14. Anonymous Coward
    Anonymous Coward

    Confused

    Why does this option exist if it hurts performance so much? Might as well just turn off hyperthreading. Or is this still better than turning it off entirely? Article needs to clarify.

    1. diodesign (Written by Reg staff) Silver badge

      Re: Confused

      It exists because it mitigates the Spectre Variant 2 security vulnerability. It hits performance because the mitigation in combination with Hyper Threading potentially slows down software.

      So your choices are:

      - a: enable mitigation for security reasons, enable Hyper-Threading, take the potential performance hit

      - b: enable the mitigation for security reasons, disable Hyper-Threading because you weren't benefiting from it anyway

      - c: disable the mitigation because you're not worried about the security issue, and enable Hyper-Threading

      - d: disable the mitigation because you're not worried about the security issue, and disable Hyper-Threading because you don't benefit from it anyway

      Most people will decide between a, b and c.

      C.

  15. Anonymous Coward
    Anonymous Coward

    50% performance hit ? At this rate the next advice will be to unplug it at the wall...

  16. ATeal

    The SMT discussions

    SMT is a good thing almost always, and the situations where it isn't a good thing ought to be few and far between. Most of us here (except one really weird comment above - WTF m8?) get that it's the thing that is a really tight loop that doesn't benefit from SMT when there are more of them than physical cores. Linux's scheduler for eons (and I'm sure many others) are aware of hyper threading, the only difficulty is that it's bloody hard for a program to look around and go "ah this is Intel - I'll just halve that reported core count"

    The difficulty is in software knowing it's dealing with SMT and adjusting itself accordingly, it can go all the way with processor affinities easily enough and it can spin up as few or as many threads as needed.

    For as long as we can specify as an environmental variable or command line argument or config file (whatever) a non-default argument to these programs - leave it on. Those who shouldn't are those running binaries from others - maybe some interpreted languages - that's another issue for another time; maybe they should use affinities instead - again another time.

    The modern cores in Intel and AMD (ignoring the piledriver-grade pounding they gave me kinda) are very very super-scalar in that they're extremely hard to keep busy even close to half the time. They're made so software heavy in various areas can run fast, No one stream will use *everything*. Someone above mentioned registers. You're looking at 168 integer registers now (and since Sandy Bridge/2012 which might be 154.... maybe...) this is another area where there are lots of resources one weird bit of software might use (needing lots of values, huge spilling or something) - but that software wont need all of the (some other area of the execution engine); I've not proved the claim "forall instruction sequences [ that sequence uses the full resources in at most one arbitrary partition of execution resources ]" but there's only around 150 to 180 ops in flight at any one time *max* so to use all those registers leaves you with like 12 operations to do something else - you get my point I hope.

    Anyway that's why SMT is good. If you use perf and read the manuals (I've written some devices drivers that abuse ioctl to expose machine specific registers - there's all kinds of things these can do, but they're so model specific....) you can confirm this and see what's going on (ish) to get the most out of it. A lot of my jobs involve squeezing performance out of Sandy Bridge chips so trust me on this - there's a lot there. It's just so model specific I can see why perf et al went "screw that" WRT supporting it.

    I actually don't like that it's just 1 extra thread. I think powerPC or SPARC - one of them - uses like 8 or 16 threads per core. Not even they fully utilise it most of the time (you can make it SMT512 if you like, but it doesn't matter if you're not running anything that isn't hammering SIMD floating point instructions, those units are going to be idle...).

    It's a good thing. Dare I say "for as long as the execution units are there to do work - it wont bottleneck" but this is the problem with hard real time systems. You're just opening programs that are not coordinating so this is impossible to say or measure. However we can all see that this means "don't run that floating point heavy stuff with more threads than there are FPUs total on this system ish"

    Modulo whatever.

    I'm one of those die hard hippies that trusts his computer though, If I ran windows (not a dig but all those things bring their own DLLs, phoning home, ect) I can see why you'd be worried. Or if you sell CPU time - you get my point. I trust the software I run and don't run software I wouldn't trust without some restrictions - and I'd say I pay a price or that. However how this affects a database (great use of SMT there) ....

    You see my point. Generally a very good thing. Spectre is such an issue (see my comment here https://forums.theregister.co.uk/forum/1/2018/07/26/netspectre_network_leak/ - you can't just"jitter the clocks") that any CPU fixed to it (ie lying about the current time, no more rdtsc ect) could still do SMT and be safe. I've long been thinking about this, but as it's not my job (sadly) I don't know if my "mitigated system" would be practical (it involves lying about the time, pretending everything is deterministic and isolated, yet doing it much like today) or if it'd be way to slow. But it can be shown easily if you can do that safely you can make the SMT system running on top safe.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2019