back to article Another day, another Spectre fix slowdown: What to expect if you heart ZFS

The widely used ZFS file system software is slowed down in both read IOPS and throughput by Intel CPU microcode fixes for the Spectre processor design flaws, one set of numbers suggests. Systems engineer Jon Kensy blogged about the results of his ZFS testing, based on a VM running ZFS on a Ubuntu 16.04 LTS system on a Dell …

  1. Warm Braw

    Ubuntu patches?

    According to the VMware security advisory, fixing the bug completely requires guest OS patches as well as patches to the CPU microcode and hypervisor. I can't see anything in the blog post about whether these have been applied or are relevant. Anyone have any further details?

    1. A Non e-mouse Silver badge

      Re: Ubuntu patches?

      Haven't VMWare pulled their patches?

    2. Anonymous Coward
      Anonymous Coward

      Re: Ubuntu patches?

      "fixing the bug completely"

      A bug is something that stops a system from functioning correctly and the big problem with Meltdown/Spectre is that the fixes that are available, for just about all of the current generation of hardware, affect system performance and a system that is running slow is no-longer functioning correctly, which is itself a bug.

      Meltdown/Spectre is no-longer a security bug - it's now a performance bug, for which no complete fix, except replacement h/w, is, or ever will be, available.

      1. Natalie Gritpants

        Re: Ubuntu patches?

        Think you're getting confused with realtime systems. Unless you have a SLA stating the performance you will get. It's not a bug that your program runs slower as long as it gets to the correct result (whatever that is).

        1. Adam 52 Silver badge

          Re: Ubuntu patches?

          I don't do real-time but I don't think I've ever worked on anything without a performance SLA. Websites must respond within x ms, databases must do n queries per hour, user interfaces must respond within y seconds, models must run in under z hours.

          Even my current GDPR subject access request project has a target SLA of 50 SARs/week.

      2. Anonymous Coward
        Anonymous Coward

        Re: Ubuntu patches?

        Yet another comment lumping Meltdown and Spectre together, Intel have done a brilliant job of getting everyone to confuse the two vulnerabilities.

        Of the two issues, Meltdown is the performance killer......the OS has to perform to switch to a completely separate x86 processing environment. So the application has to be stored temporarily, the processor switches to a new x86 environment, process the kernel request, store the results and then switch back to the original environment. Big Slow Down!!!

        What should happen is that within the x86 environment, the application makes a kernel call, the processor changes state from Ring 3 to Ring 0 to execute the kernel code.

        Spectre requires recompiling the code to ensure that results of speculatively executed by code by the CPU is flushed to avoid exploit. The impact of Spectre would be minor compared to Meltdown,

        1. Crypto Monad Silver badge

          Re: Ubuntu patches?

          > Of the two issues, Meltdown is the performance killer......the OS has to perform to switch to a completely separate x86 processing environment.

          This may be true, but then what does that have to do with *microcode* patches which this article specifically mentions? KPTI is just a kernel feature which you'll get with an updated kernel (and can turn off at the boot prompt), and doesn't require any microcode change.

          Microcode changes are more likely necessary for Spectre, because otherwise you have to recompile all your code with a compiler that knows how to spit out the right mitigations (e.g. Reptoline)

        2. GreenReaper

          Re: Ubuntu patches?

          Recompilation introduces changes designed to frustrate speculative loads and execution, which otherwise might improve performance. There is, therefore, an impact. How big, depends on the precise mechanism of the protection and the software being run.

          But without new microcode, the defence is inadequate, and so will not have the full performance impact. I've seen several graphs of performance diving after the microcode was also applied.

  2. Anonymous Coward
    Anonymous Coward

    True risk profile

    What I'm seeing is a real lack of common sense in the wake of this Intel hysteria.

    For private, well secured environments, the risk that a storage appliance becomes exploitable is not your greatest risk. If someone can gain ssh access to your storage appliance to run the exploit, your data is already accessible and exposed. They don't need the exploit, they have the crown jewels of your data already.

    In well secured environments, the risk/reward of applying these patches has to be evaluated holistically and not based on the crowd noise.

    1. Dan 55 Silver badge

      Re: True risk profile

      If someone can gain ssh access to your storage appliance to run the exploit, your data is already accessible and exposed.

      Just because a locked down user has access via ssh, it doesn't follow that they've got the crown jewels. The hysteria is precisely because Meltdown/Spectre allow privilege escalation and access to what should be non-accessible information.

      1. Anonymous Coward
        Anonymous Coward

        Re: True risk profile

        This sounds like chicken or the egg.

        IF YOU ALREADY HAVE SSH ACCESS (which, presumably you would need to run the exploit in the first place) then you certainly have access to move files on the targeted system. You don't need the Meltdown/Spectre exploit to access data at that point.

        My point is that for most storage appliances, your risk profile is broader than the appliance itself. So, the cost/benefit of applying performance deprecating patches on a storage appliance needs to be carefully evaluated.

        1. Wensleydale Cheese

          EH?

          "IF YOU ALREADY HAVE SSH ACCESS (which, presumably you would need to run the exploit in the first place) then you certainly have access to move files on the targeted system."

          Please don't confuse ssh access with privileged access. You could be ssh-ing into a well locked down user account, which could even be running without a command line available.

          1. Anonymous Coward
            Anonymous Coward

            Re: EH?

            >Please don't confuse ssh access with privileged access. You could be ssh-ing into a well locked down user account, which could even be running without a command line available.

            This. A lot of people are assuming that if a storage product's interface gives you SSH access, you have the ability to run arbitrary code.

            You don't, because that would be fucking stupid.

            1. GreenReaper
              Devil

              Re: EH?

              Just shows how many people habitually ssh into root.

    2. CheesyTheClown

      Re: True risk profile

      This is the misunderstanding about the problem. And also that NetApp, EMC and others also don’t understand it. Also that AMD is stricken as well.

      NFS, SMB and web management systems are based on remote code execution. A well written RPC call with a code injection is all that is needed to exploit all threads within a process on any platform of this type.

      iSCSI and FC should be relatively safe, but they have their own problems for which you should avoid them.

      Blob stores should be ok.

      1. Brewster's Angle Grinder Silver badge

        Re: True risk profile

        "NFS, SMB and web management systems are based on remote code execution. A well written RPC call with a code injection..."

        But they don't, AFAIK, allow you to execute arbitrary code; you can only call pre-compiled routines. So if code can break out of an RPC then you've got a patchable bug.

        And that's the point. For spectre and meltdown to come into play the hacker has to have already broken in. Even if spectre and meltdown are patched they will probably have other avenues -- after all they're not supposed to be in there in the first place. Now on any normal day, you wouldn't want to leave that door open. But the performance hit is so serious that it might be worth it, given that the primary reason spectre and meltdown are devastating is they allow legitimately running user processes to circumvent restrictions put upon them.

  3. artem

    This performance "review" is worth shit without knowing the exact hardware configuration.

    Let me give you some real worthy data: I have an Intel Core i5 2500 CPU and Linux kernel compilation (lots of small C files) slowed down by at least ~35% after applying Meltdown/Spectre patches (I'm running Linux kernel 4.11.14).

    Let me remind everyone here that if your CPU is less than Intel SkyLake(Kaby and Coffee are the same uArch)/AMD (Ry)Zen then your performance might suffer a lot, a whole lot.

    1. bigphil9009

      And your performance "review" is somehow valid with the exact same lack of hardware configuration details?

      1. artem

        Um, that's gonna be:

        * Intel Core i5 2500

        * 16GB DDR3 1600MHz RAM in dual channel mode

        * Kernel built in RAM disk (which pretty much negates any IO induced slow downs for this particular use case - we're talking about compilation)

        which shows that performance in my case is limited only by the CPU.

        And this "new" "full" info adds mostly nothing to my initial configuration because the only thing which severely affects performance is your CPU and its architecture. I like how people give me thumbs down - it shows how little they've read about Meltdown & Spectre and how different CPU generations and vendors suffer from the fixes.

        Cheers!

        1. Anonymous Coward
          Anonymous Coward

          Performance isn't as simple as the speed of your CPU. The CPU is an important part of it, because it is where the application runs, and that application's job is to fetch or store individual I/Os.

          Because the CPU's ability to deal with an I/O is now restricted by the patches available for these bugs, an individual I/O's latency will be increased. This might be important but it depends on the I/O requirements of your workload.

          Will IOPS be affected? Possibly. Depends on how many parallel I/Os can be executed at once, and whether the CPU(s) is/are saturated. But then again, IOPS is the shittest and least relevant measure of performance, unless you are a vendor trying to quote a headline figure of irrelevance.

          Will throughput be affected? If the CPUs aren't saturated, most likely not. Yes, latency will increase but for throughput workloads latency is irrelevant anyway.

          Of course, this assumes that your storage system is just doing I/Os and nothing else. If you're taking snapshots or doing replication etc. etc. then this will also be impacted as this requires CPU time and tends to be latency sensitive. Good luck if your storage still uses copy-on-write.

  4. iOS6 user

    Any fixed CPUs?

    Does Intel/AMD sells already any updated models of the CPUs which are not affected by recent flaws?

    If yea is it any way how to recognize those CPUs?

  5. PlinkerTind

    SPARC immune to Meltdown

    Intel Xeon cpus are susceptible to Meltdown. Other cpus are not.

    1. This post has been deleted by its author

  6. Stevie

    Bah!

    Can't we just reboot the internet to flush out all the 4chans?

  7. JeffyPoooh
    Pint

    "...a 7.6-8 per cent impact is pretty rough..."

    Not really.

    In terms of Moore's Law (assuming it applies), it's about two months worth of performance growth.

    Ref: (1.08) ^ 6 (<- one-sixth of...) = 1.6x (...about a year is 2 months)

    Just sayin'.

    1. Anonymous Coward
      Anonymous Coward

      Re: "...a 7.6-8 per cent impact is pretty rough..."

      >In terms of Moore's Law (assuming it applies),

      It doesn't. There is a general assumption that storage growth is exponential. For some it is, for some it isn't.

      One factor does always ring true for storage: it depends.

      1. JeffyPoooh
        Pint

        Re: "...a 7.6-8 per cent impact is pretty rough..."

        AC countered, "...storage growth is exponential. ...isn't."

        To be clear, you know that we're referring to storage SPEED, right?

        Speed. So I'd say that it's not far off Moore's Law, step-wise generations of course. Even if it isn't exactly, then scratch out "2 months" and write in "3 months" (or whatever).

        So, the point remains (more or less) valid.

  8. woodcruft

    ZFS performance hit? Surely he jests.

    I saw the headline and thought: that's funny, I didn't know FreeBSD or illumos had yet produced patches. ie. systems where ZFS is native.

    That's because they haven't and this merkin/"systems engineer" is testing his Franken-stack consisting of VMware, Ubuntu, ZFS, NFS plus other arbitrary members of kitchen sink and coming to conclusions about how cpu microcode updates as supplied by VMware affects ZFS performance on his "house built on sand" ... that hopefully resembles no other in the known universe.

    TL;DR: He's an idiot and wait for FreeBSD to produce patches for meltdown/spectre and them to be tested in the real world rather than matey's "lab".

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon