back to article Microsoft SQL Server 14 man: 'Nothing stops a Hekaton transaction'

Microsoft’s SQL Server 14 will include a new memory optimisation feature code-named Hekaton, with the potential for huge performance gains – provided your data fits in RAM. Hekaton is not a separate product, but will be part of enterprise editions of SQL Server 14, for which no availability date has been announced. The latest …

COMMENTS

This topic is closed for new posts.
  1. Robinson
    Meh

    Question...

    So, as there's a significant lag between committing the transaction in memory and writing it to disk, if your instance crashes, all of those transactions are lost. Sounds more like a method for caching stuff you don't really care about all that much, rather than some important improvement here.

    1. FutureShock999

      Re: Question...

      Most databases use write-behind caches for performance reasons. That is why they are always UPS powered, etc. There is always a slight risk, but you have that for EVERYTHING that has not yet been mirrored to a secure, off-site backup.

      Very interested to see how this plays with High Availability clusters behind it, to your point.

      1. Stephen Channell
        Meh

        Re: Question...

        Log writes are always synchronous, UPS enables the RDBMS to checkpoint before shutdown to minimise the delay in instance recovery at restart. Highly critical TP either uses synchronous replication (e.g. SRDF) or App/TPMS multi-site routing.

    2. Stephen Channell
      Meh

      Re: Question...

      I don’t think it said that. In all (non-toy) RDBMS the transaction commits once the log entry is written, with data-pages updated on memory-eviction or from the transaction log at instance recovery. Eventual-consistency wouldn’t work with transactions spanning in-memory & on-disk tables.

      Looks like a MS response to SAP HAHA, and will get really interesting when in-memory can be applied to a partition and not just a whole table

    3. Parax

      Re: Question...

      SQL transactions only commit if the log is physically written. You will not loose data if you crash.

      Log writing is 'synchronous write' hence the reason you'd put it on a separate physical volume (preferably mirrored not raid 5 - no data processing overhead is fastest)

      There are many ways this will be welcome, you can build fast access tables, separate to your main data, ie Contact search fields could be held in memory whilst all none-searchable data can reside on disk. This would result in faster front end searches.

      It would be even better if you could specify which columns of a table were to be memory resident rather than just whole tables. like having memory resident indexes for a table held on disk.

      1. Anonymous Coward
        Anonymous Coward

        Re: Question...

        We had pinned tables in SQLS2000 (yes I know that is not the same as Hekaton)

        I don't think my boss will buy off on 500+GB of RAM somehow, or a machine that might house it.

        Will this improve query search speed versus a querey on data which is 100% already in the regular buffer (ie. zero IO anyway)? The code path would be shorter obviously, but how much better would it be?

        1. Anonymous Coward
          Anonymous Coward

          Re: Question...

          "I don't think my boss will buy off on 500+GB of RAM "

          I get this all the time.

          (http://www.crucial.com/uk/store/partspecs.aspx?IMODULE=BLS4C8G3D18ADS3BEU)

          At £3024 plus VAT, this provides a factor 27 performance improvement. Conservatively, even with a cheap sql expert, that's only a week's troubleshooting.

          1. BlueGreen

            Re: Question...

            And from that page, the words "NON-ECC" jump out. That could sink your business. Don't buy cheap, don't buy non-server components, don't install it without a soak test, don't omit a ups, don't put it in a bad environment, do monitor it steadily.. don't do stuff unless you know what you're doing, in summary, be professional. Recommendin ECC mem isn't.

            And btw consider the cost of a machine that can *take* 1/2 a terabyte.

          2. Anonymous Coward
            Anonymous Coward

            Re: Question...

            Well, I sure as hell wish I was billing 3000 squid a week. Oh wait, I got that currency wrong --- convert, convert, convert. It's more, but I digress.

            Curiously how much does the rest of the box (it has to fit in the blade rack) to put 1/2 TB in cost?

            And if you believe the 27 factor, then I have some great beachside properties in Florida for sale!

      2. Robinson
        Happy

        Re: Question...

        Well, that's kind of the answer. This doesn't affect latency of the client/server interaction much if at all, what it does do is improve the parallelism within the server itself when it's managing hundreds or thousands of clients at once.

        I see.

  2. This post has been deleted by its author

    1. Anonymous Coward
      Anonymous Coward

      Re: Nothing can stop a Hekaton transaction

      A megaton transaction can.

      1. Parax

        Re: Nothing can stop a Hekaton transaction

        and a metric fuckton slaps that hands down..

      2. JLV
        Trollface

        Re: Nothing can stop a Hekaton transaction

        >A megaton transaction can.

        Thus resulting in an hecatomb.

    2. Philip Lewis
      Facepalm

      Re: Nothing can stop a Hekaton transaction

      Except an out of memory exception?

  3. Fat Northerner

    Been asking for it for years for store and forward concepts.

    After I wrote an ecommerce website, my first venture in, I came up with the concept of millions of nodes using an identical code base, with a shared transaction log. I've been waiting for this solution for ever.

    1. This post has been deleted by its author

      1. Fat Northerner

        Re: Been asking for it for years for store and forward concepts.

        My idea was a system, where each node models say, "A low paid shop assistant," and the "transaction log" is merely a "Stock Added", "Stock sold list", "Schema change" thus providing a very long feed and audit trail.

        Before each move, the node asks the central server, or on of it's neighbours, if there are any more transactions, and applies them before it does anything.

        Thus, if you want more machines, you simply add another one, and set latest transaction to zero. It then rapidly builds the store representation from nothing, after which it announces it is ready to serve.

        Of course, the cloud bypasses lots of the problems it was intended to solve, but only if you trust it.

        1. Anonymous Coward
          Anonymous Coward

          Re: Been asking for it for years for store and forward concepts.

          The Reliable Transaction Router?

      2. Fat Northerner

        Re: Been asking for it for years for store and forward concepts.

        The other thing is that I've started turning up at places where they want to squeeze more performance out of a box, and aside from one or two massive write only audit tables, the whole of the rest of the DB would fit, indexes and all, into ram with tons to spare.

  4. Cliff

    Actually rather cool

    RAM is cheap in enterprise terms, sounds well architected, well done to the MMSQL team.

  5. Will 28

    I don't get it. How is this different to the PIN functionality from SS7

    I'm sure there's something more flashy going on here, but wasn't there some PIN functionality in SQL Server 7 that loaded a table into RAM? I seem to remember that it was deprecated because of potential instability and little performance gain.

    The concept of shoving the table into RAM isn't new, so what's the big change? It's nearly done anyway by the clustered index cache, so there must be some game changer here I'm not understanding.

    Also, how's it going to work with regards to clustering, will the pinned state of the table be shared among servers?

    It all seems a bit airy to me, bit like when they announced the Filestream data type, which turned out to be a complete non event.

    1. Stephen Channell
      Meh

      Re: I don't get it. How is this different to the PIN functionality from SS7

      64-bit addressing + large block size (32k+ vs 4k) + pointers instead of BTree

    2. david 12 Silver badge

      Re: I don't get it. How is this different to the PIN functionality from SS7

      Yes. also, "In addition, stored procedures can be compiled to native DLLs for fast access."

      I have to keep reminding people: when MS says something is a feature of a new edition, they HAVE NOT said that it is a new feature.

      1. Linbox

        Re: I don't get it. How is this different to the PIN functionality from SS7

        The "native code" stored procedures is very definitely new and pretty significant.

        After they introduced the (IMHO) disastrous CLR fuck-up-waiting-to-happen, which allowed any owner of "VB for Dummies" to write dogshit procedures, actually going back, admitting CLR is crap and putting the effort into getting T-SQL procedures to compile down to native code is the best news I've heard in weeks. (I'll be heartbroken if this is another "Enterprise-Only" feature. I'm still boo-hoo-ing about Column Store Indexes).

        Your mileage may vary.

        1. Anonymous Coward
          Anonymous Coward

          Re: I don't get it. How is this different to the PIN functionality from SS7

          State of the art 1984 (or maybe a bit prior) - welcome.

        2. david 12 Silver badge

          Re: I don't get it. How is this different to the PIN functionality from SS7

          SS7 did not allow CLR or COM objects, only DLL's for extended stored procedures.

          You knew that, right?

  6. Lee D Silver badge

    “If the data becomes too large, we stop inserting rows,” Farlee said. There is no failover back to disk-based tables."

    Sigh. So like in-memory caching, but worse. Great. Cheers. I picture thousands of databases all coming abruptly to a halt at some point because someone "memory-optimised" a table that one day grows too large.

    1. AceRimmer
      Holmes

      This is why there is a position called DBA (Database administrator) which should be filled by a suitably qualified person, not some spiv who thinks it's easy cause it's SQL Server.

  7. Anonymous Coward
    Joke

    World+Dog gasps at the innovation

    Who would have thought that loading complied-stored-procedures and the entire table into memory would be faster, that definitely deserves a whole article on the Reg ...

    1. Anonymous Coward
      Anonymous Coward

      Re: World+Dog gasps at the innovation

      > Who would have thought that loading complied-stored-procedures and the entire table into memory would be faster, that definitely deserves a whole article on the Reg ...

      You may have the joke icon, but that's just what was going through my mind.

      Loading a table into memory for speed is hardly an earth shattering innovation, that's why we have disk caches for crying out loud.

      It's a useful feature, and probably worthy of an article for those that are interested, but it is coming to something where this kind of thing provokes such a gush of awe.

      1. Haberdashist

        Re: World+Dog gasps at the innovation

        I'm not an MS guy, but how is this different than MySQL Cluster? Or even just a MEMORY table in MySQL? Storing and operating on entire tables in memory seems like it's been a standard feature in databases servers for a very long time. Surely SQL Server already had an in-memory table type before this, right?

        1. JasonLaw

          Re: World+Dog gasps at the innovation

          Yes, SQL Server has pretty much always cached tables and you used to be able to 'PIN' them (not recommended).

          The problem with having tables cached like that is that they were still fundamentally designed and optimised for disk storage. There is a big difference between "designed to operate from memory" versus "designed for disk but can operate from memory".

          How big is that difference? Well, according to the article, about x8 quicker.

          1. Anonymous Coward
            Anonymous Coward

            Re: World+Dog gasps at the innovation

            The code path will be shorter and supposedly the access model for the rows slicker. WHo knows whether it means anything in the real world. The benchmark means the code is suitable for the benchmark, nothing else.

            I trust MS and SQLServer about as far as I can throw them (I am a weed).

        2. Anonymous Coward
          Anonymous Coward

          Re: World+Dog gasps at the innovation

          "Or even just a MEMORY table in MySQL"

          It's exactly the same, but for all the MySQL security vulnerabilities, or it would be, were it not for the fact that rows and pointers are stored as objects, rather than having loads of infrastructure round them.

    2. Linbox
      FAIL

      Re: World+Dog gasps at the innovation

      Think you might have missed the point of this website. For SQL Server bods, these two things are pretty significant and whilst a fairly specialised subject - that's exactly why I read El.Reg instead of The Daily Mail.

      I'm off to post "wow! Amaze-balls" on that article I didn't understand about deep-pipe-L3-caching processors (or whatever) ...

  8. Linbox
    WTF?

    SQL 14 ??!!

    We're only up to version 11.0 at the moment, with the next one presumably being v11.5?

    Did you miss '20' off the front of that version number, or have I fallen asleep in SQL Management Studio and woken up in 2024?

    (please god, tell the muppets not to call it SQL 2012R2).

    1. Richard Gadsden

      Re: SQL 14 ??!!

      Yes, it's SQL Server 2014, or v.12

      I think we can safely assume that SQL 14 is the version after that (Microsoft don't do version 13).

  9. Anonymous Coward
    Anonymous Coward

    Hang on ....

    from the original article ... "The way Hekaton works is that you declare a table as memory-optimised.."

    Emphasis on "you".

    I would have been more impressed if Hekaton decided how best to optimise things.

    Relying on coders to understand underlying architecture seems a bit 1960s .....

    1. Anonymous Coward
      Anonymous Coward

      Re: Hang on ....

      Relying on coders to understand anything is a stretch!

      However, for the competent data analyst and DBA, understanding is the name of the game. Most competent DBAs know exactly the access footprint of their applications and can read it happening from the numbers in perfmon.

      1. asdf
        Trollface

        Re: Hang on ....

        >competent DBAs eh?

        Yeah too bad most of the shops I have been in that had them (read could afford them) usually referred to me by the number on my badge instead of by my name. As for ripping on coders, that may be true for many but remember it was coders who wrote the database software in the first place. In fact in one of the shops I worked I saw a guy basically solo build a cut down in memory RDMS (with transaction log) using template metaprogramming code that would melt your face (shop wanted to avoid runtime fees and using open source stuff). Not fair to tar any computer profession as all hacks (except MCSEs and Web designers (developers hahaha) of course).

        1. asdf

          Re: Hang on ....

          I did misspeak in implying competent DBAs aren't worth the cost. They are and most of the mom and pop shops that can't afford them stay startups until they fail. Also don't get me wrong I so prefer working on a system where the data schema has been designed by somebody that knows what the hell they are doing (the ultimate value add of a competent DBA). Crappy schema almost always results in a spaghetti code base almost regardless of the quality of the developers.

  10. This post has been deleted by its author

  11. BlueGreen

    Some possible answers

    1st off this is MS advertising itself, note the 29X with pointless barchart screenshot. Caveat emptor unless you're stupid.

    > When you look up an index it’s just a memory pointer to your data, as opposed to traversing a B-tree.

    A btree is a tree structure with an innate ordering. At the leaf level IIRC the db has to scan a page (8K) to find a value. This is true whether it's on disk or in memory (so tables in ram will be so structured, pinned or not). I think they are likely to be using extendible hashing which fits RAM access better and is just pointers to data (note, @skelband et al). Downside is that EH is has no ordering (one reason it's not so commonly used despite having many advantages) which means that certain very common types of queries that work well on btrees don't work well with EH: find the largest widget size smaller than a certain size (say, 30 mm):

    select max(size) from widget where size < 30.

    I guess this is where the "There are also some code compatibility issues with compiling stored procedures" comes from.

    I suspect teradata use EH, from what I've read. I doubt there's anything novel here.

    > Compiling stored procedures to native code upped performance further: to 65,000 transactions per second.

    Yeah. Define 'transaction' here. Answer: probably what suits MS's setup the best. I also think that compiling stored procs may not be of such great value as after all they are semi-compiled into query plans, and the cost of hitting some complex structures in memory may dominate the one-off cost of parsing some text. Improving the optimiser would often be far more valuable. I don't know what 'compiled' here means, I suspect deceptive marketing bullshit as usual (happy for MS to point out where I'm wrong).

    > When you amend a row in Hekaton, you create a new version in memory. When no transactions need to reference the old version, we reclaim the memory

    I think this is multiversion concurrency control (MVCC) and may be similar or identical to snapshot isolation which sql server already has. I don't know how that fits in with "Nothing can stop a Hekaton transaction" but that sounds like a challenge to me...

    Hah! the very next line says "If the transaction fails to commit, the changes are never written to disk". No contradition here then, obviously transactions are defined as things that only succeed!

    And MVCC eats memory - all the sooner for mem to fill up then? (that one @Lee D)

    MS would do better fixing the existing problems rather than inventing new boltons which are of specialised use. FYI I don't like MSSQL any more.

    All my own opinion, some of this may be wrong. Except the bit about MS should fix problems, and me not liking MSSQL, which are inviolably true.

    <http://en.wikipedia.org/wiki/Extendible_hashing>

    http://en.wikipedia.org/wiki/Multiversion_concurrency_control

    1. BlueGreen

      Re: Some possible answers

      A wee bit more clarification. Given the query "select max(size) from widget where size < 30" in a relationald db, if there is no index on size then there must be a scan of all values to find the suitable one. Expensive. If there is an index, it being a btree (to be precise a b*tree although wiki calls it a b+tree, seems to be the same thing) then the db can work it's way down the tree to find the first value greater than 30, possibly with a quick scan of an 8k page at the end. For a billion items that could mean 'a few' reads (3 or 4 or a dozen, but logarithmically few). Scanning a billion items linearly - no thanks. Point is, EH doesn't have such indexes so the scan would appear inevitable (<http://www.informatik.uni-jena.de/dbis/lehre/ws2005/dbs1/ExHashOrig.pdf> P9 supports this).

      Also MVCC has been used for ages. I believe postgres has it and I'm pretty sure interbase did as well (interbase... shudder). It's very old tech.

      <http://en.wikipedia.org/wiki/B%2B_tree> - missed this one out.

  12. Anonymous Coward
    Anonymous Coward

    uh?

    "Nothing can stop a Hekaton transaction" but that sounds like a challenge to me...

    Hah! the very next line says "If the transaction fails to commit, the changes are never written to disk". No contradition here then, obviously transactions are defined as things that only succeed!"

    I see no contradiction, commits always being possible, but rollbacks allow for the second clause.

    1. BlueGreen

      Re: uh?

      But if so, *all* transactions cannot be stopped because if they fail they can allways be rolled back, according to your view, so it's effectively a null claim.

      Or perhaps I misunderstand, let's try from another viewpoint: they say "If the transaction fails to commit" but then you say "commits always being possible". If the latter is true then the former can never fail.

      I'm pretty sure I can propose 2 transactions that interlock such that only one can commit ergo the other cannot.

This topic is closed for new posts.

Other stories you might like