back to article How NoSQL graph databases still usurp relational dynasties

Despite being assaulted from all sides, the relational model for databases is still the king of the hill and it looks like it will not only survive, but thrive as well. NoSQL databases have become increasingly popular and have been offering a number of data and deployment modes that have overcome the limitations – real or …

  1. happy but not clappy
    Boffin

    Yes but no but

    This is 50% right, as I have come to understand it. Yes, SQL is crap at expressing these things as it has no explicit looping, and works in sets, not linear chains. There are kludges to fix this of course via stored procedures, and stored tables, and nasty operators (CONNECT anyone?) but it isn't pretty. Impedance mismatch, yes.

    It is, however, pretty easy to encode them in relational databases (e.g. two tables, one of nodes, and one of edges). Directions are also easy to express.

    So the problem here is one of semantic power, and that is something that is easily added, so I think graph extensions to SQL will appear, and performance will be equivalent if not better, as soon as someone can be bothered to add it to mariaDB. Like all good ideas, it has probably already happened.

    1. Anonymous Coward
      Anonymous Coward

      Re: Yes but no but

      Many data models fundamentally do not fit the relational model. For data models that have entities with a large, variable number of attributes, vertical tables and sparse wide tables are hopeless, inefficient kludges that distort the true data structure.

      The relational model does have its place, and probably always will, but the big realisation with the NoSQL movement is that one size doesn't fit all, nor does it have to.

      1. Charlie Clark Silver badge
        Thumb Down

        Re: Yes but no but

        The relational model does have its place, and probably always will, but the big realisation with the NoSQL movement is that one size doesn't fit all, nor does it have to.

        Bollocks. In general, an RDBMS is exactly what you want but you'll have to learn how to configure and use it properly. It grew out of Codd's reasoned arguments against the problems associated with the non-relational databases of 1960s, many of which plague the NoSQL systems of today: "consistency, who needs it?".

        The NoSQL approach grew out of some niche use cases which the software industry suddenly turned into general problems: volatile document store, time series data.

        1. Anonymous Coward
          Anonymous Coward

          Re: Yes but no but

          Wow, so many confused ideas: ACID properties, data model, scale, efficiency, refundancy.

          You do realise that the 1960s were over 40 years ago? In computing, that's practically the dark ages.

          We have better understanding, technology and theory now compared to then. That's not to say that there is anything wrong with the relational model and all the theory and tech. surrounding it. I'm really fed up with the religious warring over this subject. It's become somewhat like the vi/EMACS debate.

          > In general, an RDBMS is exactly what you want ...

          Many "so-called" relational solutions to certain data model cases are nothing more than kludges wrapped up in established practice, such as hiving off large numbers of sparse attributes into secondary vertical tables making any kind of join reconstruction of the original entity enough to drive an admin to suicide.

          If all you have is a relational hammer, then all your data modelling problems look like nails.

    2. Anonymous Curd

      Re: Yes but no but

      "It is, however, pretty easy to encode them in relational databases (e.g. two tables, one of nodes, and one of edges). Directions are also easy to express."

      That's great, but it's a complete anti-pattern. You've effectively created a write-only graph store.

      Take a gander back at the article, for example.

      "but writing a SQL statement that will find all my friends' friends that are two nodes away should always cause a program a severe migraine..."

      Finding nodes 2 hops away isn't too complex a sql query, just two joins really. What about 3 hops? Well, we're suddenly into N+1 territory, and it's swiftly downhill from there. Heaven help you if you want to do something actually Graph-y and iterative like PageRank.

      The point of a graph store isn't to store something an RDBMS can't. There isn't any such thing. Throw enough link tables and type tables and joins at a problem and you can model anything. The point is to support efficient, expressive querying of that data in a way an RDBMS will really, really struggle with. That's the point of NoSQL stores in general.

      That's also why you find there are two distinct graph technologies. You've got those concerned with storing graphs and doing rapid retrieval of small subsets (like Neo, Titan etc.), and you've got those models concerned with doing bulk analysis of a whole graph, usually based on Pregel.

      One area where graph systems are hammering RDBMS in enterprise land at the minute is MDM/DQ platforms. Dig under the hood of the latest offerings from any of the big boys there and you'll find their traditional RDBMS backends have been torn out for hipster graph backends. They're just better for that kind of any-entity-to-any-entity model, and that is often what we're dealing with in real life with real data.

  2. a_yank_lurker

    Best Tool

    The real issue is not the which is always better but which tool best fits the problem at hand. Databases are evolving to have several different types with each being best suited for a specific set of problems.

    1. Nick Ryan Silver badge

      Re: Best Tool

      A serious problem is where proponents of one technology or another attempt to force use of it in fields where it's not ideal.

      Yes, a NoSQL, or unstructured database can represent users and credentials however an SQL database tends to do this better and more efficiently. On the other hand associating arbitrary data with a particular user sometimes lends itself more to NoSQL rather than SQL. Similarly representing an arbitrary tree structure or membership for a field value is something that neither standard SQL nor NoSQL do particularly efficiently which is where the flattened reporting databases come into play, sharing features of both SQL and NoSQL.

      Ideally I'd like a seamless NoSQL and SQL database where the most appropriate storage method can be used without having to have multiple independent database connections and therefore effectively preventing transactional functionality.

      1. Charlie Clark Silver badge

        Re: Best Tool

        Ideally I'd like a seamless NoSQL and SQL database where the most appropriate storage method can be used

        What, you mean like Postgres? JSON/hstore support, vertical column support, parallelism, etc.

  3. BinkyTheHorse

    No mention of OrientDB?

    I find it weird it wasn't at least name-dropped, especially in the context of bridging the "traditional"/relational model gap.

    That DBMS has a real killer feature here - the "main" DSL is basically adapted SQL, to the point where simple, "document-oriented" CRUD queries are syntactically valid in a relational DB. Substantially lowers the learning curve, let me tell you.

  4. Anonymous Coward
    Anonymous Coward

    Someday?

    > Relational engines might someday find a way to optimize graph-style queries, but writing a SQL statement that will find all my friends' friends that are two nodes away should always cause a program a severe migraine.

    What an odd statement. Oracle has had 'connect by' queries since at least version 8i, which was released in 1998. All the other main RDBMSs allow recursive queries.

    Ironically your example - finding friends' friends - is a fixed two level depth query and therefore easily solved using standard SQL and could be done using Oracle/Ingres/RDB/Sybase/DB2 at least as early as 1985.

    1. Nick Ryan Silver badge

      Re: Someday?

      Ironically your example - finding friends' friends - is a fixed two level depth query and therefore easily solved using standard SQL and could be done using Oracle/Ingres/RDB/Sybase/DB2 at least as early as 1985.

      I'm glad I wasn't the only one wondering what was so hard about this query.

      Finding depth at an arbitrary, programattic, level is a little more interesting on the SQL front but a fixed query of "my friends" or "my friends' friends" is simple - as long as the database hasn't been designed by a muppet of course.

  5. Charlie Clark Silver badge

    Traditional database vendors, though, are fighting back. Microsoft's SQL server (as of version 2016) offers a way to store and retrieve JSON data in a relatively painless way, although the data itself is stored in the relational engine.

    Does the author only know MS SQL Server? Certainly looks like it.

    JSON support has been in Postgres for a while and Postgres 9.5 adds binary support and indexing.

  6. Anonymous Coward
    Anonymous Coward

    As someone else pointed out, graphs can be simply realised in an RDBMS using tables for nodes and edges. Insertion can be managed by a stored procedure of roughly 75 lines, including checks for cycles in a directed acyclic graph. Joe Celko describes just such a pattern in his SQL For Smarties book.

    1. Anonymous Coward
      Anonymous Coward

      > ...using tables for nodes and edges.

      Maybe so, but processes for following paths through that structure look decidedly iterative which is fundamentally different from the set-based underpinnings of the relational model query, unless you wish to get into queries with n-joins of the same table against itself, which is why it scales so badly.

  7. Bibbit

    PostgreSQL?

    Strange the writer omitted to mention it given it had JSON capabilities the last time I looked.

  8. Justin Pasher

    Graphing nodes

    The ltree module in Postgres (which has been around for over 10 years) pretty much does what you are talking about (finding node siblings, parents, children, etc). How well it does as massive scale, I couldn't say (I've only used it at relatively small scale).

  9. swm
    Happy

    Relational Databases are Very Powerful

    I manage a web site for square dancing. When I took over this job it was a mess of editing HTML so I thought that this would be a good time to clean things up. At the same time I thought that this would be a good time to learn about databases. I downloaded sqlite3 and a Java interface - best decision I ever made. I can easily add dances, remove dances, add flyers to dances etc. and remake the entire web site in under 10 seconds on a slow archaic machine. I can even sort by club, caller etc. with very little effort. Even the SQL statements that make the web pages are in the database.

    I think one of the problems is the lack of familiarity of relational databases by many computer scientists. It is a different way of thinking (like PROLOG is a different way of thinking).

    <plug>

    Look at squaredancingrochester.org to see the web site.

    </plug>

    1. oliversalmon
      WTF?

      Re: Relational Databases are Very Powerful

      I'm hoping that this is a very clever bit of satire

  10. wondersonic

    If I may...

    It's maybe time to discover a new technology: Big Data Spatial and Graph with PGX Parallel In-Memory analytics graph engine

    http://download.oracle.com/otndocs/products/bigdata-spatialandgraph/oracle-bdsg-customer-overview-july-2015.pdf

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like