back to article Neo4j CEO: We're at 'a huge inflection point for graph databases'

Emil Eifrem, CEO and co-founder of Neo Technology, says the world is at “a huge inflection point for graph databases” as his company, which supports the open source Neo4j graph database management system, releases v3.0 of the software. Ahead of releasing an architecturally overhauled v3.0 of the data management system, the …

  1. TRT Silver badge

    a huge inflection point for graph databases

    I can get you some cream for that...

  2. jake Silver badge

    "Inflection"?

    That word ... I don't think it means what you think it means.

    1. Anonymous Blowhard

      Re: "Inflection"?

      @jake: I don't think he means what you think he means.

      1. jake Silver badge

        Re: "Inflection"?

        My point is that "inflection" can go in either direction ... or orthogonally.

  3. Frederic Bloggs
    Coat

    What goes around, comes around

    I have been doing this indeterminate sentence call "IT" for a long time (with no imminent prospect of release), so it tickles me rather that "graph databases" are being touted as "new". In the early days of "databases", "graph" or "network" databases were the only game in town. Then some upstart, name of Codd, came along and said that they were all wrong and we should embrace some new fangled concept called "relational" databases instead.

    I know that one has to hype one's product up to stand a chance of getting it noticed - but I do wish that, at least some, acknowledgement of computing history is given instead of hyping some "new" concept that has been around before - sometimes three or four times.

    I know, I know - I am an old codger and am on my way to get my knackered coat already... I just wish they would let me outside.

    1. Destroy All Monsters Silver badge
      Headmaster

      Re: What goes around, comes around

      Well, yes.

      It sounds to me like we could implement a Graph Retrieval Language (like Gremlin) on top of relational databases.

      Now, we need some good stats on Neo4x + GRL vs. RDBMS + GRL. Numbers, I want to see numbers!

      Interesting point is, how do you handle the transactions (what are the transactions?) in graph database?

      1. Warm Braw

        Re: What goes around, comes around

        I certainly become instantly skeptical when I see claims such as sometimes a million times faster than a relational database,

        All a graph is is a series of objects with connecting relationships (like "is a child of", "works for", "bought"), and those relationships may have properties and metrics (like a weight). However, if you imagine how you would store an arbitrary graph database, you come back essentially to a bunch of objects (like database records) and a relational table of tuples of the form <source-object-id, dest-object-id, relation, metric>.

        The problem graph databases purport to solve is to make queries of the form "find me groups of friends and friends of friends who are interested in chess" or "what are the best routes from Birmingham to Coventry" easier to express and quicker to answer. I can certainly buy the "easier to express" argument - there's no easy way to make such queries in SQL.

        But you still have to do essentially the same data lookups - who is interested in chess and then scan your relationship table once for each "friend of" relation for every degree of separation. Now you may precompute those lookups by some means - in tradtional relational databases that would be an index on a single tuple but there's no reason it couldn't be an index on related tuples - but each such precomputation makes inserting and deleting information slower (and if you do it across related records, possibly prone to damaged referential integrity). So I hope someone can enlighten me on the crucial contept I'm missing that yields the magic million times.

        1. JMiles

          Re: What goes around, comes around

          Typically the relations from a vertex to an edge is stored as a physical location to the on-disk record so theory is you avoid having to create an index and then perform an index lookup.

          In my previous post I mentioned using Postgres ctids - this id in Postgres is a physical location to the record on-disk (its what an index lookup might return). The caveat with ctids is that when a record is updated then the ctid will change so you'd have to update any references to the old ctid updated.

    2. Doctor Syntax Silver badge

      Re: What goes around, comes around

      "I am an old codger"

      As one old codger to another - I can't remember a time when there were only four databases (presumably he means products). Can you?

  4. JMiles

    Relational databases are king. Except when your schema changes frequently (common in applications these days) - don't let RDBMS DBAs fool you by telling you ALTER TABLE can support these use cases; in most databases it can't.

    As for Gremlin on top of SQL - it's possible but the whole point of Graph databases is to avoid JOINs when traversing a relationship. This can be done in things like Postgres where you can have an array column containing a list of ctids and its performance should be comparable to Graph databases using this approach but write operations will be slower (since you need to keep ctids up-to-date on UPDATE operations).

    1. Charlie Clark Silver badge

      At the end of the day you can trade flexibility for performance. The relational model specifically tries to separate the logical from the physical and this leaves room for specific optimisations.

      RDBMS excel at consistency because this is the most valuable (corrupt data is worthless) and expensive. There are use cases where consistency is less relevant and, as in graph databases, you're more interested in the metadata (relationships) than the data itself. As JSONB shows: you can happily use relational tools to manage this.

      Frequent schema changes shouldn't really be the problem they are. But this is really a problem of tooling and not of the relational model.

      I guess the NoSQL world has highlighted the pain points for some of the use cases at scale. They've been a wake-up call to some fairly complacent RDBMS vendors (Oracle does some amazing stuff but at a price). It's great to see Postgres being the focus for much of the development: FDW, JSONB, Column Stores, parallel queries, etc.

      Back to the article: guy seems quite pretty switched on. While we all love our open source tools, the corporate environment sees risks other than the cost of licensing: support, further development, documentation, etc. So, it's nice to see corporates engaging with open source projects that 10 years ago they might have avoided.

    2. Doctor Syntax Silver badge

      "Relational databases are king. Except when your schema changes frequently (common in applications these days) - don't let RDBMS DBAs fool you by telling you ALTER TABLE can support these use cases; in most databases it can't."

      That's why we have all these non-relational databases these days. For the Agile developers who can't get it right first time. Or second. Or third....

      And, lo, we have .............. DevOps.

  5. David Harper 1
    FAIL

    One major flaw with the free version of Neo4j

    >> “Our summary is, whenever you can use MySQL free, you can use Neo4j community for free,” said Eifrem.

    Err, no.

    You cannot perform hot backups on a running instance of Neo4j community edition. You must stop the instance, then backup the files, then re-start the instance. For hot backups, you must buy the enterprise edition.

    You CAN perform a hot backup on a running instance of the MySQL community edition, using a free, open source tool such as Percona XtraBackup. Free database server, free hot backup tool.

    You do backup your databases regularly, right?

  6. EvilBanana

    They need to stop stealing my plans to take over the world. I had these ideas 10 years ago.

  7. a_yank_lurker

    Basic Data Model

    The best db is the one the that naturally fits the data while providing required performance/features. No one db type is suitable for all applications even if they can be shoehorned to sort of work. Graph databases are no exception.

  8. Gary Bickford

    linked Data and triple stores?

    I'd be interested to see how well Neo4J works as a triple store. The linked data /RDF protocols are based on relations in the form subject predicate object. This structure generalized to support every kind of database application easily but at the cost of cycles and storage. Ultimate flexibility has a price.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like