back to article EnterpriseDB cranks up Postgres database scalability

EnterpriseDB, the commercial entity that is helping to push development of the open source PostgreSQL relational database – and profit from it – is trying to keep the heat on database rival Oracle with its latest 9.2 release and a matching 3.0 update of its database management console. The PostgreSQL project put out its beta …

COMMENTS

This topic is closed for new posts.
  1. Anonymous Coward
    Anonymous Coward

    350,000 queries per second until...

    ...the query planner screws up. PostgreSQL relies heavily on statistics to determine the order in which conditions should be applied to queries. The goal is to apply the most selective conditions first and merge results in a manner appropriate for the number of rows involved. Most of the time it works well. On the other hand, there's that rare case where it goes horribly wrong. Anything could be the cause. Maybe your storage is slower because a RAID is rebuilding. Maybe a temporary software bug in your app caused a load glitch. It can simply be that a customer of your app accidentally searches for information dated in the year 212 rather than 2012. PostgreSQL now decides that no indexes are valid for a query, or maybe temporary indexes should be used, or maybe the whole thing should be a filtered cartesian product that won't finish before the death of the Universe. It's for those rare cases where you're screwed. You frantically search the Internet for everyone's superstitious tricks - increase the statistics buckets, vacuum analyze, more autovacuum processes, delete indexes, create indexes, disable plans, tune the cost estimates, and on forever. Can you manually specify a query plan to get up and running quickly? Absolutely not! It's forbidden in the PostgreSQL world. It happens again, and again. Resources used by malfunctioning queries are causing autovacuum to stall. Queries that used to work are failing one after another. You're not screwed anymore. You're dead. PostgreSQL's data is in a state where only a day of downtime can get the statistics and indexes working again. And after that a day of downtime, you still have no guarantee that all the voodoo tuning has made it any better. You can only wait for it to happen again.

    1. Charlie Clark Silver badge

      Re: 350,000 queries per second until...

      You might very well be right. And there is nothing stopping you from proposing a fix and submitting a patch. Try that with Oracle.

    2. Glyn Astill
      FAIL

      Re: 350,000 queries per second until...

      Most of your gripes sound like they're down to bad database design / configuration / management to me. Postgres doesn't have hints, it's been a long running argument and the general consensus is that they encourage bad habits and scale worse than the planner alone in all but very exceptional cases. There has been some recent discussion about implementing hints and having an extra guc parameter to obey or ignore them, I'm not sure where that's going.

      Your raid issues? Well that’s a hardware problem, but you can easily reconfigure cost variables to smooth things out. Index issues? Again sounds like bad design to me.

    3. BlueGreen

      Re: 350,000 queries per second until...

      > PostgreSQL relies heavily on statistics to determine the order in which conditions should be applied to queries

      like all modern DBs I believe. Stats-based optimisers were described as the 2nd biggest step forward in database technology (can't remember what the 1st was. Poss dynamic programming based on the stats).

      > On the other hand, there's that rare case where it goes horribly wrong

      like all modern DBs, no question about it.

      You check your stuff is well designed (as others here have clearly said), and when the inevitable happens, you deal with it. A good start is to simplify the query. This may involve revisiting the DB design and it may take breaking a query down into 2 with a materialised table in between (not optimal). Or you could look here if it helps <http://www.postgresql.org/docs/current/static/runtime-config-query.html>.

      It's can be very hard to debug a large nonperforming query so your chance of fixing one even if you could specify the query plan is small. On a really complex query you sometimes can't tell it's significantly suboptimal at all without serious time consuming digging.

      Back in the article...

      > now you can hit the index and whip right into memory for that bit of the table, thus speeding up access to that data by a factor of ten.

      That is a factor plucked out of thin air. Could be X1 or X1000 depending on the fatness of the index and whether it fits totally into ram.

  2. This post has been deleted by its author

This topic is closed for new posts.

Other stories you might like