back to article Evolutionary database design

Evolutionary Database Design (EDD, aka database refactoring) is a controversial topic in the software development world. It’s a sore point among some DBAs because it promotes the making of changes to schemas that contain live customer data, sometimes with more than one application using the same database. Given this controversy …

COMMENTS

This topic is closed for new posts.
  1. Anonymous Coward
    Anonymous Coward

    And backups?

    Very developers point of view. Operational matters like backups/restores are the big problem.

    The rest is just a matter of efficient change management. The key point is one of dependencies. As data is often central, with a lot of different clients, any change is complex and therefore as painful as can be.

    Using triggers in a production database is disastrous for performance where performance matters and can lead to peculiar deadlock situations. This all implies that it is a valid approach for brand new stuff, but applying it to a mature system will require significant investment.

  2. Adrian

    So

    the DB programmer/designer should be part of the development "team" but if dozens of outside "clients" access the system how is the DB programmer/designer going to work with ALL of the real programmers to keep them all abreast of weekly changes in the DB schema ?

    The the real world DB schemas change all the time (I'm not only the developer but also the DBA for one of my employers main DB systems so I can speak from experience), I often change schema on DB's holding Gigabytes of data because in the real world, what management wanted on day one isn't the same as what management want on day 2.

    Businesses evolve, business requirements evolve, business databases evolve. And if you don't keep evolving, well a bloke called Darwin had this theory......

  3. Craig Smith

    Why not....

    ...abandon SQL all together and use XML? For example one engine I have seen out performs SQL by quite a factor in both storage and retrieval time. You can update the schema on the fly (just load in a new DTD) old data still conforms to the rev1 DTD and new data can conform to the old DTD or the new one. It also means you can store an entirely different structured data set within the same silo. You masy be thinking "what about the overhead", well it actually compresses and strips out the tagging information meaning you actually save space, i've seen one 8Gig DB converted to raw XML and imported into the engine to the tune of around 2Gig storage requirement. I just think that all these new technologies are available, and yet reliance on an old and extremely resource intensive database system grows by the day. Want more info? Contact me...

  4. Gaius Hammond

    Optimization?

    Typical developer, believes that his code is the only thing running in production, and that the entire IT staff exists to look after just his tiny corner of the business. Real databases have this little thing called optimized query plans, and you don't make structural changes without good reason. What a chump. And notice that he's basically IBM Marketing, I bet he hasn't written any real code in years.

  5. Lou Gosselin

    AntiThesis

    By claiming that changes aren't a problem in practice, he shows how little experience he has with real world applications.

    The reg reporter was right to be skeptical of this guy.

    I've done at least a couple of database migrations where data is shared by multiple applications. Most of these I did not have source code to. There are so many ways that applications can become tethered to the initial schema. Even with all the sql flexibility of triggers and views, there is very little it can do to solve multiple data normalization methods in use.

    Try migrating data between CRM apps having multiple addresses and phone numbers per customer. So far so good, both apps work and conceptually a transfer should be possible. Except that

    app A has 3 fixed fields for bus/home phone and fax numbers tied to the address level, while app B ties an arbitrary number of numbers to the customer level.

    Migration between these schemas becomes difficult or impossible because of the way the applications were initially designed. It's a data problem, no SQL trickery or coding can solve it. This is why it is CRITICAL that the dba does a good job initially before the database grows too large to change.

  6. Aubry Thonon

    The man hasn't got a clue...

    We've had "agile/extreme programming", now we have "agile/extreme DBA"... well, AP/XP doesn't work in large projects, and I have a feeling neither will AD/XD...

    I am a programmer... more importantly, I am a relational-database programmer - I write applications that interface with SQL databases. And I am good at it because I studied the damn things! Come on, it's not that hard... You don't expect your Ford mechanic to be able to fix a Ferrari without first having boned up one it, surely? So why do people think that you can be a DB programmer without knowing how to properly write/implement a database?

    As for the database schema, I personally use ORM rather than ER... I find that ER has what I personally believe to be a major flaw - it requires the person wirting the ER diagram to decide what makes up the entities (tables) without any real guidelines. At least with ORM, the tables flow out of the diagram rather than the other way around.

    (I fully expect a ER/ORM flamewar to start now... I'll go get my asbestos underpants)

  7. Donna Burbank

    It's all about balance

    While I would agree with Mr. Ambler's assertion that overly-long up-front analysis and development cycles are not efficient, I'd say that that is a fairly well-recognized truth in the industry. The days of high-cost "waterfall" projects for both App Dev and database development are long gone. That was one end of the extreme of the constantly swinging pendulum of buzzwords and methodologies, with Agile being an extreme at the other end. The answer to too much up-front analysis not no analysis at all, but the proper level analysis to be efficient. Where Agile goes wrong is with statements like "a logical model is just busy work". At the crux of the problem of poor data quality that the TDWI report refers to is often lack of understanding of the data (i.e. a well-defined logical model). For example, if developers are using different definitions and/or calculations for "total revenue", there will be discrepancies in reports and applications. It's the work done in analyzing these data definitions that leads to efficiency in the long-term (i.e. building it right, rather than building it again).

  8. JohnA

    What a load of

    bollix!

  9. paul pareti

    In an ideal world...

    ... every database would be designed right before it's deployed, same as every home you move into should fit your needs perfectly before you drag your mattress across the threshold.

    But humans are quicksilver. We're a fortunate kind of animal that can change its mind when the situation changes. Stands to reason that when someone (or more typically, some new organization) takes over the running of a bunch of databases, the design of them will seldom be perfectly tuned to the needs of the new arrivals.

    Whether you call it agile, extreme, pragmatic or simply living in the real world, it's plain useful to be able simply, accurately and reliably to refactor a database. It means you can thereby renew or extend its service lifespan rather than start all over from scratch. It means saving money on unnecessary greenfield development projects every time the team-lead changes. It also means saving a lot of time and focusing on what matters.

    And you have to say it's also the way things are moving in the industry. Microsoft have dipped a toe in the water with a simple object rename refactoring in their team suite (http://msdn2.microsoft.com/en-gb/teamsystem/default.aspx). And Red Gate have gone further with a dedicated database refactoring tool SQL Refactor (http://www.red-gate.com/products/SQL_Refactor) at a price that makes refactoring even more sensible.

    Personally I think the pragmatic approach to evolutionary database design is really welcome. It's a return to the commonsense attitude of the generation (now in their 80s) who said "If it's broke, let's fix it." Unfortunately it was the boomer generation who swapped this wisdom for, "If it's broke, let's replace it."

This topic is closed for new posts.