back to article More on databases in academia

Well, my (this is David Norfolk writing) Blog on Databases in Academia excited some robust comment – and, strangely, it didn’t come from the Intelligent Design people I was expecting to upset. As the implications of some of the comments were pretty personal, I felt I should give my main informant, Mark Whitehorn, the “right to …

COMMENTS

This topic is closed for new posts.
  1. Anonymous Coward
    Anonymous Coward

    Have you tried Google-ing for a database?????

    I'm amazed by the lack of research you carried in trying to find "any reasonable relational database". What about trying to Google for "database" - there at the top (a sponsored link, yes) is the "world's best selling databse". And you didn't even know about it? I hope your research methods within your chosen acedemic subject is a tad better!

    MS SQL easiest to use?!?! God help us! There are millions of users of FileMaker out there, especially within acedemia - it's a shame you didn't realise this before you took the route of "if it's hard to do it must be powerful" and "if it's MS it's probably OK".

    It's not too late for you, though - I'm sure you must have other projects on which you could actually try the best of breed apps rather than the de facto MS offering.

  2. Michael

    Academics and software

    It is not only in databases that academia seems to lag behind in the use of the best available tools. It is a common fault in many departments to rely on sub-standard software tools purely on the basis of the initial cost, or license cost. This tends to neglect the benefits in performance, time, ease of use or even newer solutions that have become available since anyone bothered to check what was available.

    As for being pro windows or pro open source, well who cares? If the tool works well that is all that matters, except from the odd zealot who cares?

  3. Matt Collins

    Tutorial please!

    Perhaps you could explore some of these 'BI' techniques for the benefit of those of us who are ignorant of them. Am I mistaking BI for 'data mining', by any chance? If so, why the need for a new name? Perhaps this is why academia doesn't appear to use 'BI'!

  4. Anonymous Coward
    Anonymous Coward

    Some experiences with larger academic databases

    Two cents from Stanford University in the U.S.

    I've been working for two years in the Political Science Dept. here building several online systems and data warehouses for faculty research projects. All required "real" databases because they each had millions of records (international trade data combined with 50 country-specific variables spread over 50 years, U.S. election campaign contribution data for all elections from 1980 on for all House and Senate seats, and so on). The queries were enormously complex and required industrial SQL implementations. We used Oracle10g and PostGreSQL successfully in these projects. I anticipate at some point in the future using BI tools for multidimensional queries, but you're quite right: academics don't understand the need for these tools and don't understand the technologies involved. I would say my biggest problem is cost, though: I couldn't afford "real" ETL tools, for example, so had to build them in PL/SQL, and I relied primarily on open-source tools for most things. We could use Oracle because of a Stanford site license arrangement. I won't even discuss the hardware we had to use (all done on a Dell workstation/server in my office with pleas underway for IT services to host things). Security is also a big problem, as we're not behind a firewall due to academic-style Internet setup (I put my own hardware firewall up in my office as a short-term solution). Thanks for the articles!

  5. David Norfolk

    Reponse

    Some interesting comments already. And one that's perhaps a little closed-minded....

    A tutorial on BI? That's actually a good idea and we've been talking about it - Mark is a bit of an expert on MDX for a start - http://www.cashncarrion.co.uk/products/16163/0/.

    Is BI the same as "data mining" (also called "Knowledge Discovery In Databases")? Well, BI is the term usually used in business IME (it's hardly "new") and is, eg, "programs and technologies for gathering, storing, analyzing, and providing access to data to help enterprise users make better business decisions". Data mining is, eg, "the nontrivial extraction of implicit, previously unknown, and potentially useful information from data" - probably a compsci-oriented intersect with BI, (eg, I suspect that BI is sometimes used to provide confirmation or illustration of previously known data and that some mined data doesn't support any business decisions). Data mining is what IBM called it in the Thomas J watson Research Labs near New York, when they showed me some leading edge stuff once.

    I don't really care what it is called, I'm interested in whether it includes techniques/technologies that are practically useful and cost effective to pursue - and we're getting some feedback that might confirm that neither "data mining" nor cutting-edge "BI" (some BI is trivial) is as widely used in academia as it might be. But feel free to disagree and post counter examples. And even if I am right in general, there will always be exceptions, of course.

  6. Zachary Berkovitz

    Seriously, Filemaker?

    Filemaker may be a decent product for small business or Mac users, but as far as supportability, scalability and reliability go, it is a far, far cry from MS SQL on Windows, or even Oracle on Windows (Oracle on Unix or Linux would be more reliable and scalable than Oracle on Windows, which is buggy, and often poorly ported). I, and all of my colleagues whom with I have discussed Filemaker, and who have supported and designed database implementations with thousands of GB of data and thousands of concurrent users (including me), universally panned this product. In my experience with a few implementations, the reliability and scalability are shameful, and Access on a server performed better.

    On their website, Filemaker compares their product with Access, and their server product supports "up to 100 simultaneous users". It isn't even an advanced DB, much less does it have a robust BI package, if it has one at all.

    Businesses generally choose MS SQL or Oracle, and there are plenty of good reasons, mainly that:

    a) budget is generally available and they can spend money on projects or processes which are important to them

    b) efficiency, scalability and stability are usually requirements.

    Hopefully some of the good IT practices used in business will migrate over time into other industries, such as academia, non-profits, health care, and public sector.

  7. David Norfolk

    Seriously FileMaker?

    Some good points, but, yes, seriously FileMaker - when it is appropriate <g> It's not designed to compete with mainframe DB2, with thousands of concurrent users.

    As always, you really need to look at the requirements first - and only then choose whatever technology is appropriate...

This topic is closed for new posts.