Are your file or email archives a black hole? Do you even archive? If not, you might have a black hole in a black hole. Do you have funding for your archiving project? If not, you have a black hole in a black hole in a… you get the idea. Email is embedded in your business processes. It contains vital business intelligence, but …


It's a sales problem

While there are genuine use cases for archiving, most people do not have an archiving requirement. The "Problem" is how solution providers can convince everyone that they need an archiving solution when they don't.

Archiving is more about what you don't archive than what you do.

Not all data is equal, most of it (99.999999999999999%) is worthless junk.

old email may be worse than worthless...

... it may be a legal liability. Even if it doesn't contain anything incriminating, just the cost of retrieving, sorting and collating it can be significant. Hence many companies now have a 'retention policy' which is usually actually a disposal policy that destroys old email.

Use Opera.

Multiple account, huge storage of every email in a single SQLite database that you can prune at will and that can do in-browser full-text "as-you-type" search over the entire archive without even struggling.

Currently have all my emails for about 10 accounts going back to, the earliest, 2002 - and that's only because before then I wasn't sucking stuff into Opera or didn't keep emails. I have another computer somewhere with everything pre-2002 stored in UNIX mailbox format that I could import if I could be bothered.

Tagging, Bayesian spam-filtering, insta-searches over Gigabytes of email through multiple accounts, IMAP integration (folders, tagging, etc.) and a lovely interface to browse just one account, just unread, just ones that fit a criteria, etc. without having to work out how to do it.

But, saying that, haven't needed to bring pre-2002 archives out in the last ten years except maybe once to get some ancient password that I could have just re-issued if necessary anyway.

So where exactly is the problem with that?

I mean last time I checked IMAP was working, and it was working fine even for large mailboxes. And it's stored in Maildir format so as long as I have the files, I will always be able to read them trivially. Plus I can search them "as I type".

This seems to be yet another problem created by people who install Exchange because they are scared of a command line.

It's a common problem: People make a bad decision, but instead of admitting it was bad and changing it, they just go on spending lots of money to do trivial tasks.

pst? pfffft...

UNIX mbox all the way. I don't have some fancy archiving system, but I do have mbox files going back to ~1994 and I can grep them all I want.

That said, even avoiding the rather terrible .pst file formats, if I were interested in proper indexing and such this sounds like a good presentation to see.

