back to article How are we going to search our hard disks now?

Regular readers will know my occasional whinges about the sad state of the market for email clients – these generate hundreds of emails and comments. But there is another product category that is looking decidedly shabby these days. It is one which every so often becomes fashionable for a few weeks, and then goes on to suffer …

COMMENTS

This topic is closed for new posts.

Page:

Amen

I am going to be a bit stuck without Google Desktop. At the moment I can hit the CTRL key twice to bring up a lightning fast desktop search box which makes equivalent searches via our "official" HTML form seem childlike in comparison.

I'll keep it going as long as I can but agree with Andrew there is realistically nothing to use instead :(

2
0
Linux

Linux or mac os X

Just use locate to narrow down a list of files in seconds and grep to look inside them. Okay it's not perfect with binary files but it beats anything I've ever found on windows, and the wife can never find things using explorer, even with obvious names like C.V.

5
4
JC_

Another Reason to Upgrade

To 7 or Vista. The search in XP did indeed suck, but isn't half-bad in Vista/7.

4
7
Thumb Down

I thought that too...

Then tried searching for file types; it can't. I searched an indexed location on my PC the other day (using the more-or-less-undocumented-but-none-the-less-official method: type=) for an ISO and it flat out refused to find it. Even XP could do that...

The future is rubbish.

5
0
Gold badge
FAIL

Windows search

However, it still only indexes (or searches in) files that it thinks that you should look in.

If you are looking for a netlist (or text schematic) that contains certain terms, it won't find it. Even if you restrict the file extension to the ones you want, it just says none, without even telling you that it didn't look because it doesn't like .net or .asc files! Of course if you have permissions to roger the registry, you can change a setting, but then that's more work than grep!

5
0
Linux

Re: Linux or ...

restricting it just to Linux, catfish is a front end to locate or find so no text foo is required. IMHO more useful than strigi because it doesn't have to chomp away in the background, though as a result is slower to display results.

0
0
FAIL

Windows 7 Search?

Now it is quite likely I am doing something extremely idiotic but I have had loads of frustration with Windows 7 search not finding things I know are there. In frustration I even had it search on a term in some filenames that I could see were there in the same folder, nope didnt find a thing.

Bring back XP search

2
0

agreed

locate + grep are essential I never search with anything else. both available for Windows too . invent a file extension for you personal stuff and if you do have to scan file contents you can feed grep a shortlist. if you have to search you whole drive to find something you should sort out your bad habit of sticking stuff in silly places!

0
0

However, it still only indexes (or searches in) files that it thinks that you should look in.

indeed. the old trick of finding the origin of random error junk by searching the program files to find which one contained the mystery text is long gone, for instance.

0
0
Stop

type= Isn't the Correct Syntax

You should use Type:<type>

If you start a search from a Library then you will be offered Type: and Kind: with appropriate drop down lists for selection. It may take a while for the drop down to populate for Type: but you don't have to wait, you can type an extension.

0
0

Yeah, that's an answer

Why don't we just sit you down at my desk for a day, and you can explain it to all the blue-haired old ladies who currently ask me for help finding files.

0
1
Silver badge

Take a course ...

... learn to become a librarian.

Seriously. Your computer is a modern day personal library. If you can figure out how the stacks at Stanford or Berkeley work, your personal storage should be an open book. ::ahem::

9
0
Gold badge

Re: Take a course

Sounds like overkill to me. The most obvious difference between your computer and a library is that you didn't put everything into the library whereas presumably you did put everything into your computer.

Perhaps that's being naive. If so, could someone please explain what is on your computer, who put it there, what makes you do sure it is there and why (despite that) it is difficult for you to find it.

I'm sure there *are* reasonable answers to those questions that lead us down the path to "you need a private google", but there are also many answers that deserve the response "don't be so careless". I think we need to clarify requirements before we can have a reasonable discussion about procurement.

4
0
Thumb Up

I actually agree

And I don't know why the computer industry never adopted the Dewey system for disk file metadata. I'm serious. It was a stonking idea.

It could have been implemented very easily. I saw experiments being done over 20 years ago. Never saw the light of day. File indexing would have been a doddle if it had been adopted.

1
0
Linux

Librarians...

Of course a librarian "put everything into the library". That is what they do. The key difference between you and a librarian is that they have a system. Not only do they have a system but it's a standardized system that any other librarian or even a civilian can get a handle on.

"Desktop Search" is simply a response to people in general refusing to be organized or refusing to understand what technology can do.

It also leads to silliness where people dump their iPhoto libraries to CD because they became too large to manage (in iPhoto).

2
0
Silver badge

My point, Ken Hagan, is "learn to file properly".

It ain't exactly rocket science.

(Simplistic Librarian course is only 1 Uni unit ... but I recommend the 4 unit version. It'll do you a world of good for the rest of your life ... )

1
0
FAIL

It amuses me that you think the solution is to teach *people* to file things properly

I store all my documents in a well organized way. It doesn't mean that I can find the document where I cited a particular passage from a book, or am looking for a particular error message in my log files. Hello?? Libraries don't work in the way that we need to search out computers.

But I agree with you that if you write a document called "My C.V." and put it somewhere idiotic, and then can't find it... well, perhaps you weren't meant for the job.

0
0
FAIL

Oh, and let's not forget...

That in order to correctly identify all the meta data for the *content* of a file that you're going to need in order to file things in a way that's easily locatable by *position* on the hard driver, you'll need to not only see the future, require about 10 times the hard drive space for the meta-data and soft-links to the file in question but you'd lose probably about a year of your life per file doing the categorizing and organizing.

Hey, maybe I could employ a librarian to keep my computer organized? Nah, I'll just download Google Desktop. Oh, Sh...

0
0

My point, Ken Hagan, is "learn to file properly". → #

Actually, there's two types of filing (or storing of objects, for that matter). One is the good old "a place for every thing and every thing in its place" system which results in pegboards with outlines of tools drawn on them, etc. the other is the good old "heap" where stuff is located by search when needed, with or without indexing depending on size/need. each system has pros and cons.

0
0
Gold badge

Libraries & Jake

So please jake, tell me how I ought to organise my library.

When I have a file, say "2011-09-06_MG_2342.CR", where do I file it? Myself, I'd file it in 2011-09-06, I'd then use software to find it when I want it.

What would you do? Would you copy it to the directory "Portrait", and also copy it to "Fashion", and also to "Red_dress", and also to "Iman", and also to "AM1178" and also to "Cibeles"......

0
0
Silver badge

@BristolBachelor

Learn what meaningful subdirectory structures and meaningful filenames are. It's YOUR system, after all. That's what the "P" in "PC" is all about ... But I'll bite. In the given example, how about:

~/pseudoPr0n/female/Iman/RedDress/2011-09-06_MG_2342.CR

Or something like that ... I'm not certain what the ".CR" extension is. Nor do I care. If I felt a need (unlikely), I'd link to it elsewhere in my system. I'll leave the methodology as to how that works as an exercise for the reader, just to save the mods a little reading time ;-)

0
0
Thumb Up

Copernic 2

Copernic 2 - pre-lobotomized (into vers3) can still be made to work without upgrade nag:

http://forum.oldversion.com/showthread.php?5261-Copernic-2&s=ed0af9c0c47fa401ec8cf807a1c7d76f&p=23623&viewfull=1#post23623

works nice for me.

3
0

Copernic 2

Agreed. I made the mistake of 'upgraging' to ver3 once. What a great leap backwards. Ver2 mostly does what I need, and is very handy for foing full text sarches - if only it would read inside .rar files.

1
0
Go

Locate32

Locate32 doesn't look inside files

but is lightning quick compared to Win7 native search.

http://www.locate32.net/

3
0
Anonymous Coward

... advanced tab?

Did you try the advanced tab?

Is has an option "File containing text" that at least works on code files.

1
0

Desktop search

I agree desktop search (with privacy) is highly desirable. I use a perl script in the absence of better tools (on Linux) but it asn't a patch on proper indexing.

Backup tools could perhaps stand in here. They have to read every file periodically, so could build up a search index almost as a side product.

3
0
Silver badge
Happy

Also would make another good reason to back up regularly.

As if we should need one! But we do!

0
0

And in about 1996...

... there was AltaVista Desktop Search. Whatever happened to that?

<ron manager>Small boys, Windows 95 for goal-posts, isn't it?</ron manager>

3
0
Linux

Search, smerch ...

I've always found something like find . -name a*.gif -exec ls {} \; for example has always worked just fine. Combined with grep I can look for content as well. You could even wrap it up in a pretty Zenity script to give it a GUI for the command-line phobic.

*nix users don't need anything more, surely?

3
3

Possible alternatives

Some possible alternative to consider:

RECOLL http://www.lesbonscomptes.com/recoll/

and for the more technically oriented

Sphinx http://sphinxsearch.com/about/sphinx

0
0
Paris Hilton

Recoll FTW

Agreed, recoll is awesome... indexes contents of files and can handle quite a few file types. Also does a very good job of appearing to use next to no resources. My only complaint is that it doesn't work on Windows. Combine it with "locate" in regex mode and you have a very effective search solution.

I find Everything (http://www.voidtools.com) is a good alternative to "locate" on Unix (for searching by indexed file name) but I haven't found a decent program which will index file contents yet on Windows.

0
0

@Craiggy

Command line antics with "find" and "grep" won't search your documents, PDFs and spreadsheets. And shoving a binary file into grep will fail and likely bork your terminal session. Gone are the days when we kept our data in text files.

2
2
Anonymous Coward

@ Jim 59

Yes, they will. Excel and Word file do get picked up nicely. As for PDF's -- just use something like: acroread /a search='blah blah' *.pdf

2
0
Anonymous Coward

I could also point out ...

That if you do all this as a low-priority cron job you can stuff the results into something like an sqlite database and search that instead. I'm not interested in displaying the file contents via stdout, just knowing they contain a particular string, and then storing the filename and location.

So far this does seem to work extraordinarily well.

Docs and XLS files do contain enough string data to be useful. Admittedly, .docx and their ilk don't but I'm not having to worry about them.

1
0

DIY search

If you are keen, tools are around to dump Office and PDF into text formats which can then be searched. My perl script uses this and it works but it's still poor compared to the professional indexed searching we were promised years ago.

0
0

grep searches binary

obviously, grep can't print a line of text it says something like binary file contains ..., but it finds it. and you don't have to hack the registry to get it to look

0
0
Silver badge

Simples

pdftotext and antiword will both quite happily dump their respective formats to a text file, I've found pdftotext to be a little more reliable, though I occasionally hit issues with character sets depending what I'm indexing.

I've had to build a better search because I'm too organised for my own good, but not always consistent. If something isn't where I think I would have put it, I can't usually guess where else it may have been filed.

0
0
Coffee/keyboard

"Gone are the days"

Careful - some people thought the idea of XML/text was a good idea. I never did; with all the extraneous punctuation etc. or should I say <etc>etc</etc> Several bytes more (and yes I know it's all to do with interoperability)

Dare I finish off with etc? I may just have to XML your phone later.

0
0

Strigi + pet peeve

It's called strigi, without the N, but since it should be stringed up ...

What pisses me off most about all of those annoying desktop search tools (unless someone can point em to one that's different :) ) is that they are always indexing, and eating CPU cycles, even though they are only supposed to do this when Im idle.

Having them continously index a 2T disk, even if they dont do anything, they still eat up a lot of memory.

My ideal desktop search engine would:

1) combine the search functionality found in Picture and Music programs (think amarok, digikam)

2) recognises tags

3) should be able to be told, go out at 23:55 and start indexing

4) and be done indexing all my data well before 7:00 am in the morning

5) be able to be smart enough to understand things like ``search in all my pictures'' or search al my word docs

until the, Ill stick to mairix to earch my email once a night, digikam to search my pictures, and amarok to collect my music

4
0
Gold badge

You forgot...

When you plug in a disk, it may decide to search that too, not only eating CPU/disk cycles, but also then preventing you from disconnecting the disk afterwards (or deleting directories, etc.).

3
0

This post has been deleted by its author

Silver badge

Re: Excues me?

I assume he was referring to the search process not letting the user delete a directory because it was busy indexing it nor will it allow the user to disconnect an external disk until the indexing is finished.

I have run into the exact same problem just this past weekend with a 2 TB usb drive that was running close to 60% of capacity. After walking away in disgust at the slowness of the virus scan I came back to find the search indexer was busy doing its thing for another infuriating amount of time - I'm going for a coffee down at the pub boss. We wouldn't want to simultaneously perform a virus scan and indexing at the same time on an external drive that is going to get the same treatment on every computer it gets plugged into now would we? Some days you really want to find the git who caused all this by bringing a virus into the company on a usb stick and let the computer index and scan him.

0
0
Mushroom

@craiggy: read the question again

with less of an attitude. If the search process is scanning directories, of course your operating system will not let you delete them (while another process is using them). Why on earth would you think that the search process would want to delete directories?

0
0
Anonymous Coward

Yes ... apologies

My mis-reading.

0
0
WTF?

And that is a pity..

"because the search engines built into Windows and Mac OS X"

Personally I find spotlight works rather well in osx...

0
0
Gold badge

Spotlight

Erm, yes and no. Ask it where the hosts file is, and it will tell you that there is no such file.

It's ironic given that it was even Apple that said use spotlight to find it to fix a problem with Bonjour...

0
0
Anonymous Coward

Finding the hosts file

If you're looking for files like that, use Spotlight in the Finder. First click "File Name" to specify that you are searching by name (rather than content). By default, the Finder won't show files that the average user isn't even conscious of, e.g. invisible or system files. To include these items, click the + button to the right to add search criterion. If you're looking for the hosts file, select "System files" "are included". It finds mine in the blink of an eye. If you frequently want to search system/invisible files, click the "Save" button to retain the settings for future use.

It's a shame the author wasn't more specific about why Spotlight doesn't work for them. I have huge archives of email, specifications, letters, quotes, technical documentation, and all the my current work documents, but Spotlight if almost miraculous when I try to find something. I don't get a huge number of results, and the document I'm looking for is almost always at the top.

1
0
Anonymous Coward

And...

Spotlight's extensible arch means you can have plugins written (or try yourself with Xcode).

Not enterprise but I was recovering files from iPhoto after a HD crash which is a nightmare due to thumbnails. Those exhaustive metadata options are amazing I thought I had it with 'pixel-width' but forgot about landscape bs portrait - god bless 'pixel count'

0
0

I tried a few...

Google desktop never worked for me (too slow and file number limitation) so I won't miss it.

Among the others I tried:

Copernic started crashing on my content

Windows desktop search. No comment.

Exalead that was too slow

etc...

I still use an old version of X1 (last free version).

But I can't understand that there is nothing better widely available.

0
0

For the Mac...

On my Mac I always use EasyFind (it's free) from DEVONtechnologies. It certainly beats Spotlight...

0
0

Page:

This topic is closed for new posts.

Forums

Biting the hand that feeds IT © 1998–2017