Re: Microsoft wants to parse texts and get patterns.
well if they had anti-replication in their file system, it would make more sense than indexing based on text contents. I'd venture to guess that the text indexing is for a basic search algorithm, and probably not one that goes across their entire file system.
It's my personal opinion that an algorithm that simply scans the data live, off of the hard disk, with aggressive disk cacheing [like Linux and FreeBSD have], outperforms any background "index everything" algorithm and data set. As an example, if I want to find something on any file system, I typically use 'grep'. Even with Cygwin, it seems to be SO much more flexible (and results more relevant) than trying to use MS's ridiculous "search".
The kinds of things _I_ would search for on a windows system: "Which header file has THIS function in it" [and considering where Microsoft wants to place header files, it's painful and bad enough already trying to naviguess to that - so I typically make a symlink in a Cygwin environment so I can do it more sensibly with 'find' and 'grep'].
It's also my experience that with compressed hard drives, decompression actually IMPROVES throughput. SSD drives, maybe not so much, but DEFINITELY on a hard drive. It has been so since the 90's, when MS first integrated disk compression into the OS (and got sued by STAC for it).
anyway, I think it would be an overall 'win' for them (pun intended) to not bother so much with the indexing, and just focus on throughput and performance.