Possibly OT: What the user really wanted...
Sloppy coding, neglect to RTFM - it's business as usual in scientific research land XD (and you can't even blame the perpetrators, given that they have not normally been properly trained to code, even if coding happens to become their main task on the job; Luke 23:34 fully applies).
But apart from the obvious fault I vaguely smell another issue here, which is all too common in scientific computing although the core of the problem is rarely fully appreciated: the lack of indexed files in modern OS'es. Yeah, I know, ISAM, RMS and the like are sooo seventies (at best), who on earth needs such a thing in this day and age? Nobody, right? Except, of course, when they do, such as scientists wanting to store data that does not fit into main memory when individual chunks of the data need to be accessed by a string-typed key, a problem which is quite common of course.
Oh, yes, proper solutions for this problem abound, from the various dbm-derived key-value-stores around right down to HDF5, sure enough. Only, these libraries are not part of any standard API on POSIX-ish operating systems, their installation often needs to be requested through more or less official and less or more slow-to-respond channels or developers need to bundle their own copy with their code, plus each of these tools comes with a learning curve of its own - aye, there's the rub!
Now what I daresay 8 out of 10 scientists REALLY do is this: they create a directory on a POSIX file system and within this directory they create a file /for each record/ of data, so they have easy access to each of them. The whole seven million or so. Never mind that this ingenious solution will bog down even the most performant parallel file systems when scaled even to medium size and stubbornly resist any attempts to back this mess up in finite time. That's the problem of the IT guys, right...? (Or rather, more to the point, the users in question are not normally aware of any gotcha lurking there. After all, what could possibly go wrong?)
And all the user really wanted was a modern equivalent of an ISAM file in his OS' standard API...
DISCLAIMER: I have not looked deeper into the library discussed in the article so there might really be a good reason why they store data the way they do and my comment /may/ indeed be OT here. But the problem I describe is real, and I know I'm not the only one fighting with it on a regular basis.