back to article One bit to rule them all? Forget it – old storage types never die

Block storage, file storage and object storage are all frequently bandied about terms in the storage world. They are fundamentally different, and yet inextricably intertwined. Choosing the right storage today means understanding the differences between these different storage classes, and how they can be made redundant and/or …

  1. Ashton Black

    Pointy Haired Boss.

    Horses for courses. Good article.

    Now, the question is, how to present this to a PHB who's heard the word "object" somewhere and thinks its a really good idea. *shudder*

    1. Trevor_Pott Gold badge
      Childcatcher

      Re: Pointy Haired Boss.

      *kzert*

  2. Justin Clift

    10m files?

    "File systems also allow for lots of increased features around security of the individual files, with the downside being that they're largely useless for storing more than 10m files."

    That doesn't sound right at all. There are a *lot* of factors which significantly influence the performance of file access.

    10m though just sounds plain wrong. Maybe if they're all in the root directory and you're doing Weird Shit to the filesystem on purpose... that would probably barf on 10m files.

    1. Anonymous Coward
      Anonymous Coward

      Re: 10m files?

      Yes, that struck me as odd too.

      We've run user filestore with over 20m files per volume and not had a problem. (NetAPP WAFL, 2yrs old.)

      10m files might be a problem in some scenarios, but its certainly manageable.

    2. Trevor_Pott Gold badge

      Re: 10m files?

      I am not talking about the ability to store the files, but rather the approximate point at which file access performance starts to degrade significantly. With NTFS this absolutely is 10M files. With ReFS, you can push that a little higher. ZFS, EXT3 and EXT4 all start to see some pretty big drop offs between 10M and 15M files. NetApp's more recent files (past three years or so) are some of the better setups; I've seen 30M files before the access starts to degrade.

      Note: you can put more files on these file systems. That doesn't mean that doing so will allow accessing files on a 100M file-populated system is anywhere near as fast as accessing a file on a 10M file-populated system. Especially if the system stores these files for more than just cold archival.

      If these files are accessed even semi-regularly. (Let's say 2M files a day are read or written of 10M) then the system is spending all it's time faffing about with the index. The more files you touch, the more time the system cranks away on the index rather than the data blocks.

      Different file systems handle it differently, but the rule of thumb is that you have to start paying very close attention to your file server designs once you surpass 10M files. (Dramatically upping RAM, for example, or considering using a file system that can offload the index to SSD, etc...)

    3. Craig Dunwoody

      Re: 10m files?

      I have definitely seen file systems behaving badly with volumes that are filled near capacity, deep directory hierarchies, large numbers of small files, etc. I expect to see significant improvements in this area, as file system developers continue to enhance management of metadata. Here is an example of a file system that was recently demonstrated to behave well with over 10 *billion* files in a single volume:

      https://community.qumulo.com/qumulo/topics/10-billion-files-and-counting-on-a-qumulo-q0626-cluster

      I will be interested to learn of other examples of file systems with improved metadata handling.

      (Disclaimer: I don't work for Qumulo, but my company works with them.)

    4. macentric1

      Re: 10m files?

      Disclaimer: I work for Quantum.

      Quantum StorNext is a high performance clustered filesystem that provide a single namespace up to 5 billion files. With Storage Manager 1 billion files can be managed in a single namespace making use of HSM tiering managed directly by the filesystem. Much like the article states, Quantum's Lattus-M Object Storage platform makes use of StorNext filesystem to provide a high performance storage solution for high performance low latency durable storage, think fifteen 9s. Lattus that makes use of Fountain Erasure Coding to spread objects across disks, nodes, racks and data centers. Objects are also available via restful or S3 interfaces.

      See Quantum's website for more details and examples of Extended Online workflows..

      1. Trevor_Pott Gold badge

        Re: 10m files?

        What is important, I think, is not "how many files can you manage", but "how many files can be managed with acceptable performance. With traditional file systems (extX, NTFS, etc) this really does drop off dramatically after about 10M files. After that, adding spindles and controller cards doesn't matter. The issue is the file system, it's complexity, it's size, how much of it fits into RAM and other such considerations.

        It's the tipping point where "whole system" concerns become far greater than spindle throughput.

        So it's worth a right proper "try before you buy" with file systems. Get the wrong one and you could wind up with a filer that's dog slow and no amount of added disks will make it faster.

  3. Uncle Ron

    Print Button

    Boy I wish you guys hadn't removed the "print" icon on these nice articles. I could press the print icon and print as a .pdf and have the thing as a reference forever. Now, you want readers to click on every page so you can get the ad counts and eyeballs or whatever it is that floats your boats and paychecks.

    Sad, I have simply stopped reading these nice articles at all.

    1. Trevor_Pott Gold badge

      Re: Print Button

      http://m.theregister.co.uk/2015/05/01/bits_on_a_disk/

    2. Solmyr ibn Wali Barad

      Re: Print Button

      Sticking 'Print' into the URL still works. Which is essentially what the print button did.

      www.theregister.co.uk/Print/2015/05/01/bits_on_a_disk/

  4. mckjoe

    good summary

    Overall I think this is a good summary. Having been at a traditional block company, then a traditional nas company and now at an object company ( 1: not going to say traditional since it's a somewhat new approach and 2: egads that seems like a long time in the storage business ) - getting some of these concepts across at various levels of an org can sometimes challenging.

  5. Frumious Bandersnatch

    OMG, you didn't just do that?

    FOUR WOLVES!

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like