Working with 60 million files pushes the boundaries of any storage. Windows underpins most of my storage and so the theoretical and practical limitations of NTFS and Distributed File System Replication (DFSR), and the difference between theoretical and practical limits on the number and size of files they handle, are important …
Interesting... how does this compare to other file systems? (i.e. non-windows)
Dude, just use Linux. If you have so many files, going to the extra work to use ext3 or ext4 or ZFS or something shouldn't be too much of a burden.
So you've tested a similar scenario?
Can you give us some metrics about the performance of other file systems? Thought not...
You're so right! Unfortunately in the real world the sysadmin's aren't in charge of such decisions in every scenario. My evil schemes for Linux (or better yet, Solaris) file servers are constantly shot down. I get away with Openfiler VMs for specific purposes only in circumstances where nobody but IT will be using the file system.
CTO doesn’t understand Linux /at all/ and so doesn’t believe that you can secure it (Active-Directory-integrated share and file permissions) the same way that you can Windows. So for myself and others in such situations…there are articles on NTFS.
In teh end though, ZFS > *
...but my understanding is that Linux does not fragment files as much as NTFS, so even if other limits are similar then the lag of fragmentation is a bonus surely?
Please, not ext-anything.
I know it's being worked on, but Linux doesn't have ZFS yet. Also, please don't advocate ext3 or ext4. Both are just terrible hacks bolted on top of ext2, which is itself a terrible hack. Granted, this is Linux you're talking about, so I suppose all this talk of terrible hacks is superfluous.
There are, of course, a few other things that *do* have ZFS already. Perhaps you should go advocate one of those.
Linux would not help
As Linux does not provide a native ZFS. Under Linux ZFS is only available as FUSE.... *BSD and all kinds of current Solaris are much more approbriate, reminds HA.
"Linux does not fragment files as much as NTFS, so even if other limits are similar then the lag of fragmentation is a bonus surely"
Linux not fragmenting files is the worst MYTH/FUDD known to man.
If it was true the next gen filesystems (ext4, btrfs et al) wouldn't have online defrag. Oh wait! They do!
Anybody who repeats such rubish really doesn't have a clue. The fact is that all filesystems fragment, the question is over how much and how good the algos for preventing it are and if there are tools and opportunities for sorting it out. Linux historically has been a bad, no and no respectively. Like you say - no metrics.
Go on, justify why you think ext2 is a terrible hack.
"Linux historically has been a bad, no and no respectively."
Now that's REAL FUD
The "lag of fragmentation"? I meant "lack of fragmentation".
Here is a heaping platter of crow:
Giving an example that's only useful in 1964 doesn't prove anything, move outside data sizes that still work on punch cards and it all breaks down.
Don't confuse nobody really cares with no it doesn't.
Sure a lot of filesystems make a effort but if you try to write a .5TB file to a disk that only has a bunch of 50GB spaces guess what happens. It gets chopped up into tiny little pieces and scattered around on any filesystem.
Like I said, linux filesystems are getting online defrag tools for a reason - cos there's finally filesystem developers with a clue trying to sort through the fanboy (non-dev) bullshit. I'm not selling any filesystem, I'm just saying there's actually nothing you can do about it in real-world data use.
The question really is how much it matters.
Not for nothing but Microsoft didn't stop writing NTFS after v1 either...
Wrong. ext* allocators are much more sensible than NTFS ones. NTFS still uses brainded DOS-aged allocator mechanisms a lot of the time (e.g. put the file in the first available block). Linux, OTOH, tends to put files in block groups. Files aren't put consecutively on disk, so there is plenty of empty space between where one file ends and another begins (free space allowing, of course). That means that the files themselves aren't internally fragmented, which is the big problem (the fact that files aren't contiguous is much less of a problem, especially on a multi-processing multi-user system). You have to hit FS usage of 90%+ before fragmentation on ext* starts to rise above low single figures. On NTFS, fragmentation will go into double figures much more quickly.
Can you explain what about ext2 is a hack? It is based on the basic principles of UNIX file systems going back to year dot. It's design is actually quite similar to UFS. ext3's journalling works pretty much the same way as UFS's logging (using UFS as an example since your comment indicates some Solaris fanboyishness), and ext4 brings along extent based file allocations. There is nothing at all hacky about the design of ext* file systems.
ZFS also isn't particularly unique. BTRFS is almost cooked, and will be in the next RHEL release. Nor is ZFS particularly performant. Convenient, sure, but not performant. For actually stressful loads (e.g. heavy DB usage), UFS is considerably faster.
But can you use other FS's on Windows? Not in a hacky "yes, it's just about possible" way - but are there production-ready alternatives? Coming from a more Linux-centric world where you can often choose a filesystem to suit the workload - I'm just curious. I seem to remember that Veritas had a version of vxfs for Windows - but I haven't really looked at that for years.
Anyway - just curious :-)
Re: production-ready alternatives
Definitely not. It takes a *lot* of testing over a long period to prove that a file-system is safe, so if there was an alternative FS out there with a user base large enough to make using it on an important server anything other than a career limiting move, then you'd have heard about it already.
Accessing other FS's from Windows
I use the IFS freeware for when I sometimes need to pull stuff from my Linux system when I've dual-booted into XP. It works well and mounts the file system as a new drive letter. It's been fine for me and it's very easy to use, but I don't know if it's good enough for professional purposes or not as I haven't used it that much.
Dual boot is becoming such a pain (I used to use Linux exclusively, but I'm finding the latest MS Office too good to pass up), that I'm actually whacking on a Linux install in a VM on my system and letting *it* mount my /home/h4rm0ny folder.
I want ext4 in Windows 7, or BTRFS at some point. When I have a spare couple of months and the stomach for it, maybe I'll write it myself. ;)
Anyway, check out the IFS plugin at: http://www.fs-driver.org/. It's actually for ext2, but obviously that still has some utility with ext3.
Very briefly (in the days of NT4, IIRC, maybe w2k) you could get VxFS for Windows, I think it was part of the Veritas Foundation Suite. Someone from Veritas told me that the file system was pulled because MS weren't very impressed and Veritas were writing the disk subsystem for MS so didn't want to make any trouble. How true this is is another matter, but I have no reason to believe it isn't. Also Windows is designed to be able to access multiple filesystems, in the days of NT4, it only natively supported FAT16, but you could get FAT32 drivers from Sysinternals (again IIRC).
So the answer seems to be
defragment with consolidate before each backup
use smaller volumes
zip the user's data up
or just ftp the files to a Linux backup server as suggested in the previous article, its got to be cheaper than buying a windows based backup application
rearrange the following phrase "goats. Microsoft blows"
a title is needed
CW moralist boosts fog?
Maybe means something to do with spreading confusion
How about other file systems?
This is really interesting - but what about other file systems? I'm assuming there are vast differences between the RAM needed for NTFS and, say, ext2/3/4; any insights there?
I don't rightly know. One of these days I'll set up a lab and find out...
Linux with 1 Billion files
Linux seems to scale far better than windows with large numbers of files. This article on LWN covers experiments to put 1 Billion files onto linux filesystems.
The basic conclusion is you can put 1 Billion files on a linux filesystem, but you require a lot of memory to check the filesystem (10-30 GB depending on filesystem type) That is far less than the requirement listed above for a mere 60 million files on Windows.
ReiserFS ; spindles
The one Linux filesystem, notorious for its capability to work with myriads of small files, is ReiserFS. There are downsides though: the stable ReiserFS v3, included in the vanilla Linux kernel, has a volume size limit of 16 TB. ReiserFS v4 is not in the mainline kernel (in some part due to "functionality redundancy" reasons = inappropriate code structure) and its future is somewhat uncertain - but it is maintained out of tree and source code patches (= "installable package") for current Linux kernel versions are released regularly. Both versions also have other grey corners, just like everything else...
When working with a filesystem that large, I'd be concerned about IOps capability of the underlying disk drives (AKA "spindles"). The question is, how often you need to access those files, i.e. how many IOps your users generate... This problem is generic, ultimately independent of the filesystem you choose.
I feel warm and fuzzy reading this - good stuff
Its been a while since I read something like this (a article about sometghing were the user has learned the product and THEN read the manual to see what its lieing about).
Nice to have realworld experience and observations like this. Realy useful and I want more - what about EXT3,JFS...... let the FS fight begin here.
URL and tags for this article say NTSF...
Thank you, Mr Pott.
Interesting read, always fun to see how things change once you leave the comfort zone of the OS vendor's lab and try their toys with real data in a real environment.
Does turning off short file name creation help at these extremes? Wouldn't it reduce the size of the MFT?
Yes it does. Disabling the last touched stamp speeds up access a whole bunch. (But is important in my environment so I leave it on.)
You mean, turning ON short file names?
Short file names
No, turning off the useless 8.3 filenames that nobody uses any more.
so you need a bit more than 640K memory then?
No-one will ever need more than 640kb!!!
Screen wipe for Mr. Cowturd please!
Good on the desktop, in its way (well, the users like it), but for servers? Why would one?
Linux? Yeah right.
Do people stop to think before blurting out "use Linux". If it's a MIcrosoft shop then introducing Linux into the environment will have additional overheads (staff training, additional monitoring tools, cost of integration). It's cheaper to simply heed the pratical advice in the article and manage the file system accordingly going forwards.
This article is a sad example of why M$ admins are just users with more rights
In the days when computer professionals were either System Engineers or users this conversation would never have come up.
If it didn't exist and it was needed your were expected to create it not bitch about how rubbish the OS development team are.
M$ have managed to created a generation of people who are responsible for maintenance and control of expensive hardware but don't know or care how it really works. In most UK system administration departments internal development is treated as suspect and hence they rarely produce more than scripts or download some badly fitting outside application to almost meet the need.
Well, this is the price industry pays if they go the M$ route, yes support is cheap but as the saying goes "you get what you pay for". M$ compatibility has reduced bespoke software to configuration of M$ products and we all know how reliable they are. If you want it done right get someone in who knows how it works from the bottom up.
It should not come as a shock to anyone that M$ products do not do what it says on the tins they never did.
RE: "This article is a sad example of why M$ admins are just users with more rights"
My my - looks like somebody is used to having god-control over there network, and likes to be paid obscene amounts of money to look over highly tweaked setups since no-one else would have a clue what they had done (documentation is doubtful, as that would mean someone could replace you!)
Not every company has £60K+ (minimum) for a top-end *nix IT Admin whose main goal in life is reading man pages and writing drivers. And they dont wont to outsource it since:
1. They will charge a fortune in the beginning to "inventory" their network
2. Rip-n-Replace with a completely new setup that the customer is unfamiliar with, and is black-box to all but the new incumbent
3. Bugger all once the money for support contracts dries up, leaving them in the sh*t
People use the tools they are comfortable with, with the skills and budget they have to achieve they need to do on a daily basis... its not always the same choice as your scenario - get over it.
RE: RE: "This article is a sad example of why M$ admins are just users with more rights"
FYI documentation was always part of the job and to replace me they would need someone at least equally knowledgeable or a few month later they are buying in expensive consultants. I know the consultants are the result as this has happened, it wasn't because I set them up and hoarded knowledge it was simply that no one else was willing to read and implement my docs. I know that on but one system they paid out £80K over 8 months in consultants before they managed to find 4 "experts" to replace me. Needless to say the manager who thought up the idea of my leaving is no longer with a company.
I do however agree that not every company can afford £60K+ to be certain that IT is not something they have to worry about. I am sure that there are many companies where data loss/theft is not an problem and a web/email presence in not necessary but then again these companies would probably be better using a paper based office.
When it comes to a company of any real size however paper based offices are no longer financially viable. Larger companies should see £60K+ as nothing against the potential losses of bad IT baring in mind that this cost includes not only the maintenance/security of infrastructure but ongoing individual training for their other employees.
I too agree getting in many of the outsourcing companies is simply throwing money, experience and ownership away, worse most are staffed by the very point and click "experts" I was referring to and are learning the job as they go.
"People use the tools they are comfortable with", true, however, as the UK education system has gone from computer science to computer studies to ICT the majority of people you are referring to have only been exposed to M$ products and this includes many IT professionals. The answer to the problem in companies is training and good software that meets the user's requirements coded by someone who has met the intended audience and implemented their needs. The answer to the educational system is stop training our kids to be M$ data input clerks if they can be something more.
The real question whether a company of any size can continue to afford not to buy in real expertise along with the other insurance they need to protect them.
Our Unix guy isn't here right now, so I can't get all the grizzly details, but our fileserver gave us equally unexpected and unpleasant results when we started tripping over its’ hidden limits.
We have a Unix baox attached to a SAN with approximately 4Tb storage, which at the time we considered ample capability for its job. It didn’t take long before we hit the dreaded inode limit.
As I say, I can’t remember all the precise details, but the EXT filessystem, formatted over a certain size (1 or 2 TB) assumes that the average size of each file will be around 1GB, and therefore allots an appropriate number of Inodes for this assumption.
If your average filesize is closer to a few KB, your Inodes run out loooong before you reach the drives space capacity. When he investigated this, he naturally assumed he chosen an incorrect option when creating the partition. After significant research, he discovered the was the default, and only configuration.
"EXT filessystem, formatted over a certain size.."
It's not been set-up correctly for the type of workload !
Default? Yes. Only? No.
"After significant research, he discovered the was the default, and only configuration."
Specify the bytes/inode ratio. mke2fs creates an inode for
every bytes-per-inode bytes of space on the disk. The larger
the bytes-per-inode ratio, the fewer inodes will be created.
This value generally shouldn’t be smaller than the blocksize of
the filesystem, since then too many inodes will be made. Be
warned that is not possible to expand the number of inodes on a
filesystem after it is created, so be careful deciding the cor‐
rect value for this parameter.
No, actually <= 16KB / inode, so you won't run out of them unless you have teeny files
@Psymon: if you can't remember the details correctly, and the person who actually did it isn't around for you to ask, why bother positing at all? Clearly you haven't got useful information.
ext3 (and I think ext2) filesystems default to 16KB per inode for large filesystems, and less for smaller ones. Specifically, to 4 blocks per inode, and 4KB block size (max). Meaning that if your average file size is over 16KB, you will never run out of inodes (since you will fill the disk first instead). If your filesystem was created with 1GB per inode, it must have been created as some bizarre experiment, since after all the disk space saved by pushing the inode limit down is minimal.
Also "After significant research, he discovered the was the default, and only configuration" is utter claptrap. See the -i option of mke2fs.
This is just more proof the Microsoft is just not ready for the server room.
At least not MY server room.
Always useful to hear about "real world" IT deployment.
Anybody out there got experience of pushing the limits of Linux or Netware file systems?
MFT size is?
OK, I'll bite. How do I find out my current MFT of my NTFS drives?
"Analyse" a disk, and then view report. There are about eventeen squillion other ways to do it, but this is my preferred method.
Veritas on windows
Veritas/Symantec offer a free 4-disk limit "basic" product on Unix & Linux - they may do on Windows as well. VXFS handles terrabyte filesystems with ease on Unix - I imagine it's very good under windows too.
GB != Gb.
Paris, because she knows her bits from her bytes. Fnar.
- Vid Hubble 'scope snaps 200,000-ton chunky crumble conundrum
- Updated + vids WHOA: Get a load of Asteroid DX110 JUST MISSING planet EARTH
- 10 years of Facebook Inside Facebook's engineering labs: Hardware heaven, HP hell – PICTURES
- Very fabric of space-time RIPPED apart in latest Hubble pic
- Massive new AIRSHIP to enter commercial service at British dirigible base