Small files are normal fo rlots of people – people write apps using files as a database substitute – this comes from the desktop computing world. This problem has existed for years – but now people have discovered HPC, but they don’t want to rewrite their programs. Small files are deadly to most file systems – some more than others. Creates even more problems with clusters.
People are expecting cheap disk at commodity prices, but that’s not fast disk. Virtualization can be deadly as it adds overhead due to the levels of abstraction.
An example – an 1800 compute node cluster at USC. If they’re accessing small files, you have to have ways to coordinate file locking and synchronization across the nodes. 3-4 terabits of bandwidth capacity get slowed to nothing if there’s lots of small file access going on.
Right now base file system is QFS (Sun). The directory metadata is on separate disks from the data itself, which is great on big files, but hard with small files because of single metadata catalog. There are local parallel file systems on the nodes, which work better for small files. NFS has its own issues with small file access because of the overhead. They’ve set up “condo disk” as well as condo nodes, so they can have their own file space instead of a virtualized environment.
Some example of small file file systems –
Genomics Group – 10ks files in a single directory.
Natural Language Group – 50-250k files in directory. Many nodes accessing the same dictionaries.
Backups are slower and harder – can’t keep the tape spinning if you’re doing lots of directory accesses – takes hours instead of minutes.
Ways to help –
– faster disk (helps metadata/directory space)
– distributed file access (qfs)
– no free lunch
Next generation –
– nfs 4 doesn’t cut it
– gpfs helps some
– 10 gbps hosts on data plane – nothing but jumbo frames, which might make it worse.
– ram disk for metadata? san diego does it – might help.
– storage management solutions – performance for small files is in question.