When it comes to data storage solutions, chances are high that you’ll get dizzy due to the multitude of options and alternatives, not to mention the tons of terminology that floods this area. To make things easier, in this article we will look at the most popular file systems used to create volumes, check on their differences, investigate what best they can do in implementations, and find out which one best suits your requirements.
A Few Words Before We Start
The term file system refers to the methods and structures that your OS applies to manage how your data is stored, organized and retrieved on a storage disk, which comprises internal operations such as file naming, metadata, directories, folders, access rules and privileges.
The thing is that storage devices themselves are intended to merely hold lots of bits; they have no notion of files. Meaning that otherwise data on a storage medium would be nothing but one large body of information, a file system works similarly to the Table of Contents in a book: it allows your files to be broken up in chunks and stored across many blocks, which greatly simplifies data management and access. Simply put, if you alter a chapter or move it somewhere else, you must update the Table of Contents, or the pages won’t match.
Now, while Windows users don’t have much of a choice regarding a file system for the OS comes with one by default (mostly it’s NTFS, FAT 32, or HFS), in Unix-based devices it might be a bit of a challenge…
B-tree file system, or BTRFS, is a file system based on the copy-on-write (COW) mechanism. This implies that, as you modify a file, the file system won’t overwrite the existing data on the drive with newer information. Instead, the newer data is written elsewhere and, once the write operation is over, the file system simply points to the newer data blocks (with the old information getting recycled over time). COW also prevents issues like partial writes, which can take place due to power failure or kernel panic, and potentially corrupt your entire file system; with COW in place, a write has either happened or not happened, there’s no in-between.
BTRFS was originally designed to address the lack of pooling, checksums, snapshots, and integral multi-device spanning in Linux file systems, with a focus on fault tolerance and repair advanced features implementation, which comprise subvolumes, self-healing, online volume growth and shrinking, file compression, defragmentation and deduplication (it ensures that only one copy of duplicated data will be written into the disk(, and much more. Finally, BTRFS is easier to administer and manage on small systems compared to other options.
On the other hand, the system is still considered to be quite unstable, is known for issues associated with RAID implementation, and offers much less redundancy compared to ZFS.
BEST FOR: Enterprises that need a handy file system that is easy to manage; good for technologies and projects where high fault tolerance is not required.
ZFS (short for Zettabyte File System) is fundamentally different in this arena for it goes beyond basic file system functionality, being able to serve as both LVM and RAID in one package. Combining the file system and volume manager roles, it allows you to add additional storage devices to the current system and immediately acquire new space on all existing file systems in that pool.
Great scalability and support for nearly unlimited (up to 1 billion terabytes) data and metadata storage capacity is another advantage of ZFS. In addition, it provides extensive protection from data corruption as compared to other file systems — which, however, is less important for most home NAS storages for the risks ZFS safeguards against are very small — as well as efficient data compression, snapshots, and copy-on-write clones, continuous integrity checking and automatic repair, meaning that ZFS can offer significantly greater redundancy than BTRFS supports.
Along with this, ZFS has its drawbacks. First, plenty of its processes rely upon RAM, which is why ZFS takes up a lot of it; second, ZFS requires a really powerful environment (computer or server resources, that is) to run at sufficient speed. Given that, ZFS is not the best option for working with microservice architectures and weak hardware.
BEST FOR: For the most part, ZFS is intended to work with Sun (now Oracle) products, which are mainframes, clustered server environments, supercomputers, etc. Consequently, some of the benefits offered by ZFS won’t work for small businesses and private users.
The acronym ext refers to Linux’s original extended file system created as early as 1992. Being the first to use a virtual file system (VFS) switch, it allowed Linux to support multiple file systems at the same time on the same system. Since then, Linux has released three updates: ext2, ext3, and EXT4, which comes by default on the Linux system today.
Able to handle larger files and volumes than its evolutionary predecessors, EXT4 also extends flash memory lifespan through delayed allocation, which, in turn, improves performance and reduces fragmentation by effectively allocating bigger amounts of data at a time. What’s more, in EXT4 is employed a number of useful features that greatly increase reliability and fault-tolerance of the system: since the beginning, EXT4 has used journaling — a system of logging changes to reduce file corruption, — in addition to persistent pre-allocation, journal and metadata checksumming, faster file-system checking, unlimited number of subdirectories, and more.
However, unlike the previously reviewed BTRFS that has disk and volume management built-in, EXT4 is a “pure filesystem”. This means that even if you have multiple disks — and therefore parity or redundancy from which corrupted data can theoretically be recovered — EXT4 has no way of knowing that, as well as using it to your advantage. It also has limited capacity to operate modern loads of data, which is why EXT4 is considered somewhat outdated these days.
BEST FOR: Despite the capacity limitations mentioned above, EXT4’s functionality makes it a very reliable and robust system to work with. Given that, EXT4 is the best fit for SOHO (Small Office/Home Office) needs and projects requiring stable performance.
Extents File System, or XFS, is a 64-bit, high-performance journaling file system that comes as default for RHEL family, works extremely well with large files, and is known for its robustness and speed. XFS is particularly proficient at parallel input/output (I/O) operations due to its design, which is based on allocation groups. Thanks to that, XFS provides excellent scalability of I/O threads, file system bandwidth, and size of files and of the file system itself when spanning multiple physical storage devices. In contrast to EXT4, XFS also offers unlimited inode allocation, advanced allocation hinting (in case you need it), and, in recent versions, reflink support.
Among other advantages of XFS, it should also be noted that it provides data consistency by using metadata logging and maintaining write barriers. Allocating space across extents with data structures stored in B-trees also improves overall file system performance, especially when dealing with large files. Delayed allocation helps prevent file system fragmentation; online defragmentation is also supported. Another unique feature of XFS is that I/O bandwidth is pre-allocated at a predetermined rate, which is suitable for many real-time applications.
Among the drawbacks of the system are a serious lack of security against silent disk failures, and ‘bit rot’, nearly complete inability to recover files in case of data loss, and high sensitivity to large numbers of small files.
BEST FOR: XFS can be exceptionally helpful where large files are involved: huge data storages, large-scale scientific or bloody enterprise projects, etc.
File System Recovery
It’s true that some file system recovery applications may be good at fixing minor logical errors or volume corruption issues. However, diagnosing the problem itself can be very tricky, not to mention the following recovery process, which requires lots of expertise and technical mastery to be involved.
To prevent the situation from getting worse (think of permanently losing your crucial files), we recommend you to contact a credible data recovery team. Whether it’s a physical problem that you’ve encountered — like damage to your storage device — or a logical issue caused by your file system failures, at SalvageData we have all the necessary equipment and certified experience to eliminate it in the fastest and safest manner possible. Don’t hesitate to contact us for a free examination of your case, and let the professionals do the rest.