Windows IT Pro is the leading independent community for IT professionals deploying Microsoft Windows server and client applications and technologies.
  
  
  Advanced Search 


January 1998

Inside NTFS


RSS
Subscribe to Windows IT Pro | See More Internals and Architecture Articles Here | Reprints | Or get the Monthly Online Pass—only $5.95 a month!

MFT Records
MFT records consist of a small header that contains basic information about the record, followed by one or more attributes that describe data or characteristics of the file or directory that corresponds to the record. Figure 1 shows the basic structure of an MFT record. The header data includes sequence numbers that NTFS uses for integrity verification, a pointer to the first attribute in the record, a pointer to the first free byte in the record, and the MFT record number of the file's base MFT record if the record is not the first.

NTFS uses attributes to store all file and directory information. NT 4.0 NTFS has 14 attribute types, which Table 3 lists. On disk, attributes are divided into two logical components: a header and the data. The header stores the attribute's type, name, and flags, and it identifies the location of the attribute's data. NTFS uses a nifty performance-optimizing trick: It stores the attribute data within MFT records whenever possible, rather than allocating clusters elsewhere on the disk. When an attribute has its data stored in the MFT, the attribute is resident; otherwise, it's nonresident. A resident attribute is possible only when the attribute data fits within a record in addition to the record header, the attribute header, and other attribute headers. Thus, a hard upper-limit of 1KB (the common MFT record size) exists for the size of the data on most NT 4.0-formatted drives. If an attribute's data is resident, the attribute header points to the location of the data within the MFT record. By definition, the file name, standard information, and security attributes are always resident.

If NTFS must store an attribute's data outside the MFT, the attribute header, which is in an MFT record associated with the attribute's file or directory, contains information that locates the data on the disk. The data-mapping information is known as run-information because contiguous pieces of the data form one run made up of a starting cluster and length. The run-information, like most NTFS data structures, has a header that identifies which clusters of the attribute's data are mapped in the run-information. This approach is necessary because attributes with large amounts of data may have run-information split across multiple MFT records—each piece of the run-information covers different parts of the file. A run entry contains a virtual cluster number (VCN), which is a relative cluster offset within the attribute's data; a logical cluster number (LCN), which is the location on the disk where the data resides; and the number of contiguous clusters at that position on the disk.

Compressed files are a special case. NTFS supports compression on only the data stream of a file, and it applies the compression algorithm to blocks of 16 clusters. NTFS stores the compressed data on disk and requires clusters, but the saved space in a 16-cluster block is essentially stored in the compressed portion. The VCNs for these compressed clusters have nonexistent LCNs, which NTFS represents with an LCN of ­1 in the compressed clusters' run entries.

If a file has too many attributes to fit within one MFT record, NTFS allocates additional records and stores an attribute-list attribute in the base record. The attribute list points at the location of attributes in the additional records and consists of an entry for each attribute.

Let's stop and look at an example that demonstrates some of these concepts. Mark.txt is a relatively large and fragmented file. Mark.txt requires a filename attribute, a data attribute, and a standard information attribute (it also has a security attribute, which I'll ig-nore to simplify the example). Thus, for mark.txt, the MFT will contain one or more records that contain the headers for these attributes, and in some cases the attribute data. In this example, which Figure 2 depicts, the name and standard information are resident and stored within the first MFT record of the file. The data attribute is nonresident and is split into two attribute headers with runs in two records. The first run entry is two clusters of data at cluster 200 on the disk. The attribute-list attribute contains one entry that points at the data attribute header the second MFT record stores.

Directories
An NTFS directory is an index attribute. NTFS uses index attributes to collate file names. A directory entry contains the name of the file and a copy of the file's standard information attribute (time stamp information). This approach provides a performance boost for directory browsing because NTFS does not need to read the files' MFT records to print directory information.

When the entry data for a directory fits within an MFT record, one attribute type, the index root, describes the location of the entries in the record. As a directory grows, the entries required to describe it can overflow the file's MFT record. In these cases, NTFS allocates index buffers to store additional entries. The index allocation attribute header specifies the location of a buffer. On NT 4.0 NTFS, these buffers are 4KB in size, and the directory entries within them are variable length because they contain file names.

To make file lookups as efficient as possible, NTFS presorts the directory inside the index root and index allocation buffers. The sorting creates a tree structure: Buffers with lexicographically lower entries (e.g., 'a' is less than 'd') are beneath and to the left of higher-valued entries. Index allocation attribute headers contain run information just as nonresident data attributes do.

Figure 3 shows an example of a directory. As a simplification, the directory contains just a few entries, but the entries are split among the index root and two index allocation buffers. The red arrows show the direction that NTFS scans the entries when it performs a directory lookup. The black arrows show how the runs in the index allocation attribute reference the two allocation buffers.

NTFS Logging
As files and directories (including NTFS metadata files) change, NTFS writes records to the volume's log file. The CHKDSK program uses the log file to make NTFS on-disk data structures consistent and to minimize data loss in the face of a crash. The log file records come in two types: redo and undo. Redo records store information about a modification that must be redone if the system fails and the modified data is not on disk. For example, if a file deletion is in progress, the redo operation signals that the delete must be completed if a failure occurs and only some on-disk data structures have been updated.

NTFS uses an undo operation to roll back modifications that aren't complete when a crash occurs. If NTFS is appending data to a file and the system goes down between the time that NTFS extends the file's length and the time it writes the new data, the undo log file record tells NTFS to shorten the length of the file to its original size during a recovery.

The log file's size is based on the size of the disk and is usually between 2MB and 4MB. The log file will fill up unless NTFS ensures that the redo and undo records stored in the log file are not required for a recovery. Periodically, NTFS resets the data in the log file by sending all modified data to disk. This checkpointing process takes place about once every 5 seconds.

NT 5.0 NTFS
NTFS is gaining some exciting new features in its NT 5.0 incarnation. One new feature is built-in support for encryption, which will protect sensitive data on disks that might be viewable by programs such as NTFSDOS that ignore NTFS security settings (see "NTFSDOS Poses Little Security Risk," September 1996). NTFS doesn't perform the encryption; instead, an add-on piece (the Encrypting File System—EFS) is tied closely to the NTFS file system driver. EFS communicates with the system security authority and an EFS service, as well as NTFS. When a user wants a file encrypted, NTFS hands the data to EFS, which uses the user's security ID and a Data Encryption Standard (DES) encryption/decryption engine to manipulate the file's data. (The architecture is a little more complex than I've described, and worthy of a future column; the EFS architecture was not final at press time.)

File system quota add-ons for NT have been popular products, but they may lose their niche when Microsoft releases NT 5.0. NTFS has always hinted that it would natively support quotas (the $QUOTA metadata file has existed since NT 3.5), and NT 5.0 will finally implement them. NTFS assigns quotas on a per-user basis, and the $QUOTA file in the new $Extend metadata directory stores quota specifications for the particular volume. NTFS honors this quota information, which can restrict a user to an upper limit on the amount of data on the volume in files that the user owns.

Other features include support for sparse files and reparse points. Sparse files can save space because NTFS allocates disk clusters for only parts of the sparse file that contain valid data—the portions of the file that don't contain valid data are assumed to contain 0s. Reparse points are essentially symbolic links, and can point at files or directories on the same or different disks.

Microsoft has stated that a "lite" version of Executive Software's Diskeeper defragmentation tool (see NT Internals: "Inside Windows NT Disk Defragmenting," May 1997) will be built into NT 5.0. This claim is a little misleading because Microsoft will just ship defragmentation tools (that use the NTFS deframentation APIs present since NT 4.0 and that work on NT 4.0 as well) in the box with NT 5.0. At press time, Microsoft had not made any changes to NTFS's defragmentation support for NT 5.0. I'll address other NT 5.0 features in the future article about EFS.

NTFS Resources
As you can see, NTFS is a relatively complex file system—the details about NTFS could fill a book. In fact, Helen Custer has written a good one: Inside the Windows NT File System (Microsoft Press, 1994). It takes you on a comprehensive tour of NTFS.

You can further explore NTFS at the Winternals Web site (http://www.winternals.com). You'll find interesting information about NTFS (e.g., you can find out why NT won't place NTFS on floppy disks), and you'll find several useful NTFS data recovery tools (e.g., NTFSInfo, NTFSDOS, NTRecover).

As the changes coming in NT 5.0 show, NTFS is not a static file system. Each new version of NTFS is different enough that earlier versions won't recognize NTFS drives formatted with later versions. Microsoft has yet to write the final word on NTFS.

End of Article

   Previous  1  [2]  Next  


Reader Comments
In the January NT Internals column, “Inside NTFS,” Mark Russinovich says that “Microsoft will just ship defragmentation tools in the box with NT 5.0.” He also says that “At press time, Microsoft had not made any changes to NTFS’s defragmentation support for NT 5.0.” These statements are inaccurate.
You can access the manual version of the Diskeeper defragmenter in Windows NT 5.0 Beta 1 by going to the Preview Directory on the CD-ROM, selecting x86 or Alpha, selecting the Defrag directory, and clicking Install. The defragmenter then becomes an operational part of the system. The defragmentation engine exists in beta 1, but Microsoft hasn’t yet included the GUI. The Defrag GUI will be included in beta 2.<br>
--Jobee Knight<br>
Director of Public Relations<br>
Executive Software

Jobee Knight August 10, 1999


this is a correction, at the end of this document, author given a wrong http address which should change http://www.ntinternals.com to www.sysinternals.com, I think.


dahai April 24, 2000


My NT 4.0 server doesn't work too well. Thanks.

Royce Cornwallis Taylor December 27, 2001


does NTFS provide the ability for "high water marking" of disk surface when a file has been deleted ? on an NT server, files are downloaded, used temporarily, then deleted. would like to assure that no disk recovery utility could be used to make a low-level read of the disk are previously occupied by the deleted file. please notify by e-mail if any suggestions.

patrickoneil January 23, 2002


i want to recover partition of hard drive from ntfs to fat 32.i could not find required ntfs utility,or you can say i do not know which ntfs utility will help.may i have your opinion and required utility in my e mail?i shall be obliged

hameed March 07, 2002


The document ‘Inside NTFS’ was outstanding. I wanted to take some time to say THANKS for sharing the information.

Larry June 30, 2002


I have been reading up on the attributes of using NTSF. The question I have is, how do you reformat a drive for dos or fat32 if you don't like.

Thanks,


Doug Holifield July 31, 2002


i want to change partitions from ntfs to fat 32 .
are you can say me ntfs utility for this change?
thanks

mohammad October 13, 2002


I much appreciate your translation about FAT (File allocation Table ) and NTFS ( New Technology File System)
The french translation of FAT was GRAISSE.
The remaining translation was perfect.

Lionel MARTIN November 17, 2002


I have a problem with NTFS: One of my important directory got the System property (after a reinstallation of WinXP) after some hard work, I managed to acces copy and backup the files inside it with recovery programs, but its size was nearly 1.5 GB so now i want to delete it but I couldnt find a way to do it...

May be you can help me?
Thanks.

Erkan ALTAN October 22, 2003


 See More Comments  1   2   3 

You must be a registered user or online subscriber to comment on this article. Please log on before posting a comment. Are you a new visitor? Register now




Top Viewed ArticlesView all articles
Command Prompt Tricks

One reader shares his tip for setting up the command prompt to reflect a remote path. ...

WinInfo Short Takes: Week of November 9, 2009

An often irreverent look at some of the week's other news, including some more Windows 7 sales momentum, some Sophos stupidity, Microsoft's cloud computing self-loathing, more whining from the browser makers, Zoho's "Fake Office," and much, much more ...

Understanding File-Size Limits on NTFS and FAT

A general confusion about files sizes on FAT seems to stem from FAT32's file-size limit of 4GB and partition-size limit of 2TB. ...


Storage Whitepapers Turn to a Proven Server and Storage Migration Solution

The Impact of Disk Fragmentation on Servers

Take Control of Your Email: Understand the Business Reasons for Email Storage Management

Related Events Disk-to-Disk Grows Up

WinConnections and Microsoft® Exchange Connections

Check out our list of Free Email Newsletters!

Storage eBooks A Guide to Windows Certification and Public Keys

SQL Server Administration for Oracle DBAs

Keeping Your Business Safe from Attack: Encryption and Certificate Services

Related Storage Resources Introducing Left-Brain.com, the online IT bookstore
Looking for books, CDs, toolkits, eBooks? Prime your mind at Left-Brain.com

Discover Windows IT Pro eLearning Series!
Clear & detailed technical information and helpful how-to's, all in our trademark no-nonsense format


Windows IT Pro Home Register FAQ for Windows WinInfo News
Europe Edition About Us Contact Us/Customer Service Media Kit Affiliates / Licensing  
SQL Server Magazine Office & SharePoint Pro DevProConnections IT Job Hound
Left-Brain.com Technology Resource Directory asp.netPRO ITTV Windows SuperSite 
 
 Windows IT Pro is a Division of Penton Media Inc.
 © 2009 Penton Media, Inc. Terms of Use | Privacy Statement