NTFS Write Support GSoC - Work Summary

by coderTrevor | August 23, 2016

This is a detailed summary of the work I've performed during GSoC.

Highlights

  • Wrote numerous functions which allow for NTFS write-support.
  • Expanded ReactOS' NTFS driver with the ability to overwrite files and change a file's size.
  • Identified and repaired several bugs related to reading files from NTFS.
  • Fixed ReactOS' LargeMCB implementation, facilitating support for four file systems, NTFS included (See CORE-11002).
  • Diagnosed and fixed a regression using log files (See CORE-11707).

Code Submitted:

The following improvements have been submitted to my GSoC branch:

  • r71224 - Added minimal write support from CORE_10998 along with updates as suggested by CR-90. Supplemented by r71616.

    This code started its life as a Jira patch, CORE_10998. I posted that patch with the hopes of being selected for GSoC.

    This commit adds only the most basic write support imaginable to ReactOS' NTFS driver. It only allows for overwriting the contents of an existing file. Only non-resident files (those larger than about 700 bytes) can be overwritten, and the file size can't be changed. This minimalism is by design.

    The following functions have been added:
    NtfsLockUserBuffer() - Utility function. Ensures a give IRP has an MDL address.
    NtfsWrite() - Handles IRP_MJ_WRITE I/O Request Packets for NTFS.
    NtfsWriteDisk() - Writes data to the disk.
    NtfsWriteFile() - Writes a file to the disk.
    WriteAttribute() - Writes data to an attribute. Files on NTFS consist of attributes, including an attribute for the file's data.

  • r71494 - Added some fool-proofing to the build script for Windows.

    I lost some development time early-on because I needed additional instructions on how to use ReactOS' build script. Once I knew what I was doing wrong, I updated the script to ensure nobody else would stumble on it as I did.

  • r71660 - When writing to a file, increase the file size if trying to write past the end.

    This was a significant improvement over the minimal support added by r71677.

    Introduced functions:
    SetAttributeDataLength() - Allows for altering the data size of an attribute. Supplemented by r71677, r71820, r71832, r71942 and r71968.
    UpdateFileRecord() - Updates a file record in the master file table at a given index. Supplemented by r71920 and r72423.
    AddFixupArray() - Prepares a file record or directory index for writing to the disk. Supplemented by r71662.

  • r71664 - Update a file's size in the relevant $FILE_NAME attribute of the index entry in the parent directory.

    Another improvement. Introduced functions:
    UpdateFileNameRecord() - Searches a parent directory for the proper index entry, then updates the file sizes in that entry.
    UpdateIndexEntryFileNameSize() - Recursively searches directory index and applies the size update.

  • r71680 - Allow for an existing file to be opened with FILE_OVERWRITE, FILE_OVERWRITE_IF, or FILE_SUPERSEDE dispositions, and truncate that file.

    Another incremental improvement. This allows for a file to be opened and saved in ReactOS' Notepad.exe

  • r71696 - Lays some groundwork for extending allocation size. Supplemented by r71857 and r71858

    Another incremental improvement. Introduced functions:
    AddRun() - Adds a data run to a non-resident attribute (for increasing a file's size). First implemented in r71942 and supplemented by r71945 and r72422.
    GetLastClusterInDataRun() - Returns the LCN of the last cluster associated with an attribute. Used to prevent file fragmentation when assigning the next cluster.
    NtfsAllocateClusters() - Allocates a series of clusters on an NTFS volume. Supplemented by r71697.

  • r71820 - Add ability to write to resident attributes.

    Another incremental improvement. Introduced function:
    InternalSetResidentAttributeLength() - Sets the length of a resident attribute. Supplemented by r71837

  • r71897 - Add error-checking to InternalGetNextAttribute(); don't crash if CurrAttr->Length is invalid.

    This fixed a problem with the existing driver that I would sometimes encounter during development, one that lead to blue screens.

  • r71921 - Adds support functions.

    Another incremental improvement. Introduced functions:
    NtfsDumpDataRuns() and NtfsDumpDataRunData() - Provide diagnostic output.
    GetPackedByteCount() - Used to encode data runs.

  • r71942 - Implements AddRun() and adds support functions and documentation.

    A fairly significant improvement. Introduced functions:
    ConvertDataRunsToLargeMCB() and ConvertLargeMCBToDataRuns() - Convert NTFS data runs to and from Large memory control block structures (which I fixed earlier in the project). Supplemented by r72422.

  • r71968 - Adds FreeClusters().

    A fairly significant improvement. Introduced function:
    FreeClusters() - To free clusters on an NTFS volume. Supplemented by r72422.

  • r72424 - Adds NtfsDumpFileRecord().

    A small improvement. Introduced function:
    NtfsDumpFileRecord() - Provides diagnostic information about a file record.

Contributions to Trunk:

The following improvements have already been incorporated into ReactOS by my mentor, Pierre Schweitzer, as a result of my work:

  • r71060, r71156 - Fix invalid read for data shared over two sectors.

    I found this bug while performing an automated-test of WriteAttribute(). I realized later that I was allocating extra memory in my first patch.

  • r71155 - Don't leak memory in case of failures in NtfsReadDisk().

    I found this small issue while reviewing the existing code.

  • r71159 - Don't attempt to read a sparse run off the disk.

    I found this issue after adding sparse-run support to my automated-tester.

  • r71229 - Don't fail too early in NtfsCreateFile() when requesting write/supersede operation. This allows already setting appropriate error in certain cases and making the driver more consistent.

    My mentor, Pierre and I found a piece of misplaced code here.

  • r71409 - Rewrote FsRtlGetNextBaseMcbEntry(), FsRtlLookupBaseMcbEntry(), and FsRtlNumberOfRunsInBaseMcb() using simpler logic. This finally fixes broken MCB handling in ReactOS and allows FSDs relying on MCB to properly work in ReactOS!

    I'm very proud of this one! As you can see from CORE-11002, fixing LargeMCB's not only allowed me to advance the NTFS driver, but it also contributed to ReiserFS, FFS, and ext2 support!!

  • r71869 - Fix ReadAttribute() and NtfsReadAttribute() in the case when an attribute contains two data runs; Remove extra if statement that prevents second data run from being read after it's decoded.

    Yet another couple of issues that I found and fixed through the power of automated testing.

  • r72067 - This fixes a regression introduced by the patch above. The regression was found by DopeFishJustin and reported on Jira in CORE-11707.

    Fixing this required only minimal back-and-forth between DopeFishJustin and myself. This was the first time I've diagnosed and fixed code based solely on log files.

    The issue was caused by the fact that the function is being fed invalid data, and prior to r71869, it would fail more gracefully. I added additional error-checking and confirmed with DopeFishJustin that he no longer sees the regression. See CORE-11707 for more information.

Bonus Code

During the course of this activity, I also wrote a considerable amount of code that was useful to me, but never intended to be committed. Only the minimal amount of time was spent writing these programs, so I wasn't distracted from writing driver code. As such, this bonus code has numerous errors: memory is often leaked; file handles are leaked; comments are missing, deceptive, or just plain wrong; things are done in an inconsistent manner; etc. For the most part, these programs can't be interacted with except at the level of their source code.

  • Tester (AKA "Disk_Test_1")

    This automated-tester was absolutely crucial to my progress. Not only did it consistently find errors that would have been impossible to reproduce otherwise, it was the only way I could test many of my incremental improvements to write support. Since file creation is not yet supported, you have to first run the program in Windows, where it will create the files it needs.

  • "NTFSbrowse"

    I developed this program as a learning tool for myself. It was intended to be a stand-alone application that lets the user navigate the file structure of a live NTFS volume. In the end, I did not need to finish this feature. In its current inception, the program will give you a lot of information about a particular file record, identifying most of the individual fields in its HTML output.

  • "DumpDrive"

    This program is designed to create images of an NTFS partition, then compare these images. If it's able, the tool will keep track of every file that's changed, and display information related to the change.

  • "BuildRosFast.ahk"

    I wrote this AutoHotKey script during the community bonding period and assigned it to a custom toolbar button in Visual Studio. When I click this button, it starts building a livecd of ReactOS, kills any sessions of WindBg, reboots a VM, Launches WindBg, Launches ReactOS with debugging enabled, and finally adds whatever breakpoints I'm interested in before clicking the two buttons prompted by a livecd before you enter the desktop. This script is pretty crude (it would be better if it paid attention to the build results) but at the same time, it's an extremely powerful time saver!

What's working / What can be used now

Every change mentioned under "Contributions to Trunk" has already found it's way into the ReactOS operating system. These changes can be used by simply running a livecd of the latest version of ReactOS and reading an NTFS volume. You can also use my tester; the read tests stress the parts of the driver that used to be broken.

Overwriting a file is mostly finished and works as an end-user would expect, with the caveats I'm aware of listed below. There's two ways to test: First, build my branch with the usual method. In a virtual machine, mount an NTFS volume in Windows, preferably Windows Server 2003. Create a text file on the volume. To test writing to resident files, you can leave this blank, to test writing to non-resident files, add at least 800 bytes of text then save the file.

Reboot into ReactOS (built with my GSoC branch). Open the file with Notepad.exe and change the contents. If testing resident files, make sure you add less than about 600 bytes of data (migrating resident attributes to non-resident isn't implemented at present). If testing non-resident files, there isn't such a restriction. Save the file. Reboot into Windows. You should be able to observe that your file has been updated from within ReactOS.

Overwriting a file

Alternate testing method: Use my automated tester, Disk_Test_1. The principle is the same. You need to copy the tester onto an NTFS drive from within Windows, and run the program once. Then, you can run the tester from within ReactOS.

NTFS Tester

What's left to be done

There's still considerable work to be done before the driver is ready for the end-user. Immediate goals include:

As suggested by Thomas Faber, write support should be disabled by default, and enabled via the registry. This will prevent end-users who misunderstand the experimental-nature of the driver from losing data.

For overwriting a file:

  • Support changing a file's [allocation] size via IRP_MJ_SET_INFORMATION.
  • Support migrating a resident file to non-resident, when it grows too large.
  • Add support for creating $ATTRIBUTE_LIST (very rarely needed).

For creating a new file:

  • Commit code to create a new file record (this is working but I didn't have the commit ready in time for the end of GSoC).
  • The file must have an index entry added to the parent directory's index.

From there, creating directories, deleting and renaming files and directories would be next, making the driver fairly usable. However, before the driver can be considered "feature complete," considerable development is still needed beyond even this point. Off the top of my head and in no particular order: we need to make sure we support caching, compression, logging, async IO, memory mapped files, installing to and booting from NTFS, etc...

Looking Forward

Thank you to my mentors, ReactOS, and Google for this wonderful opportunity!!

I definitely plan to keep contributing to ReactOS! As of right now, the last commit made to my branch by the GSoC deadline is r72424, and newer commits should be considered to be made after GSoC has ended.