Skip to main content

Using TDD to test file system operations

I will refer to SUT during this article, meaning "System Under Test". In this case it will be the synchronize functionality in the FileSystemSynchronizer class.

A friend of mine asked me how I would unit test file system operations. The answer is that System.IO is not very test friendy and you will have to implement wrappers around it. This is not very hard, only time consuming and it gives you an additional layer of complexity. I would do this if I had to do a lot of file operations that needed testing, or if those file operations where important enough.

First we need a purpose. Let's say that we're going to build an application that's going to synchronize files in two file folders by pushing changes from source to target.

public class FileSystemSynchronizer
{
    public void Synchronize(string source, string destination)
    {
    }
}

Before we do anything else we should ask ourselves what we need this routine to accomplish. We do this by defining tests.

public class FileSystemSynchronizerSpecification
{
    [Fact]
    public void ShouldCreateTargetFolderIfDoesNotExist()
    {
    }

[Fact]
public void ShouldCopyAnyFilesFromSourceFolderToTargetFolder()
{
}

[Fact]
public void ShouldCopyAnyDirectoriesFromSourceFolderToTargetFolder()
{
}

[Fact]
public void ShouldCopyFilesAndFoldersRecursivlyFromSourceToTargetFolder()
{
}

[Fact]
public void ShouldNotCopySourceFileIfSameFileExistInTargetFolder()
{
}

[Fact]
public void ShouldCopySourceFileIfNewerThanFileInTargetFolder()
{
}

[Fact]
public void ShouldRemoveFilesFromTargetFolderNotPresentInSourceFolder()
{
}

}

Now we could just implement these tests and the wrapper situation should resolve itself?

[Fact]
public void ShouldCreateTargetFolderIfDoesNotExist()
{
    const string SourcePath = @"C:\FolderA";
    const string TargetPath = @"C:\FolderB";

/* Setup */
var factory = MockRepository.GenerateMock<IFileSystemFactory>();
var sut = new FileSystemSynchronizer(factory);

/* Arrange */
factory.Expect(f => f.PathExists(TargetPath)).Return(false);

/* Test */
sut.Synchronize(SourcePath, TargetPath);

/* Assert */
factory.AssertWasCalled(f => f.MakeDirectory(TargetPath));

}

Do you find it hard to write tests before writing the code? Well, it is meant to be hard, because you should consider why you write the code in the first place.

I've created something I call the IFileSystemFactory and that class knows how to check if a path exists and it knows how to create directories. In this test I expect PathExists to be called and return false from it, and then I verify that a directory is created.

As you run this test it will turn red, but as you write the implementation it should turn green.

public class FileSystemSynchronizer
{
    private readonly IFileSystemFactory fsFactory;

public FileSystemSynchronizer(IFileSystemFactory fsFactory)
{
    this.fsFactory = fsFactory;
}

public void Synchronize(string sourcePath, string targetPath)
{
    if (!fsFactory.PathExists(targetPath))
    {
        fsFactory.MakeDirectory(targetPath);
    }
}

}

unit test file system operations

The last step is to refactor, but I think I will wait until I have something to refactor. This code is still quite simple and clean.

Let's implement that next test.

[Fact]
public void ShouldCopyAnyFilesFromSourceFolderToTargetFolder()
{
    /* Setup */
    var factory = MockRepository.GenerateMock<IFileSystemFactory>();
    var sut = new FileSystemSynchronizer(factory);

var targetDirectory = MockRepository.GenerateStub&lt;IDirectory&gt;();

var file1 = MockRepository.GenerateMock&lt;IFile&gt;();
file1.Name = &quot;first.txt&quot;;

var file2 = MockRepository.GenerateMock&lt;IFile&gt;();
file2.Name = &quot;second.txt&quot;;

var fileList = new[] { file1, file2 };

/* Arrange */
factory.Expect(f =&gt; f.PathExists(TargetPath)).Return(true);
factory.Expect(f =&gt; f.GetDirectory(SourcePath)).Return(targetDirectory);
targetDirectory.Files = fileList;

/* Test */
sut.Synchronize(SourcePath, TargetPath);

/* Assert */
foreach (var file in fileList)
{
    file.AssertWasCalled(f =&gt; f.CopyTo(TargetPath));
}

}

And the SUT to make it green.

public class FileSystemSynchronizer
{
    private readonly IFileSystemFactory fsFactory;

public FileSystemSynchronizer(IFileSystemFactory fsFactory)
{
    this.fsFactory = fsFactory;
}

public void Synchronize(string sourcePath, string targetPath)
{
    if (!fsFactory.PathExists(targetPath))
    {
        fsFactory.MakeDirectory(targetPath);
    }

    foreach (var file in fsFactory.GetDirectory(sourcePath).Files)
    {
        file.CopyTo(targetPath);
    }
}

}

And we're green, but .... THIS IS CRAP!

Overspecification is the hell of unit testing

What I did just now was not testing the function, but specifying the internals of the function. This is very dangerous, because I can't refactor without changing my tests. You should always try to test only the public api, and you should not bother with the internals. Instead you should look at the output after SUT has been run.

That means that we'll have to rethink and refactor our tests.

Let's create a virtual simulation of our file system instead, and fill it with files. Our virtual directory as source should be replicated into our virtual target path. This means a bit more implementation in the test, but we can limit the testing to Input/Output without overspecifying internals.

public class VirtualFileSystemFactory : IFileSystemFactory
{
    public static readonly IDirectory FolderA;
    public static readonly IDirectory FolderB;

private const string FolderAPath = @&quot;C:\FolderA&quot;;
private const string FolderBPath = @&quot;C:\FolderB&quot;;

private readonly IDictionary&lt;string, IDirectory&gt; fileSystem = new Dictionary&lt;string, IDirectory&gt;
    {
        { FolderAPath, FolderA },
        { FolderBPath, FolderB },
    };

static VirtualFileSystemFactory()
{
    FolderA = new VirtualFolder(FolderAPath);
    FolderB = new VirtualFolder(FolderBPath);
}

public bool PathExists(string targetPath)
{
    return fileSystem.ContainsKey(targetPath);
}

public void MakeDirectory(string targetPath)
{
    fileSystem.Add(targetPath, new VirtualFolder(targetPath));
}

public IDirectory GetDirectory(string sourcePath)
{
    if (!this.PathExists(sourcePath))
    {
        throw new DirectoryNotFoundException(&quot;Folder was not registered in VirtualFileSystemFactory: &quot; + sourcePath);
    }

    return fileSystem[sourcePath];
}

public void Copy(IFile file, string targetPath)
{
    this.GetDirectory(targetPath).Add(file);
}

}

public class VirtualFile : IFile { public VirtualFile(string name) { Name = name; }

public string Name { get; set; }

}

public class VirtualFolder : IDirectory, IEnumerable<IFileSystemItem> { private readonly string path; private readonly IList<IFileSystemItem> items;

public VirtualFolder(string path)
{
    this.path = path;
    items = new List&lt;IFileSystemItem&gt;();
}

public IEnumerable&lt;IFile&gt; Files
{
    get { return items.Where(f =&gt; f is IFile).Cast&lt;IFile&gt;(); }
}

public void Add(IFile file)
{
    items.Add(file);
}

public IEnumerator&lt;IFileSystemItem&gt; GetEnumerator()
{
    return items.GetEnumerator();
}

IEnumerator IEnumerable.GetEnumerator()
{
    return items.GetEnumerator();
}

}

That is a lot of code, but it is testing code. This will actually give us the power to not overspecify our tests, but work with the results of the method we're testing.

Look how this beautified the tests that where previously a mocking hell.

[Fact]
public void ShouldCreateTargetFolderIfDoesNotExist()
{
    const string UnknownTargetPath = @"C:\FolderC";

/* Setup */
var factory = new VirtualFileSystemFactory();
var sut = new FileSystemSynchronizer(factory);

/* Test */
sut.Synchronize(SourcePath, UnknownTargetPath);

/* Assert */
Assert.True(factory.PathExists(UnknownTargetPath), &quot;Target path should be created if it does not exist&quot;);

}

[Fact] public void ShouldCopyAnyFilesFromSourceFolderToTargetFolder() { /* Setup */ var factory = new VirtualFileSystemFactory(); var sut = new FileSystemSynchronizer(factory);

// Create files in source folder
var sourceFolder = factory.GetDirectory(SourcePath);
sourceFolder.Add(new VirtualFile(&quot;first.txt&quot;));
sourceFolder.Add(new VirtualFile(&quot;second.txt&quot;));

/* Test */
sut.Synchronize(SourcePath, TargetPath);

/* Assert */
var targetFolder = factory.GetDirectory(TargetPath);
foreach (var file in sourceFolder.Files)
{
    Assert.Contains(file, targetFolder.Files);
}

}

These tests are great, because they won't break when we refactor our SUT. They are great because they are readable and you don't have to be Ayende to figure out how the mocking works. 

What about wrapping the file system?

If I finish writing my tests and implementing my system, I will end up with a file system wrapping that looks like this.

model of a virtual file system wrapper

We did wrap the file system. The wrapping layer grew fourth from what my tests needed. This means that it would probably not look the same if we had a different problem to solve. Then the wrapping layer would be suited for that problem instead.

Originating from the problem description and let the API grow from our tests, gave us a wrapping layer that both looks and feels natural to the problem at hand. I could never have anticipated this design, it has to be hand grown and it has to be done with TDD.

You can download the complete sample from here. Do I need to mention that it worked flawless on the first run?

comments powered by Disqus