If you recall a while back when I was demonstrating some Functional Data Structures, I mentioned the fact that some of the functions were not tail recursive, and that this is something that we would probably want to do something about. Which raises the question: How exactly do we go about making a function tail-recursive? I am going to attempt to address that question here.

One of the first problems with creating a tail recursive function is figuring out whether a function is tail recursive in the first place. Sadly this isn’t something that is always obvious. There has been some discussion about generating a compiler warning if a function is not tail recursive, which sounds like a dandy idea since the compiler knows enough to know how to optimize tail recursive functions for us. But we don’t have that yet, so we’re going to have to try and figure it out on our own. So here are some things to look for:

  1. When is the recursive call made? And more importantly, is there anything that happens after it? Even something simple like adding a number to the result of the function can cause a function to not be tail-recursive
  2. Are there multiple recursive calls? This sort of thing happens when processing tree-like data structures quite a bit. If you need to apply a function recursively to 2 sub-sets of elements and then combine them, chances are they are not tail-recursive
  3. Is there any exception handling in the body of the function? This includes use and using declarations. Since there are multiple possible return paths then the compiler can’t optimize things to make the call recursive.

Now that we have a chance of identifying non-tail-recursive functions lets take a look at how to make a function tail-recursive. There may be cases where its not possible for various reasons to make a function tail-recursive, but it is worthwhile trying to make sure recursive functions are tail-recursive because a StackOverflowException cannot be caught, and will cause the process to exit, regardless (yes, I know this from previous experience Smile)

Accumulators

One of the primary ways of making a function tail-recursive is to provide some kind of accumulator as one of the function parameters so that you an build the final result and pass it on using the accumulator, so when the recursion is complete you return the accumulator. A very simple example of this would be creating a sum function on a list of ints. A simple non-tail-recursive version of this function might look like:

let rec sum (items:int list) =
    match items with
    | [] –> 0
    | i::rest –> i + (sum rest)

And making it tail-recursive by using an accumulator would look like this:

let rec sum (items:int list) acc =
    match items with
    | [] –> acc
    | i::rest –> sum rest (i+acc)

 

Continuations

If you can’t use an accumulator as part of your function another (reasonably) simple approach is to use a continuation function. Basically the approach here is to take the work you would be doing after the recursive call and put it into a function that gets passed along and executed when the recursive call is complete. For an example where we’re going to use the insert function from my Functional Data Structures post. Here is the original function:

let rec insert value tree =
    match (value,tree) with
    | (_,Empty) –> Tree(Empty,value,Empty)
    | (v,Tree(a,y,b)) as s –>
        if v < y then
            Tree(insert v a,y,b)
        elif v > y then
            Tree(a,y,insert v b)
        else snd s

This is slightly more tricky since we need to build a new tree with the result, and the position of the result will also vary. So lets add a continuation function to this and see what changes:

let rec insert value tree cont =
    match (value,tree) with
    | (_,Empty) –> Tree(Empty,value,Empty) |> cont
    | (v,Tree(a,y,b)) as s –>
        if v < y then
            insert v a (fun t –> Tree(t,y,b)) |> cont
        elif v > y then
            insert v b (fun t –> Tree(a,y,t)) |> cont
        else snd s |> cont

For the initial call of this function you’ll want to pass in the built-in id function, which just returns whatever you pass to it. As you can see the function is a little more involved, but still reasonably easy to follow. The key is to make sure you apply the continuation function to the result of the function call, otherwise things will fall apart pretty quickly

These two techniques are the primary means of converting a non-tail-recursive function to a tail-recursive function. There is also a more generalized technique known as a “trampoline” which can also be used to eliminate the accumulation of stack frames (among other things). I’ll leave that as a topic for another day, though.

Another thing worth pointing out, is that the built-in fold functions available in the F# standard library are already tail-recursive. So if you make use of fold you don’t have to worry about how to make your function tail recursive. Yet another reason to make fold your go-to function.

Let’s say that you’ve been working hard on this really awesome data structure. Its fast, its space efficient, its immutable, its everything anyone could dream of in a data structure. But you only have time to implement one function for processing the data in your new miracle structure, so what would it be?

Ok, not a terribly realistic scenario, but bare with me here, there is a point to this. The answer to this question, of course, is that you would implement fold. Why you might ask? Because if you have a fold implementation then it is possible to implement just about any other function you want in terms of fold. Don’t believe me? Well, I’ll show you, and in showing you I’ll also demonstrate how finding the right abstraction in a functional language can reduce the size and complexity of your codebase in amazing ways.

Now, to get started, let’s take a look at what exactly the fold function is:

val fold:folder('State -> 'T -> 'State) -> state:'State -> list:'T list -> 'State

In simple terms it iterates over the items in the structure, and applies a function to each element which in some way processes the element and returns some kind of accumulator. Ok, maybe that didn’t come through quite as simply as I would have hoped. So how about start with a pretty straight-forward example: sum.

 
let sum (list:int list)= List.fold (fun s i -> s + i) 0 list 

Here we are folding over a list of integers, but in theory the data structure could be just about anything. Each item in the list gets added to the previous total. The first item is added with the value passed in to the fold, so for items [1;2;3] we start by adding 1 to 0, then 2 to 1, then 3 to 3, the result is 6. We could even get kinda crazy with the point-free style and use the fact that the + operator is a function which takes two arguments, and returns a third…which happens to exactly match our folding function.

let sum (list:int list) = List.fold (+) 0 list

So that’s pretty cool right? Now it seem like you could also very easily create a Max function for your structure by using the built in max operator, or a Min function using the min operator.

let max (list:int list) = List.fold (max) (Int32.MinValue) list 
let min (list:int list) = List.fold (min) (Int32.MaxValue) list

But I did say that you could create any other processing function right? So how about something a little trickier, like Map? It may not be quite as obvious, but the implementation is actually equally simplistic. First let’s take a quick look at the signature of the map function to refresh our memories:

val map: mapping ('T –> 'U) –> list:'T list –> 'U list

So how do we implement that in terms of fold? Again, we’ll use List because its simple enough to see what goes on internally:

let map (mapping:'a -> 'b) (list:'a list) = List.fold (fun l i –> mapping i::l) [] list

Pretty cool right? Use the Cons operator (::) and a mapping function with an initial value of an empty list. So that’s pretty fun, how about another classic like filter? Also, pretty similar

let filter (pred:'a -> bool) (list:'a list) = List.fold (fun l i –> if pred I then i::l else l) [] list

Now we’re on a roll, so how about the choose function (like map, only returns an Option and any None value gets left out)? No problem.

let choose (chooser:'a –> 'b option) (list:'a list) = List.fold (fun l i –> match chooser i with | Some i –> i::l | _ –> l) [] list

Ok, so now how about toMap?

let toMap (list:'a*'b list) = List.fold (fun m (k,v) –> Map.add k v) Map.empty list

And collect (collapsing a list of lists into a single list)?

list collect (list:'a list list) = List.fold (fun l li –> List.fold (fun l' i' –> i'::l') l li) [] list

In this case we’re nesting a fold inside a fold, but it still works. And now, just for fun, list exists, tryFind, findIndex, etc

let exists (pred:'a -> bool) (list:'a list) = List.fold (fun b i -> pred i || b) false list
let tryFind (pred:'a -> bool) (list:'a list) = List.fold (fun o i -> if pred i then Some i else o) None list
let findIndex (pred:'a -> bool) (list:'a list) = List.fold (fun (idx,o) i -> if pred i then (idx + i,Some idx) else (idx + 1,o)) (-1,None) list |> snd |> Option.get
let forall (pred:'a -> bool) (list:'a list) = List.fold (fun b i -> pred i && b) true list
let iter (f:'a -> unit) (list:'a list) = List.fold (fun _ i -> f i) () list
let length (list:'a list) = List.fold (fun c _ -> c + 1) 0 list
let partition (pred:'a -> bool) (list:'a list) = List.fold (fun (t,f) i -> if pred i then i::t,f else t,i::f) ([],[]) list

Its worth pointing out that some of these aren’t the most efficient implementations. For example, exists, tryFind and findIndex ideally would have some short-circuit behavior so that when the item is found the list isn’t traversed any more. And then there are things like rev, sort, etc which could be built in terms of fold, I guess, but the simpler and more efficient implementations would be done using simpler recursive processing. I can’t help but find the simplicity of the fold abstraction very appealing, it makes me ever so slightly giddy (strange, I know).

So here we are at part 2 in the series of posts looking at Functional Data Structures from the book of the same name by Chris Okasaki. Last time we looked at what is perhaps the simplest of the functional data structures, the List (also useful as a LIFO stack).  Up next we’ll continue in the order that Chris Okasaki used in his book, and take a look at implementing a Set using a Binary Tree.

Diving right in, here is implementation for a Set using a binary tree in F#:

module Set

    type Tree<'a when 'a:comparison> =
        | Empty
        | Tree of Tree<'a>*'a*Tree<'a> 

    let rec isMember value tree =
        match (value,tree) with
        | (_,Empty) -> false
        | (x,Tree(a,y,b)) ->
            if value < y then
                isMember x a
            elif value > y then
                isMember x b
            else
                true

    let rec insert value tree = 
        match (value,tree) with
        | (_,Empty) -> Tree(Empty,value,Empty)
        | (v,Tree(a,y,b)) as s -> 
            if v < y then
                Tree(insert v a,y,b)
            elif v > y then
                Tree(a,y,insert v b)
            else
                snd s

This is pretty simple, like the List we’re working with a Discriminated Union, this time with an Empty, and then a Tree that is implemented using a 3-tuple (threeple?) with a Tree, an element, and a Tree. There is a constraint on the elements that ensures they are comparable, since this is going to be an ordered tree.

We only have two functions here, one isMember, which says whether or not the element exists in the set, and the other insert, which adds a new element. If you look at the isMember function, its not too difficult, a recursive search of the tree attempting to find the element. Since this is a sorted tree, each iteration will compare the element being searched for with the element in the current node of the tree. If its less than the current node we follow the right-hand side of the tree, otherwise we follow the left-hand side of the tree. If we find an empty tree, the element doesn’t exist. Update is a little more difficult…it’s recursive like isMember, but it is also copying some of the paths. The bits that are copied are the bits that are not being traversed, so in reality the majority of the tree returned from the update function is actually shared with the source tree, its root is just new. Take a hard look at that for a moment, and see if the pain begins to subside…then we’ll look at the C# version.

public static class Tree
{
    public static EmptyTree<T> Empty<T>() where T: IComparable
    {
        return new EmptyTree<T>();
    }
}

public class EmptyTree<T> : Tree<T> where T: IComparable
{
    public override bool IsEmpty { get { return true; }}
}

public class Tree<T> where T: IComparable
{
    public Tree<T> LeftSubtree { get; internal set; }
    public Tree<T> RightSubtree { get; internal set; }
    public T Element { get; internal set; }
    public virtual bool IsEmpty
    {
        get { return false; }
    }
}

public static class Set
{
    public static bool IsMember<T>(T element, Tree<T> tree) where T: IComparable
    {            
        if (tree.IsEmpty)
            return false;
        var currentElement = tree.Element;
        var currentTree = tree;
        while(!currentTree.IsEmpty)
        {
            if (element.CompareTo(currentElement) == 0)
                return true;
            if (element.CompareTo(currentElement) == 1)
            {
                currentTree = currentTree.RightSubtree;
            }
            else
            {
                currentTree = currentTree.LeftSubtree;
            }
            currentElement = currentTree.Element;
        }
        return false;
    }

    public static Tree<T> Insert<T>(T element, Tree<T> tree) where T: IComparable
    {
        if (tree.IsEmpty)
            return new Tree<T> { LeftSubtree = Tree.Empty<T>(), Element = element, 
                                 RightSubtree = Tree.Empty<T>() };
        switch(element.CompareTo(tree.Element))
        {
            case 0:
                return tree;
            case 1:
                return new Tree<T> { RightSubtree = tree.RightSubtree, Element = tree.Element, 
                                     LeftSubtree = Set.Insert<T>(element,tree.LeftSubtree) };
            default:
                return new Tree<T> { LeftSubtree = tree.LeftSubtree, Element = tree.Element, 
                                     RightSubtree = Set.Insert<T>(element, tree.RightSubtree) }; 
        }
    }
}

This is a reasonable chunk of code, so lets work it from the top down. We start off by defining the Tree data structure. We use inheritance in this case to make an Empty tree, since we don’t have Discriminated Unions in C# (If I were a good person I would update that right now to return a singleton of the EmptyTree class, but alas, I’m lazy). The Static Tree class provides the convenience method for creating the empty tree, and the Tree type is our parameterized tree.

The methods in the Set class do the work of checking for an existing member in the set, and inserting a new member in the set.  I took the opportunity to convert the recursive isMember function to a looping construct in C# (which is what the F# compiler will do for you).  This is not really possible with the Insert method because it is not tail recursive.  The logic is the same in both versions, but the C# version is a bit more verbose (though having LeftSubtree and RightSubtree makes things a little clearer in my opinion).  Again, the biggest difference between the two is the amount of code (since we don’t have Discriminated Unions and Pattern Matching in C# land)

Summing Up Persistent Structures

Interestingly this is where the first section of Okasaki’s book ends (Its actually chapter 2, but chapter 1 is more of a foundational thing…no code).  These two implementations show the basic ideas behind what are described as “Persistent” data structures…meaning bits of the structures are re-used when creating new structures are part of an operation that would mutate the original structure in a non-functional (mutable) data structure.  In the case of a List/Stack we are referencing the old list as the “Tail” of the new list, so each time we add a new item we are simply allocating space for the new item.  In the case of the Tree/Set we create a new root tree on Add, and then reference all paths except for the new node that gets added (or, if the item already exists, we just have the new root…this is actually something Okasaki suggests the reader should solve as an additional exercise).  These concepts are fundamental to the more complex data structures that fallow, and present the basic ideas that are employed to make the structures efficient within the context of functional programming.

Up next in the book is a look at how more traditional data structures, such as heaps and queues, can be converted to a more functional setting.  Expect more goodness in the area, but I would also like to revisit some of the basics here.  The more observant readers may have noticed that the majority of the functions used on these simple types were not Tail Recursive, which means the compiler and JIT cannot optimize them, which ultimately means they are going to cause your stack to blow up if you’re dealing with large structures.  It might be worth exploring how to go about converting these to make them Tail Recursive.

As you may have guessed from the title, I’ve started doing some work with F#.  Initially I was somewhat reluctant to go down the F# path because some of the more interesting aspects of the other functional languages I’ve been exploring are not present…specifically the type systems behind Scala and Haskell, the laziness of Haskell, and the concurrent programming model of Erlang.  In spite of these perceived downfalls, there were some definite plusses, namely interoperability with everything .Net, immutability by default, and the wonderful concise programing model of a functional language.

So with these benefits in mind I set about figure out what F# was all about.  The language itself is based strongly on OCaml, and I’ve not had any experience with OCaml, so I was unsure what to expect.  I decided to find a book on the subject, and I wish I could tell you for sure which one it was, but it was long ago, and for some reason when I look at all of the F# books on Safari none of them seem to fit the bill…The closest seems to be Expert F# 2.0, so we’ll assume that one was it for now.  Regardless, I read the entire thing over the course of about 3 days (started on a Friday, and had made my way through by Sunday).  I didn’t go through any exercises, or really try to write any code along the way, since I really just wanted to figure out what the language was all about.  I should point out that I’ve tried at least once before to make my way through an F# book, and didn’t have much luck…this time round it was smmoooooth.  I think the biggest reason was that I already had a pretty solid grasp of the core concepts in functional languages.  Things like functional composition, pattern matching, and working with immutable data types are central to just about every functional language, and F# is no different, so my learning experience was really just a matter of mapping those concepts onto the correct syntactic elements in my head.  By the time it was all over I felt pretty comfortable with the basics of the language.

Shortly after reading the book I decided to actually try writing something real and useful…this proved to be a bit more of a challenge.  There are a few reasons for this…a big part was that organizing a functional project is different than organizing an OO project.  This was complicated by the fact that the first task I set myself on was re-writing something I had in C# in F#.  This was supposed to be more than just a syntactic translation, but also an attempt to see if my hunch that the problem being solved was effectively a functional problem, and so would lend itself well to a real functional language.  The problem was I was used to thinking about the problem in terns of the classes I had already created, and in F# those concepts were not there.  Before long, though, I had adjusted my thinking, and the more time I spent working on the problem the more I found myself enjoying F#.  After that initial experience (which was mostly academic, in that it was not intended to go “live”) I found myself wanting to explore more with the language, and so I’ve been looking for reasons to use it.  I’m not going to go into all of the ways I’ve managed that here, but I did want to share some observations:

  • My initial reluctance based on the perceived drawbacks were largely my own naivety.  While it is true that there are no higher-kinded types, and therefore no type constructors, this does not make the programing experience that much worse.  Granted there are some kinds of things that will be duplicated, which folks using Haskell would be able to do away with by harnessing the power of the type system, but this does not make F# useless by any stretch.  As a matter of fact F# exposes some capabilities of the CLR that C# does not, including being able to specify wildcard types, which allow you to say “I have a parameterized type, but I don’t care about the specific type of the parameter”, and even some Structural Typing, which provides a way to constrain types by specifying the methods those types should have.
  • The let construct is deceptively simple when you first encounter it.  Initially it seems like just a way to specify a variable or function name…it becomes interesting though when you realize that the fact that there is a single construct for both means that the two are effectively the same thing. Combine with this the fact that they can be nested, and you have an extremely versatile construct.  I assume this comes directly from the OCaml heritage of F#
  • Pattern matching is just awesome.
  • Working with Object Oriented concepts is jarring, and feels….awkward.  I have no proof, but I can’t help but think this is intentional. While F# is not a “pure” language like Haskell, it still tries to be “functional by default”.  The standard types that you work with all the time, like tuples and lists, are immutable, as are the let bindings.  You have to be specific if you want the mutable versions of any of these.  I can’t help but think the fact that it is easier (or should I say more natural) to work with pure functional types and immutable data structures is a design feature of the language.

The biggest problem I have with F# at this point is that it is clear that it is still a second-class citizen in the VisualStudio world.  While it shipped with VS 2010, a lot of the other tooling doesn’t support it.  Things like the built in analysis tools, just don’t work.  Even the syntax highlighting is less impressive than C#.  There is also the fact that there are no built-in refactorings for F#.  Event third-party tools like Resharper and CodeRush don’t have support.  This is really sad, since the language itself is really a joy to work with.  There is still a perception that it is largely academic, and you can’t do any real work in it.  This is unfortunate, since in our normal day-to-day programming lives there are some problems that are just functional in nature.  In general, functional programing is all about asking questions and getting answers.  Contrast this with OO, which stresses a “Tell don’t ask” paradigm.  If you divide your application into sections which are suited to “telling” vs “asking” then you may find that you can write certain parts functionally very easily, and others OO equally easy.  Wouldn’t it be amazing if people started choosing their languages based on the nature of the problem to be solved, rather than simply because “I’m a C# developer”.

Tentatively subtitled: “How scale can make fools of us all”

This is going to be a real life war story…cause I haven’t done one of those in a while, and this particular case really ticked me off.  Here’s the scoop:  I’ve got a “service” which is called by other parts of the system.  And by “service” I don’t mean something running in its own process and waiting for SOAP/REST requests or messages, I simply mean something that has a defined entry point (a static method in this case), where you pass in some data, and get something back.

Like many others, I’m sure, I’m using an IoC container to wire up bits so that I can have a big ball of interfaces “to make testing easier” (one of these days I’ll break that rather nasty habit and figure out a better way to do thing, but I’m getting off topic).  Specifically, I’m using Windsor for my dependency injection because it seems to have become the Container de jure among the devs that actually are using containers at work (StructureMap was in there for a while too, but it seems to have faded).  As many of you may know, Windsor is one of those containers that tracks instances for you so that it can use a Lifecycle rule to decide whether to give you an already existing instance of an object, or create a new one for you. It will also automatically call Dispose() on IDisposable objects that it may be tracking, thus helping ensure proper cleanup of resources.

In my case I had everything set up using the Transient lifestyle, because each request was essentially stateless, and there really wasn’t a lot of expense involved in creating a new instance of the objects.  Because I’ve done my homework, I know that if you’re using Transient objects in Windsor, you should explicitly call Release on the container to release the object when you’re done with it, otherwise you’re likely to get a memory leak, since the container would be holding on to an instance of the object, not letting the GC do its thing.  So, I made sure I did that, and my code looked something like this:

var myService = _container.GetService<IMyService>();
try
{
    myService.DoWork();
}
finally
{
    _container.Release(myService);
}

The one thing to point out here, is that my reference to _container was a singleton, so I would get it set up the first time and then use the pre-configured container after that. So, where is the problem? Anyone? Well, I didn’t see anything wrong with it. And neither did the person doing the code review.  But, as you might guess from the fact that I’m writing about this, there was a problem, and here’s how it manifested itself:

Approximately 6 days after this went to production, one particular set of servers in one of our data centers (lets say for the sake of this post we have 2) started kicking out OutOfMemoryExceptions during calls to the service.  My first thought was, “strange, but I’m doing the right thing here and releasing, so its probably just something else eating up memory and my code is suffering”.  To help demonstrate this I even set up a test running 1000 calls to the service in a while loop and watching the memory…nothing unusual, hovered around 33MB.  So I fired up the most excellent dotTrace memory profiler, and it confirmed.

4 more days go by and our operations folks come and beat the crap out of me because they have had to reboot production servers every couple of hours.  Ok, they didn’t beat the crap out of me, but they wanted to, and they did send along a dump, which one of the other devs who is a wiz with windbg was able to translate into something meaningful for me.  The dump showed thread contention in ReaderWriterLockSlim.WaitOnEvent(), and about 200MB worth of an object called Castle.Microkernel.Burden.  And here are some other interesting details:  The service is called by all kinds of different servers; Web servers, SOAP servers, REST servers, but none of these were showing problems.  The only one that was having issues was a server that was set up to process asynchronous SOAP requests (don’t ask).  And each server could process up to 20 at a time.

Armed with this information I did some googling, and discovered that the Burden object is the thing you leak when you don’t call Release() on the container in Windsor….But I was calling release!  I found a blog post by Davy Brion that talked about getting leaks when using your own Windsor container with NServiceBus, and how to deal with it….seemed interesting, but it also seemed like something that didn’t apply, since the problem there was that NServiceBus didn’t know about calling Release() since it was written with a container that didn’t keep references.  It did lead me to the source code for the release policy, which showed me something very interesting.

The Windsor object tracking is basically doing some reference counting.  The ReaderWriterLockSlim is being used to manage the count of instance references, so when you create a new instance it is incremented, and when you release an instance it is decremented.  In either case you’re doing a write, so you’re calling a ForWriting() method on a lock wrapper, which is effectively trying to do a write lock (at some point down the call stack)….very interesting.  At this point I decided to see if I could reproduce the problem, and so I took my earlier test running 1000 calls in a loop, and kicked it up a few notches on the concurrency scale, and set it up to run calls in a while loop until the thread was canceled. I fired up 25 threads to do this, launched the little console app and waited.  Sure enough I was able to see in process monitor that memory was rising….there were some spots where a large collection was taking place, but it wouldn’t release everything, and so soon my little app which started at around 40 MB was using 50 MB, then 60 MB.  It was the concurrency!  The multiple requests were stacking up new instances of object, and new instances of the Burden object faster than they could be collected because the whole thing was bottle-necked by the ReaderWriterLockSlim!

So I plugged in a version of Davy’s code to fix the NServiceBus issue, only I decided since I was managing this container local to my service, and I was also dealing with any Disposables myself, that I would not let it track anything (there is actually a built-in policy object for not tracking anything…just realized that).  Plugged it in, fired up the test, and I had a little console app that ran for about an hour and hovered at about 40MB of memory in use.

We actually did an emergency deployment to push this to the effected set of servers in production, and I’m happy to say that so far I’ve not seen an issue….of course our logs stopped showing the OutOfMemory exceptions about 24 hours before we pushed the fix, so we have that to help out our feeling of doubt that the issue is resolved.  And even though I could create something suspicious locally, we were never able to recreate the production issue in QA.  One of the interesting things about our environment is that we have a lot of customers who do things that we don’t exactly expect, or want, them to do.  It looks like in this case we had some customers who were doing a lot of asynchronous calls and they just managed to stack up in a way where things got ugly.

If it walks like a giraffe and talks like a duck then what is it?  Maybe a duhk?  Who knows, but it certainly is not a duck.  So if that is the case, then you can probably guess what the Duhking library is all about…or maybe you can’t.  In terms of programming, Duck Typing refers to the ability of some languages to allow you to treat an object of one type as an object of a different type, provided the methods/properties needed exist on both objects.  Statically typed languages are usually not very good at this sort of loosey-goosey type inference, which is why this behavior is typically restricted to languages with less stringent rules on typing.

The Duhking library is an attempt to provide a very limited view of Duck Typing to .Net 3.5 applications.  It allows you to graft an interface type onto an object that has matching properties and/or methods but does not actually implement that type.  Why would you do that?  Well, there were a couple of use-cases that drove the development of this library.  One is a case where you want to wrap an API that you have no control over in an interface so that you can test your consuming code.  This is common for things like HttpContext or SmtpClient where you want to utilize the functionality of those libraries in your code which you work so hard to make testable.  A standard approach to doing this is to create an interface which defines the methods you need, and then create a “wrapper” class that implements the interface, but then calls through to the real, un-testable class to do the work.  So my thought was this:  “Since we’re just calling through to matching method signatures in a class that already exists, why not abstract the whole thing so I don’t need all of these crazy Wrapper classes everywhere?”.

The other use-case came up when dealing with anonymous types.  We all know that you can create basic data objects as an anonymous type, and then use them within the scope they are created.  But what happens if you want to pass an anonymous type to another method?  Well, you have two choices.  You can either move the data in your anonymous type to another class/struct and pass that, or you can resort to some reflection trickery to get the values out of a plain old object.  It seemed like you should be able to create a new anonymous object as a particular interface type, and then pass it around as that interface type.

So the Duhking library allows you to do both fairly easily by use of some simple extension methods on object, and the magic of the Castle project DynamicProxy2 library.  Now that I’ve told you the secret, surely you can see how things work.  The library simply creates a proxy of the specified interface, and then intercepts calls to the interface methods, and in-turn calls the matching methods on the object we are “Duhking”.  There is some checking going on to ensure that your object is compatible with the interface you are wanting to Duhk, which involves checking method signatures (this is surprisingly complicated, considering it is the basis for compiler based interface implementation verification….but then I may be doing it the hard way), but beyond that its just passing calls off to the proxy.  But enough idle chit-chat, lets see a sample.

Okay, so lets take the first use-case where we are wanting a wrapper class for a sealed framework class so we can test our consumers.  Let’s go with the SmtpClient as an example because it is fairly common to want to send emails for various reasons.

First off we need out wrapper interface:

public interface ISmtpClient
{
    string Host { get; set; }
int Port { get; set; }

void Send(string from, string recipients, string subject, string body); }

You could add in additional properties or methods, but this is enough to get you going.  So now you can use this interface in place of the standard SmtpClient in the framework, and write your tests against it without major pain.  So the next step is to Duhk the real SmtpClient so it implements your interface when your ready to do the “real” work.

// Some code here getting ready to call your class that needs the client
var myClass = new ClassNeedingSmtpClient(new SmtpClient().AsType<ISmtpClient>());
// and now you do something with it

Pretty cool huh?  You can also check to see if the given concrete type can be Duhked by using the CanBe extension method

// Some code here getting ready to call your class that needs the client
var realClient = new SmtpClient();
if(realClient.CanBe<ISmtpClient>())
    var myClass = new ClassNeedingSmtpClient(realClient.AsType<ISmtpClient>());
// and do something else if it doesn't work

So now lets look at the other usage scenario, wrapping anonymous types in an iterface so you can pass them around.  The first thing we need is an interface to hold the data

public interface INamedSomething
{
    string Name { get; }
    int Id { get; }
    string SomethingElse { get; }
}

Note here that we are only specifying getters. That is because the properties of anonymous types are read-only, and right now the duhking code doesn’t differentiate anonymous types from other types (more on that later). Ok, so with this we can now create an anonymous type and return it as an INamedSomething

public INamedSomething MethodThatReturnsSomething()
{
    // Some work goes here
    return new { Name = "Sam", Id = 1234, SomethingElse = "Hah!", SomethingElseNotInTheInterface = "Foo" }.AsType<INamedSomething>();
}

And this works fine. Notice I threw an extra property in there to show you that when we are checking for matching signatures we’re only checking the methods/properties in the interface we’re trying to Duhk. You can have as many additional properties or methods as you want in your concrete type, doesn’t matter.

Now, as for that whole read-only thing.  Right now the code that checks compatability between your class and the target interface is ensuring that each method in the target interface has a matching method in the class.  This includes the compiler-generated methods for getting and setting properties.  That means that if your using an anonymous type as your class, you will never be able to Duhk it to an interface that has setters on it’s properties.  While technically correct there is something about this behavior that bugs me…it just doesn’t seem flexible enough.  So most likely what I am going to do is add some special handling for anonymous types that will allow the target interface to have both getters an setters.  This will in effect provide a way to stub out an interface implementation, and use the anonymous type to set the initial values of the interface.  This does change a bit the purpose of what I’m trying to do with this library, and gives it the added ability to stub out interfaces, so I’ve held off on doing this.  I think, though, that adding this functionality will actually increase the utility of the library, so it’s probably worth doing.

Right, so now that you have all of the grueling details, go get it, and let me know what you think.

As of right about now, you should be able to mosey on over to the DxCore Community Plug-ins page, and grab a copy of CR_MoveFile.  This is a plug-in I created primarily as a tool to aid in working in a TDD environment, but which certainly has uses for non-TDD applications.  It does basically what CR_MoveFile_ScreenShotthe name suggests, it allows you to move a file from one directory in your solution/project structure to another, even one in a different project.  I implemented this as a code provider (since it could change the functionality if you move the file from one project to another), so it will appear in the Code menu when you have the cursor somewhere within the beginning blocks of a file (“using” sections, namespace declaration, or class/interface/struct declarations).  Once selected you are presented with a popup window which has a tree that represents your current solution structure, with your current directory highlighted.  You can use the arrow keys to navigate the directories and choose a new home for your file.

If you move files between projects, the plug-in will create project references for you, so you don’t need to worry about that.  When the file is moved the file contents remain unchanged, so all namespaces will be the same as they were originally.  I did this mostly to keep the plug-in simple, but also because I could see situations where this would be good, and situations where this would be bad, and it seemed like this was a bad choice to make for people.  I’ve been using this plug-in on a day-to-day basis for a while now, and things seem pretty clean, I did run into a small issue, however, using it within a solution that was under source control.  At this point you need to make sure the project files effected by the move are checked out, otherwise the plug-in goes through the motions, but doesn’t actually do anything, which is quite annoying.  There is also no checking going on to make sure the language is the same between the source and target project, so if you work on a solution that contains C# and VB.Net projects, you have to be careful not to move files around to projects that can’t understand what they are (oh, and the project icons used on the tree view are all the same, so there is no visual indication of what project contains what type of files).

That’s pretty much it.  Clean, simple, basic.  Used with other existing CodeRush/Refactor tools like “Move Type To File” and “Move to Namespace”, this provides for some pretty powerful code re-organization.  Just make sure you run all of your tests :).

Anyone who has been around me for more than a few hours while coding, or who pays any attention to me on Twitter will know that I am a huge fan of CodeRush and Refactor Pro! from DevExpress.  I consider these sorts of tools essential to getting the most out of your development environment, and I think CodeRush is one of the best tools available for a number of reasons, not the least of which is it’s extensibility.  CodeRush is built on top of DxCore, which is a freely available library for building Visual Studio plug-ins (incidentally, DevExpress also have a free version of CodeRush called CodeRush XPress, which is built on the same platform).  DxCore provides any developer who wants it access to the same tools that the folks at DevExpress have for building plug-ins and extensions on top of VisualStudio, and several developers (including yours truly) have done just that.

One of the more recent additions to the CodeRush arsenal are the CodeIssues.  As of the v9 release, CodeRush included an extensive collection of these mini code analyzers which will look at your code in real time and do everything from let you know when you have undisposed resources, to suggesting alternate language features you may not even be aware of.  A lot of these are also tied in to the refactoring and code generation tools that already exist within CodeRush and Refactor Pro! so that not only do you see that there is an issue or suggestion, but in a lot of cases you can tell the tool to correct it for you.  Pretty impressive stuff.

So what I would like to do is dig in to how the CodeIssue functionality works within CodeRush by creating a custom CodeIssue Provider.  Because I’m a TDD guy, one of the things I’ve been trying to do is build in some tooling around the TDD process to make it that much easier to write code TDD.  So based on that I’m going to show you how to implement a CodeRush CodeIssueProvider which will generate a warning whenever you have created a Unit Test method with no assertions (which would indicate that you are either dealing with an Integration Test, or your test is not correctly factored).  Note: Since the CodeIssue UI elements are part of the full CodeRush product, and not CodeRush XPress, this plug-in will note do anything unless you are running the full version of CodeRush.

Okay, so the first thing to do is to create a new Plug-In project.  This can either be done from the Visual Studio File –> New Project menu, or by selecting the New Plug-in option from the DevExpress menu in visual studio (if you are using CodeRush XPress and you don’t have the DevExpress menu, my man Rory Becker has a solution for you).  Regardless of which way you go, you will get a “New DxCore Plug-in Project” window, which will ask you what Language you want to write your plug-in in (C# or Visual Basic .Net), and what kind of plug-in you want, along with the standard stuff about what to name the solution and where to store the files.  For our purposes we’re going to go with C# as the Language, a Standard Plug-in, and we’ll call it CR_TestShouldAssert (the CR_ is a naming convention used by the CodeRush team to indicate it’s a CodeRush plug-in, as opposed to a Refactoring or DxCore plug-in).

image

Net up is the “DxCore Plug-in Project Settings” dialog.  This allows you to give your plug-in a title, and set some more advanced options which deal with how the plug-in gets loaded by the DxCore framework.  We’ll just leave everything as-is and move on to the good stuff.

image

Once your project loads you will be presented with a design surface, this is because a large number of the components that are available via DXCore can actually be found in the Visual Studio toolbox, and you can just drag them out onto your plug-in designer to get started.  The CodeIssueProvider is an exception, though, so we will have to crack open the designer file to add it to our plug-in.  So open up the PlugIn1.designer.cs file, and add the following line of code under the “Windows Form Designer Generated Code” section:

CodeIssueProvider cipTestsShouldAssert;

You’ll need to add a using statement for the DevExpress.CodeRush.Core namespace as well.  Next we need to instantiate it, so we need to do this in the the InitializeComponents method.  When you are finished your InitializeComponents method should look like this:

this.components = new System.ComponentModel.Container();
cipTestsShouldAssert = new CodeIssueProvider(this.components);
((System.ComponentModel.ISupportInitialize)(this)).BeginInit();
((System.ComponentModel.ISupportInitialize)(this)).EndInit();

Now if we switch back over to the designer, we will see our new provider on the design surface.  At this point we can use the Properties window to configure the provider.  The things we need to worry about filling out are the Description, DisplayName, and ProviderName properties.  The Description is the text that will be displayed in the Code Issue catalog, so it needs to clearly explain what the CodeIssueProvider is intended to do.  Let’s go with something like: “A Unit Test should have at least one explicit or implicit assertion.”  As for DisplayName, lets say something like “Unit Test Method Should Assert”, and make the ProviderName the same.

Ok, so now it’s time to actually do the work of finding a TestMethod that violates this condition.  So we need to switch over to the Events list for our provider, and Double-Click in the CheckCodeIssues drop-down so it generates an event handler for us.  You will now be taken to the code editor and presented with a empty handler that looks something like:

private void cipTestsShouldAssert_CheckCodeIssues(object sender, CheckCodeIssuesEventArgs ea)
{

}

This looks pretty much like your normal event handler, we’ve got the sender object (which would be our provider instance, and then we have a custom EventArgs object. Looking at this event args object, you can see quite a few methods, and a couple of properties.  The first few methods you see deal with actually adding your code issue, if it exists, to the list of issues reported by the UI.  You’ve got one method for each type of CodeIssue (AddDeadCode, AddError, AddHint, AddSmell,AddWarning), and then one method (AddIssue), which allows you to specify the CodeIssue Type.  Now this is where things start to get interesting because basically we’re at the point where the good folks who wrote DxCore have said “All right, go off and find your problem and report your finding back to me when your done”.  So from here we have to figure out whether or not there are any test methods without asserts floating around anywhere.  The good news is that there are a few tools in the CodeRush bag of tricks that can help us.

Perhaps the best tool for figuring out this sort of thing is the “Expression Lab” plug-in.  You can open this up by going to the DevExpress menu, opening the Tool Windows->Diagnostics->Expressions Lab.  This shows you in real time what the AST that CodeRush produces for your code looks like as you move about in a file.  You can also see all of the properties associated with the various syntax elements, and view how things are related.  This is a very handy tool to have.  Before we dig too deep into the Expressions Lab, lets get a start on finding our CodeIssue.  We know that we are going to be looking at methods here, since we are ultimately searching for test methods, so the first thing to do is to limit the scope of our search to just methods.  The CheckCodeIssues event is fired at a file level, so you are basically handed an entire file to search by the DxCore framework.  We need to filter that down a bit and only pay attention to the methods contained in the current file.  To do that we’re going to use the ResolveScope() method of the CheckCodeIssuesEventArgs object.  Calling the ResolveScope() method gives us a ScopreResolveResult object, which doesn’t sound very interesting, but this object has a wonderful little method on it called GetElementEnumerator().  This method will allow you to pass in a filter expression, and return all of the elements that match that filter expression as an enumerable collection. So to get to this, lets add the following to the body of our event handler:

var resolveScope = ea.ResolveScope();
foreach(IMethodElement method in resolveScope.GetElementEnumerator(ea.Scope,new ElementTypeFilter(LanguageElementType.Method)))
{
}

This looks pretty straightforward, but there are a couple of things I want to point out. First is the ea.Scope property that we are passing in to the GetElementEnumerable() method. This is the AST object that represents the top of the parse-tree that we are going to be searching for code issues in. Typically this is a file-level object, but I don’t know that you can count on that always being the case (changing the parse settings could potentially effect how much of the code is considered invalid at a time, and so you could get larger or smaller segments of code).  The other interesting bit is the ElementTypeFilter().  This allows us to filter the list of AST elements given to us in our enumerable based on their LangueElementType (LanguageElement is the base class for syntax elements within the DxCore AST structure.  All nodes have an ElementType property which exposes a LanguageElementType enum value). In our case we’re only interested in methods, so we’re using LanguageElementType.Method.  The result is a collection of all of the methods within our Scope.

Now that we have all of our methods, we need to figure out if they are Test methods.  To do this we’ll have to look for the existence of an Attribute on the method.  Taking a look at Expressions Lab, we can see that a Method object has an Attributes collection associated with it. So we should be able to search the list of attributes for one with a Name property of “Test”.  Using Linq, we can do this pretty easily like this:

method.Attributes.OfType<IAttributeElement>().Count(a => a.Name == "Test")

This will give a a count of the “Test” attributes on our method. We can put this into an if statement like so:

if(method.Attributes.OfType<IAttributeElement>().Count(a => a.Name == "Test") > 0)
{
}

A quick note; I’m using the OfType<T>() method to convert the collection returned by the Attributes Property into an enumerable of IAttributeElements just as an easy way of enabling Linq expressions against the collection. Since DxCore is written to work with all versions of VisualStudio, there really isn’t any official Linq support. As a matter of fact, using the expression we did limits the plug-in to only those people with .Net Framework 3.5 installed on their development machines. I think that in this day and age, this is a fairly safe assumption, so I’m not that worried about it. I would like to point out also, that having this expression in place does not prevent the plug-in from working with Visual Studio 2005, as long as the 3.5 framework is installed.

 

Ok, so now we have a list of methods, and we’re filtering them based on whether or not they are Test methods (defined by the existence of a Test attribute).  The next thing to do is look for an Assert statement within the text of our method.  This is another place where the Expressions Lab proves invaluable.  Looking at Expressions Lab we discover that our Assert statement is in fact an ElementReferenceExpression and is a child node of our Method object.  With this knowledge in hand we can use the FindElementByName method on our Method object to look for an Assert reference:

var assert = method.FindChildByName("Assert") as IElementReferenceExpression

Now all we have to do is test whether or not our assert variable is null, and we know whether or not this method violates our rule. Once we do that test we can add the appropriate Code Issue Type to the CodeIssues list using our event args. The last piece of the puzzle then will look something like this:

if(assert == null)
{
    ea.AddIssue(CodeIssueType.CodeSmell,(SourceRange)method.NameRanges[0],"A Test Method should have at least one Assert");
}

With this in place we should now be able to run our project and try it out. Using F5 to debug a DxCore plug-in will launch a new instance of Visual Studio. From there if you create a new project, or open an existing project, and write a test method which does not have an Assert, you should see a red squiggle underneath the name of the method. Hovering over that with your mouse you’ll see our Code Issue test presented. Adding an Assert will make the Code Issue disappear.

image

Well, things are looking good here, we’ve got code that is searching for an issue, and displaying the appropriate warning if our condition is met.  There is one other condition we should probably consider, however.  The one case I can think of when our rule does not apply is when we are expecting the code under test to throw an exception.  In that case there would be an ExpectedException attribute on the test class.  To make our users happy we should probably implement this functionality.

The good news is we already know how to accomplish this, since we are using the same technique to determine if the method we’re looking at is a test method.  All we need to do is change the test condition in our Count() method so it looks for “ExpectedException” instead of “Test”.  While we’re at it it seems like a reasonable thing to get an instance of the attribute and then check it for null, similar to how we’re handling the assert.  With all of this done the code should look like this:

var assert = method.FindChildByName("Assert") as IElementReferenceExpression;
var expectedException = method.Attributes.OfType<IAttributeElement>().FirstOrDefault(a => a.Name == "ExpectedException");
if (assert == null && expectedException == null)
{
    ea.AddIssue(CodeIssueType.CodeSmell, (SourceRange)method.NameRanges[0], "A Test Method should have at least one implicit or explicit Assertion");
}

So now we should be able to run this, and see that the code issue disappears if we have a test method with either an assert statement, or an expected exception attribute. Pretty cool. You’ll notice that I also updated our issue message so it reflects the fact that we are able to handle implicit assertions (in the form of our ExpectedException) attribute.  For the sake of completeness, here is what our finished CheckCodeIssues method looks like:

private void cipTestShouldAssert_CheckCodeIssues(object sender, CheckCodeIssuesEventArgs ea)
{
    var resolveScope = ea.ResolveScope();
    foreach (IMethodElement method in resolveScope.GetElementEnumerator(ea.Scope, new ElementTypeFilter(LanguageElementType.Method)))
    {
        if (method.Attributes.OfType<IAttributeElement>().Count(a => a.Name == "Test") > 0)
        {
            var assert = method.FindChildByName("Assert") as IElementReferenceExpression;
            var expectedException = method.Attributes.OfType<IAttributeElement>().FirstOrDefault(a => a.Name == "ExpectedException");
            if (assert == null && expectedException == null)
            {
                ea.AddIssue(CodeIssueType.CodeSmell, (SourceRange)method.NameRanges[0], "A Test Method should have at least one implicit or explicit Assertion");
            }
        }
    }
}

And that’s it. Granted there are some things here I would like to change before releasing this into the wild. We are specifically looking for NUnit/MbUnit style test method declarations for one, and we are also looking only for the short version of the attribute names, but this should give you a good idea of how things work.

If you are interested in seeing a more polished final version, you can either download the finished source for this post, or have a look at my CR_CreateTestMethod (admittedly poorly named) plug-in on the DxCore Community Plug-In’s site.

I ran into this odd problem recently working with some Linq2SQL based persistence code.  There is some code someone put together to commit a list of changed entities to the database as part of a single transaction, which simply iterates through the list and performs the appropriate action.  The problem I was having was that I had an object referenced by another object that needed to be persisted first, otherwise there was a foreign key violation.  To add to the strangeness there seemed to be some magic going on (most likely utilizing the INotifyPropertyChanged goodness), so that even if I tried to persist just my dependent object first, both were still showing up in the list, and always in exactly the wrong order.  Now, I’m okay with magic.  Magic makes a lot of things a lot easier.  The problem arises whenever the magic is incomplete, and doesn’t follow through to take care of all of the operation.  Its like someone comming up to you and saying “Pick A Card”, at which point you do, and put the card back, and they say “I know what your card was” and walking away.  Not real convincing.  This is what was going on here.  There was the smarts to know that changes were being made to more than one entity, and there were even attributes to define what properties contained dependent objects, but no smarts to actually deal with a case when you would want to save more than one object in an object graph at a time.

So it occued to me I should be able to do some linqy magic and create some sort of iterator that would return dependent objects in the appropriate order, so the lest dependent of the objects get move to the beginning of the list.  My first step, since I wasn’t really sure how to do this, was to write a test.  And I made it more or less mirror the issue I was facing, a list of two items, one of which is a dependency of the other.  I don’t know if there is a lot of value in posting all of the test cases here, but the end result was rather nice.  Sure it took several iterations, and there was plenty of infinite looping and stack overflows (which does some fun things to studio when your running your tests with TestDriven.Net), but I think this is a reasonable solution to the problem:

public static IEnumerable<T> EnsureDependenciesFirst<T>(this IEnumerable<T> items, Func<T ,IEnumerable> selector)
{
    if(items.Count() < 2)
        return;
    var firstPass = items.SkipWhile(t => items.Intersect(selector(t)).Count() > 0);
    var remainingItems = items.Except(firstPass);
    if(items.Count() == remainingItems.Count())
        return remainingItems;
    return firstPass.Concat(remainingItems.EnsureDependenciesFirst(selector));
}

Ok, so what do we have here?  Well to start out I’m checking the item list to see if there are at least two items in it, if not I just return the list.  This provides a means to avoid an infinate loop due to the recursive call, and provides a shortcut for a scenario with only one item.  Next off I use the SkipWhile() method, combined with the user-supplied selector function to iterate through each item, retrieve it’s list of dependencies (which is what the selector function does), and checks to see if the current list contains any of the dependencies for the object.  The results of this first pass are the objects which have no dependencies at all, so therefore they need to be first in the list.  The next logical step is to run the operation again for a list that does not contains the items filtered out by the first pass.  This is done via a recursive call back to the EnsureDependenciesFirst extension.  You will notice we’re checking the count of the remaining items against the current list, and returning the list if they are the same.  This is another safety precaution for dealing with infinite loops.  If we have a circular dependency, this bit will just return the items that are interdependent.

You will note that this is a generic function that has really noting at all to do with the entities that I am dealing with.  This was largely due to the fact that this was built TDD, so I just used a simple class which had a property that could take another instance of itself.  To use this to overcome my entity committing problem, I would have to write a not too small function to retrieve the list of dependent objects from the entity (since there would need to be some reflection magic to look at attributes on the properties to determine which properties contain dependencies), but it pretty much will drop in to the foreach statement that is currently being used to persist the entities.

Incidently, I learned from my dev team what the “official” way of dealing with this is a “ReorderChanges” method, which takes two entities, in the order in which they should be persisted.  I think I like my solution better, mostly because it should mean I don’t have to worry about it again.

I was pleased to find recently that Roy Osherove’s Art of Unit Testing was available on Safari.  I have been following Roy’s blog for a while now, and was quite excited at the prospect of him writing a book on Unit Testing.  It was only my personal cheapness that kept me from shelling out the $25 to get the E-Book version from Manning ahead of time.  I have to say, now that I have read it, that it would have been well worth the money.  Before I get too deep I want to provide some context for what I am about to say.

I consider myself an experienced TDD practitioner and Unit Test Writer
So that means that I was reading this book hoping to gain some insight.  I wanted to find out how to write better, more readable, more maintainable tests.  I was also hoping for a little bit of “atta-boy” affirmation that the way I do things is the “right” way.  The astute reader may be able to tell that in order for my first hope to be true, the second may have to get some points taken away.  This was in fact the case, and to be honest, coming out of it I feel like I’ve gotten more value from the things I’ve learned than I received from whatever ego stroking may have occurred with what I am currently doing right.

So lets get started….
I was expecting the book to start out essentially as it did, some brief history about the author and an introduction to Unit Testing for those who may not be familiar with it.  I have to say I was expecting the book to be a little more TDD-centric than it was, but I think most of that was my own bias for TDD as “The Only Way To Write Software”.  Roy actually explained what TDD was, and also why he wasn’t going to harp too much on it throughout the book.  I have to say, I can see why he made the decision that he did.  I can also say that it seemed perfectly clear to me that TDD is a technique that he feels has a lot a value, which made me happy.  Since this is supposed to be a review from the perspective of an experienced practitioner of TDD and Unit Testing, I’m not going to go into anything that was touched on in the early chapters, apart from noting that they contained a general introduction to the tools, technique and philosophy of unit testing.  I can also say that, though I was already familiar with the material, I didn’t mind reading through it at all.  Overall, Roy’s writing style was light and quite pleasant, even for a technical book.

And now into the meat of the book…
For me, things started getting interesting in Part 3 of the book.  This is where issues of test design and organization are addressed.  This is one of those areas that I feel like I need some guidance on, mostly because I developed my testing idioms mostly through habit, and trial and error.  I look back on tests I have written in the past (which could be as little as two days ago) and I wonder how I could have come up with such a brittle, unmaintainable nightmare.  I feel like I need guidance from the experts on what I can do better when writing my tests.  Roy delivered on these items in chapter 7 “The pillars of good tests”.  One of the lessons I took away from this was the value in testing one concept per test.  I had heard this as “one assert per test” in the past, and scoffed at the idea.  But Roy presents a very compelling argument for why this is a good idea, if you are testing multiple concepts, you don’t know the extent of the problem when your test fails.  And lets face it, the failing test is the reason we’re doing this whole thing.  I’ve seen personally the failing test that just keeps failing.  You tackle the issue from one failed assert only to rebuild, and find one right after it which fails as well.  One of the issues I’ve had with this is the redundant setup and configuration that could be required for exercising this concept, but this issue is also addressed by the straight forward recommendation of creating clear and understandable configuration methods.  In the past I have generally not been really good about applying DRY to my test setup, which, I know, is another case of treating tests differently from regular code.  Having someone in a position of authority (like Roy) say, “put your setup in separate methods so you can re-use them and make your tests more readable” made it okay to do the thing that I knew I should be doing anyway.  The key concepts covered are making tests readable, maintainable, and an accurate portrayal of the authors intent.

Even more in depth….
Section 4 goes even further and talks about how to integrate unit testing into an organization which is not already doing it.  This is an interesting subject to me as I have recently moved to a company which has not been doing unit testing and TDD as part of their regular development process.  Roy draws on his experiences as a consultant to provide some really good advice for how to go about enacting this sort of change in an organization.  I particularly pleased with his candor when he describes his failed attempts at integrating unit testing.  It would have been quite easy to simply say “Based on my considerable expertise, these are the things you need to do”, but he chooses instead to share some real-world experience in a straight forward way that only adds to my respect for him as a professional.  In addition to this, he touches on techniques for integrating testing into “legacy” code (i.e. code which is not tested).  He does a good job at introducing some techniques for testing what is essentially untestable code, which a very large nod at Michael Feathers’ “Working Effectively with Legacy Code”.

The book ends with three appendices, one discussing the importance of testability in the design process, one listing vairous testing tools (both Java and .Net), and the last listing guidelines for conducting test reviews.  This last one is nice, because it presents a concise view of all of the guidelines presented throughout the book, and provides page references where you can get the “why” behind each. 

All in all…
This is a really good book, which should be part of any agile development library.  It doesn’t matter if you are writing your first unit tests, or you’re a seasoned pro, there is going to be something here for you.  I think it is great that Roy has chosen to share his experience with the developer community in this way.  I came into this book with some rather high expectations and I think they were met.

A note on TypeMock….
I remember seeing some criticism floating around on twitter suggesting the book was rather pro TypeMock.  There was also the comment that Roy’s affiliation with TypeMock was not made clear early on.  I can’t say I saw either of these things when I was reading it.  For starters, I already knew Roy worked for TypeMock, so perhaps that skewed my ability to objectively judge if the disclosure was done in a timely manner or not.  I can say that the places in the book which there seemed to be a preference for TypeMock were places where he stated things like “I feel TypeMock has a better syntax in this case”, or “TypeMock is the only tool with provides these capabilities”.  For starters, the first is a statement of preference.  Sure Roy helped design the API for TypeMock, so it seems only natural that he would prefer it to other frameworks, but having used it I would have to agree with the statement.  It is a great API, and example if a fluent interface done well.  The second comment is also plain fact.  Of the mocking libraries available in the .Net space, TypeMock is the only one that allows you to swap instances of objects in place, without making changes to the classes using them.  You can argue over whether or not this is a good or a bad thing, but the fact remains that it is a feature specific to TypeMock.  Maybe I was expecting something more blatant and obvious, but I just didn’t see it.