Friday, October 26, 2007

Handoff 1.0

Having used it on my current project for the last few months and improved the unit test coverage, I've released version 1.0.0 of the Handoff gem. You can get the gem file at RubyForge or just a wait a while and do

sudo gem install handoff -v 1.0.0

My first priority for future development is to improve the output when specifying delegation that hasn't been implemented yet. (Currently you just get one NoMethodError.) After that I may work on a couple of features requested by Ali, but it's a real hassle TDD'ing a library for making test assertions, so the going will likely be slow.

For those of you who don't work with me, Handoff is a tiny little gem providing a fluent interface for asserting on simple delegation. It aims to make asserting the behavior as simple as implementing it (with Forwardable). See the rdocs for examples.

Friday, October 19, 2007

Some Rambling on xUnit Testing Style

On our drive back to Brooklyn from the client site last night, Patrick and I were talking about testing style.

I mentioned that it had bothered me for a long time that the base class we extend in (most) xUnit frameworks is called "TestCase" even though any good test class describes multiple test cases. [To help fend off semantic confusion, here's my idea of test case identity: One test case is a set of inputs and stimuli to the code under test. There may be several things to assert about what goes on during a test case or the state of things after its run, but the inputs and stimuli are constant. When you vary them, you have another test case.]

Since we try to minimize the number of assertions in a given test (ideally keeping it to one, though in contrast to Jay I'm personally fine with that one assertion being complex) we often have multiple test methods that assert on the same test case (by the definition above) and could therefore share setup and teardown code. But since we also need tests to exercise other scenarios, we can't use the framework's setup method to create the scenario unless we're willing to set up all the stuff we need in all our tests, what you might call a "superset fixture," which feels wrong. Incidentally, I think the reason not many people use the term "fixture" for the stuff you set up for your tests is that it's always been a pretty weak concept in practice: either you have a mess of objects that various tests will use in various ways or you're setting up so little that it's barely worth talking about.

It may be practical to set up a superset fixture when doing simple state-based testing, but if you're dealing with a "wide" object, it means a really noisy setup.* When using mocks it's actually impossible to fully set up more fixture than you'll need in every test: a mock set up but not used will fail the tests that don't satisfy its expectations.

So besides having the name 'TestCase' that doesn't seem to make sense, we have these facilities for setting up and tearing down that we don't leverage much.

Then I had what I thought might be an important insight right there in the car. Maybe the class was called a TestCase and had just one setup method because it was originally intended to describe just one case, with each test method just asserting something different about the scenario created in the setup. If so, the setup could even include the stimulus of the code under test, reducing test methods to nothing but assertions. Maybe what's made both the naming and the use of shared fixture-setup seem awkward all this time is that we've tied ourselves to creating one TestCase class per production class, when all along we could have had a TestCase class for each scenario we wanted to test, with most having a very small number of test methods.

Here's a super-simple example of the sort of thing I was imagining, though my imaginings were a lot more abstract.

require 'test/unit'
require 'set'

class Set::EmptyTest < Test::Unit::TestCase
  def setup
    @set = Set.new
  end
  
  def test_size_is_zero
    assert_equal 0, @set.size
  end
  
  def test_empty
    assert @set.empty?
  end
end

class Set::AdditionTest < Test::Unit::TestCase
  def setup
    @set = Set.new
    @set.add 5
  end
  
  def test_size_is_one
    assert_equal 1, @set.size
  end
  
  def test_contains_added_item
    assert @set.include?(5)
  end
  
  def test_not_empty
    assert !@set.empty?
  end
end

class Set::DeletionTest < Test::Unit::TestCase
  def setup
    @set = Set.new [:abc, 5]
    @set.delete 5
  end
  
  def test_size_is_one
    assert_equal 1, @set.size
  end
  
  def test_no_long_contains_deleted_item
    assert !@set.include?(5)
  end
  
  def test_still_contains_other_item
    assert @set.include?(:abc)
  end
  
  def test_not_empty
    assert !@set.empty?
  end
end

All these little test cases could be a maintenance headache if they each lived in their own file, but it might not be too bad if you gave up the one-class-per-file convention. Although I'm not a good student of history, I knew xUnit frameworks started with one in Smalltalk, and it seemed like this one-TestCase-class-per-scenario approach might have been really convenient in a development environment where all code was organized hierarchically without the bother of source files that might need to be moved, renamed, etc when changing tests. I've only run Squeak long enough to build a trivial Seaside application, so I was speculating, but I could imagine it being pretty handy to organize tests with one package per class under test, then a class per test case, each with a setup, then a test method for each assertion to be verified.

In Ruby we would also do some metaprogramming to reduce the noise and make the test code more intentional. Maybe something like this.

testcase_for 'an empty set' do
  
  setup { @set = Set.new }
  
  test('size is zero') { assert_equal 0, @set.size }
  
  test('empty') { assert @set.empty? }
end

testcase_for 'adding an item to a set' do
  setup do
    @set = Set.new
    @set.add 5
  end
  
  test('size is one') { assert_equal 1, @set.size }
  
  test('contains added item') { assert @set.include?(5) }
  
  test('not empty') { assert !@set.empty? }
end

You probably noticed this looks a lot like RSpec contexts, which gets at why I was so excited. I wondered if Kent Beck's original intent had been something much closer to BDD, and it had just taken the rest of us a long time to catch up.

So when I got home I went digging around for articles about unit testing style and found surprisingly little. I also looked for anything about the original intent of the framework. (Googling these topics was a little depressing because of all the weak information, plagiarism, and content spam.)

My search stopped when I found the Kent Beck article where he originally presented the unit testing framework pattern we now know so well. (I believe that was first published in The Smalltalk Report in October 1994. Thanks to Farley for digging up that obscure nugget.) I was disappointed to find that the example in that first article conforms pretty much exactly to the classic form of unit test we've all seen before, including a setup method that sets up more fixture than any one test method uses.

In case you find the Smalltalk a little painful to read, here's my translation of Beck's example test case to Ruby. (The examples above, you'll now see, are based on Beck's.)

require 'test/unit'
require 'set'

class SetTest < Test::Unit::TestCase
  
  def setup
    @empty = Set.new
    @full = Set.new [:abc, 5]
  end
  
  def test_add
    @empty.add 5
    assert @empty.include?(5)
  end
  
  def test_delete
    @full.delete 5
    assert @full.include?(:abc)
    assert !@full.include?(5)
  end
  
  def test_illegal
    begin
      @full[0]
      fail
    rescue NoMethodError
      # expected
    end
  end
end

So that was a bit of a letdown. On the other hand, I did finally learn why the base class is called a TestCase when it represents so many different test cases.

As a test writer, you tend not to think of your TestCase subclass as a normal class. All the instantiation and running is in the framework, and none of your code ever interacts with TestCase instances, so their life-cycle (their very existence as normal objects) is usually irrelevant to you as a user of the framework. From the framework's point of view, however, their life-cycle is central.

As you may or may not know, your xUnit runner creates one instance of your TestCase class for each test method, passing to the constructor the name of the test method the new instance will run. Sure, well written setup and teardown methods would allow the runner to use one instance for all the test methods, but that would require the test writer not to accidentally leave state hanging around in the instance. Why put that burden on the framework user when the framework can just as easily start with a completely clean slate every time?

So the framework creates one TestCase instance per test method, each of which is a test case of its own. It works in the intuitive sense of "test case" as well as in OO terms. Score one for the forefathers of agile software development!

I'm interested to hear what styles people have used or seen in unit test/spec suites. Have you tried creating multiple TestCases per class to keep individual test case classes more maintainable? Did it work out? What about BDD specs? Do you find having the structure of the test or spec map directly to the purpose of the code (as opposed to having private test helper methods scattered around your source file) advantageous? How far have you gone with keeping test setup all inline in the tests themselves? Let me know.

* Yes, wide objects that are hard to set up for test are a smell that the code under test may be poorly factored, but if there's anything we've learned driving back and forth from Brooklyn to central New Jersey every week, it's that you often have to live with bad smells, though you should always be on the lookout for a route that avoids them without making you late.

Sunday, October 14, 2007

"Sit on this and rotate!"

--Suggestion from a project manager to two relatively senior developers on a team in need of a tech lead [1]

Last week I finished my second eight-week stint as technical lead of a medium-sized team of ThoughtWorkers. Patrick Farley and I alternated in the role over the last seven months or so, switching between development and tech leading at each production release. As we're both approaching one year on the project (and therefore roll-off) I've now handed the reins over to someone new (Paul Gross) who will likely rotate with another team member if the team continues to release on a similar schedule.

Rotating the tech lead position isn't a standard practice on ThoughtWorks projects, but it's worked out really well for us, so I thought I'd document it a bit.

Pro

The simplest and most obvious benefit of rotating the role is knowledge sharing. The tech lead will be exposed to aspects of the release process and production deployment (including lots of people and groups in the organization) that the rest of team shouldn't be concerned with if they're to continue developing new functionality at maximum velocity. While it's important to protect the team from distractions where practical, it's also important not to create a one-person knowledge silo. (What's your truck number?) Rotating the role means that you've likely got more than one person available who's been through the whole cycle, without having had to simultaneously dedicate multiple capable developers to work that's largely about always being available for a context switch to ask or answer the right questions.

Besides specifically spreading knowledge relevant to the current project, rotating the leadership role also creates more well-rounded developers, giving them some of the experience necessary to lead other teams. Whether in the context of a consulting organization or a company employing developers for its own projects, this is hugely valuable for everyone involved. The developers don't just benefit from being in multiple roles: they also get to see how a number of other people deal with the same issues. (Even the devs who don't rotate into the lead role benefit from that.)

Rotation also helps keep the lead's technical tools from getting dull. This goes for general development chops as well as knowledge of how the team's systems work. During my runs I knew a lot about what was coming up in terms of requirements, and I was always there for team tasking sessions when the shape of the implementation was generally hashed out, but once something was being implemented I generally only learned more about things that came up in team-room discussions or where the pair working the story sought my advice. No good developer wants to become an ivory-tower architect, and rotating back to a role where you can focus on making the rubber meet the road is a great way of making sure it doesn't happen.

There's also a therapeutic advantage to rotation. The tech lead, like the project manager but to a lesser extent, bears responsibility for the work of a whole lot of people, and that can be exhausting and frustrating. I was extremely fortunate to have an incredible team behind me, so when it came to their work I felt comfortable trusting them to deliver on reasonable commitments. But our application was in production, with new releases rolling out regularly, and that meant lots of external groups playing a vital role in our team's delivery.

Or not.

It's a sad fact of enterprise software consulting that we're often brought in because an organization's had difficulty delivering reliably. They probably haven't identified all the causes of their difficulties, and many are likely outside the scope of what our team can remedy. The development group and business sponsors may be completely aligned on maximizing delivery value while minimizing process overhead for the project, and we may achieve incredible results in developing working software that meets the business need. But that only gets you so far before you need to coordinate with the release management group, the configuration management group, the database development group, the database infrastructure group, the quality assurance group, the network design group, the network support group, the network monitoring group, etc. Chances are the organization's difficulty in delivering has had a lot to do with the way these organs have worked together (or against each other). Our software has to work its way through the same system to deliver actual value. Certainly we can try to influence these groups to streamline their processes, but if our charter is to develop software, this is a delicate balance. The orchestration of dozens of informal conversations, formal meetings, ticketing system requests, and emails to achieve seemingly simple goals can be hilarious and even fun (when it all goes right) but it's decidedly not technical leadership, and it's easy to burn out.

Rotating the lead role could also pay off if someone doesn't work out. There are a couple ways rotation mitigates the pain and project risk introduced by a bad technical lead. First, the struggling lead has at least one former leader whose experience and advice they can draw on to try to improve as well as to directly supplement their own efforts. Failing that, the worst case is that a former lead is reinstated, which will likely be a far greater comfort to the customer than having someone new brought in. The security of having past leaders available makes this a great way to let someone stretch into a more senior role than they've had before without betting an entire project on their success.

Con

What are the downsides to rotating developers through the role? The only one we've seen is that you have to bring someone new up to speed more frequently than you otherwise would. Though the cost here is not negligible, I think it's justified, particularly in a consulting situation where eventual roll-offs and transitions are inevitable.

How

As far as how to do the transition, we tried it two ways.

The first time we rotated, Patrick was focusing on our initial release to production, but we also had a bunch of new stories for upcoming releases that needed to be estimated and have some open questions resolved. It was too much for one person, so I focused on the upcoming stories while he dealt with the production roll-out. As support work on Release 1 quieted down, we shifted developers over to work stories for Release 2. The upside of that was there were enough hours in the day for Patrick to get production straightened out without collapsing from exhaustion, while stories for Release 2 got the attention they needed. The most immediate downside was that I hadn't been exposed to much of the release and deployment process, so as we began dealing with production support, I had to lean on him more than I would have liked, sometimes pulling him out of normal development work.

When the next rotation came up, we felt it would be inefficient having us once again simultaneously acting as tech leads for different releases, so we did more of a hard cut-over. This reduced confusion among the client and the team regarding who to talk to about what, but it may not have been practical if we hadn't both already had deployments under our belts. The cut-over felt good, so we've stuck with that since.

Try it?

Here's some background on the scenario where it worked for us. Our team size has varied: during the time we've been rotating we've had between seven and ten developers, two to three business analysts, a dedicated iteration manager only briefly, and a project manager/client principal mostly but not entirely dedicated to our project. Before we started, the client organization already had a large body of software in production and established processes for release and deployment.

Another thing to keep in mind that might be important for any team looking to try this is that Patrick and I had worked together for months before the team down-sizing that led to the rotation. We'd developed a strong mutual respect, each feeling that the other was capable of succeeding in the role. We were also both happy to take a normal development role on a team with the other leading. It's hard for me to see the rotation succeeding without that, and it may be the most difficult element to recreate on another team.

Obviously there are lots of other factors that may have helped this work for us. Your mileage will vary. For what it's worth, both our Client Principal and Patrick have said they want to try something similar at other projects, and I do too, assuming we seem to have the right ingredients.

Have you been in situations where rotating team members through leadership positions might have helped? Have you tried it? Let us know. Thanks.

[1] No, no one really said that.