The Testing Mindset

by Noel Rappin on September 30, 2010

The Testing Mindset

This guest post is contributed by Noel Rappin, who is a Senior Consultant with Obtiva, and has been a professional web developer for a dozen years. He is the author of four technical books. The most recent, Rails Test Prescriptions, is available for purchase at http://www.pragprog.com/titles/nrtest/rails-test-prescriptions. Noel has a Ph.D. from the Georgia Institute of Technology, where he studied educational technology and learner-centered design. You can find Noel on Twitter as @noelrap.

Noel Rappin If you want your Test-Driven Development (TDD) process to be effective, you need to have a testing mindset. Choosing the right tools helps, but the difference between using the TDD process well and using the TDD process poorly is much larger than any difference between tools.

If you’ve heard anything about testing, you’ve been told that the process is “red, green, refactor”, that is “write a failing test, make it pass, clean it up”. In order to use that process effectively, you need to get in the habit of “thinking TDD” as you attack a problem. Features of thinking TDD are:

  • The ability to break a task down into tiny, component pieces.
  • An embrace of the simplest solution to the problem.
  • A willingness to live with code that may feel unfinished until you write more tests.

The first one is not hard, and gets easier with practice. The second and third are the ones that I struggle with. The temptation to move your code just a little beyond your tests, or to put that little code filigree is very compelling, and almost always winds up being regretted.

I want to look at the testing mindset in solving a small problem that I first presented as a small Ruby Kata:

Find all the unique sequences of digits, like [1, 1, 2, 3, 8] that have the following properties:

  • Each element of the sequence is a digit between 1 and 9
  • The digits add to 15
  • There is at least digit that appears exactly twice
  • No digit appears more than twice
  • Order is irrelevant [1, 1, 2, 3, 8] and [1, 3, 2, 1, 8] are the same sequence and only count once.

There are 38 sequences.

At first glance, it seems like this problem would be hard to write in a test-driven way. If you start from the end result and try to do the whole thing in one try, then you wind up trying to test the final 38 number. That’s a hard way to go. The first step is to try and find a part of the problem that is small and verifiable.

After I little bit of analysis, I broke the problem into three parts. The first, and probably the easiest to understand, is how to verify whether a given sequence meets the parameters of the problem. The parameters are reasonably simple and easy to verify.

The second problem is to determine which sequences of numbers to test. I decided to solve this problem with a search through the potential sequences using a mechanism where each sequence would be able to say which sequence to test next. In other words, the problem starts by testing [1], determines that does not meet the requirements, then asks the sequence what to test next. The way I thought the problem through, that sequence is [1, 1]. This is far from the only way to solve the problem. However, by walking through the sequence this way, we have a simple, easy to verify, fact about each sequence — what comes after it in order.

The third part of the test is a loop that walks through the sequences and verifies each on in turn. We’ll come back to that in a moment.

I start by setting up an RSpec file, and write out my first specs. The first one is an easy one, just to make sure I can create my object and do the basics.

  it "should be able to sum its items" do
    sequence = Sequence.new(1, 2, 3, 4)
    sequence.sum.should == 10
  end

Getting there is pretty easy.

class Sequence

  attr_accessor :elements, :status

  def initialize(*elements)
    @elements = elements
  end

  def sum
    elements.inject { |sum, n| sum + n }
  end

end

The verification steps are all laid out in the problem. The first one, of course, is the sum.

  it "is valid if the sum is 15" do
    sequence = Sequence.new(1, 1, 7, 6)
    sequence.should be_valid
    sequence = Sequence.new(1, 1, 7, 7)
    sequence.should_not be_valid
  end

Two things to note. One is that I’m testing both the positive and negative conditions (strictly speaking, those should probably be in two different tests). Two is that I’ve picked a positive sequence that I know will stay valid even after I add the other constraints. That makes the test more stable as I add new requirements.

At this point, we pass with:

  def valid?
    sum == 15
  end

Gotta admit, that’s pretty simple. This is where the other two parts of the testing mindset come in. If you are a conscientious developer, then part of you is screaming on the inside that this is too simple, and it’s incomplete. The temptation to add the next piece of the code right now is pretty strong. Resist. Things are better if you go one step at a time.

The next step is this test:

  it "is invalid if there is no pair" do
    sequence = Sequence.new(8, 7)
    sequence.should_not have_digit_pair
    sequence.should_not be_valid
  end

I don’t need a positive test here, because my first test already covers it, though if you wanted to insist that I need a separate test for should have_digit_pair, I wouldn’t argue too strenuously.

At this point, I cheat a little bit, because I have this histogram method pre-tested from other projects, which makes the digit pair method easy.

  def histogram
    @histogram ||= begin
      @histogram = Hash.new(0)
      elements.each do |digit|
        @histogram[digit] += 1
      end
      @histogram
    end
  end

  def has_digit_pair?
    histogram.values.any? { |x| x == 2 }
  end

In this case, histogram is an implementation detail — if I didn’t have it, I would probably write it within the has_digit_pair? method and then refactor it to a separate method on the next step.

The digit trio test and code are similar

  it "is invalid if there is a digit trio" do
    sequence = Sequence.new(1, 1, 1, 5, 7)
    sequence.should have_digit_trio
    sequence.should_not be_valid
  end
  def has_digit_trio?
    histogram.values.any? { |x| x > 2 }
  end

At this point, the valid? method has gotten really, really complicated. Okay, not really:

  def valid?
    !has_digit_trio? && has_digit_pair? && sum == 15
  end

The need to test digit pair and digit trio as separate entities has eliminated any impulse I might have to write those methods inline in the valid method, which is an example of how writing in small pieces tends to give you short and independent code methods.

At this point, I’m comfortable with the verification and it’s time for the order of the sequences. One great feature of TDD that helps with ordering is that I can specify behavior without caring about implementation. I have no idea, as I start, what the final implementation is, but I can specify the behavior in a way that I can verify whatever implementation I wind up with.

The basic idea here is that each sequence will know what sequence to test next, and eventually we’ll cover the whole space with something like a while loop. It took a couple tries to get the rules for moving to the next sequence correct, I’ll spare you my back and forth. The first couple rules are easy.

  it "should start with 1" do
    sequence = Sequence.new
    sequence.next.elements.should == [1]
  end

  it "a sequence with a low sum should spawn a new element" do
    sequence = Sequence.new(1)
    sequence.next.elements.should == [1, 1]
  end

The implementation at this point is also pretty easy:

  def next
    return Sequence.new(1) if elements.empty?
    Sequence.new(*(elements << elements[-1]))
  end

So if the sequence is not empty, we append a duplicate of the last element to the end of the sequence. This helps guarantee that the digits in the sequence are in numerical order, which keeps us from having to worry about duplicate sequences in different orders, but that’s beside the immediate point (if we think there’s a problem later, we’ll write a test). The immediate point is that the specs pass, and it’s time to think up the next example of navigation behavior.

It actually might be helpful to code up the actual loop at this point. As much as I love TDD, it’s helpful to see the big picture — this example is as much an exploration of the problem as it is a normal feature, and having the loop in place makes it easy to see how the ordering method is behaving.

In Ruby 1.9 you can use an Enumerator to walk the sequence. While building the code, you can augment this with print statements to see what’s going on in a big picture sense, while using TDD to add new features. Technically, since I’ve refactored this into a class method, you could even write a test to verify the enumerator, although I didn’t when I originally solved the problem.

def self.sequences
  Enumerator.new do |y|
    sequence = Sequence.new
    while sequence
      sequence = sequence.next
      break unless sequence
      y << sequence
    end
  end
end

What’s cool about that is that invoking the enumerator is very simple:

if __FILE__ == $0
  result = Sequence.sequences.select { |s| s.valid? }
  p result.size
  p result.map(&:elements)
end

Okay, the logic at this point goes like this: following the rule we have so far, the first sequence we test is [1], the next one is [1, 1], and the next one is [1, 1, 1]. They all fail, which is fine. The relevant point is that there’s no need to then test [1, 1, 1, 2], because we know it will fail from the trio of 1′s. We can move directly on to [1, 1, 2].

We’re looking for a small, verifiable step, so to put it another way:

  it "a sequence with a trio should increase its last element" do
    sequence = Sequence.new(1, 1, 1)
    sequence.next.elements.should == [1, 1, 2]
  end

Which makes our developing program:

  def next
    return Sequence.new(1) if elements.empty?
    if has_digit_trio?
      elements[-1] += 1
      return Sequence.new(*elements)
    elsif sum < 15
      Sequence.new(*(elements << elements[-1]))
  end

Hey, we got a nice little side benefit from writing the digit trio method separately, we get to use it here.

At this point the program will hum along merrily until it gets to a sequence whose sum is over 15. (You can verify this by running the program via the enumerator). What we want when a sequence sum gets too high is to again stop, back up, and take the next to last digit and increase it. Or:

  it "a sequence whose sum is too high should back up an element" do
    sequence = Sequence.new(1, 1, 2, 2, 3, 3, 4)
    sequence.next.elements.should == [1, 1, 2, 2, 3, 4]
  end

Our next method is starting to get kind of complex:

def next
  return Sequence.new(1) if elements.empty?
  if has_digit_trio?
    elements[-1] += 1
    return Sequence.new(*elements)
  elsif sum < 15
    return Sequence.new(*(elements << elements[-1]))
  elsif sum >= 15
    new_elements = elements[0 .. -2]
    new_elements[-1] += 1
    return Sequence.new(*new_elements)
  end
end

Two more pieces, both of which work toward ending the sequence. We need to make sure that no digit goes over 9, and we need to make sure the sequence ends.

  it "a sequence should not let a digit get over 9" do
    sequence = Sequence.new(1, 9, 5)
    sequence.next.elements.should == [2]
  end

  it "the last sequence should end" do
    sequence = Sequence.new(9, 6)
    sequence.next.should be_nil
  end

Which we can solve with:

def next
  return Sequence.new(1) if elements.empty?
  if has_digit_trio?
    elements[-1] += 1
    return Sequence.new(*elements)
  elsif sum < 15
    return Sequence.new(*(elements << elements[-1]))
  elsif sum >= 15
    new_elements = elements[0 .. -2]
    if new_elements[-1] == 9
      return nil if new_elements.size == 1
      new_elements = new_elements[0 .. -2]
    end
    new_elements[-1] += 1
    return Sequence.new(*new_elements)
  end
end

It’s a little ugly, but I actually don’t see all that much in the way of necessary refactoring. I admit that the code as it stands is a little goofy in terms of whether Sequence objects are mutable or not, but it doesn’t interfere with solving the problem. Which is the point. If I need to worry about Sequence objects being immutable, I need to write additional tests to drive that change in logic. If I want to just refactor without changing logic, I can clean up as much as I want with some confidence that the tests will catch any errors.

Now, running the entire sequence will return the 38 sequences that solve this problem, with nearly the entire program under test. We were able to isolate small pieces of a more complex problem and turn them into concrete, verifiable pieces of code that enabled us to organically build the larger program.

If you want to see my complete solution, the tests are at http://gist.github.com/602077 and the program code is at http://gist.github.com/602076. Other solutions, with different assumptions and methods, are referenced in the original blog post at http://railsrx.com/2010/09/27/a-quick-ruby-kata/.

I hope you found this article valuable and that it gives you an insight into the “Testing Mindset”. Feel free to ask questions and give feedback in the comments section of this post. Thanks!

Do read these awesome Guest Posts:

Technorati Tags: , , , , ,

Posted by Noel Rappin

{ 9 comments… read them below or add one }

John Tantalo September 30, 2010 at 9:34 am

> it “is valid if the sum is 15″ do

Don’t you mean the inverse?

> it “is invalid if the sum is not 15″ do

Also, this test doesn’t actually test if the sum is 15. Shouldn’t you include the antecedent as part of the test itself? I noticed have inverted the test and included the antecedent in the “is invalid if there is a digit trio” test.

Reply

Noel Rappin September 30, 2010 at 9:49 am

I’m not sure I understand the question. The test for verification doesn’t need to test the sum because the previous test verifies that the sequence class can sum, so I’m testing that a sequence that I know adds to 15 is classified as valid. Could you be more specific about what you think the test should look like?

Reply

John Tantalo October 5, 2010 at 7:46 pm

Your test should show that the sum is 15 if the sequence is valid, and not the other way around, right? The way you phrased the spec is confusing because it’s not something you should be testing for, because its not true.

Here’s my version of the test,

it “is invalid if the sum isn’t 15″ do
sequence = Sequence.new(1, 1, 7, 6)
sequence.sum.should == 15
sequence.should be_valid
sequence = Sequence.new(1, 1, 7, 7)
sequence.sum.should_not == 15
sequence.should_not be_valid
end

As somebody reading your code, I shouldn’t have to trust that the sample sequences have the property you stated; you should prove it in your test. You followed this pattern in your “is invalid if there is no pair” and “is invalid if there is a digit trio” tests.

Reply

Noel Rappin October 6, 2010 at 8:45 am

Okay, I think I understand this now — you left out the word “you” in the last sentence of your post, and I partially misread it.

The difference between the 15 test and the digit pair tests, at least as I was building them, was that the summing functionality was already there. Therefore, strictly from a TDD perspective, I don’t need “sequence.sum.should == 15″ because that test doesn’t add any new logic to the code.

If I wanted to add a “equal_to_desired_sum?” method, which would be reasonable and consistent with the other two methods, then I’d add sequence.should be_equal_to_desired_sum as a test to force the creation of the new method.

Reply

Prakash Murthy September 30, 2010 at 11:43 am

Thanks for an awesome blogpost illustrating in detail how to develop Test-first!

I had solved the Quick Ruby Kata earlier in my own way with print statement validation & no formal tests whatsoever. So it was a good learning experience to go through the step-by-step details in the blog post.

Quite a few other learning points for me on this post – wrote them up on my blog: http://zero2railshero.tumblr.com/post/1214370090/excellent-ruby-learning-blog-post-the-testing-mindset

Reply

slawosz September 30, 2010 at 7:53 pm

Hi,
I see one question with this approach:
Method has_digit_pair? for me is not a part of public api – we need it only internally. So it could be private. In this case, it is no problem, but what if it would change state of object? It would be better to make it private. But then TDD is more complicated….
How to deal with it?

Reply

Noel Rappin September 30, 2010 at 11:48 pm

I disagree with the premise — why would I make has_digit_pair? private? Why would I think it’s not part of the public API?

I could imagine this class as part of a system where there’s a visual display that displays the result of each sequence, in which case that method could easily be public. My bar for making a model method private is really high, especially in Ruby, where there’s really nothing stopping a determined other class from calling the private method anyway.

If it happens that has_digit_pair? changes the state of the object, we would find that out because it would break a test.

In the general case, you wouldn’t need to test a private method explicitly, you would test the public methods that call it — the existence of a private method is basically an implementation detail.

Reply

Tony Schneider October 1, 2010 at 10:23 pm

excellent post, thanks!

Reply

b.b. September 2, 2011 at 5:43 pm

You did more or less thorough up-front analysis: “a little” and “back and forth” and “a couple tries” later on, which is just error-prone. Your decision to go with “a mechanism where each sequence would be able to say which sequence to test next” was actually a guess. There is simply no evidence why starting “from the end result” should be harder than your solution. The noted ugliness comes down to a noisy mixture of concerns within the Sequence-object. Never mind, E. Evans himself once said that TDD/BDD/DDD is not very design friendly.

In this very Kata to generate good enough sequence-candidates it is in fact cheaper to just count up eliminating 0-containing-numbers and order-variations.

For I don’t speak a single word Ruby, pseudo-code:

has0(0).Should.Be.True
has0(1).Should.Be.False
isDuplicateSequence(21).Should.Be.True
isDuplicateSequence(12).Should.Be.False
sumIs15(78).Should.Be.True
sumIs15(0).Should.Be.False
hasDuplicate(11).Should.Be.True
hasDuplicate(0).Should.Be.False
hasTrio(111).Should.Be.True
hasTrio(0).Should.Be.False

for(int digits = 0, results = 0; results < 38; i++){
if( !Has0(i) || IsDuplicateSequence(i))
continue;
if( SumIs15(i) && HasDuplicate(i) && !HasTrio(i)){
// there it is: i – print it, collect it, whatever …
result++;
}
}

That's it.

Reply

Leave a Comment

{ 24 trackbacks }

Previous post:

Next post: