How does your code smell?

by Kevin Rutherford on November 8, 2010

How does your code smell?

This guest post is by Dr. Kevin Rutherford, a UK-based agile/XP coach, developer and project leader, with over 25 years experience in software development. He is also the founder of AgileNorth and XP-Manchester. Contact him via http://www.kevinrutherford.co.uk.

Dr. Kevin Rutherford I expect you’re a conscientious modern developer. You write your code using test-driven or behaviour-driven development, and you never write a line of production code unless it is needed in order to make a test pass. But there’s more to TDD than RED-GREEN-RED-GREEN; you have to REFACTOR each time the tests go green. Specifically, we write a test and see it fail; then we write code in the most straightforward way we possibly can to make it pass; then we refactor to clean up any mess we just made. But in that third step, just what should we refactor, and why? And if we need to refactor a pile of code that wasn’t developed in a test-driven manner, how do we get a handle on where to begin?

Refactoring is defined as improving software without changing what it does. The intention is to take bad code and replace it with something better. The next question must then be: What is “good code” and what is “bad code”? How can we tell them apart and recognize which is which? And when we can recognize bad code, what should we do to eliminate it? The best guiding principle here is the four rules of Simple Design, first described by Kent Beck in the early descriptions of Extreme Programming. As re-expressed by J.B.Rainsberger, the design of a piece of code is “simple” when it:

  1. Passes its tests
  2. Minimizes duplication
  3. Maximizes clarity
  4. Has fewer elements

So that’s it: First ensure that your code passes all its tests, then remove all duplication, then make sure the design intent is clear, and then get rid of anything that doesn’t need to be there. Easy! But “duplication” and “clarity” are non-trivial concepts. For example, it is usually easy to see when a chunk of code has been copied and pasted, but far less easy to see when a piece of knowledge, or a responsibility, exists in more than one place. Textual duplication can be detected by tools, but the more subtle kinds cannot. Similarly, the use of clear names can appear subjective; maybe a name I believe to be clear confuses you, and vice versa. Our goal is Simple Design, but the path to simplicity may be very unclear.

To help with this difficulty, the agile community has developed informal names for the most common kinds of “bad” code. These are anti-patterns, and are often called “code smells,” after Kent Beck’s analogy between code and babies: “If it stinks, change it”. Reflecting the rules of Simple Design, code smells come in two major varieties: they either make the code harder to understand, or something in the code is duplicated (which more than doubles the risk involved in making future changes). However, until recently there has been no catalog of the kinds of smell you might find in Ruby code, and Bill Wake and I set about remedying that problem with our book Refactoring in Ruby. Here’s a list of the major types of code smell we identified, grouped approximately according to their impact on the rules of Simple Design:

Duplication Names
Alternative Modules with Different Interfaces Comments
Combinatorial Explosion Complicated Boolean Expression
Control Coupling Divergent Change
Data Class Dynamic Code Creation
Data Clump Global Variable
Derived Value Implementation Inheritance
Duplicated Code Inappropriate Intimacy
Feature Envy Incomplete Library Module
Greedy Method Inconsistent Names
Greedy Module Long Parameter List
Large Module Nil Check
Lazy Class Runaway Dependencies
Long Method Shotgun Surgery
Message Chain Simulated Polymorphism
Middle Man Special Case
Open Secret Speculative Generality
Parallel Inheritance Hierarchies Temporary Field
Procedural Code Type Embedded in Name
Refused Bequest Uncommunicative Name
Reinvented Wheel Utility Function
Repeated Value

For each different code smell identified, we give a set of tell-tale symptoms so you can spot it easily, an indication of what refactorings you can do to remove the smell, and an indication of what other smells you might reveal as you do so. For example, here’s a brief summary of one code smell from the catalog:

Feature Envy
What To Look For:

  • Code references another object more than it references itself, or
  • Several clients manipulate a particular type of object in similar ways.

What To Do:

  • Extract Method to isolate the envious code
  • Move Method to put it with the referenced object

What To Look For Next:

  • Duplication: Look for further duplication around the clients
  • Communication: Review names for the receiving class for consistency

Refactoring in Ruby also provides hundreds of challenges, exercises that will help you practice detecting and removing smells from working Ruby code. For example, take a look at the following code; which smells can you see?

# Creates a report showing machine activity
class Report
  def report(machines)
    statuses = machines.map { |machine| status(machine) }
    "FACTORY REPORT\n#{statuses.join("\n")}\n"
  end

  def status(machine)
    activity = machine.bin == nil ? '' : " bin=#{machine.bin}"
    "Machine #{machine.name}#{activity}\n"
  end
end

As a companion to the book I have also created a very naive code smell detector tool called Reek. Reek cannot perform enough static analysis to detect every code smell listed, but it does a reasonable job of detecting some forms of Class Variable, Control Couple, Data Clump, Duplication, Irresponsible Module, Large Class, Long Method, Long Parameter List, Feature Envy, Utility Function, Nested Iterators, Simulated Polymorphism and Uncommunicative Name. For example, consider the little source file above; Reek reports the following code smells:

$ reek simple_report.rb
simple_report.rb -- 3 warnings:
  Report#status calls machine.bin twice (Duplication)
  Report#status doesn't depend on instance state (LowCohesion)
  Report#status refers to machine more than self (LowCohesion)

Reek can also produce a YAML report giving extra details such as line numbers:

$ reek --yaml simple_report.rb
---
- !ruby/object:Reek::SmellWarning
  location:
    lines:
    - 9
    - 9
    context: Report#status
    source: simple_report.rb
  smell:
    class: Duplication
    occurrences: 2
    subclass: DuplicateMethodCall
    call: machine.bin
    message: calls machine.bin twice
  status:
    is_active: true
- !ruby/object:Reek::SmellWarning
  location:
    lines:
    - 8
    context: Report#status
    source: simple_report.rb
  smell:
    class: LowCohesion
    subclass: UtilityFunction
    message: doesn't depend on instance state
  status:
    is_active: true
- !ruby/object:Reek::SmellWarning
  location:
    lines:
    - 8
    context: Report#status
    source: simple_report.rb
  smell:
    class: LowCohesion
    subclass: FeatureEnvy
    receiver: machine
    message: refers to machine more than self
    references: 3
  status:
    is_active: true

As you can see, this report is a little rough around the edges, and there are still many more code smells that Reek could detect. But even in its current draft state many people find Reek helpful, either as a part of their build process (the Reek gem includes an easy-to-use Rake task) or on an ad hoc basis. If you try it please let me know how well it works for you.

I hope you found this article valuable. Feel free to ask questions and give feedback in the comments section of this post. Thanks!

Do also read these awesome Guest Posts:

Technorati Tags: , ,

Posted by Kevin Rutherford

Previous post:

Next post: