How do I smell Ruby code?

by Timon Vonk on March 1, 2011

How do I smell Ruby code?

Understanding the worst of code

This guest post is by Timon Vonk, who is a self-employed Ruby enthusiast and standard nerd with an edge. He has worked with Ruby for several years, but is well-known with many other (programming) languages. Also likes martial arts, loud music, varying quantities of booze and a good scotch.

Introduction

Timon Vonk Writing bad code isn’t a bad thing. Not understanding the problem you’re trying to solve any better after having written that piece of code is. Fortunately, that happens far less often. In this article I hope to give a better understanding of Ruby code by going into Ruby specific code smells. We’ll start with some simple examples that are common in all programming languages – they just need to be covered – and then dive into some Ruby specific smells.

So what is that smell?

The term was coined in the 90s by Kent Beck on WardsWiki (one of the first wikis around) and has been popularized ever since. A code smell is a hunch, not necessarily measurable, that the code you’re looking at can be improved in some way. This process of improvement is called refactoring, as you might know. And as far as refactoring is concerned, there is no time like now, don’t leave open ends; it’s a bad habit.

The Basics

Let us give a quick rundown on the more basic code smells:

  • Duplicated code, if you see any, is almost always a bad thing. We’ll get into this part a little later.
  • Multiple method / class responsibility is always a bad thing. Try factoring out your solution in multiple methods. It will make you’re code more readable and a lot better maintainable. Large methods and classes are a dead giveaway for this as well.
  • A class should never use more methods from other classes than from itself. Why is it even there?
  • A child class should always honor the contract of the parent class, i.e., be a kind of that class. Check out the Liskov substitution principle for more information.
  • If a class hardly does anything, why is it there?
  • Does your solution have a more simple approach? Complexity can be a reason of pride for some – if not most – coders, but too much makes it terrible to understand, especially later on.
  • Non-descriptive or too long identifiers or names are a good sign that either you can’t define the responsibility of the code, or you have a hard time with naming conventions.

Simple, easy smells should give a good start on fine tuning your code. However, every language has its own specific quirks. Let us take a look at Ruby.

Calling eval on user input or unchecked code

input = "'rm -rf /'"
klass =  eval input

Of course this is bad. I hope it doesn’t need any explanation. Try not to use eval, but instance_eval instead. And if you use either, make sure that the code you eval is secure — never eval user input directly.

Nested blocks without added value

array = [["banana", "apple"],["pineapple","beer"]]
# And I want to call reverse on each element, I could do
array.collect { |sub_array| sub_array.collect { |subsub_array| subsub_array.reverse }}
# But this is much nicer
array.collect { |sa| sa.collect(&:reverse) }
# => [["ananab", "elppa"],["elppaenip","reeb"]]

The reason is simple enough. Your code is more readable, and that’s what we all want. So what about nested multi-line blocks? Check it out, big chance your solution is the root of this particular evil.

Code similarity

def post_to_site(data)
  url = build_url(data)
  response = RestClient.post(url)
end

def get_from_site(data)
  url = build_url(data)
  response = RestClient.get(url)
end

def delete_from_site(data)
  url = build_url(data)
  response = RestClient.delete(url)
end

You can easily solve this lump of code by introducing some meta-programming:

def  response_from_site(data, method = :get)
  url = build_url(data)
  response = RestClient.public_send(method, url)
end

This gives you a clean, nicer method. And it’s readable too! Isn’t that nice?

Long, repetitive and cluttering statements

Often enough you have similar parameters that call similar methods. For instance, you might need to check on some parameter and call the associated method. Or even simpler, you might need to check if a certain user input matches your criteria. I prefer a simple rule of thumb, if you’re working with any sort of collection or set, the functional approach is always cleaner, more simple and most definitely faster. The point is not to dictate when and whether you should prefer the functional approach, just that you understand that long lines of repetitive clutter screw up your code.

Take the following example:

input = "english"
case input
when "english"
  puts "English, ***, do you speak it?"
when "french"
  puts "Baguette!"
when "dutch"
  puts "I only smoke *** when it's free."
else
  puts "Dunno"
end

You can imagine in complex applications that this goes on and on. I actually see it happen a lot and it’s not necessarily bad. However, it’s hard to maintain and in more complex situations it can get really hard to read through. I’ve seen loads of Rails applications where they use just this to check on a particular parameter. Really ugly!

Since it seems like you’re white listing, one way to solve it would be to use a hash with input:result.

whitelist = {
  "english" => "English, ***, do you speak it?",
  "french" => "Baguette!",
  "dutch" => "I only smoke *** when it's free.",
  "other" => "Dunno."
}

if whitelist.has_key?(input)
  puts whitelist[input]
else
  puts whitelist[other]
end

It’s always important to be proud of the code you write. It really helps if it doesn’t smell. And I hope this article helped you do that.

Feel free to ask questions and give feedback in the comments section of this post. Thanks!

You might want to read a related article:

Technorati Tags: , ,

Posted by Timon Vonk

{ 18 comments… read them below or add one }

Cedric March 1, 2011 at 11:58 am

Very nice article. Having recently started developing in Ruby, where I used to develop mostly in Java, I am familiar with the Java ‘smells’. So it is good to see some thought being put in to detect the ‘smells’ in Ruby code. Ruby being a lot less ridged than Java in that there are many more ways to do the same thing may result in a greater potential for the code to ‘smell’.

Reply

Maurice March 1, 2011 at 12:37 pm

Good, points that are given in “Basics” should be treated under design category. People who has discipline in OO design and programming, eventually follows this. Considering the rapid growth of high level languages Java, Scala and even Ruby, I sincerely believe programmer should have bare minimum fundamentals of OO concepts. Duplicate code, heavy /thick class [too many methods], modules should be loosely coupled and highly cohesive, contract between classes etc., are fundamentals of any OO concepts. Thanks for this nice article.

Reply

Buddy Lindsey March 1, 2011 at 1:50 pm

I like this post there is a lot of good information in it, for me especially since I am only 6 months into using ruby.

I know I have a problem with naming conventions. However, I can’t seem to find any good resources for this. Can you recommend something to read on it? I hate trying to figure out what kind of casing or names to add where and when. It would be nice if there was a generalized set of guidelines to at least get you started on proper naming.

Reply

Timon Vonk March 1, 2011 at 3:01 pm

Good that you like it! The Ruby Pickaxe has some good bits on naming conventions, but honestly, if you stick to the lowercase-underscore format for methods, keep them human readable and descriptive, you’re doing a great job.

Reply

Kerry Buckley March 1, 2011 at 6:29 pm

You can make the last example even simpler:

whitelist = Hash.new(“Dunno”)
whitelist.merge!({
“english” => “English, ***, do you speak it?”,
“french” => “Baguette!”,
“dutch” => “I only smoke *** when it’s free.”,
“other” => “Dunno.”
})

puts whitelist[input]

Reply

BiHi March 1, 2011 at 6:41 pm

The code doesn’t work as it is, +other+ being undefined. If you want to go the hash route instead of using case… when, you might want to use the ability to set a default value to your hash, I’d say what you did here is a code smell to me ;)

whitelist = Hash.new(“Dunno.”).merge({
“english” => “English, ***, do you speak it?”,
“french” => “Baguette!”,
“dutch” => “I only smoke *** when it’s free.”,
})
puts whitelist[input]

Reply

Timon Vonk March 1, 2011 at 6:47 pm

Ah yeah, well noted, forgot a quote. I suppose your example works fine too, but it seems a bit excessive to use merge just so you can set the default value nicer.

Reply

BiHi March 1, 2011 at 7:21 pm

Maybe more expressive, and not using merge, you can set +default+ directly on the hash too.

whitelist = {
“english” => “English, ***, do you speak it?”,
“french” => “Baguette!”,
“dutch” => “I only smoke *** when it’s free.”,
}
whitelist.default = ‘Dunno.’
puts whitelist[input]

Using a key to set a default value doesn’t seem right, what if you need that key later on (not applicable to this example)? At least, I’d use a symbol key like :default for the default, which will avoid allocating a string each time you need to access the default value.

Reply

Steve Schwartz March 4, 2011 at 11:19 am

Or how about:

whitelist = ({
“english” => “English, ***, do you speak it?”,
“french” => “Baguette!”,
“dutch” => “I only smoke *** when it’s free.”
})

puts whitelist[input] || “Dunno”

Reply

Julio Cesar Villasante March 1, 2011 at 9:42 pm

I like this post, the hash refactoring is a very good one. Normally this is a work for inheritance and polymorfism but as you said the simple solution is always the best one.

Keep up the good work.

Reply

Patrick Tulskie March 1, 2011 at 10:21 pm

In your nested blocks example, I’d advise against using that technique in anything less than Ruby 1.9. Using the symbol is about 2-3x slower in Ruby 1.8.

I whipped up a gist to show that: https://gist.github.com/da98ab48dbe41986a2aa

In 1.9.2, the performance difference between the two is not as great.

Reply

Timon Vonk March 3, 2011 at 2:33 am

That’s very true, thanks for pointing that out.

Reply

Eric Hodel March 1, 2011 at 11:28 pm

All forms of eval are code smell. instance_eval may be less smelly but you still probably shouldn’t be using it in joyous everyday code.

Reply

Jarmo Pertman March 2, 2011 at 12:28 am

I’d just use ‘or’ there:
whitelist = {
“english” => “blah”,
“other” => “dunno”
}
whitelist[input] || whitelist["other"]

No merge and default hash value complexity. For me it seems more readable at least.

Reply

Timon Vonk March 3, 2011 at 2:35 am

I actually quite like this one, maybe even use it next time. No issues with readability or complexity, simple and to the point :-)

Reply

Steve Schwartz March 4, 2011 at 11:21 am

You don’t really need the “other” key/value in the hash. Just leave it out and do:

whitelist[input] || “dunno”

Reply

François Beausoleil March 2, 2011 at 2:38 am

There exists a Hash#fetch method, which is especially suited to this: https://gist.github.com/849890

whitelist = {
“english” => “English, ***, do you speak it?”,
“french” => “Baguette!”,
“dutch” => “I only smoke *** when it’s free.”,
}

puts whitelist.fetch(input) { “Dunno” }

When using #fetch, if the key does not exist, either KeyError is raised, or the block is called. The block may raise an alternate exception, if it would make things more explicit.

There’s one thing #fetch does not do which would help a lot: if the message included the key which generated the error. Still, #fetch is immensely useful.

Reply

Patrick Ma March 24, 2011 at 12:21 pm

Have you heard about rails_best_practices?

Reply

Leave a Comment

{ 38 trackbacks }

Previous post:

Next post: