How do I build DSLs with yield and instance_eval?

by on November 30, 2010

How do I build DSLs with yield and instance_eval?

This guest post is by Michael Bleigh, a Rubyist developing web applications and more for Intridea from his hometown of Kansas City. He is a prolific member of the open-source and Ruby communities, releasing such projects as OmniAuth and Hashie. In addition, he has presented at many Ruby events including RubyConf 2010, RailsConf 2009/2010 and more. While he spends much of his time writing Ruby code, he also enjoys graphic design and user experience work.

Michael Bleigh Ruby provides some fantastic built-in features for creating Domain Specific Languages (DSLs). A Domain Specific Language is, for our purposes today, like a miniature specialized programming language within a programming language. It is a way to expose functionality in a simple, readable format for other programmers (or yourself) to use. One of the most commonly used DSLs in the Ruby world is Sinatra:

require 'rubygems'
require 'sinatra'

get '/hello' do
  "Hello world."
end

Sinatra is a Domain Specific Language for building web applications. Its syntax is built based on the HTTP verbs such as GET, POST, and PUT. By exposing functionality in this way, the code is much more readable than using a more complex, programmatic API such as something like this:

app = NoDSL::Application.new

app.on_request(:get, :path_info => '/hello') do |response|
  response.body = "Hello world."
end

This is far less readable than Sinatra’s code, but in many programming languages this would be a perfectly acceptable design for a library. However, because Ruby has powerful facilities for metaprogramming and first-class functions, it is not only common practice but essentially expected for libraries to provide clean, readable APIs and leverage DSLs when necessary to do so.

Yield to Oncoming Code

The yield statement is a very important concept to understand when building a Ruby DSL. The functionality provided by yield allows a developer to pass off control temporarily to allow for configuration or advanced functionality. Yielding is a pattern that completely pervades the Ruby language, including the Ruby standard library (the functionality included with the language itself). If you’ve ever used the Array#map (or Array#collect) functionality, that’s one example of a yield pattern. An example use to increment all the items in an array would look like this:

[1, 2, 3].map{|i| i + 1} # => [2, 3, 4]

So how would we re-implement the map functionality if it weren’t provided for us? It’s actually quite simple using the yield statement:

class Array
  def my_map
    result = []
    self.each do |item|
      result << yield(item)
    end
    result
  end
end

[1, 2, 3].my_map{|i| i + 1} # => [2, 3, 4]

The yield statement essentially stops the evaluation of the method and evaluates the block passed into the method, calling it with any arguments supplied in the yield statement itself. So if I had a method that simply yielded its argument, it would look like this:

def parrot(argument)
  yield argument
end

parrot("Polly want a cracker.") do |argument|
  puts argument
end

# Output: "Polly want a cracker."

Using yield for DSLs

Now, using yield, we have the facilities to build a simple DSL. Let’s say we want to create a Domain Specific Language for describing kitchen recipes. We want to be able to add ingredients as well as steps, then print out the result. Our basic class would look something like this:

class Recipe
  attr_accessor :name, :ingredients, :instructions

  def initialize(name)
    self.name = name
    self.ingredients = []
    self.instructions = []
  end

  def to_s
    output = name
    output << "\n#{'=' * name.size}\n\n"
    output << "Ingredients: #{ingredients.join(', ')}\n\n"

    instructions.each_with_index do |instruction, index|
      output << "#{index + 1}) #{instruction}\n"
    end

    output
  end
end

Now we can build a recipe:

mac_and_cheese = Recipe.new("Mac and Cheese")

mac_and_cheese.ingredients << "Noodles"
mac_and_cheese.ingredients << "Water"
mac_and_cheese.ingredients << "Cheese"

mac_and_cheese.instructions << "Boil water."
mac_and_cheese.instructions << "Add noodles, boil for six minutes."
mac_and_cheese.instructions << "Drain water."
mac_and_cheese.instructions << "Mix in cheese with noodles."

The output of ‘puts mac_and_cheese’ will look like this:

Mac and Cheese
==============

Ingredients: Noodles, Water, Cheese

1) Heat water to boiling.
2) Add noodles, boil for six minutes.
3) Drain water.
4) Mix in cheese with noodles.

While this works, the code doesn’t seem to be very elegant at all! We need a way to make it look more like you would see on a recipe card. Let’s add some functionality using yield. First, we’ll rewrite the initializer to use yield:

def initialize(name)
  self.name = name
  self.ingredients = []
  self.instructions = []

  yield self
end

Upon initialization, the Recipe class will now yield itself, meaning that the caller can call modify it within a block context. Next, we need to add some friendly methods for adding ingredients and instructions to the class:

def ingredient(name, options = {})
  ingredient = name
  ingredient << " (#{options[:amount]})" if options[:amount]

  ingredients << ingredient
end

def step(text, options = {})
  instruction = text
  instruction << " (#{options[:for]})" if options[:for]

  instructions << instruction
end

This lets us create a recipe in a much more natural way:

mac_and_cheese = Recipe.new("Mac and Cheese") do |r|
  r.ingredient "Water", :amount => "2 cups"
  r.ingredient "Noodles", :amount => "1 cup"
  r.ingredient "Cheese", :amount => "1/2 cup"

  r.step "Heat water to boiling.", :for => "5 minutes"
  r.step "Add noodles to boiling water.", :for => "6 minutes"
  r.step "Drain water."
  r.step "Mix cheese in with noodles."
end

Once again, if we run ‘puts mac_and_cheese’ we can see the results of our handiwork:

Mac and Cheese
==============

Ingredients: Water (2 cups), Noodles (1 cup), Cheese (1/2 cup)

1) Heat water to boiling. (5 minutes)
2) Add noodles to boiling water. (6 minutes)
3) Drain water.
4) Mix cheese in with noodles.

Great! Not only do we have more functionality (allowing the user to specify amounts of ingredients and durations for instructions), but this looks a lot closer to something you might see on a recipe card.

Using yield is a great way to provide a simple configuration DSL and it takes almost no extra effort. However, to really take a DSL to the next level, you may be interested in utilizing another piece of the Ruby language called instance_eval.

Kicking It Up A Notch With instance_eval

While almost all programming languages give an eval function for evaluating a provided string as though it were source code, Ruby’s powerful blocks allow you to do this in a much cleaner and more readable fashion in some specific cases. For our purposes today, we’ll be using instance_eval. The instance_eval method takes either a string or a block and evaluates the passed block in the context of the object calling instance_eval. You can do this with any object in Ruby, even a String:

"Hello.".instance_eval{ size } # => 6

This provides a distinct advantage, in some ways, over yield by actually changing the evaluation context so that there’s no need to specify the object in question for each statement (e.g. r.ingredient). You can see an instance_eval based DSL in action if you’ve used the Rails 3 Router. However, the Rails 2.3 router was based on yield (thus map.resources instead of just resources).

Caveat Eval

While instance_eval may be a good option (and even the correct one) for a specific DSL you are working on, it is not an universally useful tool. Because instance_eval changes the evaluation context, you will lose access to methods on the calling context (because self changes) as well as expose private methods of the evaluating object that you may not have intended to be accessible. Remember that whenever you use instance_eval, the code passed in is treated as though it were being written into a method body of the object. A simple example of this:

def me
  "Michael Bleigh"
end

class YieldDSL
  attr_accessor :name
  def initialize
    yield self
  end
end

class EvalDSL
  attr_accessor :name
  def initialize(&block)
    instance_eval &block
  end
end

YieldDSL.new do |d|
  d.name = me
end
# => #<YieldDSL:0x101771bc0 @name="Michael">

EvalDSL.new do
  self.name = me
end
# EXCEPTION: NoMethodError

So it is wise to be careful when providing an instance_eval based DSL, as it may not always be more beneficial for the user. A simpler syntax comes at the cost of changing evaluation context.

Building Recipes with instance_eval

In our case for building Recipes, however, there isn’t danger in switching context. We’re mostly passing in strings and it’s unlikely that any complex context is going to be associated. So let’s upgrade it! All we need to do is redefine the initializer once more:

def initialize(name, &block)
  self.name = name
  self.ingredients = []
  self.instructions = []

  instance_eval &block
end

Ruby has a convention that the last argument passed to a method is a block that can be captured in the method by using an ampersand (&) character with a variable name. In this way, we have direct access to the block (whereas before with yield we were making an implicit call to the block). You can also use the built-in block_given? method to check whether or not a block was passed into the method you’re currently evaluating. This should be done instead of checking for block.nil? or similar.

So what can we do with our fancy new instance_eval DSL? We can define a recipe with an even prettier syntax!

mac_and_cheese = Recipe.new("Mac and Cheese") do
  ingredient "Water", :amount => "2 cups"
  ingredient "Noodles", :amount => "1 cup"
  ingredient "Cheese", :amount => "1/2 cup"

  step "Heat water to boiling.", :for => "5 minutes"
  step "Add noodles to boiling water.", :for => "6 minutes"
  step "Drain water."
  step "Mix cheese in with noodles."
end

And if we run ‘puts mac_and_cheese’, we get the same results as before.

Finishing Up

So now you should have some basic idea of how to build DSLs in Ruby using yield and instance_eval. The ability to expose functionality in a concise, easily-readable way is a very useful weapon for your programming arsenal. Before we wrap, let’s take a look at a couple more things:

Having AND Eating Cake

There’s no reason that yield and instance_eval DSLs need to be mutually exclusive. Far from it! In Ruby we encourage options, and it’s actually quite easy to provide a way to yield OR instance_eval based on the block passed in:

def initialize(&block)
  if block_given?
    if block.arity == 1
      yield self
    else
      instance_eval &block
    end
  end
end

What this snippet does is check the arity (number of arguments) of the block that’s passed in. If it’s one (meaning that they’re asking for something to be passed to the block) then we use the yield DSL strategy. Otherwise, we use the instance_eval strategy. That wasn’t so hard, was it?

Advanced DSLs with Treetop

The DSLs covered in this article so far have been internal DSLs, that is, DSLs that are executed inside the context of Ruby code. However, it is also possible to build external DSLs that do not have to contain any Ruby code at all! For example, Cucumber, the integration testing framework, is an external Natural Language DSL. Rather than being wrapped in Ruby idioms, it actually defines its own language that is executed entirely outside the context of the Ruby programming language.

The most popular library for building Natural Language DSLs in Ruby is Treetop, which lets you create grammars upon which a new Domain Specific Language can be crafted. It’s a highly interesting library with some amazing facilities, so be sure to check it out!

Conclusion

I hope that this introduction to yield and instance_eval has shown you just how easy it can be to build Domain Specific Languages in Ruby. The next time you find yourself repeatedly building the same kind of objects over and over, you might consider making a DSL to streamline the process as well as improving readability.

Feel free to ask questions and give feedback in the comments section of this post. Thanks and Good Luck!

Technorati Tags: , ,

Posted by Michael Bleigh

Follow me on Twitter to communicate and stay connected

{ 21 comments… read them below or add one }

r4ito November 30, 2010 at 11:17 am

I love this kind of articles! I’ve been wondering how the DSL syntax was accomplished, now I know the basics, thanks.

Reply

Aurelien November 30, 2010 at 4:01 pm

Thanks a lot :-)

Reply

Joe November 30, 2010 at 8:01 pm

This article was very well written. Thanks!

Reply

Joe November 30, 2010 at 8:18 pm

Running the code under caveat eval in irb I get:
#
not:
# EXCEPTION: NoMethodError

I assume you are alluding to the “don’t overuse instance_eval” controversy?

Reply

Guille Carlos November 30, 2010 at 8:21 pm

Great post thank you!

Reply

Raphael Almeida November 30, 2010 at 8:23 pm

That’s why I love ruby.
He lets me write code readable to humans.

Reply

Joe November 30, 2010 at 8:23 pm

hmmm… comments ate my code
Ran the code under caveat eval in irb and EvalDSL works just like YieldDSL. I don’t get:
# EXCEPTION: NoMethodError

Reply

Martin November 30, 2010 at 11:44 pm

The example that should fail with:

# EXCEPTION: NoMethodError

works fine for me. !? Ruby 1.8.7.

Reply

Michael Bleigh December 1, 2010 at 7:03 am

Whoops! You’re absolutely right. The example should have been executed inside a class to display the errant behavior.

Reply

rasenplanscher December 18, 2010 at 2:43 am

It would be nice, if this recognition were reflected in an update to the article. I came up with this little example illustrating your point:

[code]
class Yield
def self.this
yield self
end
def self.hidden
"secret hello kitty cult id"
end
private_class_method :hidden
end

class Eval
def self.this &block
instance_eval &block
end
def self.hidden
"secret hello kitty cult id"
end
private_class_method :hidden
end

Yield.this {|s| puts s; puts self; puts s.hidden}
Yield
main
NoMethodError: private method `hidden' called for Yield:Class

Eval.this {|s| puts s; puts self; puts hidden}
Eval
Eval
secret hello kitty cult id
[/code]

Reply

Cary Swoveland December 1, 2010 at 12:29 am

What a great article! After having recently completed the Metaprogramming course, it helped me better understand when to use instance_eval instead of blocks, and how both can be used in DSL’s, something I had not thought about before. I also found the discussion very well-organized and clear. Many thanks, Michael. Now, if I can just sort out procs and lambdas…

Reply

c00lryguy December 1, 2010 at 4:56 am

Or you can just use my DSL gem which makes defining DSLs as easy as defining a class! Check it out: https://github.com/c00lryguy/dsl

Reply

Myron Marston December 3, 2010 at 10:54 pm

You wrote “You can also use the built-in block_given? method to check whether or not a block was passed into the method you’re currently evaluating. This should be done instead of checking for block.nil? or similar.”

Why should block_given? be used instead of block.nil? (or even instance_eval(&block) if block)? It seems like personal preference to me, but if there’s a specific reason to favor block_given?, I’d like to know.

Reply

Michael Bleigh December 4, 2010 at 6:58 am

I believe there is a reason, but it’s one of those things I knew once and then forgot. If “block.nil?” or “if block” work all right, I don’t know the reason off hand not to use them, but I really feel like there was one. Sorry for the complete lack of help. :)

Reply

Nando Vieira December 7, 2010 at 4:36 am

The reason you should you `block_given?` is that the `&block` construction is optional, and if you skip it `block_given?` will still be able to recognize the block.

Skipping `&block` is also faster, and if you need a reference to a block you can do `def some_method; Proc.new.call; end` as Proc.new will use the method’s block.

Reply

Semyon December 12, 2010 at 7:04 pm

Great article, thank you!

There is a little mistake. The first receipt step differs from it’s definition in the code:

mac_and_cheese.instructions << "Boil water."

1) Heat water to boiling.

Reply

Ahmed El.Hussainittp April 8, 2011 at 5:28 am

Great article, thanks a lot

Reply

John Weir May 5, 2011 at 2:21 am

The trick for yielding sometimes and instance_eval in others was exactly what I was looking for. Thank you

Reply

Marc October 31, 2012 at 6:36 pm

So, it’s now the tail end of 2012 — for folks coming across this page today looking for how to build a Ruby DSL, checkout the Docile gem, which it makes it way easy and convenient.

Reply

Alex January 9, 2014 at 6:08 pm

Great article, this was exactly what I was looking for.

One correction: “Heat water to boiling.” should be “Boil water.”

Reply

Alex January 9, 2014 at 6:44 pm

Or the other way around, as in the rest of the article it’s “Heat water to boiling.”.

Reply

Leave a Comment

{ 49 trackbacks }

Previous post:

Next post: