“AkitaOnRails” On Anatomy of Ruby Blocks/Closures

by on November 30, 2007

Fabio Akita’s “AkitaOnRails” series at RubyLearning.com, for would-be Ruby developers, has been quite a hit. Today in another article, Fabio talks in depth about Ruby’s Blocks/Closures, This is a rather long article but well worth the time invested in reading it. The entire source code for the programs in this article is available here.

Fabio AkitaFabio Akita is a Brazilian Rails enthusiast, also known online as “AkitaOnRails”. He regularly write posts on his own blog and had published the very first book tailored for the Brazilian audience called “Repensando a Web com Rails”. He is now a full-time Ruby on Rails developer working as Brazil Rails Practice Manager for the Utah company Surgeworks LLC.

Note: You may want to brush up on Ruby Blocks and Procs before going through Fabio’s article.

Fabio>> There is no Rubyism more difficult to explain, than a Closure. I could write an entire book about it. Instead of boring everyone to death with academics, I’ll try to focus only on what one really needs for day-to-day activities.

Let’s start with an example:

# program1.rb
for i in [1,2,3,4]
  puts i
end

a = 0
b = [1,2,3,4]
while a < b.length
  puts b[a]
  a += 1
end

These are very simple iterators, similar to what we have in several languages. The first one with a ‘for‘ construct and the second with the familiar ‘while‘ statement. Nothing fancy here. But let’s see another way of accomplishing the same thing in Ruby:

# program2.rb
[1,2,3,4].each { |i| puts i }

[1,2,3,4].each do |i|
  puts i
end

Not too shabby, simpler and elegant, but that’s where people start gasping. The pipes notation is particularly threatening for non-starters. Both the brackets and the do..end notations define an enclosed piece of code that we name as ‘blocks’ or ‘closures’. What’s between the pipes are like parameters to a method. It ‘feels’ like this pseudo-code:

def unnamed_method(i)
  puts i
end

[1,2,3,4].each(unnamed_method)

This is not valid Ruby code, of course. That’s similar to what we would do in C# with delegates. We have something similar in JavaScript (using the “Prototype”:http://www.prototypejs.org/api/enumerable/each library):

[1,2,3,4].each(function(i) {
  alert(i);
});

In Javascript functions are first-class citizens of the language and they can be defined anonymously (without a name), and then passed as parameters to another function. We can manipulate and move functions all over the place.

Ruby doesn’t have methods as first-class citizens. We actually can extract a method from an object and wrap it around a ‘Method‘ object, but it holds the context of its object, so it is not independent.

# program3.rb
class Test
  def initialize
    @hello = "Hello!"
  end
  def say
    @hello
  end
end

m = Test.new.method(:say)
puts m.call # => "Hello!"
puts m.class # => "Method"

Here, we extracted the :say method from the instantiated Test instance. Notice that we can now manipulate the method as a normal object. Whenever we send the ‘call‘ message to the method object, it runs as if it was being executed from the context of the original object (Test.new.say). In the above example the second last statement would successfully print “Hello!”, as stored in the local instance variable @hello.

Although simple, we don’t do this very often. That’s because this method is bound to the original object’s context and we don’t usually want that: it would be nice to have an independent block of code. So, let’s create one very simple block of code referenced by a variable:

# program4.rb
c = lambda { |i| puts i }
c.call(1) # => 1
c.call(2) # => 2

The ‘lambda‘ keyword encloses the code within brackets as an object block, an instance of the Proc class. This object responds to the ‘call‘ method. In the last two statements we pass parameters to the ‘call‘ method and it goes to the ‘i’ variable defined within pipes inside the block. So, it acts as an independent entity, detached from any particular class. Let’s test it:

# program5.rb
c = lambda { |i| puts i }

class Test
  def say(block)
    block.call(self.class)
  end
end

c.call(self.class) # => Object
Test.new.say(c)    # => Test

We’re using the same block defined above in the variable ‘c’. After the definition of the Test class, we call the block object passing ‘self.class’ and it returns Object as a result. Then, we call the :say method from inside an instance of the Test class. The :say method calls the block giving the inner ‘self.class’ as a block parameter. In this case it prints out ‘Test’ instead of ‘Object‘, meaning that the block binds itself to the enclosing scope. That’s one difference between a block and a detached object method.

In many ways, Blocks resembles anonymous functions from Javascript, anonymous delegates from C#, anonymous inner classes from Java. This is a very useful construct that was primarily created to better handle iterators. For instance:

# program6.rb
[1,2,3,4].reverse_each { |i| puts i }
# => 4 
# => 3
# => 2
# => 1

Now, this is different from the Array’s ‘each‘ method we used before. ‘reverse_each‘ navigates backwards through the Array’s elements. It gets each element and passes it into the ‘i’ variable, set as a parameter for the block defined within brackets.

In languages like Java, everything has to be defined through an interface. Enumerators are no different, and we have interfaces like ‘Iterator’. This method defines simple methods as ‘hasNext()’ and ‘next()’. But what if we actually need something as a reverse iterator? Now we’re on our own. What if we need something more complicated like an iterator that only walks through even elements? In Ruby we can define such a method like this:

# program7.rb
class Array
  def even
    i = 0
    while i < self.size
      yield(self[i]) if i % 2 == 0
      i += 1
    end
  end
end

[1,2,3,4,5,6].even { |i| puts i }
# => 1
# => 3
# => 5

First of all, remember that Ruby’s classes are all open, so we can easily redefine the standard Array class and append new methods to it. Now pay attention to the ‘even’ method. We implement it the usual way with a ‘while’ loop. But the interesting bit is the ‘yield‘ keyword. Pretend that it works like a *wildcard placeholder* for blocks. In the example, when we pass the ‘self[i]‘ value as its parameter, we’re actually passing this value to the ‘i’ parameter in the block.

We can rewrite this method in a slightly different way but with the same behavior:

# program8.rb
class Array
  def even(&code)
    i = 0
    while i < self.size
      code.call( self[i] ) if i % 2 == 0
      i += 1
    end
  end
end

So, now we explicitly defined that the ‘even’ method expects to receive a block, converting it to the ‘code’ parameter. Then, inside it we send the ‘call‘ method to ‘code’ and pass ‘self[i]‘ as its parameter. The result is exactly the same as using the yield keyword.

We can still do it differently:

# program9.rb
class Array
  def even(block)
    i = 0
    while i < self.size
      block.call( self[i] ) if i % 2 == 0
      i += 1
    end
  end
end

Now we’re doing it without the ampersand in the parameter. In the previous example, the ampersand operator ‘captures’ a block into a Proc instance object. In the latest example, the ‘even’ method expects to directly receive a Proc object, like this:

# program10.rb
class Array
  def even(block)
    i = 0
    while i < self.size
      block.call( self[i] ) if i % 2 == 0
      i += 1
    end
  end
end
c = lambda { |i| puts i }
[1,2,3,4,5,6].even( c )

Let’s go back to the Array’s ‘each‘ method as we displayed before:

c = lambda { |i| puts i }
[1,2,3,4].each( &c )

A little bit different, because the ‘each‘ method doesn’t expect a Proc object as a parameter, but an actual Block. So we use the ampersand before the Proc instance variable and it kind of ‘expands’ it back into a ‘raw’ code block, so that the ‘even’ method can ‘yield’ it inside, instead of executing through the ‘call‘ method.

This usage of a Proc object is not as elegant as just passing a Block, but with this construct we are storing code within an object. We can define a method that receives as many blocks as we need, for instance:

# program11.rb
def foo(name, block1, block2)
  block1.call
  puts name
  block2.call
end

foo "Fabio", lambda { puts "Hello" }, lambda { puts "World" }

This example receives a normal parameter and 2 blocks instead of one. We can pass blocks as enclosed Proc objects in the parameters list as we would do with any other kind of object. We usually don’t need that many discrete blocks inside a single method. The most usual style is:

# program12.rb
def foo( param1, param2 )
  # do something
  some_param = 1
  yield( some_param ) if block_given?
end

foo(1, 2) do |some_param|
  # do something
end

So we define a normal method, with normal parameters. But inside it we ask ‘block_given?‘. If positive, it ‘yields’ its block passing some parameter to it (of course, parameters are optional, and you can pass as many parameters as you want to a block, even zero parameters).

We call the defined method as usual, passing a block at the end of the method call. By the way, here’s another way of defining a block: using the do .. end construct. There is no strict rule, but we reserve the brackets notation when the block is small and we can state it in a single line, and the do .. end notation when we have blocks with multiple statements inside.

There’s a gotcha:

foo a, b do |some_param|
  # do something
end

foo a, b { |some_param| # do something }

Both brackets and do .. end constructs define blocks, so at first glance the 2 above statements seems to do the same thing. But the gotcha is that in Ruby parenthesis are optional, and we’re not using them here.

In the first statement the block is assumed to be passed to the ‘foo’ method as expected, with ‘a’ and ‘b’ as normal parameters. But the second statement guesses that ‘b’ is a method and tries to pass the block to it. The recommendation is: if you have a method that needs both parameters and a block, enclose the parameters within parenthesis to avoid ambiguities.

We now understand that Blocks are pieces of code that can be exchanged between method calls, as parameters or returned values. But there is more to it:

# program13.rb
c = lambda { |i| puts i }
c = Proc.new { |i| puts i }
c = proc { |i| puts i }

The above 3 statements do the same thing: instantiate a block object. ‘proc’ is an alias for ‘lambda‘ and they work slightly different than ‘Proc.new‘. In Ruby 1.9, ‘proc’ will probably be an alias for ‘Proc.new‘ instead.

Keywords to keep in mind are:

  • lambda/Proc.new – encloses a bunch of code inside a Proc instance.
  • & – ampersand, either captures a ‘raw’ code block into a Proc object or expands the Proc object as a ‘raw’ block.
  • {}/do..end – defines a code block.
  • || – pipes, defines the parameters of a block. If you don’t need any, just omit the pipes altogether.

So, some people misinterpret Blocks as a simple function pointer, or something like Java’s anonymous inner class. That’s not the case: and here we finally boil down to “Closures”. Ruby Blocks are Closures. The words ‘blocks’ and ‘closures’ mean the same thing in Ruby.

Ruby Blocks can enclose not only code and it’s own inner local variables, but it can enclose the surrounding context variables. That’s why it is called a ‘closure’. Let’s see an example:

# program14.rb
def greetings_factory(prefix)
  Proc.new { |name| "#{prefix}, #{name} !"}
end

birthday = greetings_factory("Happy Birthday")
xmas = greetings_factory("Merry XMas")

puts birthday.call("David") # => "Happy Birthday, David !"
puts xmas.call("Matz")      # => "Merry XMas, Matz !"

The first thing is a method definition for ‘greetings_factory’. It gets a prefix as a parameter and returns a Proc object, whose inner parameter is a name. So far so good.

The second part defines 2 Proc instances, one for birthday and another for christmas. Notice that we pass 2 different prefixes into the ‘greetings_factory’ method. The different values are ‘closed’ within Block. So, when we later call them, we see how differently they behave: they actually stored the latest state within itself. So each block stored the ‘prefix’ variable passed before, while still accepting the ‘name’ parameter within the Block.

Keep in mind that every Ruby Block is a Closure, that’s why this construct actually works:

# program15.rb
list = []
[1,2,3,4].each do |i| 
  list << i * 2 
end
puts list.inspect # => [2, 4, 6, 8]

So, we defined a ‘list’ array *before* we create the iterator block. Then, inside the block we refer to the external ‘list’ array and populate it. In Java, this would’ve been a final variable, but in Ruby there is no such limitation.

You’d want to be very careful about the surroundings of your block: do not define variables that’s going to be used inside the blocks too early in the code. Try to keep dependencies nearby, like in the above example where the ‘list’ Array is defined right before the iterator itself.

Iterators get a big boost because we’re not limited to a hard Interface. We can add whatever methods we need, like ‘each’, ‘reverse_each’, ‘collect’, ‘select’ and so on. Each one of them can receive a block and pass one element at a time to the user-defined block.

Another very important usage is to enclose widely used code patterns. For example, Rails has the following construct to use database transactions:

User.transaction do 
  u = User.new(:login => 'admin')
  u.save!
end

‘User’ would be the ActiveRecord instance. One example of the structure for the model’s ‘transaction’ class method would resemble this structure:

class ActiveRecord::Base
  def self.transaction
    begin
      ActiveRecord::Base.establish_connection
      yield if block_given?
    rescue => e
      RAILS_DEFAULT_LOGGER.error e
    ensure
      ActiveRecord::Base.remove_connection 
    end
  end
end

This means: open the database then try to ‘yield’ the block if provided. If anything wrong happens, get the error message and log it. Finally ensure that the connection is dropped after all this.

ActiveRecord doesn’t open and close connections this often, but you get the idea. But this is an overall big picture of one way to avoid repetition and the extraction of common code patterns. So it’s clever to encapsulate common functionality and place in a placeholder for user-defined code in-between.

The File.open method does the same thing: it takes responsibility of properly opening files, yielding the user block and ensuring that the file is closed without the user having to manually do this kind of house cleaning.

The most important “concept”:http://martinfowler.com/bliki/Closure.html is that Blocks helps hiding implementation details. We don’t want to know the inner details of a list iterator, or how a transaction works. We just need to focus on the business logic itself, trapped within a Block.

We described here a lot of stuff, and I think it covers the basics. Enough to actually read some of the Rails source-code and get acquainted for closure’s modus operandi. Hope you enjoyed this article!

Thank you Fabio for showing us a different perspective on Ruby Blocks. In case you have any queries, questions on this article, kindly post your questions here and Fabio would be glad to answer.

Technorati Tags: , , , ,

Posted by Fabio Akita

Follow me on Twitter to communicate and stay connected

{ 18 comments… read them below or add one }

Lucas Húngaro November 30, 2007 at 6:11 pm

Great!

Congrats to Akita and Satish.

Reply

Chris November 30, 2007 at 8:24 pm

Cheers very interesting article, I learned a lot.

Note your even methods are actually return odd numbers, not that it effects the thrust of the article at all.

Reply

AkitaOnRails November 30, 2007 at 9:31 pm

You are right :-) I meant ‘even position within the array’ and it turns out that in the even positions there are odd numbers, which may confuse you. I should’ve been clearer.

Reply

ch December 2, 2007 at 8:36 am

Great!Cheers very interesting article, I learned a lot.

Reply

Anil December 4, 2007 at 7:13 pm

Nice post! Very crisp description.
Inspired me to start playing with blocks!

Reply

mwsd December 11, 2007 at 7:23 pm

Hello,

a really nice article, however i am not sure if this
is correct, perhaps a typo?
———————————–
So we use the ampersand before the Proc instance
variable and it kind of ‘expands’ it back into a ‘raw’
code block, so that the ‘even’ method can ‘yield’ it
inside, instead of executing through the ‘call‘ method.
———————————–

I believe you mean:
“[...] so that the ***‘each’*** method can ‘yield’ it inside, instead of executing through the ‘call‘ method.”

Thanks for the article!
Greetings,
mwsd

Reply

AkitaOnRails December 12, 2007 at 8:57 am

Hm, you’re probably right, it is a typo all right. This particular snippet is about the ‘each’ method, not the ‘even’ that I showed right before it. Good catch.

Reply

Bradley Andersen December 28, 2007 at 12:49 am

Re: comments [2] and [3]. While the thing works, it does return the value stored in array[0], when 0 is not an even number. Aside from this and me wanting to make every string without interpolation ” instead of “” (as suggested by Satish!), this is very good. Very good tutorial, very informative.

Reply

louis juska December 31, 2007 at 8:05 am

thanks fabio, it was a very helpful read and complimented satish’s article on blocks/methods/procs. now i see the purpose of proc objects!

Reply

Chad Lester February 2, 2008 at 9:36 pm

Thanks for the article! However, program5.rb does
NOT demonstrate the differences between a block and a detached method. You could replace the first part with a Method and the test will run the same.

class AnyClass
def my_method(i)
puts i
end
end
c = AnyClass.new.method(:my_method)

More importantly, the scope with which the block runs is actually NOT different as you stated. For example if you create a local variable in each scope, you can see that the block is running in the context that it was created! See the following:

# program5.rb
scope = “OUTER SCOPE”
c = lambda { |i| puts “#{i}, #{scope}”; scope = “MARK” }

class Test
def say(block)
scope = “INNER SCOPE”
block.call(self.class)
puts “Inner: scope=#{scope}” # => INNER SCOPE
end
end

c.call(self.class) # => Object
puts “Outer: scope=#{scope}” # => MARK
scope = “OUTER SCOPE” # reset the scope variable
Test.new.say(c) # => Test
puts “Outer: scope=#{scope}” # => MARK

Maybe one thing you could mention is that there is an
instance_scope method, but that only gives access to instance variables, not local variables. At this time, I don’t know of a way to truly run a block in a different scope unless you write the block as a string and use “eval”.

One other minor nitpick from another example:
foo a, b { |some_param| # do something }
Because the “}” is on the same line as the # mark, it
is commented out. So if copied as written, this results in an unclosed block (probably an unexpected $end error)

Reply

Chad Lester February 2, 2008 at 9:39 pm

Ugh… I wish I could have formatted the above comment better. I forgot to use the code tags. Sorry.

Reply

andoy August 18, 2008 at 8:23 am

This article was very helpful. Thanks!

Reply

Woof Powers September 25, 2008 at 12:05 am

“Fabio Akita is a Brazilian Rails enthusiast …”

What is Brazilian Rails? Is that a rougher form of Rails which specializes in closures which subdue? Or is it Rails with most of the hair removed? If it’s the latter, wouldn’t that be Merb?

Reply

Luiz Gustavo March 19, 2009 at 8:52 pm

Congratulations for the article Fábio! It’s very clear and strait.
I’m a Java programmer learning Ruby, and this kind of approach (Blocks, Closures) are not part of my day-to-day job, but Satish examples and this article made understand these concepts very easily.
Thanks!

Reply

soluch October 19, 2010 at 10:57 pm

Chad Lester is right. There is a conceptual mistake in the article. Ability to accept parameters doesn’t differentiate Proc from Method, and even if it would, it wouldn’t affect context saving.

The difference between Method and Proc class is that Proc can additionally to instance context save local context of a method, where it is defined. In case of Method there is simply no such local context, there is only instance context (self auto-variable).

Ruby Method and Proc work a lot like C# delegates and anonymous delegates, only in C# delegate constructor in some cases may be called implicitly, and in Ruby it is explicit.

Reply

Fabio Akita October 20, 2010 at 7:38 pm

Yes, you and Chad are correct. Looking back I really don’t remember what I meant to say when I first wrote this article and I agree that it feels strange reading it now. Maybe I should try to rewrite or omit this particular portion. Thanks for the feedback.

Reply

Ochronus May 5, 2011 at 6:53 pm

I’ve written a tutorial/review on ruby blocks and closures with code examples, be sure to check it out: ruby blocks and closures

Reply

NewToRuby April 4, 2012 at 4:06 pm

—-
# program15.rb
list = []
[1,2,3,4].each do |i|
list << i * 2
end
—-
You told: "In Java, this would’ve been a final variable"
I'm not sure. I'm not a Java superprogrammer, but I don't think it is necessary to use a final variable.
The only limitation is the fixed length of a Java array, but it can be resolved with some simple tricks.

Reply

Leave a Comment

{ 11 trackbacks }

Previous post:

Next post: