Does Ruby Have Too Many Equality Tests?

by Eric Anderson on November 17, 2010

Does Ruby Have Too Many Equality Tests?

This guest post is by Eric Anderson, who develops web-based applications for small businesses though his company Pixelware, LLC in Atlanta, GA. He also runs SaveYourCall.com which allows people to record phone calls from any phone without the need for any complicated hardware.

Eric Anderson You probably started using == out of habit from other languages. It seems to work and that seems good enough. But then you might start seeing ===, =~, eql? and equal? and wonder what the heck those are about? How are they different and why does Ruby make you insane with so many equality tests?

This article will explain the purpose of each test. It will help you understand the intention of the test by looking at how the standard library defines them. Once you understand the intent you will know how to define that method in your own objects and what you should expect when you are using standard library and 3rd party objects.

==

== is your main equality test. Most of the time when you want to test for equality this is what you will use. The intent of this method is “do the objects have the same value regardless of class?” The following are a few examples that will illustrate the behavior:

a = Object.new
a == a                             # true
a == Object.new                    # false
"foo"           == "foo"           # true
"foo".object_id == "foo".object_id # false
1 == 1.0                           # true
1.class == 1.0.class               # false

The first two examples show the default behavior when comparing instances of class Object. If the two objects are the same object in memory it will return true. But if it is compared to another instance of Object it will return false.

The next example compares two String instances with the same value. As you can see when compared they return true even though they are different objects in memory (as noted by the fact that their object id returns false when compared). So even though the default behavior is fairly strict subclasses should open up if they are able to decide the objects have the same value.

The final example shows us the value comparison carries across different classes. 1 is a Fixnum while 1.0 is a Float. But since they still have the same value they are considered equal. This means when implementing your own objects you should ignore the class of the other objects and concentrate solely on the value of the objects.

Also when implementing == it is important to ensure the equality test is reversible. a == b should return the same value as b == a.

One last note on ==. You may occasionally see the != operator. != is not a method you can define but a language feature. It will call your == method and then return the opposite boolean value returned by ==.

eql?

eql? operates slightly more restrictive than ==. Like == it is concerned with the value of the object and not concerned if two objects are the same instance. But unlike ==, eql? does care about the class. Let’s look at some examples:

a = Object.new
a.eql? a                             # true
a.eql? Object.new                    # false
"foo".eql? "foo"                     # true
"foo".object_id == "foo".object_id   # false
1.eql? 1.0                           # false

As you can see eql? starts out like ==, but when comparing the Fixnum to the Float the result is false even though they have the same value. An object is data plus behavior. Since == is intended to be ignorant of class it is basically ignoring differences in behavior and only concerned with differences in data. By using eql? you are indicating you not only care that the value is the same but that the behavior of the object is the same.

equal?

equal? is the most strict of any equality tests. It will test to ensure that the objects being compared are the same instances in memory. This means the value of their object id will be same as they are literally two pointers to the same instance.

a = Object.new
a.equal? a                           # true
a.equal? Object.new                  # false
"foo".equal? "foo"                   # false
"foo".object_id == "foo".object_id   # false
1.equal? 1                           # true
1.object_id == 1.object_id           # true

So our default behavior is like the other equality tests but the similarities end there. When you compare two difference instances of String with the same value you get false. This is because despite having the same value they are two different instances in memory.

The last example is an interesting quirk of Ruby. For performance and memory optimization reasons there is only once instance of each value of Fixnum. This means that even though we created our instances separately they have the same object id and therefore equal? will return true when compared.

Ruby requires equal? conform to this intent. You should not override this method in your subclasses. Doing so might cause unpredictable behavior by the Ruby runtime.

===

=== can allow you to write code in a very concise and readable way. === is a way you can compare two objects using a single operator but having the intent change depending on the context of the comparison. Lets give some examples:

a = Object.new
a === a                             # true
a === Object.new                    # false
"foo"           === "foo"           # true
"foo".object_id == "foo".object_id  # false
1 === 1.0                           # true
1.class == 1.0.class                # false
Fixnum  === 1                       # true
(1..10) === 5                       # true
/o/ === 'foo'                       # true

As we can see this starts out looking a LOT like ==. We can see that the comparison is concerned with value not the instance in memory. It also doesn’t care about class. The last three examples really show this method’s uniqueness. When comparing a class with an instance of that class we get true. When comparing a range with a value in that range we get true. When comparing a regexp with a string it returns true if the regexp matches the string. Why does Class, Range and Regexp define === this way? Because the === operator is used in the case control flow statement. Review the following example:

case obj
  when "foo"          then ....
  when Fixnum, Float  then ...
  when 1..10          then ....
  when /o/            then ....
end

This is the magic of ===. It lets the intent change depending on the context of the comparison. So sometimes you want a simple value comparison like ==. Other times you want to know if an object is an instance of the given class. Other times you want to know if a value is in a range and sometimes you want to know if a regexp matches an object. You have one operator the case statement can use that works for all these types of comparisons.

It is important to note that just because a === b, this does not imply b === a. The order of the comparison is very important. For example 1 === Fixnum will return false while Fixnum === 1 returns true.

=~

=~ is a pretty special equality operator. Object defines it to always return false (even if comparing two objects that are the same instance). Where this operator gets interesting is with the Regexp class. Take a look at the following examples:

a = Object.new
a =~ a                                          # false
/o/   =~ 'foo'                                  # true
'foo' =~ /o/                                    # true
Mime::Type.new('application/xml') =~ 'text/xml' # true

Regexp defines =~ as alias to the match method. This provides the familiar =~ syntax found in languages like Perl while still providing a fully object-oriented regexp library. Note that String defines =~ to reverse the operands. So ‘foo’ =~ obj will execute obj =~ ‘foo’. This allows =~ to be reversible when comparing a String even though =~ is not generally reversible.

Regexp is the primary place =~ is used but the last example shows where =~ is defined by a class in the Rails framework to allow mime type aliases to match a specific mime type object.

Like the == method there is an inverse of =~ which is a language feature and not a method that can be overridden. So if you see !~ the =~ method will be called then the inverse of the result will be returned.

Conclusion

So does Ruby have too many equality tests? I think not! == obviously is the work horse equality test. But the case statement would not be nearly as elegant without ===. Testing a regexp would not be nearly as concise without =~. == is great most of the time but sometimes it is too broad. eql? and equal? are great way to be more precise.

When creating your own classes try to make sure that these methods are conforming to their intent and not just inheriting the default behavior of Object. This can often make your API much more elegant.

I hope you found this article valuable. Feel free to ask questions and give feedback in the comments section of this post. Thanks!

Do also read these awesome Guest Posts:

Also check out the free and paid Ruby-related eBooks from RubyLearning.

Technorati Tags: , ,

Posted by Eric Anderson

{ 10 comments… read them below or add one }

Stefan Kanev November 17, 2010 at 2:53 pm

As a note, `eql?` is actually used for comparing hash keys. Even if you have the nice effect of ‘same classes, same value’ in most cases, `eql?` sends a different message to me. When I see it redefined, I think “All right, we’re using this as a hash key”. When I see it used directly, I go “Huh?”.

Also, you are saying that “when implementing [==] your own objects you should ignore the class of the other objects and concentrate solely on the value of the objects.”. That is not true. While `1 == 1.0`, you don’t have the PHP/JS semantic of `”1″ == 1`, where you can argue that the type is different, but the value is the same. `==` doesn’t ignore the class — what it does is that it defines mixed-type comparison within a hierarchy. You can see that both `Integer` and `Float` share a common parent, `Numeric`.

Reply

banister November 17, 2010 at 11:15 pm

@Stefan, I don’t think anyone in their right mind would argue “1″ and 1 have the same value but are of different type – “1″ is (if you want) the ascii value of the _character_ “1″, whereas 1 is a straight-up integer. Perhaps if you’re from a PHP background you might have such a mental model, but I think most programmers from other languages would absolutely not agree with you on this point.

Further, my view of #== is that it is a duck-typed version of #eql?. I agree completely with the poster’s analysis on this matter; the book “The Ruby Programming Language” (co-authored by Matz) also gives this definition of #eql? that seems to bear out the poster’s understanding of things: “Classes that override [eql?] typically use it as a strict version of == that does no type conversion” (p 77)

Reply

Eric Anderson November 18, 2010 at 8:10 pm

Thanks for your perspective!

Regarding eql? I don’t know if I would necessary say eql? is only for comparing hash keys (although that is a good info to know for someone new to Ruby). The behavior is an important aspect of an object and there are other uses cases where using eql? to verify you are getting an object with the same value AND behavior are important.

Regarding ==, perhaps my saying that == should be completely class ignorant is going a bit too far. It should not be concerned much with class but it should be at least aware enough of the class to know how to interpret the value.

I don’t think they have to be in the same hierarchy to be == but if the classes are so different that it doesn’t make sense to compare them then it should return false. Even if their values are superficially similar (like in the case of “1″ and 1).

I think with all these methods there are not hard and fast rules. It is partly up to the object implementer to decide what ==, eql?, etc means. But the implementer should be able to justify that his implementation falls in line with the intent that the standard library seems to be conveying.

Reply

Jonas Elfström November 17, 2010 at 5:57 pm

>!= is not a method you can define but a language feature.

Except in Ruby 1.9 it seems it is.


> Object.methods.include?(:!=)
=> true

class Test
def !=(o)
true
end
end

t1 = Test.new
t1!=t1
=> true

I kind of agree with http://www.zenspider.com/Languages/Ruby/QuickRef.html about this.

Reply

Eric Anderson November 18, 2010 at 7:57 pm

I was not aware of this change for 1.9. Good info! I agree that there are unlikely to be many use cases for overriding !=. I assume they did this just to make it like every other operator (i.e. a method instead of a special case language feature).

Reply

Maurício Szabo November 26, 2010 at 7:31 pm

I agree there are quite few uses for overriding != and !, but sometimes this grants some marvelous things.

I’m working on a project using Arel, https://github.com/mauricioszabo/arel_operators, with works overriding operators, so I can construct a Query using plain-ruby, like:

Person.where { name != ‘Foo’ }
or
Person.where { !(name == ‘Foo’) }

On Ruby 1.8, for example, I need to use something like:
Person.where { |q| q.not(q.name == ‘Foo’) }

So, as it’s unlikely you’ll need to override !=, it’s quite usefull, really.

Reply

Jarmo Pertman November 18, 2010 at 12:20 am

It seems that there is a typo in:
/o/ =~ foo # true

Should’t the foo be a ‘foo’ as a String instead?

Reply

Eric Anderson November 18, 2010 at 7:55 pm

You are correct, I have updated this post to fix the typo. Thanks!

Reply

Rob Nichols May 8, 2013 at 2:53 pm

I’m a little surprised considering the conclusion of this article, that the <=> operator is not mentioned.
<=> b == 0 is yet another equality comparison.

More importantly, defining how <=> works in your classes is often better than defining == directly. It is therefore, a good starting point when customising the comparison behaviour of your classes, and equality is just a type of comparison.

Reply

Rob Nichols May 8, 2013 at 2:59 pm

a <=> b == 0 is yet another equality comparison.

Limits on editting time meant I was unable to correct the original

Reply

Leave a Comment

{ 45 trackbacks }

Previous post:

Next post: