Ruby

Table of Contents

1 Variables and basic blocks

1.1 Variables : local, global, instance, class, constants

1.2 begin stmt end

begin
  hello_string = "Hello World"
  puts hello_string
end

1.3 begin stmt rescue stmt1 ensure stmt2 end

#Program to copy one file to another
#with exception handling

from_filename, to_filename = ARGV

from_file = open( from_filename )
from_data = from_file.read()
puts      "Copying #{from_data.length()} bytes..."

begin
  to_file_status = File.exist?( to_filename )

  # prompt the user before overwriting files
  if  to_file_status == true 
    puts "Output file exists. Will overwrite the existing file."
    puts "Continue (Y/N)?"
    answer = $stdin.gets().chomp()
    if answer != 'Y'
      raise "Try Again"
    end
  end
rescue
  # try copying again with a new filename
  puts "Enter another filename:"
  to_filename = $stdin.gets.chomp()
  retry
end


to_file   = open( to_filename , 'w' )
to_file.write(from_data)

from_file.close()
to_file.close()

2 Loops and Conditionals. main!?

2.1 if elsif else end

2.2 stmt if condition

2.3 stmt unless condition

2.4 case

2.5 while condition [do] stmt end

2.6 until condition [do] stmt end

2.7 stmt until conditional

2.8 for i in **.5 do stmt end

2.9 (expression).each do |loopvariable| code end

(The difference is that the for loop does not create a local scope for the variables)

2.10 break and next

2.11 begin stmt end and retry

2.12 Iterators

3 methods

3.1 arguments optional!

3.2 variable length argument list (*args)

3.3 multiple return values!!

3.4 method alias

3.5 defined? and undef

4 Containers - Arrays (heterogeneous), hashes

5 Variables and methods for classes and objects

5.1 Instance variables and instance methods

We will look at how objects and classes work in Ruby. First, we will try to understand how class and objects are defined and created.

Every object is an instance of its class. Let us take a simple example.

class A
  def initialize(str)
    @name = str
  end
  def expand
    puts @name
  end
end

Now, when we initialize an object instance of A by a_obj = A.new("a"), the instance a_obj has an instance variable @name, within it.

If we create another instance b_obj = A.new("b"), then the instance variables of a_obj and that of b_obj are different.

So it is clear that instance variables (those which begin with the @ symbol) must be stored as part of the object instance.

How about instance methods? For example, the above class defines two instance methods - initialize and expand. Should we store this as part of the object instance? This is not necessary. The methods are common to all the object instances of A. Hence we can save storage by storing the instance method definitions in a common structure shared by all instances of A.

This common structure is what the "class" A really is.

So abstractly, we have the following picture.

a_obj                             A
   +-----------+                  +-----------------+
   |           |               /--/ initialize(str) |
   |  @name    |          /----  /| expand          |
   |           |      /---     /- |                 |
   +-----------+ /----        /   |                 |
   |  class    |-            /    +-----------------+
   +-----------+            /
                          /-
b_obj                    /
                        /
   +-----------+      /-
   |  @name    |     /
   |           |    /
   |           |   /
   |           | /-
   +-----------+/
   |  class    /
   +-----------+

The following code will list all the instance methods associated with a_obj. (Note that this also contains the instance methods provided by the superclass of A, namely, Object. We will examine superclassing in the next section.)

puts a_obj.methods.grep /expand/

Thus the instance methods are stored in the classes. There are two kinds of methods that the class has - instance methods and class methods. The first are those which can be called on an instance of the class, as in a_obj.expand. We will see the second kind of methods, that is, class methods.

5.2 Class variables and class methods

"Everything in Ruby is an Object". What are classes? They must also be objects in Ruby. If classes are objects, what is a class an instance of?

puts a_obj.class.class
puts "a String".class.class
puts Class.class

In the above example, you see that different classes like A and String are instances of the class Class. The class Class is an instance of itself. (This is the first circularity that we encounter in Ruby's object model.)

For the sake of remembering, you can visualize the .class method as "moving to the right" in the hierarchy of objects in Ruby. What we have seen can be visualized as follows.

a_obj            A                 Class
    +----+       /--------+       /+----------\
    |    |   /---|        |    /-- |          |\
    |    |/--    |        | /--    |          | \
    +----+       +--------+-       |          |  X
                                   |          | /
                                   /----------+/
                String           /-
"a String"       /---------+    /
   +-------+   /-|         |   /
   |       |  /  |         | /-
   |       |/-   |         |/
   +-------/     +---------/

Since classes are objects, they should have some variables within themselves, and they should also have some methods which can be called on them. These are called class variables and class methods. How do we define these?

We will first see the most common syntax for defining class methods. Later, we will see that the mechanism for definition of instance methods and class methods are not really different, but are the same underlying mechanism.

class A
  def A.identification
    puts "class A"
  end
end

puts A.identification
a_obj = A.new
puts a_obj.class.identification
b_obj = A.new
puts b_obj.class.identification

In summary: every object responds to some instance methods. To find the instance method, you look "one step to the right". Classes are also objects, so to find its methods (that is, class methods) you step one more to the right.

Question: Is new a class method or an instance method? Where will you find the definition of new?

For defining class variables, you prepend @@ to an ordinary variable name. The following example illustrates the use of class variables.

class A1
  @@count = 1
  def initialize(str)
    @@count = @@count+1
    @name = str
  end

  def A1.print_count
    puts @@count
  end

end
a_new = A1.new("a")
puts A1.print_count
b_new = A1.new("b")
puts A1.print_count

5.3 A consistent viewpoint: Eigenclass

"A change in perspective is worth 80 IQ points."

– Alan Kay, designer of Smalltalk

To add methods to every instance of A, we looked up the class definition of A and added methods to it. For consistency, it would have been nice if we do the same for class methods of A : that is, look up its class, and add those methods in that place.

However, there is a problem: we know from a previous example that A and String are instances of the same class, that is, Class. How is it then that A and String have different class methods? To elaborate: two instances of the class Class, namely, A and String have different methods. We need to know how this is possible.

To do this as consistently as possible, Ruby introduced the nice concept of a "singleton object" or an "eigenclass" associated with each instance of a class. This eigenclass can hold variables, methods etc. which are available only to one particular instance of a class, and to no other. We begin with an example.

a_obj = A.new("a")
b_obj = B.new("b")

def a_obj.respond
  puts "#{@name} says hi."
end


a_obj.respond
#b_obj.respond  #will raise NoMethodError exception
a_obj.singleton_methods                 #will include respond

That is, by defining a_obj.respond, we add a method that is available to only one instance of A, namely a_obj, which is unavailable to every other instance of A.

That is, expand is an instance method of a_obj that is not a shared method of every other instance of A.

This gives us a clue as to how class methods are defined. What will the following code do? How exactly does it work using the notion of eigenclasss? How do you write it in the "nice" syntax of the previous subsection?

def String.abbreviate
  self.to_s[0..2]          #What is self here?
end

"abc".class.abbreviate

The above code adds the method object to the String class' eigenclass. Thus even though String and A are both instances of Class, we now have the method abbreviate which is available to String, but not to A.

Ruby also supports the following strange syntax for singleton classes.

class << a_obj
  def new_respond
    puts 'Hello'
  end
end
a_obj.singleton_methods         #should include new_respond

For more, please read: . Understanding Ruby Singleton Classes

6 Method dispatch

Every method call in Ruby has a receiver. This is the object on which the method is called on. For example, in the following method dispatch,

a_obj.expand

the receiver is a_obj.

In Ruby, there is a default receiver, called self. In a source file, when you type ~puts "Hello World"~, there is a receiver which receives this message - in this case, self points to an object named main (Note that main is an object, not a method.)

The way to understand method dispatch in Ruby is to keep track of self. self changes when either of the two happens (Dave Thomas, "The Ruby Object Model")

  1. When you call a method with an explicit receiver
  2. When you define a new class or module (modules will be explained in the section on mixins.)

That is, in the following code, the self changes are annotated.

puts 'Hello World'                  # self is main
puts self
class A
  puts self                         # self is now the class A

  def name=(str)
   @name = str                      # instance method. when it is
                                    # called, self will be the receiver
   puts self                 
  end
end

puts 'Hi there'                     # self is main
puts self
a = A.new

a.name="Hello"                      # when a.name= executes, self
                                    # changes to a. Check the output 
                                    # to see that self changes during
                                    # the method call.

This explains the concept of the receiver object. In order to understand the method dispatch, it is enough to remember that when the receiver is explicitly specified, as in a.name=, the self changes to a. Hence it suffices to understand how method lookup works for one object, namely self.

How does method dispatch actually take place once we have changed self?

6.1 Instance methods: look in self

We will explain the simple case of method lookup, where the called method is found in self's class itself, without looking up the superclass and so on. We will look at the full protocol once we understand inheritance in Ruby.

For now, assume that the method called on self is available in self's class. That is, in terms of the figure

self                              A
   +-----------+                  +-----------------+
   |           |               /--| initialize(str) |
   |  @name    |          /----   | expand          |
   |           |      /---        |                 |
   +-----------+ /----            |                 |
   |  class    |-                 +-----------------+
   +-----------+            
                          

we "step one to the right" in order to find the method. Suppose the method call is self.expand. We start from the object self, and find its class A by stepping one to the right and finding the class A. We then search the method named expand in the class A. We do find this method.

Now, we will execute this method, with self maintained as it is when the method call started. Let us look again at the definition of expand.

def expand
  puts @name
end

What does @name refer to when the method starts executing? It refers to self's instance variable name. When the method call started, let's say self was a_obj. Then, @name will refer to the instance variable name of the object a_obj.

Suppose you now expand the definition of class A as follows. (You can add to the definition of any class in Ruby, including the standard classes. These are called open classes.)

class A
  def expand_more
    puts object_id
    expand
  end
end

How does the object_id method call work? What will be self if you initiate the method call a_obj.expand_more?

Wait.

What about the instance-specific methods that you could add to each instance? See a_obj.respond. How are they looked up?

The correct protocol for method lookup within an object's class looks therefore like this:

self                              A
   +-----------+                  +-----------------+
   |           |                  | initialize(str) |
   |  @name    |                  | expand          |
   |           |                  |                 +---------+
   +           |                  |                 |         |
   |           +----    c         +-----------------+         | sc
   +-----------+    \---------     self's eigenclass          |
                              \---+----------------+          |
                                  |                |          |
                                  |   respond      +----------+
                                  |                |
                                  +----------------+

That is, the method is first looked up in self's eigenclass. If the method is found there, then that method is executed without resetting self. If the method is not present in the eigenclass, then the lookup goes "up" to the class of the self object. If it is present there, then the method is executed without resetting self.

Once we understand this, it is easy to understand the full protocol for method dispatch in presence of inheritance. We will now understand how inheritance and "mixins" work in Ruby, and then describe the full method dispatch protocol.

7 Inheritance and mixins

We think about how inheritance works.

Consider a simple example

class A
  def foo1
    puts "A::foo1"
  end

  def foo2
    puts "A::foo2"
    foo3
  end

  def foo3
    puts "A::foo3"
  end
end

class B < A
  def foo1
    puts "B::foo1"
  end
  def foo3
    puts "B::foo3"
  end
end

a_obj = A.new
b_obj = B.new

What does the expression B<A ensure? It makes sure that B is a subclass of A - that is, every instance of B will also be an instance of A. All of the following method calls return true.

a_obj.is_a?(A)
b_obj.is_a?(B)
b_obj.is_a?(A)

8 Method dispatch with inheritance

a.foo2()
b.foo2()
a_obj
+------------+                      +---------------+
|            |                      | A             |
|            |                      |  foo1         +-------+
|            |                      |  foo2         |       |
+------------+                      |  foo3         |       |
                                    |               |       |
                                    +---------------+       | superclass
                                                            |
                                                            |
                                                            |
                                     +---------------+      |
                                     | B             +------+
                                     |  foo1         +-------
                                     |  foo3         |      |
                                     +---------------+      |
 b_obj                                                      | superclass
                        class        +---------------+      |
 +----------------+               /--+ B's eigenclass|      |
 |                |         /-----   |               |      |
 |                |   /-----         |               -------+
 |                +---               |               |
 +----------------+                  +---------------+

b_obj.foo2 goes up the ancestry chain of superclasses until 
the method is found in A

Single inheritance ensures that there is only one superclass
to look up. If it the method is not found even in the superclass,
then it goes up into the superclass' superclass and so on.

9 self and super

super keyword, used inside a method, searches for the next method along the object's ancestor chain. This includes modules as well as superclasses.

Ruby Docs for super keyword

This is different from other object-oriented languages. In Smalltalk, super is a reference to the current object that forces the method lookup to start from the current object's superclass.

If arguments are not given, then super will call the superclass' method with the same arguments as that of the subclass. Notice that this may cause an ArgumentError exception if the subclass method has more number of arguments than the superclass method. The correct way in this case is to explicitly pass the argument to the supertype, as in the following example.

#-------------------------------------------------------------------
# Class Student: has a name and a hall number
#  can be compared
#-------------------------------------------------------------------

class Student

  def initialize(name, hall)
    @name = name
    @hall = hall
  end
  def to_s
    "#{@name} residing in Hall #{@hall}"
  end
end

#------------------------------------------------------------
# a TA is a Student with an additional responsibility
#------------------------------------------------------------
class TA < Student
  def initialize ( name, hall, course )
    super name, hall
    @course = course
  end

  def to_s
    super + " and a TA for #{@course}"
  end
end

10 Meta Object Hierarchy

Ruby does not have a "meta-object protocol". However, we will consider the circularities involved in the Ruby Hierarchy a little closely.

10.1 Circularity in the instance of relation

Start with any ruby object. Continue taking the class of the objects till you reach a fixpoint.

o    = Object.new
c    = o.class
puts c.class
while c.class != c.class.class
  c = c.class
end

Everything in ruby is an object, and every object is an instance of a class. Taking the "instance of" relationship is similar to stepping "sideways" in the class hierarchy. We see a circularity when we notice that the "class" of the class "Class" is itself.

What does this mean? Note that a class is essentially a container for the common methods of all its instances. If every class in Ruby is an instance of "Class", then "Class" contains all the common methods of all the classes in Ruby, including itself.

c = Class.new
c.instance_methods.each {|foo| puts foo}

10.2 Climbing up the superclass relation

The superclass of Object is a BasicObject.

Object.superclass
Object.ancestors
BasicObject.superclass

There is no circularity here. BasicObject has a superclass nil.

The class of BasicObject, as expected, is Class. Class has a superclass method, so every BasicObject should respond to the superclass call.

However, there is a strange phenomenon here - climbing up via the superclass relationship need not always end in a class.

Object.superclass.is_a?(Class)       #true, BasicObject is a class
BasicObject.superclass.is_a?(Class)  #false, nil is not a class.

Why does Ruby provide a BasicObject above Object? This allows the possibility of defining concepts in Ruby that are not objects.

class Trait < Object
  # code
end

Trait.is_a?(Object)                  #false

10.3 The two relationships together

The picture is as follows: the red lines indicate a superclass relationship, and the blue ones an instance-of relationship. By default, red lines are upward, and blue lines point to the right.

./Ruby_1_9.png

Ruby 1.9

is_a?(class) → true or false click to toggle source

Returns true if class is the class of obj, or if class is one of the superclasses of obj or modules included in obj.

  • Ruby Class:Object documentation

What is the problem with this? is_a? checks both the instance-of relationship as well as the superclass relationship. So we see the following problem.

Class.is_a?(Module)        #true because Module is a superclass of Class
Module.is_a?(Class)        #true because Module is an instance of Class

This is a famous error in logic, introduced by Porphyry in what is called the Tree of Porphyry. A critique of this is found in the following passage -

"The relation of a genus to a species is clearly illustrated by the device known as the 'Tree of Porphyry'. The following is the traditional illustration, and has evoked from Bentham the characterization of the 'matchless beauty of the Tree of Porphyry'.

./Porphyrian_Tree.png

The reader will note, however, that the relation between the genus "animal" say, to its species "man" is different from the relationship of the species man to its individual members. The first is a relation between a class and its subclass, the second between a class and its members. Porphyry, who considerably modified Aristotle's theory of the predicables, also confused it irreparably."

  • Morris Cohen and Ernst Nagel, "An Introduction to Logic and the Scientific Method", 1936.

A personal opinion : is_a? should have been bifurcated into two methods: is_a_type_of? (to denote the subclass relationship) and is_an_instance_of? (to denote the instance relationship). This confusion is not specific to Ruby, it is found in most Object-Oriented languages. It would have been more appropriate to think about is_a_type_of?, is_an_instance_of? and has_a, rather than just the two questions is_a? and has_a?.

10.4 Summary

  1. There is no circularity in the superclass relationship in Ruby. This is implied by the fact that only Class~es respond to the ~superclass method, and the superclass of BasicObject is nil, which is not a Class.
  2. There is a circularity in the instance-of relationship, because Class is an instance of itself.
  3. Considering both the superclass and the instance of relationships together introduces more circularities - e.g. via the NilClass class and via the Module class.

Are these circularities essential? Do they cause inconsistency in the way programmers design and think of systems?

11 Reflection. Case study: Unit-testing framework

11.1 Examining existing definitions

Ruby provides methods to examine an object's public, protected and private instance methods, and public, protected and private instance variables. Since classes are themselves objects, it is possible to get class methods of an object as well.

a = "abc"
a.methods                       #public methods
a.singleton_class
class << a
  def reverse
    "bca"
  end
end
a.singleton_methods

This is enough to implement a unit-testing framework.

A typical unit test-case:

require "test/unit"

class A
  attr_accessor :name
end

class TestA < Test::Unit::TestCase
  def test_get
    a = A.new
    assert_nil a.name
  end

  def test_set_get
    a = A.new
    a.name = "a"
    assert_equal "a", a.name
  end

  def foo
    assert_equal true, false
  end
end

How does the unit testing framework know which tests to run? Let us look a simple method which runs all test cases in a unit test case (from Ruby 1.4 source code)

# Rolls up all of the test* methods in the fixture into
# one suite, creating a new instance of the fixture for
# each method.
def self.suite
  method_names = public_instance_methods(true)
  tests = method_names.delete_if { |method_name| method_name !~ /^test.+/ }
  suite = TestSuite.new(name)
  tests.each do
    |test|
    catch(:invalid_test) do
      suite << new(test)
    end
  end
  if (suite.empty?)
    catch(:invalid_test) do
      suite << new(:default_test)
    end
  end
  return suite
end

Thus the code looks for self's public instance methods, and removing any which do not start with "test", initializes one instance per test which runs the test.

11.2 Modifying existing definitions

Can we add new methods to an existing object using the reflection capability? A key ingredient of the reflection capability is the eval functions.

11.2.1 eval

eval evaluates any Ruby expression.

eval "('abc'+'def').length"

# Ruby expression to define a new method
eval """
def hi
  puts 'Hello'
end
"""

hi

a = "abc"

eval """
class << a
  def bye
    puts 'bye'
  end
end
"""

a.bye

11.2.2 instance_eval

This is a method that takes a piece of Ruby text, and evaluates it in the context of the receiver.

a = "abc"
a.instance_eval { length }      # calls the method on receiver a

a.instance_eval {def reverse; "bca" end}

a.instance_eval{ reverse }

a.reverse

(Note that this can be done even if we do not know the class of the object.)

11.2.3 class_eval

This takes a block and evaluates it in the receiver if it is a class.

a = "abc"
a.class.class_eval { def abbreviate; self[0..1] end }
b = "bcd"
b.abbreviate

12 Closures

eval { puts 'Hello' }
[0..4].collect { |i| i*i }


def binop2(a,b,f)
  f.call(a,b)
end

puts binop2(3, 2, lambda {|i,j| i*j})

def binop(a,b)
  yield(a,b)
end

puts binop(3,2){|i,j| i+j}

With closures and Ruby's convention of optional parantheses, we can define "Domain-Specific Languages".

13 Links

13.2 Reference

Ruby String Class documentation

Paolo Perrota, "Metaprogramming Ruby", The Facets of Ruby Series, Pragmmatic Progamming Bookshelf, 2010.


Date: 2014-11-08T13:40+0530

Author: Satyadev Nandakumar

Org version 7.9.3f with Emacs version 24

Validate XHTML 1.0