- Chapter 4 Subclasses
Every object in the Smalltalk-80 system is an instance of a class. All instances of a class represent the same kind of system component. For example, each instance of Rectangle represents a rectangular area and each instance of Dictionary represents a set of associations between names and values. The fact that the instances of a class all represent the same kind of component is reflected both in the way the instances respond to messages and in the form of their instance variables.
- All instances of a class respond to the same set of messages and use the same set of methods to do so.
- All instances of a class have the same number of named instance variables and use the same names to refer to them.
- An object can have indexed instance variables only if all instances of its class can have indexed instance variables.
The class structure as described so far does not explicitly provide for any intersection in class membership . Each object is an instance of exactly one class. This structure is illustrated in Figure 4.1. In the figure, the small circles represent instances and the boxes represent classes. If a circle is within a box, then it represents an instance of the class represented by the box.
Lack of intersection in class membership is a limitation on design in an object-oriented system since it does not allow any sharing between class descriptions. We might want two objects to be substantially similar, but to differ in some particular way. For example, a floating-point number and an integer are similar in their ability to respond to arithmetic messages, but are different in the way they represent numeric values. An ordered collection and a bag are similar in that they are containers to which elements can be added and from which elements can be removed,
but they are different in the precise way in which individual elements are accessed. The difference between otherwise similar objects may be externally visible, such as responding to some different messages, or it may be purely internal, such as responding to the same message by executing different methods. If class memberships are not allowed to overlap, this type of partial similarity between two objects cannot be guaranteed by the system.
The most general way to overcome this limitation is to allow arbitrary intersection of class boundaries (Figure 4.2).
We call this approach multiple inheritance. Multiple inheritance allows a situation in which some objects are instances of two classes, while other objects are instances of only one class or the other. A less general relaxation of the nonintersection limitation on classes is to allow a class to include all instances of another class, but not to allow more general sharing (Figure 4.3).
We call this approach subclassing. This follows the terminology of the programming language Simula, which includes a similar concept. Subclassing is strictly hierarchical; if any instances of a class are also instances of another class, then all instances of that class must also be instances of the other class.
The Smalltalk-80 system provides the subclassing form of inheritance for its classes. This chapter describes how subclasses modify their superclasses, how this affects the association of messages and methods, and how the subclass mechanism provides a framework for the classes in the system.
A subclass specifies that its instances will be the same as instances of another class, called its superclass, except for the differences that are explicitly stated. The Smalltalk-80 programmer always creates a new class as a subclass of an existing class. A system class named Object describes the similarities of all objects in the system, so every class will at least be a subclass of Object. A class description (protocol or implementation) specifies how its instances differ from the instances of its superclass. The instances of a superclass can not be affected by the existence of subclasses.
A subclass is in all respects a class and can therefore have subclasses itself. Each class has one superclass, although many classes may share the same superclass, so the classes form a tree structure. A class has a sequence of classes from which it inherits both variables and methods. This sequence begins with its superclass and continues with its superclass's superclass, and so on. The inheritance chain continues through the superclass relationship until Object is encountered. Object is the single root class; it is the only class without a superclass.
Recall that an implementation description has three basic parts:
- A class name
- A variable declaration
- A set of methods
A subclass must provide a new class name for itself, but it inherits both the variable declaration and methods of its superclass. New variables may be declared and new methods may be added by the subclass. If instance variable names are added in the subclass variable declaration, instances of the subclass will have more instance variables than instances of the superclass. If shared variables are added, they will be accessible to the instances of the subclass, but not to instances of the superclass. All variable names added must be different from any declared in the superclass.
If a class does not have indexed instance variables, a subclass can declare that its instances will have indexed variables; these indexed variables will be in addition to any inherited named instance variables. If a class has indexed instance variables, its subclasses must also have indexed instance variables; a subclass can also declare new named instance variables.
If a subclass adds a method whose message pattern has the same selector as a method in the superclass, its instances will respond to messages with that selector by executing the new method. This is called overriding a method. If a subclass adds a method with a selector not found in the methods of the superclass, the instances of the subclass will respond to messages not understood by instances of the superclass.
To summarize , each part of an implementation description can be modified by a subclass in a different way:
- The class name must be overridden.
- Variables may be added.
- Methods may be added or overridden.
An Example Subclass
An implementation description includes an entry, not shown in the previous chapter, that specifies its superclass. The following example is a class created as a subclass of the FinancialHistory class introduced in Chapter 3. Instances of the subclass share the function of FinancialHistory for storing information about monetary expenditures and receipts. They have the additional function of keeping track of the expenditures that are tax deductible. The subclass provides the mandatory new class name (DeductibleHistory), and adds one instance variable and four methods. One of these methods (initialBalance:) overrides a method in the superclass. The class description for DeductibleHistory follows.
|instance variable names||deductibleExpenditures|
transaction recording spendDeductible: amount for: reason self spend: amount for: reason. deductibleExpenditures ← deductibleExpenditures + amount spend: amount for: reason deducting: deductibleAmount self spend: amount for: reason. deductibleExpenditures ← deductibleExpenditures + deductibleAmount inquiries totalDeductions ↑ deductibleExpenditures initialization initialBalance: amount super initialBalance: amount. deductibleExpenditures ← 0
In order to know all the messages understood by an instance of DeductibleHistory, it is necessary to examine the protocols of DeductibleHistory, FinancialHistory, and Object. Instances of DeductibleHistory have four variables--three inherited from the superclass FinancialHistory, and one specified in the class DeductibleHistory. Class Object declares no instance variables.
Figure 4.4 indicates that DeductibleHistory is a subclass of FinancialHistory. Each box in this diagram is labeled in the upper left corner with the name of class it represents.
Instances of DeductibleHistory can be used to record the history of entities that pay taxes (people, households, businesses). Instances of FinancialHistory can be used to record the history of entities that do not pay taxes (charitable organizations, religious organizations). Actually, an instance of DeductibleHistory could be used in place of an instance of FinancialHistory without detection since it responds to the same messages in the same way. In addition to the messages and methods inherited from FinancialHistory, an instance of DeductibleHistory can respond to messages indicating that all or part of an expenditure is deductible. The new messages available are spendDeductible:for:, which is used if the total amount is deductible; and spend:for:deducting:, which is used if only part of the expenditure is deductible. The total tax deduction can be found by sending a DeductibleHistory the message totalDeductions.
When a message is sent, the methods in the receiver's class are searched for one with a matching selector. If none is found, the methods in that class's superclass are searched next. The search continues up the superclass chain until a matching method is found. Suppose we send an instance of DeductibleHistory a message with selector cashOnHand. The search for the appropriate method to execute begins in the class of the receiver, DeductibleHistory. When it is not found, the search continues by looking at DeductibleHistory's superclass, FinancialHistory. When a method with the selector cashOnHand is found there, that method is executed as the response to the message. The response to this message is to return the value of the instance variable cashOnHand. This value is found in the receiver of the message, that is, in the instance of DeductibleHistory.
The search for a matching method follows the superclass chain, terminating at class Object. If no matching method is found in any class in the superclass chain, the receiver is sent the message doesNotUnderstand:; the argument is the offending message. There is a method for the selector doesNotUnderstand: in Object that reports the error to the programmer.
Suppose we send an instance of DeductibleHistory a message with selector spend:for:. This method is found in the superclass FinancialHistory. The method, as given in Chapter 3, is
spend: amount for: reason expenditures at: reason put: (self totalSpendFor: reason) + amount. cashOnHand ← cashOnHand - amount
The values of the instance variables (expenditures and cashOnHand) are found in the receiver of the message; the instance of DeductibleHistory. The pseudo-variable self is also referenced in this method; self represents the DeductibleHistory instance that was the receiver of the message.
Messages to self
When a method contains a message whose receiver is self, the search for the method for that message begins in the instance's class, regardless of which class contains the method containing self. Thus, when the expression self totalSpentFor: reason is evaluated in the method for spend:for: found in FinancialHistory, the search for the method associated with the message selector totalSpentFor: begins in the class of self, i.e., in DeductibleHistory.
Messages to self will be explained using two example classes named One and Two. Two is a subclass of One and One is a subclass of Object. Both classes include a method for the message test. Class One also includes a method for the message result1 that returns the result of the expression self test.
test ↑ 1 result1 ↑ self test
test ↑ 2
An instance of each class will be used to demonstrate the method determination for messages to self. example1 is an instance of class One and example2 is an instance of class Two.
example1 ← One new. example2 ← Two new
The relationship between One and Two is shown in Figure 4.5. In addition to labeling the boxes in order to indicate class names, several of the circles are also labeled in order to indicate a name referring to the corresponding instance.
The following table shows the results of evaluating various expressions.
The two result1 messages both invoke the same method, which is found in class One. They produce different results because of the message to self contained in that method. When result1 is sent to example2, the search for a matching method begins in Two. A method is not found in Two, so the search continues by looking in the superclass, One. A method for result1 is found in One, which consists of one expression, ↑ self test. The pseudo-variable self refers to the receiver, exarnple2. The search for the response to test, therefore, begins in class Two. A method for test is found in Two, which returns 2.
Messages to super
An additional pseudo-variable named super is available for use in a method's expressions. The pseudo-variable super refers to the receiver of the message, just as serf does. However, when a message is sent to super, the search for a method does not begin in the receiver's class. Instead, the search begins in the superclass of the class containing the method. The use of super allows a method to access methods defined in a superclass even if the methods have been overridden in subclasses. The use of super as Other than a receiver (for example, as an argument), has no different effect from using self; the use of super only affects the initial class in which messages are looked up.
Messages to super will be explained using two more example classes named Three and Four. Four is a subclass of Three, Three is a subclass of the previous example Two. Four overrides the method for the message test. Three contains methods for two new messages--result2 returns the result of the expression self result1, and result3 returns the result of the expression super test.
result2 ↑ self result1 result3 ↑ super test
test ↑ 4
Instances of One, Two, Three, and Four can all respond to the messages test and result1. The response of instances of Three and Four to messages illustrates the effect of super (Figure 4.6).
example3 ← Three new. example4 ← Four new
An attempt to send the messages result2 or result3 to example1 or example2 is an error since instances of One or Two do not understand the messages result2 or result3.
The following table shows the results of sending various messages.
When test is sent to example3, the method in Two is used, since Three doesn't override the method, example4 responds to result1 with a 4 for the same reason that example2 responded with a 2. When result2 is sent to example3, the search for a matching method begins in Three. The method found there returns the result of the expression self result1. The search for the response to result1 also begins in class Three. A matching method is not found in Three or its superclass, Two. The method for result1 is found in One and returns the result of self test. The search for the response to test once more begins in class Three. This time, the matching method is found in Three's superclass Two.
The effect of sending messages to super will be illustrated by the responses of example3 and exarnple4 to the message result3. When result3 is sent to example3, the search for a matching method begins in Three. The method found there returns the result of the expression super test. Since test is sent to super, the search for a matching method begins not in class Three, but in its superclass, Two: The method for test in Two returns a 2. When result3 is sent to example4, the result is still 2, even though Four overrides the message for test.
This example highlights a potential confusion: super does not mean start the search in the superclass of the receiver, which, in the last example, would have been class Three. It means start the search in the superclass of the class containing the method in which super was used, which, in the last example, was class Two. Even if Three had overridden the method for test by returning 3, the result of exarnple4 result3 would still be 2. Sometimes, of course, the superclass of the class in which the method containing super is found is the same as the superclass of the receiver.
Another example of the use of super is in the method for initialBalance in DeductibleHistory.
initialBalance: amount super initialBalance: amount. deductibleExpenditures ← 0
This method overrides a method in the superclass FinancialHistory. The method in DeductibleHistory consists of two expressions. The first ex pression passes control to the superclass in order to process the initialization of the balance.
super initialBalance: amount
The pseudo-variable super refers to the receiver of the message, but indicates that the search for the method should skip DeductibleHistory and begin in FinancialHistory. In this way, the expressions from FinancialHistory do not have to be duplicated in DeductibleHistory. The second expression in the method does the subclass-specific initialization.
deductibleExpenditures ← 0
If self were substituted for super in the initialBalance: method, it would result in an infinite recursion, since every time initialBalance: is sent, it will be sent again.
Abstract superclasses are created when two classes share a part of their descriptions and yet neither one is properly a subclass of the other. A mutual superclass is created for the two classes which contains their shared aspects. This type of superclass is called abstract because it was not created in order to have instances. In terms of the figures shown earlier, an abstract superclass represents the situation illustrated in Figure 4.7. Notice that the abstract class does not directly contain instances.
As an example of the use of an abstract Superclass, consider two classes whose instances represent dictionaries. One class, named SmallDictionary, minimizes the space needed to store its contents; the other, named FastDictionary, stores names and values sparsely and uses a hashing technique to locate names. Both classes use two parallel lists that contain names and associated values. SmallDictionary stores the names and values contiguously and uses a simple linear search to locate a name. FastDictionary stores names and values sparsely and uses a hashing technique to locate a name. Other than the difference in how names are located, these two classes are very similar: they share identical protocol and they both use parallel lists to store their contents. These similarities are represented in an abstract superclass named DualListDictionary. The relationships among these three classes is shown in Figure 4.8.
The implementation description for the abstract class, DualListDictionary is shown next.
|instance variable names||names|
accessing at: name | index | index ← self indexOf: name. index = 0 ifTrue: [self error: 'Name not found'] ifFalse: [↑ values at: index] at: name put: value | index | index ← self indexOf: name. index = 0 ifTrue: [index ← self newIndexOf: name]. ↑ values at: index put: value testing includes: name ↑ (self indexOf: name) ~= 0 isEmpty ↑ self size = 0 initialization initialize names ← Array new: 0. values ← Array new: 0
This description of DualListDictionary uses only messages defined in DualListDictionary itself or ones already described in this or in the previous chapters. The external protocol for a DualListDictionary consists of messages at:, at:put:, includes:, isEmpty, and initialize. A new DualListDictionary (actually an instance of a subclass of DualListDictionary) is created by sending it the message new. It is then sent the message initialize so that assignments can be made to the two instance variables. The two variables are initially empty arrays (Array new: 0).
Three messages to self used in its methods are not implemented in DualListDictionary--size, indexOf:, and newIndexOf:. This is the reason that DualListDictionary is called abstract. If an instance were created, it would not be able to respond successfully to all of the necessary messages. The two subclasses, SmallDictionary and FastDictionary, must implement the three missing messages. The fact that the search always starts at the class of the instance referred to by self means that a method in a superclass can be specified in which messages are sent to self, but the corresponding methods are found in the subclass. In this way, a superclass can provide a framework for a method that is refined or actually implemented by the subclass.
SmallDictionary is a subclass of DualListDictionary that uses a minimal amount of space to represent the associations, but may take a long time to find an association. It provides methods for the three messages that were not implemented in DualListDictionary--size, indexOf:, and newIndexOf:. It does not add variables.
accessing size ↑ names size private indexOf: name ↑ to: names size do: [ :index | (names at: index) = name ifTrue: [↑ index]]. ↑ 0 newIndexOf: name self grow. names at: names size put: name. ↑ names size grow | oldNames oldValues | oldNames ← names. oldValues ← values. names ← Array new: names size + 1. values ← Array new: values size +1. names replaceFrom: 1 to: oldNames size with: oldNames. values replaceFrom: 1 to: oldValues size with: oldValues
Since names are stored contiguously, the size of a SmallDictionary is the size of its array of names, names. The index of a particular name is determined by a linear search of the array names. If no match is found, the index is 0, Signalling failure in the search. Whenever a new association is to be added to the dictionary, the method for newindexOf: is used to find the appropriate index. It assumes that the sizes of names and values are exactly the sizes needed to store their current elements. This means no space is available for adding a new element. The message grow creates two new Arrays that are copies of the previous ones, with one more element at the end. In the method for newindexOf:, first the sizes of names and values are increased and then the new name is stored in the new empty position (the last one). The method that called on newindexOf: has the responsibility for storing the value.
We could evaluate the following example expressions.
|ages ← SmallDictionary new||a new, uninitialized instance|
|ages initialize||instance variables initialized|
|ages at: 'Brett' put:3||3|
|ages at: 'Dave' put: 30||30|
|ages includes: 'Sam'||false|
|ages includes: 'Brett'||true|
|ages at: 'Dave'||30|
For each of the above example expressions, we indicate in which class the message is found and in which class any messages sent to self are found.
|message selector||message to self||class of method|
FastDictionary is another subclass of DualListDictionary. It uses a hashing technique to locate names. Hashing requires more space, but takes less time than a linear search. All objects respond to the hash message by returning a number. Numbers respond to the \\ message by returning their value in the modulus of the argument.
accessing size | size| size ← 0. names do: [ :name | name notNil ifTrue: [size ← size +1]]. ↑ size initialization initialize names ← Array new: 4. values ← Array new: 4 private indexOf: name | index | index ← name hash \\ names size + 1. [(names at: index) = name] whileFalse: [(names at: index) isNil ifTrue: [↑ 0] ifFalse: [index ← index \\ names size + 1]]. ↑ index newIndexOf: name | index | names size - self size < = (names size / 4) ifTrue: [self grow]. index ← name hash \\ names size + 1. [(names at: index) isNil] whileFalse: [index ← index \\ names size + 1]. names at: index put: name. ↑ index grow | oldNames oldValues | oldNames ← names. oldValues ← values. names ← Array new: names size * 2. values ← Array new: values size * 2. 1 to: oldNames size do: [ :index | (oldNames at: index) isNil ifFalse: [self at: (oldNames at: index) put: (oldValues at: index)]]
FastDictionary overrides DualListDictionary's implementation of initialize in order to create Arrays that already have some space allocated (Array new: 4). The size of a FastDictionary is.not simply the size of one of its variables since the Arrays always have empty entries. So the size is determined by examining each element in the Array and counting the number that are not nil.
The implementation of newIndexOf: follows basically the same idea as that used for SmallDictionary except that when the size of an Array is changed (doubled in this case in the method for grow), each element is explicitly copied from the old Arrays into the new ones so that elements are rehashed. The size does not always have to be changed as is necessary in SmallDictionary. The size of a FastDictionary is changed only when the number of empty locations in names falls below a minimum.
The minimum is equal to 25% of the elements.
names size - self size < = (names size / 4)
Subclass Framework Messages
As a matter of programming style, a method should not include messages to self if the messages are neither implemented by the class nor inherited from a superclass. In the description of DualListDictionary, three such messages exist--size, indexOf:, and newIndexOf:. As we shall see in subsequent chapters, the ability to respond to size is inherited from Object; the response is the number of indexed instance variables. A subclass of DualListDictionary is supposed to override this method in order to return the number of names in the dictionary.
A special message, subclassResponsibility, is specified in Object. It is to be used in the implementation of messages that cannot be properly implemented in an abstract class. That is, the implementation of size and indexOf: and newIndexOf:, by Smalltalk-80 convention, should be
The response to this message is to invoke the following method defined in class Object.
subclassResponsibility self error: 'My subclass should have overridden one of my messages.'
In this way, if a method should have been implemented in a subclass of an abstract class, the error reported is an indication to the programmer of how to fix the problem. Moreover, using this message, the programmer creates abstract classes in which all messages sent to self are implemented , and in which the implementation is an indication to the programmer of which methods must be overridden in the subclass.
By convention, if the programmer decides that a message inhertied from an abstract superclass should actually not be implemented, the appropriate way to override the inherited method is
The response to this message is to invoke the following method defined in class Object.
shouldNotImplement self error: 'This message is not appropriate for this object.'
There are several major subclass hierarchies in the Smalltalk-80 system that make use of the idea of creating a framework of messages whose implementations must be completed in subclasses. There are classes describing various kinds of collections (see Chapters 9 and 10). The collection classes are arranged hierarchically in order to share as much as possible among classes describing similar kinds of collections. They make use of the messages subclassResponsibility and shouldNotImplement. Another example of the use of subclasses is the hierarchy of linear measures and number classes (see Chapters 7 and 8).
Summary of Terminology
|subclass||A class that inherits variables and methods from an existing class.|
|superclass||The class from which variables and methods are inherited.|
|Object||The class that is the root of the tree-structured class hierarchy.|
|overriding a method||Specifying a method in a subclass for the same message as a method in a superclass.|
|super||A pseudo-variable that refers to the receiver of a message; differs from self in where to start the search for methods.|
|abstract class||A class that specifies protocol, but is not able to fully implement it; by convention, instances are not created of this kind of class.|
|subclassResponsibility||A message to report the error that a subclass should have implemented one of the superclass's messages.|
|shouldNotImplement||A message to report the error that this is a message inherited from a superclass but explicitly not available to instances of the subclass.|