Smalltalk80LanguageImplementation:Chapter 03
- Chapter 3 Classes and Instances
Classes and Instances
Objects represent the components of the Smalltalk-80 system the numbers, data structures, processes, disk files, process schedulers, text editors, compilers, and applications. Messages represent interactions between the components of the Smalltalk-80 system-- the arithmetic, data accesses, control structures, file creations, text manipulations, compilations, and application uses. Messages make an object's functionality available to other oObjects, while keeping the object's implementation hidden. The previous chapter introduced an expression syntax for describing oObjects and messages, concentrating on how messages are used to access an object's functionality. This chapter introduces the syntax for describing methods and classes in order to show how the functionality of oObjects is implemented.
Every Smalltalk-80 object is an instance of a class. The instances of a class all have the same message interface; the class describes how to carry out each of the operations available through that interface. Each operation is described by a method. The selector of a message determines what type of operation the receiver should perform, so a class has one method for each selector in its interface. When a message is sent to an object, the method associated with that type of message in the receiver's class is executed. A class also describes what type of private memory its instances will have.
Each class has a name that describes the type of component its instances represent. A class name serves two fundamental purposes; it is a simple way for instances to identify themselves, and it provides a way to refer to the class in expressions. Since classes are components of the Smalltalk-80 system, they are represented by oObjects. A class's name automatically becomes the name of a globally shared variable. The value of that variable is the object representing the class. Since class names are the names of shared variables, they must be capitalized.
New Objects are created by sending messages to classes. Most classes respond to the unary message new by creating a new instance of themselves. For example,
OrderedCollection new
returns a new collection that is an instance of the system class OrderedCollection. The new OrderedCollection is empty. Some classes create instances in response to other messages. For example, the class whose instances represent times in a day is Time; Time responds to the message now with an instance representing the current time. The class whose instances represent days in a year is Date; Date responds to the message today with an instance representing the current day. When a new instance is created, it automatically shares the methods of the class that received the instance creation message.
This chapter introduces two ways to present a class, one describing the functionality of the instances and the other describing the implementation of that functionality.
- A protocol description lists the messages in the instances' message interface. Each message is accompanied by a comment describing the operation an instance will perform when it receives that type of message.
- An implementation description shows how the functionality described in the protocol description is implemented. An implementation description gives the form of the instances' private memory and the set of methods that describe how instances perform their operations.
A third way to present classes is an interactive view called a system browser. The browser is part of the programming interface and is used in a running Smalltalk-80 system. Protocol descriptions and implementation descriptions are designed for noninteractive documentation like this book. The browser will be described briefly in Chapter 17.
Protocol Descriptions
A protocol description lists the messages understood by instances of a particular class. Each message is listed with a comment about its functionality. The comment describes the operation that will be performed when the message is received and what value will be returned. The comment describes what will happen, not how the operation will be performed. If the comment gives no indication of the value to be returned, then the value is assumed to be the receiver of the message.
For example, a protocol description entry for the message to a FinancialHistory With the selector spend:for: is
spend: amount for: reason | Remember that an amount of money, amount, has been spent for reason. |
Messages in a protocol description are described in the form of message patterns, A message pattern contains a message selector and a set of argument names, one name for each argument that a message with that selector would have. For example, the message pattern
spend: amount for: reason
matches the messages described by each of the following three expressions.
HouseholdFinances spend: 32.50 for: 'utilities'
HouseholdFinances spend: cost + tax for: 'food'
HouseholdFinances spend: 100 for: usualReason
The argument names are used in the comment to refer to the arguments. The comment in the example above indicates that the first argument represents the amount of money spent and the second argument represents what the money was spent for.
Message Categories
Messages that invoke similar operations are grouped together in categories. The categories have names that indicate the common functionality of the messages in the group. For example, the messages to FinancialHistory are grouped into three categories named transaction recording, inquiries, and initialization. This categorization is intended to make the protocol more readable to the user; it does not affect the operation of the class.
The complete protocol description for FinancialHistory is shown next.
transaction recording | |
receive: amount from: source | Remember that an amount of money, amount, has been received from source. |
spend:amount for: reason | Remember that an amount of money, amount, has been spent for reason. |
inquiries | |
cashOnHand | Answer the total amount of money currently on hand. |
totalReceivedFrom: source | Answer the total amount received from source, so far. |
totalSpentFor: reason | Answer the total amount spent for reason, so far. |
initialization | |
initialBalance: amount | Begin a financial history with amount as the amount of money on hand. |
FinancialHistory protocol |
A protocol description provides sufficient information for a programmer to know how to use instances of the class. From the above protocol description, we know that any instance of FinancialHistory should respond to the messages whose selectors are receive:from:, spend:for:, cashOnHand, totalReceivedFrom:, totalSpentFor:, and initialBalance:. We can guess that when we first create an instance of a FinancialHistory, the message initialBalance: should be sent to the instance in order to set values for its variables.
Implementation Descriptions
An implementation description has three parts.
- a class name
- a declaration of the variables available to the instances
- the methods used by instances to respond to messages
An example of a complete implementation description for FinancialHistory is given next. The methods in an implementation description are divided into the same categories used in the protocol description. In the interactive system browser, categories are used to provide a hierarchical query path for accessing the parts of a class description. There are no special character delimiters separating the various parts of implementation descriptions. Changes in character font and emphasis indicate the different parts. In the interactive system browser, the parts are stored independently and the system browser provides a structured editor for accessing them.
class | FinancialHistory |
instance variable names | cashOnHand incomes expenditures |
instance methods | transaction recording
receive: amount from: source
incomes at: source
put: (self totalReceivedFrom: source) + amount.
cashOnHand ← cashOnHand + amount
spend: amount for: reason
expenditures at: reason
put: (self totalSpentFor: reason) + amount.
cashOnHand ← cashOnHand - amount
inquiries
cashOnHand
↑ cashOnHand
totalReceivedFrom: source
(incomes includesKey: source)
ifTrue: [↑ incomes at: source]
ifFalse: [↑ 0]
totalSpentFor: reason
(expenditures includesKey: reason)
ifTrue: [↑ expenditures at: reason]
ifFalse: [↑ 0]
initialization
initialBalance: amount
cashOnHand ← amount.
incomes ← Dictionary new.
expenditures ← Dictionary new
|
This implementation description is different from the one presented for FinancialHistory on the inside front cover of this book. The one on the inside front cover has an additional part labeled "class methods" that will be explained in Chapter 5; also, it omits the initialization method shown here.
Variable Declarations
The methods in a class have access to five different kinds of variables. These kinds of variables differ in terms of how widely they are available (their scope) and how long they persist. There are two kinds of private variables available only to a single object.
- 1. Instance variables exist for the entire lifetime of the object.
- 2. Temporary variables are created for a specific activity and are available only for the duration of the activity.
Instance variables represent the current state of an object. Temporary variables represent the transitory state necessary to carry out some activity. Temporary variables are typically associated with a single execution of a method: they are created when a message causes the method to be executed and are discarded when the method completes by returning a value.
The three other kinds of variables can be accessed by more than one object. They are distinguished by how widely they are shared.
- 3. Class variables are shared by all the instances of a single class.
- 4. Global variables are shared by all the instances of all classes (that is, by all oObjects).
- 5. Pool variablesare shared by the instances of a subset of the classes in the system.
The majority of shared variables in the system are either class variables or global variables. The majority of global variables refer to the classes in the system. An instance of FinancialHistory named HouseholdFinances was used in several of the examples in the previous chapters. We used HouseholdFinances as if it were defined as a global variable name. Global variables are used to refer to oObjects that are not parts of other oObjects.
Recall that the names of shared variables (3-5) are capitalized, while the names of private variables (1-2) are not. The value of a shared variable will be independent of which instance is using the method in which its name appears. The value of instance variables and temporaries will depend on the instance using the method, that is, the instance that received a message.
Instance Variables
There are two types of instance variables, named and indexed. They differ in terms of how they are declared and how they are accessed. A class may have only named instance variables, only indexed variables, or some of each.
❏ Named Instance Variables An implementation description includes a set of names for the instance variables that make up the individual instances. Each instance has one variable corresponding to each instance variable name. The variable declaration in the implementation description of FinancialHistory specified three instance variable names.
instance variable names | cashOnHand incomes expenditures |
An instance of FinancialHistory uses two dictionaries to store the total amounts spent and received for various reasons, and uses another variable to keep track of the cash on hand.
- expenditures refers to a dictionary that associates spending reasons with amounts spent.
- incomes refers to a dictionary that associates income sources with amounts received.
- cashOnHand refers to a number representing the amount of money available.
When expressions in the methods of the class use one of the variable names incomes, expenditures, or cashOnHand, these expressions refer to the value of the corresponding instance variable in the instance that received the message.
When a new instance is created by sending a message to a class, it has a new set of instance variables. The instance variables are initialized as specified in the method associated with the instance creation message. The default initialization method gives each instance variable a value of nil.
For example, in order for the previous example messages to HouseholdFinances to work, an expression such as the following must have been evaluated.
HouseholdFinances ← FinancialHistory new initialBalance: 350
FinancialHistory new creates a new object whose three instance variables all refer to nil. The initialBalance: message to that new instance gives the three instance variables more appropriate initial values.
❏ Indexed Instance Variables Instances of some classes can have instance variables that are not accessed by names. These are called indexed instance variables. Instead of being referred to by name, indexed instance variables are referred to by messages that include integers, called indices, as arguments. Since indexing is a form of association, the two fundamental indexing messages have the same selectors as the association messages to dictionaries--at: and at:put:. For example, instances of Array have indexed variables. If names is an instance of Array, the expression
names at: 1
returns the value of its first indexed instance variable. The expression
names at: 4 put: 'Adele'
stores the string 'Adele' as the value of the fourth indexed instance variable of names. The legal indices run from one to the number of indexed variables in the instance.
If the instances of a class have indexed instance variables, its variable declaration will include the line indexed instance variables. For example, part of the implementation description for the system class Array is
class name | Array |
indexed instance variables |
Each instance of a class that allows indexed instance variables may have a different number of them. All instances of FinancialHistory have three instance variables, but instances of Array may have any number of instance variables.
A class whose instances have indexed instance variables can also have named instance variables. All instances of such a class will have the same number of named instance variables, but may have different numbers of indexed variables. For example, a system class representing a collection whose elements are ordered, OrderedCollection, has indexed instance variables to hold its contents. An OrderedCollection might have more space for storing elements than is currently being used. The two named instance variables remember the indices of the first and last element of the contents.
class name | OrderedCollection |
instance variable names | firstIndex lastIndex |
indexed instance variables |
All instances of Ordered Collection will have two named variables, but one may have five indexed instance variables, another 15, another 18, and so on.
The named instance variables of an instance of FinancialHistory are private in the sense that access to the values of the variables is controlled by the instance. A class may or may not include messages giving direct access to the instance variables. Indexed instance variables are not private in this sense, since direct access to the values of the variables is available by sending messages with selectors at: and at:put:. Since these messages are the only way to access indexed instance variables, they must be provided.
Classes with indexed instance variables create new instances with the message new: instead of the usual message new. The argument of new: tells the number of indexed variables to be provided.
list ← Array new: 10
creates an Array of 10 elements, each of which is initially the special object nil. The number of indexed instance variables of an instance can be found by sending it the message size. The response to the message size
list size
is, for this example, the integer 10.
Evaluating each of the following expressions, in order,
list ← Array new: 3.
list at: 1 put: 'one'.
list at: 2 put: 'two'.
list at: 3 put: 'three'
is equivalent to the single expression
list ← #('one' 'two' 'three')
Variables that are shared by more than one object come in groups called pools. Each class has two or more pools whose variables can be accessed by its instances. One pool is shared by all classes and contains the global variables; this pool is named Smalltalk. Each class also has a pool which is only available to its instances and contains the class variables.
Besides these two mandatory pools, a class may access some other special purpose pools shared by several classes. For example, there are several classes in the system that represent textual information; these classes need to share the ASCII character codes for characters that are not easily indicated visually, such as a carriage return, tab, or space. These numbers are included as variables in a pool named TextConstants that is shared by the classes implementing text display and text editing.
If FinancialHistory had a class variable named SalesTaxRate and shared a pool dictionary whose name is FinancialConstants, the declaration would be expressed as follows.
instance variable names | cashOnHand incomes expenditures |
class variable names | SalesTaxRate |
shared pools | FinancialConstants |
SalesTaxRate is the name of a class variable, so it can be used in any methods in the class. FinancialConstants, on the other hand, is the name of a pool; it is the variables in the pool that can be used in expressions.
In order to declare a variable to be global (known to all classes and to the user's interactive system), the variable name must be inserted as a key in the dictionary Smalltalk. For example, to make AllHistories global, evaluate the expression
Smalltalk at: #AllHistories put: nil
Then use an assignment statement to set the value of AllHistories.
Methods
A method describes how an object will perform one of its operations. A method is made up of a message pattern and a sequence of expressions separated by periods. The example method shown below describes the response of a FinancialHistory to messages informing it of expenditures.
spend: amount for: reason
expenditures at: reason
put: (self totalSpentFor: reason) + amount.
cashOnHand ← cashOnHand - amount
The message pattern, spend: amount for: reason, indicates that this method will be used in response to all messages with selector spend:for:. The first expression in the body of this method adds the new amount to the amount already spent for the reason indicated. The second expression is an assignment that decrements the value of cashOnHand by the new amount.
Argument Names
Message patterns were introduced earlier in this chapter. A message pattern contains a message selector and a set of argument names, one for each argument that a message with that selector would have. A message pattern matches any messages that have the same selector. A class will have only one method with a given selector in its message pattern. When a message is sent, the method with matching message pattern is selected from the class of the receiver. The expressions in the selected method are evaluated one after another. After all the expressions are evaluated, a value is returned to the sender of the message.
The argument names found in a method's message pattern are pseudo-variable names referring to the arguments of the actual message. If the method shown above were invoked by the expression
HouseholdFinances spend: 30.45 for: 'food'
the pseudo-variable name amount would refer to the number 30.45 and the pseudo-variable name reason would refer to the string 'food' during the evaluation of the expressions in the method. If the same method were invoked by the expression
HouseholdFinances spend: cost + tax for: 'food'
cost would be sent the message + tax and the value it returned would be referred to as amount in the method. If cost referred to 100 and tax to 6.5, the value of amount would be 106.5.
Since argument names are pseudo-variable names, they can be used to access values like variable names, but their values cannot be changed by assignment. In the method for spend:for:, a statement of the form
amount ← amount * taxRate
would be syntactically illegal since the value of amount cannot be reassigned.
Returning Values
The method for spend:for: does not specify what the value of the message should be. Therefore, the default value, the receiver itself, will be returned. When another value is to be specified, one or more return expressions are included in the method. Any expression can be turned into a return expression by preceding it with an uparrow (↑ ). The value of a variable may be returned as in
↑ cashOnHand
The value of another message can be returned as in
↑ expenditures at: reason
A literal object can be returned as in
↑ 0
Even an assignment statement can be turned into are turn expression, as in
↑ initialIndex ← 0
The assignment is performed first. The new value of the variable is then returned.
An example of the use of a return expression is the following implementation of totalSpentFor:.
totalSpentFor: reason
(expenditures includesKey: reason)
ifTrue: [↑ expenditures at: reason]
ifFalse: [↑ 0]
This method consists of a single conditional expression. If the expenditure reason is in expenditures, the associated value is returned; otherwise, zero is returned.
The Pseudo-variable self
Along with the pseudo-variables used to refer to the arguments of a message, all methods have access to a pseudo-variable named self that refers to the message receiver itself. For example, in the method for spend:for:, the message totalSpentFor: is sent to the receiver of the spend:for: message.
spend: amount for: reason
expenditures at: reason
put: (self totalSpentFor: reason) + amount.
cashOnHand ← cashOnHand - amount
When this method is executed, the first thing that happens is that totalSpentFor: is sent to the same object (self) that received spend:for:. The result of that message is sent the message + amount, and the result of that message is used as the second argument to at:put:.
The pseudo-variable self can be used to implement recursive functions. For example, the message factorial is understood by integers in order to compute the appropriate function. The method associated with factorial is
factorial
self = 0 ifTrue: [↑ 1].
self < 0
ifTrue: [self error: 'factorial invalid']
ifFalse: [↑ self * (self - 1) factorial]
The receiver is an Integer. The first expression tests to see if the receiver is 0 and, if it is, returns 1. The second expression tests the sign of the receiver because, if it is less than 0, the programmer should be notified of an error (all oObjects respond to the message error: with a report that an error has been encountered). If the receiver is greater than 0, then the value to be returned is
self * (self - 1) factorial
The value returned is the receiver multiplied by the factorial of one less than the receiver.
Temporary Variables
The argument names and self are available only during a single execution of a method. In addition to these pseudo-variable names, a method may obtain some other variables for use during its execution. These are called temporary variables. Temporary variables are indicated by including a temporary variable declaration between the message pattern and the expressions of a method. A temporary declaration consists of a set of variable names between vertical bars. The method for spend:for: could be rewritten to use a temporary variable to hold the previous expenditures.
spend: amount for: reason
| previousExpenditures |
previousExpenditures - self totalSpentFor: reason.
expenditures at: reason
put: previousExpenditures + amount.
cashOnHand ← cashOnHand - amount
The values of temporary variables are accessible only to statements in the method and are forgotten when the method completes execution. All temporary variables initially refer to nil.
In the interactive Smalltalk-80 system, the programmer can test algorithms that make use of temporary variables. The test can be carried out by using the vertical bar notation to declare the variables for the duration of the immediate evaluation only. Suppose the expressions to
be tried out include reference to the variable list. If the variable list is undeclared, an attempt to evaluate the expressions will create a syntax error message. Instead, the programmer can declare list as a temporary variable by prefixing the expressions with the declaration | list |. The expressions are separated by periods, as in the syntax of a method.
| list |
list ← Array new: 3.
list at: 1 put: 'one'.
list at: 2 put: 'four'.
list printString
The programmer interactively selects all five lines--the declaration and the expressions--and requests evaluation. The variable list is avail able only during the single execution of the selection.
Primitive Methods
When an object receives a message, it typically just sends other messages, so where does something really happen? An object may change the value of its instance variables when it receives a message, which certainly qualifies as "something happening." But this hardly seems enough. In fact, it is not enough. All behavior in the system is invoked by messages, however, all messages are not responded to by executing Smalltalk-80 methods. There are about one hundred primitive methods that the Smalltalk-80 virtual machine knows how to perform. Examples of messages that invoke primitives are the + message to small integers, the at: message to oObjects with indexed instance variables, and the new and new: messages to classes. When 3 gets the message + 4, it does not execute a Smalltalk-80 method. A primitive method returns 7 as the value of the message. The complete set of primitive methods is included in the fourth part of this book, which describes the virtual machine. Methods that are implemented as primitive methods begin with an expression of the form
<primitive #>
where # is an integer indicating which primitive method will be followed. If the primitive fails to perform correctly, execution continues in the Smalltalk-80 method. The expression < primitive # > is followed by Smalltalk-80 expressions that handle failure situations.
Summary of Terminology
class | An object that describes the implementation of a set of similar oObjects. |
instance | One of the oObjects described by a class; it has memory and responds to messages. |
instance variable | A variable available to a single object for the entire lifetime of the object; instance variables can be named or indexed. |
protocol description | A description of a class in terms of its instances' public message protocol. |
implementation description | A description of a class in terms of its instances' private memory and the set of methods that describe how instances perform their operations. |
message pattern | A message selector and a set of argument names, one for each argument that a message with this selector must have. |
temporary variable | A variable created for a specific activity and available only for the duration of that activity. |
class variable | A variable shared by all the instances of a single class. |
global variable | A variable shared by all the instances of all classes. |
pool variable | A variable shared by the instances of a set of classes. |
Smalltalk | A pool shared by all classes that contains the global variables. |
method | A procedure describing how to perform one of an object's operations; it is made up of a message pattern, temporary variable declaration, and a sequence of expressions. A method is executed when a message matching its message pattern is sent to an instance of the class in which the method is found |
argument name | Name of a pseudo-variable available to a method only for the duration of that method's execution; the value of the argument names are the arguments of the message that invoked the method. |
↑ | When used in a method, indicates that the value of the next expression is to be the value of the method. |
self | A pseudo-variable referring to the receiver of a message. |
message category | A group of methods in a class description. |
primitive method | An operation performed directly by the Smalltalk-80 virtual machine; it is not described as a sequence of Smalltalk-80 expressions. |