- Chapter 9 Protocol for All Collection Classes
Protocol for All Collection Classes
Object Magnitude Character Date Time Number Float Fraction Integer LargeNegativeInteger LargePositiveInteger SmallInteger LookupKey Association Link Process Collection*** SequenceableCollection LinkedList Semaphore ArrayedCollection Array Bitmap DisplayBitmap RunArray String Symbol Text ByteArray Interval OrderedCollection SortedCollection Bag MappedCollection Set Dictionary IdentifyDictionary Stream PositionableStream ReadStream WriteStream ReadWriteStream ExternalStream FileStream Random File FileDirectory FilePage UndefinedObject Boolean False True ProcessorScheduler Delay SharedQueue Behavior ClassDescription Class MetaClass Point Rectangle BitBit CharacterScanner Pen DisplayObject DisplayMedium Form Cursor DisplayScreen InfiniteForm OpaqueForm Path Arc Circle Curve Line LinearFit Spline
A collection represents a group of objects. These objects are called the elements of the collection. For example, an Array is a collection. The Array
#('word' 3 5 $G (1 2 3))
is a collection of five elements. The first one is a String, the second and third are SmallIntegers, the fourth element is a Character, and the fifth is itself an Array. The first element, the String, is also a collection; in this case, it is a collection of four Characters.
Collections provide basic data structures for programming in the Smalltalk-80 system. Elements of some of the collections are unordered and elements of other collections are ordered. Of the collections with unordered elements, Bags allow duplicate elements and Sets do not allow duplication. There are also Dictionaries that associate pairs of objects. Of the collections with ordered elements, some have the order specified externally when the elements are added (OrderedCollections, Arrays, Strings) and others determine the order based on the elements themselves (SortedCollections). For example, the common data structures of arrays and strings are provided by classes that associate integer indices and elements and that have external ordering corresponding to the ordering of the indices.
This chapter introduces the protocol shared by all collections. Each message described in this chapter is understood by any kind of collection, unless that collection specifically disallows it. Descriptions of each kind of collection are provided in the next chapter.
Collections support four categories of messages for accessing elements:
- Messages for adding new elements
- messages for removing elements
- messages for testing occurrences of elements
- messages for enumerating elements
A single element or several elements can be added or removed from a collection. It is possible to test whether a collection is empty or whether it includes a particular element. It is also possible to determine the number of times a particular element occurs in the collection. Enumeration allows one to access the elements without removing them from the collection.
Adding, Removing, and Testing Elements
The basic protocol for collections is specified by the superclass of all collection classes, named Collection. Class Collection is a subclass of class Object. The protocol for adding, removing, and testing elements follows.
|add: newObject||Include the argument, newObject, as one of the receiver's elements. Answer newObject.|
|addAll: aCollection||Include all the elements of the argument, aCollection, as the receiver's elements. Answer aCollection.|
|remove: oldObject||Remove the argument, oldObject, from the receiver's elements. Answer oldObject unless no element is equal to oldObject, in which case, report that an error occurred.|
|remove: oldObject ifAbsent: anExceptionBlock||Remove the argument, oldObject, from the receiver's elements. If several of the elements are equal to oldObject, only one is removed. If no element is equal to oldObject, answer the result of evaluating anExceptionBlock. Otherwise, answer oldObject.|
|removeAll: aCollection||Remove each element of the argument, aCollection, from the receiver. If successful for each, answer aCollection. Otherwise report that an error occurred.|
|includes: anObject||Answer whether the argument, anObject, is equal to one of the receiver's elements.|
|isEmpty||Answer whether the receiver contains any elements.|
|occurrencesOf: anObject||Answer how many of the receiver's elements are equal to the argument, anObject.|
|Collection instance protocol|
In order to demonstrate the use of these messages, we introduce the collection lotteryA
(272 572 852 156)
and the collection lotteryB
(572 621 274)
We will assume that these two collections, representing numbers drawn in a lottery, are instances of Bag, a subclass of Collection. Collection itself is abstract in the sense that it describes protocol for all collections. Collection does not provide sufficient representation for storing elements and so it is not possible to provide implementations in Collection of all of its messages. Because of this incompleteness in the definition of Collection, it is not useful to create instances of Collection. Bag is concrete in the sense that it provides a representation for storing elements and implementations of the messages not implementable in its superclass.
All collections respond to size in order to answer the number of their elements. So we can determine that
is 4 and
is 3. Then, evaluating the messages in order, we have
|expression||result||lotteryA if it changed|
|lotteryA includes: 572||true|
|lotteryA add: 596||596||Bag (272 572 852 156 596)|
|lotteryA addAll: lotteryB||Bag(572 621 274)||Bag (272 274 852 156 596 572 572 621)|
|lotteryA occurrencesOf: 572||2|
|lotteryA remove: 572||572||Bag(272 274 852 156 596 572 621)|
|lotteryA removeAll: lotteryB||Bag(572 621 274)||Bag(272 852 596 156)|
Note that the add: and remove: messages answer the argument rather than the collection itself so that computed arguments can be accessed. The message remove: deletes only one occurrence of the argument, not all occurrences.
Blocks were introduced in Chapter 2. The message remove: oldObject ifAbsent: anExceptionBlock makes use of a block in order to specify the behavior of the collection if an error should occur. The argument anExceptionBlock is evaluated if the object referred to by oldObject is not an element of the collection. This block can contain code to deal with the error or simply to ignore it. For example, the expression
lotteryA remove: 121 ifAbsent: 
does nothing when it is determined that 121 is not an element of lotteryA.
The default behavior of the message remove: is to report the error by sending the collection the message error: 'object is not in the collection'. (Recall that the message error: is specified in the protocol for all objects and is therefore understood by any collection.)
Included in the instance protocol of collections are several enumeration messages that support the ability to list the elements of a collection and to supply each element in the evaluation of a block. The basic enumeration message is do: aBlock. It takes a one-argument block as its argument and evaluates the block once for each of the elements of the collection. As an example, suppose letters is a collection of Characters and we want to know how many of the Characters are a or A.
count ← 0. letters do: [:each | each asLowercase = = $a ifTrue: [count ← count + 1]]
That is, increment the counter, count, by 1 for each element that is an upper or lowercase a. The desired result is the final value of count. We can use the equivalence test (= =) rather than equality since objects representing Characters are unique.
Six enhancements of the basic enumeration messages are specified in the protocol for all collections. The description of these enumeration messages indicates that "a new collection like the receiver" is created for gathering the resulting information. This phrase means that the new collection is an instance of the same class as that of the receiver. For example, if the receiver of the message select: is a Set or an Array, then the response is a new Set or Array, respectively. In the Smalltalk-80 system, the only exception is in the implementation of class Interval, which returns a new OrderedCollection, not a new Interval, from these enumeration messages. The reason for this exception is that the elements of an Interval are created when the Interval is first created; it is not possible to store elements into an existing Interval.
|do: aBlock||Evaluate the argument, aBlock, for each of the receiver's elements.|
|select: aBlock||Evaluate the argument, aBlock, for each of the receiver's elements. Collect into a new collection like that of the receiver, only those elements for which aBlock evaluates to true. Answer the new collection.|
|reject: aBlock||Evaluate the argument, aBlock, for each of the receiver's elements. Collect into a new collection like that of the receiver only those elements for which aBlock evaluates to false. Answer the new collection.|
|collect: aBlock||Evaluate the argument, aBlock, for each of the receiver's elements. Answer a new collection like that of the receiver containing the values returned by the block on each evaluation.|
|detect: aBlock||Evaluate the argument, aBlock, for each of the receiver's elements. Answer the first element for which aBlock evaluates to true. If none evaluates to true, report an error.|
|detect: aBlock ifNone: exceptionBlock||Evaluate the argument, aBlock, for each of the receiver's elements. Answer the first element for which aBlock evaluates to true. If none evaluates to true, evaluate the argument, exceptionBlock, exceptionBlock must be a block requiring no arguments.|
|inject: thisValue into: binaryBlock||Evaluate the argument, binaryBlock, once for each element in the receiver. The block has two arguments: the second is an element from the receiver; the first is the value of the previous evaluation of the block, starting with the argument, thisValue. Answer the final value of the block.|
|Collection instance protocol|
Each enumeration message provides a concise way to express a sequence of messages for testing or gathering information about the elements of a collection.
Selecting and Rejecting
We could have determined the number of occurrences of the character a or A using the message select:.
(letters select: [:each | each asLowercase = = $a]) size
That is, create a collection containing only those elements of letters that are a or A, and then answer the size of the resulting collection. We could also have determined the number of occurrences of the character a or A using the message reject:.
(letters reject: [ :each | each asLowercase ~~ $a]) size
That is, create a collection by eliminating those elements of letters that are not a or A, and then answer the size of the resulting collection.
The choice between select: and reject: should be based on the best expression of the test. If the selection test is best expressed in terms of acceptance, then select: is easier to use; if the selection test is best expressed in terms of rejection, then reject: is easier to use. In this example, select: would be preferred.
As another example, assume employees is a collection of workers, each of whom responds to the message salary with his or her gross earnings. To make a collection of all employees whose salary is at least $10,000, use
employees select: [ :each | each salary > = 10000]
employees reject: [ :each | each salary < 10000]
The resulting collections are the same. The choice of which message to use, select: or reject:, depends on the way the programmer wishes to express the criterion "at least $10,000."
Suppose we wish to create a new collection in which each element is the salary of each worker in the collection employees.
employees collect: [ :each | each salary]
The resulting collection is the same size as employees. Each of the elements of the new collection is the salary of the corresponding element of employees.
Suppose we wish to find one worker in the collection of employees whose salary is greater than $20,000. The expression
employees detect: [ :each | each salary > 20000]
will answer with that worker, if one exists. If none exists, then employees will be sent the message error: 'object is not in the collection'. Just as in the specification of the removing messages, the programmer has the option to specify the exception behavior for an unsuccessful detect:. The next expression answers one worker whose salary exceeds $20,000, or, if none exists, answers nil.
employees detect: [ :each | each salary > 20000] ifNone: [nil]
In the message inject:into:, the first argument is the initial value that takes part in determining the result; the second argument is a two-argument block. The first block argument names the variable that refers to the result; the second block argument refers to each element of the collection. An example using this message sums the salaries of the workers in the collection employees.
employees inject: 0 into: [ :subTotal :nextWorker | subTotal + nextWorker salary]
where the initial value of 0 increases by the value of the salary for each worker in the collection, employees. The result is the final value of subTotal.
By using the message inject:into:, the programmer can locally specify temporary variable names and can avoid separate initialization of the object into which the result is accumulated. For example, in an earlier expression that counted the number of occurrences of the Characters a and A in the collection letters, we used a counter, count.
count ← 0. letters do: [ :each | each asLowercase = = $a ifTrue: [count ← count + 1]]
An alternative approach is to use the message inject:into:. In the example expression, the result is accumulated in count, count starts at 0. If the next character (nextElement) is a or A, then add 1 to count; otherwise add 0.
letters inject: 0 into: [ :count :nextElement | count + (nextElement asLowerCase = = $a ifTrue:  ifFalse: )]
In the beginning of this chapter, examples were given in which new collections were expressed as literals. These collections were Arrays and Strings. For example, an expression for creating an array is
#('first' 'second' 'third')
where each element is a String expressed literally.
The messages new and new: can be used to create instances of particular kinds of collections. In addition, the class protocol for all collections supports messages for creating instances with one, two, three, or four elements. These messages provide a shorthand notation for creating kinds of collections that are not expressible as literals.
|with: anObject||Answer an instance of the collection containing anObject.|
|with: firstObject with: secondObject||Answer an instance of the collection containing firstObject and secondObject as elements.|
|with: firstObject with: secondObject with: thirdObject||Answer an instance of the collection containing firstObject, secondObject, and thirdObject as elements.|
|with: firstObject with: secondObject with: thirdObject with: fourthObject||Answer an instance of the collection, containing firstObject, secondObject, thirdObject, and fourthObject as the elements.|
|Collection class protocol|
For example, Set is a subclass of Collection. To create a new Set with three elements that are the Characters s, e, and t, evaluate the expression
Set with: $s with: $e with: $t
Note that the rationale for providing these four instance creation messages, no more and no fewer, is that this number satisfies the uses to which collections are put in the system itself.
Conversion Among Collection Classes
A complete description and understanding of the permissible conversions between kinds of collections depends on a presentation of all the subclasses of Collection. Here we simply note that five messages are specified in the converting protocol for all collections in order to convert the receiver to a Bag, a Set, an OrderedCollection, and a SortedCollection. These messages are specified in class Collection because it is possible to convert any collection into any of these kinds of collections. The ordering of elements from any collection whose elements are unordered, when converted to a collection whose elements are ordered, is arbitrary.
|asBag||Answer a Bag whose elements are those of the receiver.|
|asSet||Answer a Set whose elements are those of the receiver (any duplications are therefore eliminated).|
|asOrderedCollection||Answer an OrderedCollection whose elements are those of the receiver (ordering is possibly arbitrary).|
|asSortedCollection||Answer a SortedCollection whose elements are those of the receiver, sorted so that each element is less than or equal to(< =) its successors.|
|asSortedCollection: aBlock||Answer a SortedCollection whose elements are those of the receiver, sorted according to the argument aBlock.|
|Collection instance protocol|
Thus if lotteryA is a Bag containing elements
272 572 852 156 596 272 572
is a Set containing elements
852 596 156 572 272
is a SortedCollection containing elements ordered (the first element is listed as the leftmost one)
156 272 272 572 572 596 852