Smalltalk80LanguageImplementation:Chapter 13

Chapter 13 Implementation of the Basic Collection Protocol

Implementation of the Basic Collection Protocol

The protocol for the classes in the Collection hierarchy was presented in Chapters 9 and 10. This chapter presents the complete implementation of class Collection and the implementation of the basic protocol for instance creation, accessing, testing, adding, removing, and enumerating for each subclass of Collection. These implementations make effective use of a framework in class Collection that is refined in its subclasses. Messages in Collection are implemented in a very general way or as self subclassResponsibility. Messages are implemented as

self subclassResponsibility

if the method depends on the representation of the instances. Each subclass must override such messages to fulfill any "subclass responsibilities." Subclasses may override other messages, for efficiency purposes, with a new method that takes advantage of the representation. A subclass may implement some messages with

self shouldNotImplement

which results in a report that the message should not be sent to instances of the class. For example, SequenceableCollections cannot respond to remove:ifAbsent:; therefore the method is implemented as self shouldNotImplement.

Class Collection

❏ Collection instance creation protocol

In addition to the messages new and new:, an instance of a Collection can be created by sending any one of four messages made up of one, two, three, or four occurrences of the keyword with:. The messages new and new: are not reimplemented in Collection; they produce an instance that is an empty collection. Each of the other four instance creation methods is specified in Collection in a similar way. First an instance is created (with the expression self new) and then the arguments, in order, are added to the instance. The new instance is returned as the result. The instance is created using self new, rather than super new or self basicNew, because a subclass of Collection might reimplement the message new. Any subclass of Collection that represents fixed-size objects with indexed instance variables must reimplement the following instance creation messages since such a subclass cannot provide an implementation for new.

class name

Collection

superclass

Object

class methods

instance creation
    with: anObject
        | newCollection |
        newCollection ← self new.
        newCollection add: anObject.
        ↑newCollection
    with: firstObject with: secondObject
        | newCollection |
        newCollection ← self new.
        newCollection add: firstObject.
        newCollection add: secondObject.
        ↑newCollection
    with: firstObject with: secondObject with: thirdObject
        | newCollection |
        newCollection ← self new.
        newCollection add: firstObject.
        newCollection add: secondObject.
        newCollection add: thirdObject.
        ↑newCollection
    with: firstObject with: secondObject with: thirdObject with: fourthObject
        | newCollection |
        newCollection ← self new.
        newCollection add: firstObject.
        newCollection add: secondObject.
        newCollection add: thirdObject.
        newCollection add: fourthObject.
        ↑newCollection

The implementation of each of the instance creation messages depends on the ability of the newly-created instance to respond to the message add:. Class Collection cannot provide implementations of the following messages because they depend on the representation used by a subclass:

add: anObject
remove: anObject ifAbsent: aBlock
do: aBlock

All other messages in the basic collection protocol are implemented in terms of these three messages. Each subclass must implement the three basic messages; each can then reimplement any others in order to improve its performance.

❏ Collection adding protocol

The protocol for adding elements to a collection is implemented in class Collection as follows.

adding
    add: anObject
        self subclassResponsibility
    addAll: aCollection
        aCollection do: [ :each | self add: each ].
        ↑aCollection

Notice that the implementation of addAll: depends on both do: and add:. The order of adding elements from the argument, aCollection, depends on both the order in which the collection enumerates its elements (do:) and the manner in which the elements are included into this collection (add:).

❏ Collection removing protocol

The messages remove: and removeAll: are implemented in terms of the basic message remove:ifAbsent:, which must be provided in a subclass. These methods report an error if the element to be removed is not in the collection. The method remove:ifAbsent: can be used to specify different exception behavior.

removing
    remove: anObject ifAbsent: exceptionBlock
        self subclassResponsibility
    remove: anObject
        self remove: anObject ifAbsent: [self errorNotFound]
    removeAll: aCollection
        aCollection do: [ :each | self remove: each].
        ↑aCollection
private
    errorNotFound
        self error: 'Object is not in the collection'

As usual, the category private refers to messages introduced to support the implementations of other messages; it is not to be used by other objects. Most error messages that are used more than once will be specified as private messages in order to create the literal message string once only.

❏ Collection testing protocol

All the messages in the protocol for testing the status of a collection can be implemented in Collection.

testing
    isEmpty
        ↑self size = 0
    includes: anObject
        self do: [ :each | anObject = each ifTrue: [↑true]].
        ↑false
    occurrencesOf: anObject
        | tally |
        tally ← 0.
        self do: [ :each | anObject = each ifTrue: [tally ← tally + 1]].
        ↑tally

The implementations of includes: and occurrencesOf: depend on the subclass's implementation of the basic enumerating message do:. The block argument of do: in the method for includes: terminates as soon as an element equal to the argument is found. If no such element is found, the last expression (↑false) is evaluated. The response to isEmpty and includes: are Boolean objects, true or false. The message size is inherited from class Object, but is reimplemented in Collection because size, as defined in Object, is only nonzero for variable-length objects.

accessing
    size
        | tally |
        tally ← 0.
        self do: [ :each | tally ← tally + 1].
        ↑tally

This is a low-performance approach to computing the size of a collection which, as we shall see, is reimplemented in most of the subclasses.

❏ Collection enumerating protocol

An implementation of all of the messages that enumerate the elements of collections, except do:, can be provided in class Collection.

enumerating
    do: aBlock
        self subclassResponsibility
    collect: aBlock
        | newCollection |
        newCollection ← self species new.
        self do: [ :each | newCollection add: (aBlock value: each)].
        ↑newCollection
    detect: aBlock
        ↑self detect: aBlock ifNone: [self errorNotFound]
    detect: aBlock ifNone: exceptionBlock
        self do: [ :each | (aBlock value: each) ifTrue: [↑each]].
        ↑exceptionBlock value
    inject: thisValue into: binaryBlock
        | nextValue |
        nextValue ← thisValue.
        self do: [ :each | nextValue ← binaryBlock value: nextValue value: each].
        ↑nextValue
    reject: aBlock
        ↑self select: [ :element | (aBlock value: element) = = false]
    select: aBlock
        | newCollection |
        newCollection ← self species new.
        self do: [ :each | (aBlock value: each) ifTrue: [newCollection add: each]].
        ↑newCollection

In the methods associated with collect: and select:, the message species is sent to self. This message was not shown in Chapter 9 because it is not part of the external protocol of collections. It is categorized as private to indicate the intention for internal use only. The message is implemented in class Object as returning the class of the receiver.

private
    species
        ↑self class

Thus the expression

self species new

means "create a new instance of the same class as that of the receiver."

For some collections, it may not be appropriate to create a "similar" instance in this way; a new collection that is like it may not be an instance of its class. Such a collection will override the message species. In particular, an Interval responds that its species is Array (because it is not possible to modify an Interval); the species of a MappedCollection is the species of the collection it maps (since the MappedCollection is simply acting as an accessor for that collection).

If a collection cannot create an instance by simply sending the class the message new, it must reimplement messages collect: and select:. Since reject: is implemented in terms of select:, it need not be reimplemented.

The method for inject:into: evaluates the block argument once for each element in the receiver. The block is also provided with its own value from each previous evaluation; the initial value is provided as the argument of inject:. The final value of the block is returned as the value of the inject:into: message.

The reason for the introduction of two messages, detect: and detect:ifNone:, is similar to the reason for the two removing messages, remove: and remove:ifAbsent:. The general case (detect:) reports an error if no element meeting the detection criterion is found; the programmer can avoid this error report by specifying an alternative exception (detect: ifNone:).

❏ Collection converting protocol

The protocol for converting from any collection into a Bag, Set, OrderedCollection, or SortedCollection is implemented in a straightforward way--create a new instance of the target collection, then add to it each element of the receiver. In most cases, the new instance is the same size as the original collection. In the case of OrderedCollections, elements are added at the end of the sequence (addLast:), regardless of the order of enumerating from the source.

converting
    asBag
        | aBag |
        aBag ← Bag new.
        self do: [ :each | aBag add: each].
        ↑aBag
    asOrderedCollection
        | anOrderedCollection |
        anOrderedCollection ← OrderedCollection new: self size. 
        self do:[ :each | anOrderedCollection addLast: each]. 
        ↑anOrderedCollection
    asSet
        | aSet |
        aSet ← Set new: self size.
        self do: [ :each | aSet add: each].
        ↑aSet
    asSortedCollection
        | aSortedCollection |
        aSortedCollection ← SortedCollection new: self size.
        aSortedCollection addAll: self.
        ↑aSortedCollection
    asSortedCollection: aBlock
        | aSortedCollection |
        aSortedCollection ← SortedCollection new: self size.
        aSortedCollection sortBlock: aBlock.
        aSortedCollection addAll: self.
        ↑aSortedCollection

❏ Collection printing protocol

The implementations of the printOn: and storeOn: messages in Object are overridden in Collection. Collections print in the form

className (element element element )

Collections store themselves as an expression from which an equal collection can be constructed. This takes the form of

((className new))

or

((className new) add: element; yourself)

or

((className new) add: element; add: element; yourself)

with the appropriate number of cascaded messages for adding each element, depending on whether the collection has no, one, or more elements. The message yourself returns the receiver of the message. It is used in cascaded messages to guarantee that the result of the cascaded message is the receiver. All objects respond to the message yourself; it is defined in class Object.

The general methods for printing and storing are

printing
    printOn: aStream
        | tooMany |
        tooMany ← aStream position + self maxPrint.
        aStream nextPutAll: self class name, '('.
        self do:
            [ :element |
                aStream position > tooMany
                    ifTrue: [aStream nextPutAll: '...etc...). ↑self].
                element printOn: aStream.
                aStream space].
            aStream nextPut: $)
    storeOn: aStream
        | noneYet |
        aStream nextPutAll: '(('.
        aStream nextPutAll: self class name.
        aStream nextPutAll: 'new)'.
        noneYet ← true.
        self do:
            [ :each |
                noneYet
                    ifTrue: [noneYet ← false]
                    ifFalse: [aStream nextPut: $;].
                aStream nextPutAll: 'add:'.
                aStream store: each].
            noneYet i↑false: [aStream nextPutAll: '; yourself'].
            aStream nextPut: $)
private
    maxPrint
        ↑5000

These methods make use of instances of a kind of Stream that acts as an accessor for a String. The method printOn: sets a threshold for the length of the String to be created; a long collection may print as

className (element element ...etc... )

The threshold is determined as the response to the message maxPrint which is set at 5000 characters. Subclasses can override the private message maxPrint in order to modify the threshold.

Note that this technique of using a method rather than a variable is a way of providing a parameter in a method. A variable cannot be used as the parameter because the variable, to be accessible to all instances, would have to be a class variable. Subclasses cannot specify a class variable whose name is the same as a class variable in one of its superclasses; thus if a subclass wants to change the value of the variable, it will do so for instances of its superclass as well. This is not the desired effect.

The printing format is modified in several subclasses. Array does not print its class name; Intervals print using the shorthand notation of the messages to: and to:by: to a Number. A Symbol prints its characters (without the ~ prefix of the literal form of a Symbol); a String prints its characters delimited by single quotes.

The storeOn: message is reimplemented in ArrayedCollection and several of its subclasses because instances are created using new: anInteger rather than simply new. Arrays, Strings, and Symbols store in their literal forms. Intervals use the shorthand notation of messages to: and to:by:. MappedCollections store using the converting message mappedBy: that is sent to the collection that is indirectly accessed.

Subclasses of Collection

For each subclass of Collection, we show the methods that implement the three required messages (add:, remove:ifAbsent:, and do:) and the messages in the adding, removing, testing, and enumerating protocols that are reimplemented. New collection protocol for a particular subclass as specified in Chapter 9 will generally not be presented in this chapter.

Class Bag

Bag represents an unordered collection in which an element can appear more than once. Since the elements of Bags are unordered, the messages at: and at:put: are reimplemented to report an error.

Instances of Bag have an instance of Dictionary as a single instance variable named contents. Each unique element of a Bag is the key of an Association in contents; the value of an Association is an Integer representing the number of times the element appears in the Bag. Removing an element decrements the tally; when the tally falls below 1, the Association is removed from contents. Bag implements new, size, includes:, and occurrencesOf:. A new instance must initialize its instance variable to be a Dictionary. The reimplementation of size is made efficient by summing all the values of elements of contents. The arguments of the testing messages are used as keys of contents. In implementing includes:, the responsibility for checking is passed to contents. In order to answer the query occurrencesOf: anObject, the method checks that anObject is included as a key in contents and then looks up the value (the tally) associated with it.

class name	Bag
superclass	Collection
instance variable names	contents
class methods	instance creation new ↑super new setDictionary
instance methods	accessing at: index self errorNotKeyed at: index put: anObject self errorNotKeyed size \| tally \| tally ← 0 contents do: [ :each \| tally ← tally + each]. ↑tally testing includes: anObject ↑contents includesKey: anObject occurrencesOf: anObject (self includes: anObject) ifTrue: [↑contents at: anObject] ifFalse: [↑0] private setDictionary contents ← Dictionary new (in Collection) private errorNotKeyed self error: self class name, 's do not respond to keyed accessing messages'

To add an element is to add it once, but Bags can add multiple times. The implementation of add: calls on add:withOccurrences:. Removing an element checks the number of occurrences, decrementing the tally or removing the element as a key in contents if the tally is less than 1.

adding
    add: newObject
        ↑self add: newObject withOccurrences: 1
    add: newObject withOccurrences: anInteger
        contents at: newObject
            put: anInteger + (self occurrencesOf: newObject).
        ↑newObject
removing
    remove: oldObject ifAbsent: exceptionBlock
        | count |
        count ← self occurrencesOf: oldObject.
        count = 0 ifTrue: [↑exceptionBlock value].
        count = 1
            ifTrue: [contents removeKey: oldObject]
            ifFalse: [contents at: oldObject put: count - 1]].
        ↑oldObject

Enumerating the elements of a Bag means selecting each element of the Dictionary and evaluating a block with the key of that element (i.e., the actual Bag element is the key of the Dictionary). This has to be done multiple times, once for each occurrence of the element, as indicated by the value associated with the key.

enumerating
    do: aBlock
        contents associationsDo:
            [ :assoc | assoc value timesRepeat: [aBlock value: assoc key]]

Class Set

The elements of Sets are unordered like those of Bags, so the messages at: and at:put: produce an error report. A Set may not contain an element more than once, therefore, every insertion of an element must, in theory, check the entire collection. To avoid searching all elements, a Set determines where in its indexed instance variables to start a search for a particular element by using a hashing technique.

Each Set has an instance variable named tally. Maintaining this tally of the number of elements avoids the inefficiencies involved in determining the size of the Set by counting every non-nil element. Thus new, new:, and size are reimplemented; the first two in order to initialize the variable tally and the last simply to respond with the value of tally.

class name	Set
superclass	Collection
instance variable names	tally
class methods	instance creation new ↑self new: 2 new: anInteger ↑(super new: anInteger) setTally
instance methods	accessing at: index self errorNotKeyed at: index put: anObject self errorNotKeyed size ↑tally private setTally tally ← 0

In the method for new:, super is used in order to avoid recursion. A private message of Set, findElementOrNil:, hashes the argument to produce the index at which to begin the probe of the Set. The probe proceeds until the argument, anObject, is found, or until nil is encountered. The response is the index of the last position checked. Then the testing messages are implemented as

testing
    includes: anObject
        ↑(self basicAt: (self findElementOrNil: anObject)) ~~ nil
    occurrencesOf: anObject
        (self includes: anObject)
            ifTrue: [↑1]
            ifFalse: [↑0]

The number of occurrences of any element in the Set is never more than 1. The three basic messages must make use of basicAt: and basicAt:put: because Sets report an error if at: or at:put: is used.

adding
    add: newObject
        | index |
        newObject isNil ifTrue: [↑newObject]. 
        index ← self findElementOrNil: newObject. 
        (self basicAt: index)isNil
            ifTrue: [self basicAt: index put: newObject, tally ~ tally + 1]. 
        ↑newObject
removing
    remove: oldObject ifAbsent: aBlock
        | index |
        index ← self find: oldObject ifAbsent: [↑aBlock value]. 
        self basicAt: index put: nil.
        tally ← tally - 1.
        self fixCollisionsFrom: index.
        ↑oldObject
enumerating
    do: aBlock
        1 to: self basic Size do:
            [ :index |
                (self basicAt: index) isNil
                    ifFalse: [aBlock value: (self basicAt: index)]]

The private message find:ifAbsent: calls on findElementOrNil:; if the element, oldObject, is not found, the argument aBlock is evaluated. In order to guarantee that the hashing/probing technique works properly, remaining elements might need to be compacted whenever one is removed (fixCollisionsFrom:). These methods are good examples of when the accessing messages basicAt:, basicAt:put:, and basicSize must be used.

Class Dictionary

A Dictionary is a collection of Associations. Class Dictionary uses a hashing technique to locate its elements which is like that of its superclass, Set, but hashes on the keys in the Associations instead of on the Associations themselves. Most of the accessing messages for Dictionary are reimplemented to treat the values of the Associations as the elements, not the Associations themselves.

Dictionary implements at: and at:put:, but redefines the argument associated with the keyword at: to be any key in the Dictionary (not necessarily an Integer index). The argument of includes: is the value of one of the Associations in the Dictionary, not one of the Associations themselves. The message do: enumerates the values, not the Associations. The argument to remove: is also a value, but this is an inappropriate way to delete from a Dictionary because elements are referenced with keys. Either removeAssociation: or removeKey: should be used. Thus the messages remove: and remove:ifAbsent: should not be implemented for Dictionary.

Much of the work in the accessing protocol is done in private messages, either those inherited from Set or similar ones for finding a key (findKeyOrNil:).

class name

Dictionary

superclass

Set

instance methods

accessing
    at: key
        ↑self at: key ifAbsent: [self errorKeyNotFound]
    at: key put: anObject
        | index element |
        index ← self findKeyOrNil: key. 
        element ← self basicAt: index. 
        element isNil
            ifTrue:
                [self basicAt: index put: (Association key: key value: anObject).
                tally ← tally + 1]
                " element is an Association. The key already exists, change its value." 
            ifFalse:
                [element value: anObject]. 
        ↑anObject
    at: key ifAbsent: aBlock
        | index |
        index ← self findKey: key ifAbsent: [↑aBlock value].
        ↑(self basicAt: index) value

testing
    includes: anObject
        "Revert to the method used in Collection."
        self do: [ :each | anObject = each ifTrue: [↑true]].
        ↑false

adding
    add: anAssociation
        | index element |
        index ← self findKeyOrNil: anAssociation key. 
        element ← self basicAt: index.
        element isNil
            ifTrue: [self basicAt: index put: anAssociation. 
                tally ← tally + 1]
            ifFalse: [element value: anAssociation value]. 
        ↑anAssociation

removing
    remove: anObject ifAbsent: aBlock
        self shouldNotImplement

enumerating
    do: aBlock
        self associationsDo: [ :assoc | aBlock value: assoc value]

private
    errorKeyNotFound
        self error: 'key not found'

Notice the similarity between at:put: and add:. The difference is in the action taken if the element is not found--in the case of at:put:, a new Association is created and stored in the Dictionary; in the case of add:, the argument, anAssociation, is stored so that any shared reference to the Association is preserved.

The message collect: is reimplemented in order to avoid the problems of collecting possibly identical values into a Set which would result in throwing away duplications. The message select: is reimplemented in order to select Associations by applying their values as the arguments to the block.

enumerating
    collect: aBlock
        | newCollection |
        newCollection ← Bag new.
        self do: [:each I newCollection add: (aBlock value: each)]. 
        ↑newCollection
    select: aBlock
        | newCollection |
        newCollection ← self species new. 
        self associationsDo:
            [ :each |
                (aBlock value each value)ifTrue: [newCollection add: each]].
    ↑newCollection

IdentityDictionary overrides at:, at:put:, and add: in order to implement checking for identical keys instead of equal keys. An IdentityDictionary is implemented as two parallel ordered collections of keys and values, rather than as a single collection of Associations. Thus do: must also be reimplemented. The implementation is not shown.

SequenceableCollections

SequenceableCollection is the superclass for all collections whose elements are ordered. Of the messages we are examining, remove:ifAbsent: is specified as being inappropriate for SequenceableCollections in general, since the order of elements might have been externally specified and it is assumed that they should be removed in order. Because SequenceableCollections are ordered, elements are accessed using at:; the implementation is provided in class Object. The message do: is implemented by accessing each element at index 1 through the size of the collection. SequenceableCollections are created using the message new:. Therefore, collect: and select: must be reimplemented to create the new collection using new: rather than new. The methods for collect: and select: shown next use a WriteStream in order to access the new collection, and the message at: in order to access elements of the original collection.

class name

SequenceableCollection

superclass

Collection

instance methods

accessing
    size
        self subclassResponsibility

removing
    remove: oldObject ifAbsent: anExceptionBlock
        self shouldNotImplement

enumerating
    do: aBlock
        | index length |
        index ← 0.
        length ← self size.
            [(index ← index + 1) < = length]
                whileTrue: [aBlock value: (self at: index)]
    collect: aBlock
        | aStream index length |
        aStream ← WriteStream on: (self species new: self size). 
        index ← O.
        length ← self size.
        [(index ← index + 1) < = length]
            whileTrue: [aStream nextPut: (aBlock value: (self at: index))]. 
        ↑aStream contents
    select: aBlock
        | aStream index length |
        aStream ← WriteStream on: (self species new: self size). 
        index ← O.
        length ← self size.
        [(index ← index + 1) < = length]
            whileTrue:
                [(aBlock value: (self at: index))
                    ifTrue: [aStream nextPut: (self at: index)]]. 
        ↑aStream contents

Notice that size is declared as a subclass responsibility in SequenceableCollection. The method inherited from the superclass Collection uses do: to enumerate and thereby count each element. But the method for do: as specified in SequenceableCollection determines the limit for indexing by requesting the size of the collection. Therefore, size must be reimplemented in order not to be stated in terms of do:.

Subclasses of SequenceableCollection

❏ Class LinkedList

Elements of LinkedList are instances of Link or of one of its subclasses. Each LinkedList has two instance variables, a reference to the first and to the last elements. Adding an element is assumed to be interpreted as adding to the end (addLast:); the method for addLast: is to make the element the next link of the current last link. Removing an element means that the element's preceding link must reference the element's succeeding link (or nil). If the element to be removed is the first one, then its succeeding link becomes the first one.

class name	LinkedList
superclass	SequenceableCollection
instance variable names	firstLink lastLink
instance methods	accessing at: index \| count element size \| count ← 1. element ← self first. size ← self size. [count > size] whileFalse: [count = index ifTrue: [↑element] ifFalse: [count ← count + 1. element ← element nextLink]]. ↑self errorSubscriptBounds: index at: index put: element self error: 'Do not store into a LinkedList using at:put:' adding add: aLink ↑self addLast: aLink addLast: aLink self isEmpty ifTrue: [firstLink ← aLink] ifFalse: [lastLink nextLink: aLink]. lastLink ← aLink. ↑aLink removing remove: aLink ifAbsent: aBlock \| tempLink \| aLink = = firstLink ifTrue: [firstLink ~ aLink nextLink. aLink = = lastLink ifTrue: [lastLink ←nil]] ifFalse: [tempLink ← firstLink. [tempLink isNil ifTrue: [↑aBlock value]. tempLink nextLink = = aLink] whileFalse: [tempLink ← tempLink nextLink]. tempLink nextLink: aLink nextLink. aLink = = lastLink ifTrue: [lastLink ← tempLink]]. aLink nextLink: nil. ↑aLink enumerating do: aBlock \| aLink \| aLink ← firstLink. [aLink isNil] whileFalse: [aBlock value: aLink. aLink ← aLink nextLink]

A nil link signals the end of the LinkedList. Thus the enumerating message do: is implemented as a simple loop that continues until a nil is encountered in the collection.

❏ Class Interval

Intervals are SequenceableCollections whose elements are computed. Therefore, messages for adding and removing cannot be supported. Since elements are not explicitly stored, all accessing (at:, size, and do:) requires a computation. Each method checks to see if the last element computed is to be incremented (positive step) or decremented (negative step) in order to determine whether the limit (stop) has been reached.

class name	Interval
superclass	SequenceableCollection
instance variable names	start stop step
class methods	instance creation from: startInteger to: stopInteger ↑self new setFrom: startInteger to: stopInteger by: 1 from: startInteger to: stopInteger by: stepInteger ↑self new setFrom: startInteger to: stopInteger by: stepInteger
instance methods	accessing size step < 0 ifTrue: [start < stop ifTrue: [↑O] ifFalse: [↑stop -- start // step + 1]] ifFalse: [stop < start ifTrue: [↑O] ifFalse: [↑stop -- start // step + 1]] at: index (index > = 1 and: [index < = self size]) ifTrue: [↑start + (step * (index - 1))] ifFalse: [self errorSubscriptBounds: index] at: index put: anObject self error: 'you cannot store into an Interval' adding add: newObject self error: 'elements cannot be added to an Interval' removing remove: newObject self error: 'elements cannot be removed from an Interval' enumerating do: aBlock \| aValue \| aValue ← start. step < 0 ifTrue: [[stop < = aValue] whileTrue: [aBlock value: aValue. aValue ← aValue + step]] ifFalse: [[stop > = aValue] whileTrue: [aBlock value: aValue. aValue ← aValue + step]] collect: aBlock \| nextValue i result \| result ← self species new: self size. nextValue ← start. i ← 1. step < 0 ifTrue: [[stop < = nextValue] whileTrue: [result at: i put: (aBlock value: nextValue). nextValue ← nextValue + step. i ← i + 1]] ifFalse: [[stop > = nextValue] whileTrue: [result at: i put: (aBlock value: nextValue). nextValue ← nextValue + step. i ← i + 1]]. ↑result private setFrom: startInteger to: stopInteger by: stepInteger start ← startInteger. stop ← stopInteger. step ← stepInteger

❏ ArrayedCollections--Array, ByteArray, String, Text, and Symbol ArrayedCollection is a subclass of SequenceableCollection; each ArrayedCollection is a variable-length object. All instance creation methods are reimplemented to use new:, not new. ArrayedCollections are fixed-length so add: is disallowed; in its superclass, remove: was already disallowed and do: was implemented. only size, therefore, is implemented in ArrayedCollection--it is a system primitive that reports the number of indexed instance variables.

Of the subclasses of ArrayedCollection, Array, and ByteArray do not reimplement any of the messages we are examining in this chapter. Accessing messages for String--at:, at:put:, and size--are system primitives; in Text, all accessing messages are passed as messages to the instance variable string (which is an instance of String). Symbol disallows at:put: and returns String as its species.

❏ OrderedCollections and SortedCollections

OrderedCollection stores an ordered, contiguous sequence of elements. Since OrderedCollections are expandable, some efficiency is gained by allocating extra space for the sequence. Two instance variables, firstIndex and lastIndex, point to the first and the last actual elements in the sequence.

The index into OrderedCollection is converted to be within the range of firstIndex to lastIndex for accessing messages (at: and at:put:) and the size is simply one more than the difference between the two indices. Adding an element is interpreted to be adding to the end; if there is no room at the end, the collection is copied with additional space allocated (makeRoomAtLast is the private message that does this work). The actual location for storing an element is the computed index position after lastIndex. If an element is removed, then the remaining elements must be moved up so that elements remain contiguous (removeIndex:).

class name	OrderedCollection
superclass	SequenceableCollection
instance variable names	firstIndex lastIndex
class methods	instance creation new ↑self new: 10 new: anInteger ↑(super new: anInteger) setIndices
instance methods	accessing size ↑lastIndex - firstIndex + 1 at: anInteger (anInteger < 1 or: [anInteger + firstIndex - 1 > lastIndex]) ifTrue: [self errorNoSuchElement] ifFalse: [↑super at: anInteger + firstIndex - 1] at: anInteger put: anObject \| index \| index ← anInteger truncated. (index < 1 or: [index + firstIndex - 1 > lastIndex]) ifTrue: [self errorNoSuchElement] ifFalse: [↑super at: index + firstIndex - 1 put: anObject] adding add: newObject ↑self addLast: aLink addLast: newObject lastIndex = self basicSize ifTrue: [self makeRoomAtLast]. lastIndex ← lastIndex + 1. self basicAt: lastIndex put: newObject. ↑newObject removing remove: oldObject ifAbsent: absentBlock \| index \| index ← firstIndex. [index < = lastIndex] whileTrue: [oldObject = (self basicAt: index) ifTrue: [self removeIndex: index. ↑oldObject] ifFalse: [index ← index + 1]]. ↑absentBlock value private setIndices firstIndex ← self basicSize // 2 max: 1. lastIndex ← firstIndex - 1 max: 0 errorNoSuchElement self error: 'attempt to index non-existent element in an ordered collection'

The enumerating messages do:, collect:, and select: are each reimplemented--do: in order to provide better performance than the method provided in SequenceableCollection.

enumerating
    do: aBlock
        | index |
        index ← firstIndex. 
        [index < = lastIndex]
            whileTrue:
                [aBlock value: (self basicAt: index).
                index ← index + 1]
    collect: aBlock
        | newCollection |
        newCollection ← self species new.
        self do: [:each | newCollection add: (aBlock value: each)]. 
        ↑newCollection
    select: aBlock
        | newCollection |
        newCollection ← self copyEmpty.
        self do: [ :each | (aBlock value: each) ifTrue: [newCollection add: each]].
        ↑newCollection

In the method for select:, the new collection is created by sending the original collection the message copyEmpty. This message creates a new collection with enough space allocated to hold all the elements of the original, although all the elements might not be stored. In this way, time taken in expanding the new collection is avoided.

SortedCollection is a subclass of OrderedCollection. The message at:put: reports an error, requesting the programmer to use add:; add: inserts the new element according to the value of the instance variable sortBlock. The determination of the position for insertion is done as a "bubble sort." collect: is also reimplemented to create an OrderedCollection rather than a SortedCollection for collecting the values of the block. The code is not shown; a bubble sort looks the same in Smalltalk-80 as it would in most programming languages.

Class MappedCollection

Instances of MappedCollection have two instance variables--domain and map. The value of domain is either a Dictionary or a SequenceableCollection; its elements are accessed indirectly through map. The message add: is disallowed. Both at: and at:put: are reimplemented in MappedCollection in order to support the indirect access from map to the elements of domain. The size of a MappedCollection is the size of its domain.

class name	MappedCollection
superclass	Collection
instance variable names	domain map
class methods	instance creation collection: domainCollection map: mapCollection ↑super new setCollection: domainCollection map: mapCollection new self error: 'use collection:map: to create a Mapped Collection'
instance methods	accessing at: anIndex ↑domain at: (map at: anIndex) at: anIndex put: anObject ↑domain at: (map at: anIndex) put: anObject size ↑map size adding add: newObject self shouldNotImplement enumerating do: aBlock map do: [ :mapValue \| aBlock value: (domain at: mapValue)] collect: aBlock \| aStream \| aStream ← WriteStream on: (self species new: self size). self do: [ :domainValue \| aStream nextPut: (aBlock value: domainValue)]. ↑aStream contents select: aBlock \| aStream \| aStream ← WriteStream on: (self species new: self size). self do: [ :domainValue \| (aBlock value: domainValue) ifTrue: [aStream nextPut: domainValue]]. ↑aStream contents private setCollection: domainCollection map: mapCollection domain ← domainCollection. map ← mapCollection species ↑domain species

Notes

Smalltalk80LanguageImplementation:Chapter 13

Contents

Implementation of the Basic Collection Protocol

Class Collection

Subclasses of Collection

Class Bag

Class Set

Class Dictionary

SequenceableCollections

Subclasses of SequenceableCollection

Class MappedCollection

Notes

Navigation menu

Smalltalk80LanguageImplementation:Chapter 13

Implementation of the Basic Collection Protocol

Class Collection

Subclasses of Collection

Class Bag

Class Set

Class Dictionary

SequenceableCollections

Subclasses of SequenceableCollection

Class MappedCollection

Notes

Navigation menu

Search