- Chapter 12 Protocol for Streams
Protocol for Streams
Object Magnitude Character Date Time Number Float Fraction Integer LargeNegativeInteger LargePositiveInteger SmallInteger LookupKey Association Link Process Collection SequenceableCollection LinkedList Semaphore ArrayedCollection Array Bitmap DisplayBitmap RunArray String Symbol Text ByteArray Interval OrderedCollection SortedCollection Bag MappedCollection Set Dictionary IdentifyDictionary Stream*** PositionableStream*** ReadStream*** WriteStream*** ReadWriteStreamm*** ExternalStream*** FileStream*** Random File*** FileDirectory*** FilePage*** UndefinedObject Boolean False True ProcessorScheduler Delay SharedQueue Behavior ClassDescription Class MetaClass Point Rectangle BitBit CharacterScanner Pen DisplayObject DisplayMedium Form Cursor DisplayScreen InfiniteForm OpaqueForm Path Arc Circle Curve Line LinearFit Spline
The collection classes provide the basic data structure for storing objects together as nonlinear and linear groups. The protocol for these classes makes it possible directly to access (store and retrieve) individual elements. It also, through the enumeration messages, supports noninterruptible accessing of all of the elements, in order. However, it does not support intermingling the two kinds of accessing operations-enumerating and storing. Nor does the collection protocol support accessing individual elements, one at a time, unless an external position reference is separately maintained.
Unless an easily computed external name for each element exists, interruptible enumeration of individual elements can not be carried out efficiently. It is possible, for example, sequentially to read elements of an OrderedCollection using a combination of first and after:, as long as the elements of the collection are unique. An alternative approach involves the collection itself remembering, in some way, which element was last accessed. We call this the "position reference" in the discussion that follows. The possibility of shared access to a sequence of elements, however, means that it is necessary to maintain a separate, external memory of the last element accessed.
Class Stream represents the ability to maintain a position reference into a collection of objects. We use the phrase streaming over a collection to mean accessing the elements of a collection in such a way that it is possible to enumerate or store each element, one at a time, possibly intermingling these operations. By creating several Streams over the same collection, it is possible to maintain multiple position references into the same collection.
There are a number of ways to maintain a position reference for streaming over a collection. A common one is to use an integer as an index. This approach can be used for any collection whose elements are externally named by an integer. All SequenceableCollections fall into this category. As we shall describe later, such a Stream is represented in the Smalltalk-80 system by the class PositionableStream. A second way to maintain a position reference is to use a seed for a generator of objects. An example of this kind of Stream in the Smalltalk-80 system is class Random which was already presented in Chapter 8. And a third way is to maintain a non-numerical position reference, such as a reference to a node in a sequence; this approach is illustrated in this chapter by an example class that supports streaming over a linked list or a tree structure.
Class Stream, a subclass of class Object, is a superclass that specifies the accessing protocol for streaming over collections. Included in this protocol are messages for reading (retrieving) and writing (storing) into the collection, although not all the subclasses of class Stream can support both kinds of accessing operations. The basic reading message is next; its response is the next element in the collection that is referenced by the Stream. Given the ability to access the next element, more general reading messages can be supported. These are next: anInteger, which responds with a collection of anInteger number of elements; nextMatchFor: anObject, which reads the next element and answers whether it is equal to the argument, anObject; and contents, which answers a collection of all of the elements.
|next||Answer the next object accessible by the receiver.|
|next: anInteger||Answer the next anInteger number of objects accessible by the receiver. Generally, the answer will be a collection of the same class as the one accessed by the receiver.|
|nextMatchFor: anObject||Access the next object and answer whether it is equal to the argument, anObject.|
|contents||Answer all of the objects in the collection accessed by the receiver. Generally, the answer will be a collection of the same class as the one accessed by the receiver.|
|Stream instance protocol|
The basic writing message is nextPut: anObject; this means to store the argument, anObject, as the next element accessible by the receiver. If both read and write messages are possible, a next message following a nextPut: anElement does not read the element just stored, but rather the one after it in the collection. Writing messages also include nextPutAll: aCollection, which stores all of the elements in the argument into the collection accessed by the receiver, and next: anInteger put: anObject, which stores the argument, anObject, as the next anInteger number of elements.
|nextPut: anObject||Store the argument, anObject, as the next element accessible by the receiver. Answer anObject.|
|nextPutAll: aCollection||Store the elements in the argument, aCollection, as the next elements accessible by the receiver. Answer aCollection.|
|next: anInteger put: anObject||Store the argument, anObject, as the next anInteger number of elements accessible by the receiver. Answer anObject.|
|Stream instance protocol|
The reading and writing messages each determine if a next element can be read or written and, if not, an error is reported. The programmer might therefore wish to determine whether accessing is still feasible; this is accomplished by sending the Stream the message atEnd.
|atEnd||Answer whether the receiver cannot access any more objects.|
|Stream instance protocol|
Noninterrupted reading of elements that are applied as arguments to a block can be done by sending the message do: aBlock, similar to the enumerating message supported by the collection classes.
|do: aBlock||Evaluate the argument, aBlock, for each of the remaining elements that can be accessed by the receiver.|
|Stream instance protocol|
The implementation of this enumeration message depends on sending the messages atEnd and next to the message receiver. We show this method as an example of the use of these messages.
do: aBlock [self atEnd] whileFalse: [aBlock value: self next]
Each kind of Stream must specify its instance creation messages. In general, a Stream can not be created simply by sending the message new because the Stream must be informed of which collection it accesses and what is its initial position reference.
As a simple example, let's assume that the collection accessed by a Stream is an Array and that the Stream is called accessor. The contents of the Array are the Symbols
Bob Dave Earl Frank Harold Jim Kim Mike Peter Rick Sam Tom
and the position reference is such that Bob is the next accessible element. Then
|accessor nextMatchFor: #Bob||false|
|accessor nextMatchFor: #Frank||true|
|accessor nextPut: #James||James|
|accessor contents||(Bob Dave Earl Frank Harold James Kim Mike Peter Rick Sam Tom)|
|accessor nextPutAll: #(Karl Larry Paul)||(Karl Larry Paul)|
|accessor contents||(Bob Dave Earl Frank Harold James Karl Larry Paul Rick Sam Tom)|
|accessor next: 2 put: #John||John|
|accessor contents||(Bob Dave Earl Frank Harold James Karl Larry Paul John John Tom)|
In the introduction to this chapter we indicated three possible approaches that a Stream might use in order to maintain a position reference. The first one we will present uses an integer index which is incremented each time the Stream is accessed. The Stream accesses only those kinds of collections whose elements have integers as external keys; these include all of the subclasses of SequenceableCollection.
Class PositionableStream is a subclass of class Stream. It provides additional protocol appropriate to Streams that can reposition their position references, but, it is an abstract class because it does not provide an implementation of the inherited messages next and nextPut: anObject. The implementation of these messages is left to the subclasses of PositionableStream--ReadStream, WriteStream, and ReadWriteStreamm.
A PositionableStream is created by sending the class one of two instance creation messages, on: aCollection or on: aCollection from: firstIndex to: lastIndex. The argument, aCollection, is the collection the Stream accesses; in the second case, a copy of a subcollection of aCollection is accessed, i.e., the one delimited by the two arguments firstIndex and lastIndex.
|on: aCollection||Answer an instance of a kind of PositionableStream that streams over the argument, aCollection.|
|on: aCollection from: firstIndex to: lastIndex||Answer an instance of a kind of PositionableStream that streams over a copy of a subcollection of the argument, aCollection, from firstIndex to lastIndex.|
|PositionableStream class protocol|
PositionableStream supports additional protocol for accessing and testing the contents of the collection.
|isEmpty||Answer true if the collection the receiver accesses has no elements; otherwise, answer false.|
|peek||Answer the next element in the collection (as in the message next), but do not change the position reference. Answer nil if the receiver is at the end.|
|peekFor: anObject||Determine the response to the message peek. If it is the same as the argument, anObject, then increment the position reference and answer true. Otherwise answer false and do not change the position reference.|
|upTo: anObject||Answer a collection of elements starting with the next element accessed by the receiver, and up to, not inclusive of, the next element that is equal to anObject. If anObject is not in the collection, answer the entire rest of the collection.|
|reverseContents||Answer a copy of the receiver's contents in reverse order.|
|PositionableStream instance protocol|
Since a PositionableStream is known to store an explicit position reference, protocol for accessing that reference is supported. In particular, the reference can be reset to access the beginning, the end, or any other position of the collection.
|position||Answer the receiver's current position reference for accessing the collection.|
|position: anInteger||Set the receiver's current position reference for accessing the collection to be the argument, anInteger. If the argument is not within the bounds of the receiver's collection, report an error.|
|reset||Set the receiver's position reference to the beginning of the collection.|
|setToEnd||Set the receiver's position reference to the end of the collection.|
|skip: anInteger||Set the receiver's position reference to be the current position plus the argument, anInteger, possibly adjusting the result so as to remain within the bounds of the collection.|
|skipTo: anObject||Set the receiver's position reference to be past the next occurrence of the argument, anObject, in the collection. Answer whether such an occurrence existed.|
|PositionableStream instance protocol|
Class ReadStream is a concrete subclass of PositionableStream that represents an accessor that can only read elements from its collection. We can create an example similar to the previous one to demonstrate the use of the additional protocol provided in class PositionableStream and inherited by all ReadStreams. None of the nextPut:, next:put:, nor nextPutAll: messages can be successfully sent to a ReadStream.
accessor ← ReadStream on: #(Bob Dave Earl Frank Harold Jim Kim Mike Peter Rick Sam Tom)
|accessor nextMatchFor: #Dave||true|
|accessor peekFor: #Frank||true|
|accessor upTo: #Rick||(Jim Kim Mike Peter)|
|accessor skip: 1||the accessor itself|
|accessor reset||the accessor itself|
|accessor skipTo: #Frank||true|
Class WriteStream is a subclass of PositionableStream representing accessors for writing elements into a collection. None of the next, next:, nor do: messages can be successfully sent to a WriteStream.
WriteStreams are used throughout the Smalltalk-80 system as a part of the methods for printing or storing a string description of any object. Each object in the system can respond to the messages printOn: aStream and storeOn: aStream. The methods associated with these messages consist of a sequence of messages to the argument, which is a kind of Stream that allows writing elements into the collection it accesses. These messages are nextPut:, where the argument is a Character, and nextPutAll:, where the argument is a String or a Symbol. An example will illustrate this idea.
Class Object printing protocol, as described in Chapter 6, includes the message printString. An implementation of this message is
printString | aStream | aStream ← WriteStream on: (String new: 16). self printOn: aStream. ↑aStream contents
If a collection is sent the message printString, then the answer is a String that is a description of the instance. The method creates a WriteStream that the collection can store into, sends the message printOn: to the collection, and then responds with the contents of the resulting WriteStream. The message storeString to any object is similarly implemented in class Object, the difference being that the second expression consists of the message storeOn: aStream rather than printOn: aStream.
The general way in which collections print a description of themselves is to print their class name followed by a left parenthesis, followed by a description of each element separated by spaces, and terminated by a right parenthesis. So if a Set has four elements--the Symbols one, two, three, and four--then it prints on a Stream as
Set (one two three four )
An OrderedCollection with the same elements prints on a Stream as
OrderedCollection (one two three four )
and so on.
Recall that the definition of printOn: and storeOn: given in Chapter 6 is that any suitable description can be provided for printOn:, but the description created by storeOn: must be a well-formed expression that, when evaluated, re-constructs the object it purports to describe.
Here is an implementation in class Collection for printOn:.
printOn: aStream aStream nextPutAll: self class name. aStream space. aStream nextPut: $(. self do: [ :element | element printOn: aStream. aStream space]. aStream nextPut: $)
Notice that the message space is sent to the WriteStream (aStream). It, and a number of other messages are provided in class WriteStream to support concise expressions for storing delimiters into such Streams. They are
|cr||Store the return character as the next element of the receiver.|
|crTab||Store the return character and a single tab character as the next two elements of the receiver.|
|crTab: anInteger||Store the return character as the next element of the receiver, followed by anInteger number of tab characters.|
|space||Store the space character as the next element of the receiver.|
|tab||Store the tab character as the next element of the receiver.|
|WriteStream instance protocol|
Thus to construct the String
from two corresponding Arrays,
names ← #(bob joe bill) cities ← #('New York' 'Chicago' 'Rochester')
evaluate the expressions
aStream ← WriteStream on: (String new: 16). aStream nextPutAll: 'name'. aStream tab. aStream nextPutAll: 'city'. aStream cr; cr. names with: cities do: [ :name :city | aStream nextPutAll: name. aStream tab. aStream nextPutAll: city. aStream cr]
then the desired result is obtained by evaluating aStream contents.
Suppose a collection already exists and we wish to append further information into it by using Stream protocol. Class WriteStream supports instance creation protocol that accepts a collection and sets the position reference for writing to the end.
|with: aCollection||Answer an instance of WriteStream accessing the argument, aCollection, but positioned to store the next element at the end of it.|
|with: aCollection from: firstIndex to: lastIndex||Answer an instance of WriteStream accessing the subcollection of the argument, aCollection, from location firstIndex to lastIndex, but positioned to store the next element at the end of the subcollection.|
|WriteStream class protocol|
Thus if a String referred to as header already existed, containing
then the previous example String would be constructed by
aStream ← WriteStream with: header. names with: cities do: [ :name :city | aStream nextPutAll: name. aStream tab. aStream nextPutAll: city. aStream cr]. aStream contents
Class ReadWriteStreamm is a subclass of WriteStream that represents an accessor that can both read and write elements into its collection. It supports all the protocol of both ReadStream and WriteStream, as given above.
Streams of Generated Elements
Of the three ways to create a position reference for streaming over a collection, the second way cited in the introduction to this chapter was to specify a seed by which the next elements of the collection can be generated. This kind of Stream only permits reading the elements, not writing. The reference, however, can be repositioned by resetting the seed.
Class Random, introduced in Chapter 8, is a subclass of Stream that determines its elements based on an algorithm employing a number as a seed. Random provides a concrete implementation for next and atEnd. Because the size of the collection is infinite, it never ends; moreover, Random can not respond to the message contents. It can respond to the message do:, but the method will never end without the programmer's purposeful intervention.
The following is an implementation of class Random; the reader is referred to Chapters 11 and 21 for examples making use of instances of the class. The implementations for do: and nextMatchFor: anObject are inherited from class Stream.
|instance variable names||seed|
instance creation new ↑self basicNew setSeed
testing atEnd ↑false accessing next | temp | "Lehmer's linear congruential method with modulus m = 2 raisedTo:16, a = 27181 odd, and 5 = a \\ 8, c = 13849 odd, and c/m approximately 0.21132" [seed ← 13849 + (27181 * seed) bitAnd: 8r177777. temp ← seed / 65536.0. temp = 0] whileTrue. ↑temp private setSeed "For pseudo-random seed, get a time from the system clock. It is a large positive integer; just use the lower 16 bits." seed ← Time millisecondClockValue bitAnd: 8r177777
Another possible example of a stream of generated elements are the probability distributions that are presented in Chapter 21. The superclass ProbabilityDistribution is implemented as a subclass of Stream. The message next: anInteger is inherited from Stream. Each kind of ProbabilityDistribution determines whether it is "read-only" and, if so, implements nextPut: as self shouldNotImplement. Class SampleSpace, another example in Chapter 21, maintains a collection of data items and implements nextPut: anObject as adding to the collection.
Streams for Collections Without External Keys
The third way to maintain a position reference for streaming over a collection cited in the introduction to this chapter was to maintain a nonnumerical reference. This would be useful in cases where the elements of the collection cannot be accessed by external keys or where such access is not the most efficient means to retrieve an element.
Streaming over instances of class LinkedList provides an example in which the elements can be accessed by indexes as the external keys, but each such access requires a search down the chain of linked elements. It is more efficient to maintain a reference to a particular element in the collection (a kind of Link)and then to access the next element by requesting the current elements nextLink. Both reading and writing into the LinkedList can be supported by such a Stream.
Suppose we create a subclass of Stream that we call LinkedListStreamm. Each instance maintains a reference to a LinkedList and a position reference to an element in the collection. Since both reading and writing are supported, the messages next, nextPut:, atEnd, and contents must be implemented. (Note that these~four messages are defined in class Stream as self subclassResponsibility.) A new instance of LinkedListStreamm is created by sending it the message on: aLinkedList.
|instance variable names||collection|
instance creation on: aLinkedList ↑self basicNew setOn: aLinkedList
testing atEnd ↑currentNode isNil accessing next | saveCurrentNode | saveCurrentNode ← currentNode. self atEnd ifFalse: [currentNode ← currentNode nextLink]. ↑saveCurrentNode nextPut: aLink | index previousLink | self atEnd ifTrue: [↑collection addLast: aLink]. index ← collection indexOf: currentNode index = 1 ifTrue: [collection addFirst: aLink] ifFalse: [previousLink ← collection at: index - 1. previousLink nextLink: aLink]. aLink nextLink: currentNode nextLink. currentNode ← aLink nextLink. ↑aLink private setOn: aLinkedList collection ← aLinkedList. currentNode ← aLinkedList first
Now suppose in order to demonstrate the use of this new kind of Stream we make up a LinkedList of nodes that are instances of class WordLink; class WordLink is a subclass of Link that stores a String or a Symbol.
|instance variable names||word|
instance creation for: aString ↑self new word: aString
accessing word ↑word word: aString word ← aString comparing = aWordLink ↑word = aWordLink world printing printOn: aStream aStream nextPutAll: 'a WordLink for'. aStream nextPutAll: word
From the above we can see that an instance of WordLink for the word #one is created by
WordLink for: #one
Its print string is
'a WordLink for one'
We can then create a LinkedList of WordLinks and then a LinkedListStreamm that accesses this LinkedList.
list ← LinkedList new. list add: (WordLink for: #one). list add: (WordLink for: #two). list add: (WordLink for: #three). list add: (WordLink for: #four). list add: (WordLink for: #five). accessor ← LinkedListStreamm on: list
Then an example sequence of messages to accessor is
|accessor next||a WordLink for one|
|accessor next||a WordLink for two|
|accessor nextMatchFor:(WordLink for: #three)||true|
|accessor nextPut:(WordLink for: #insert)||a WordLink for insert|
|accessor contents||LinkedList(a WordLink for one a WordLink for two a WordLink for three a WordLink for insert a WordLink for five)|
|accessor next||a WordLink for five|
Similarly, traversing the nodes of a tree structure, such as that of class Tree given in Chapter 11, can be done by a kind of Stream that maintains a reference to a current node and then accesses the next element by accessing the current node's left tree, root, or right tree. This kind of Stream is slightly more complicated to implement than that for a LinkedList because it is necessary to retain knowledge of whether the left or right tree has been traversed and back references to the father of the current node. The order of traversal of the tree structure can be implemented in the Stream, ignoring the method by which subtrees were added to the structure. Thus, although we used in-order traversal in the implementations of class Tree and class Node, we can stream over a Tree with postorder traversal by implementing the messages next and nextPut: appropriately.
External Streams and File Streams
The Streams we have examined so far make the assumption that the elements of the collection can be any objects, independent of representation. For communicating with input/output devices, such as a disk, however, this assumption is not valid. In these cases, the elements are stored as binary, byte-sized elements that may prefer to be accessed as numbers, strings, words (two bytes), or bytes. Thus we have a need to support a mixture of nonhomogeneous accessing messages for reading and writing these different-sized "chunks" of information.
Class ExternalStream is a subclass of class ReadWriteStreamm. Its purpose is to add the nonhomogeneous accessing protocol. This includes protocol for positioning as well as accessing.
|nextNumber: n||Answer the next n bytes of the collection accessed by the receiver as a positive Small-Integer or LargePositiveInteger.|
|nextNumber: n put: v||Store the argument, v, which is a positive SmallInteger or LargePositiveInteger, as the next n bytes of the collection accessed by the receiver. If necessary, pad with zeros.|
|nextString||Answer a String made up of the next elements of the collection accessed by the receiver.|
|nextStringPut: aString||Store the argument, aString, in the collection accessed by the receiver.|
|nextWord||Answer the next two bytes from the collection accessed by the receiver as an Integer.|
|nextWordPut: anInteger||Store the argument, anInteger, as the next two bytes of the collection accessed by the receiver.|
|padTo: bsize||Skip to the next boundary of bsize characters, and answer how many characters were skipped.|
|padTo: bsize put: aCharacter||Skip--writing the argument, aCharacter,into the collection accessed by the receiver in order to pad the collection--to the next boundary of bsize characters and answer how many characters were written (padded).|
|padToNextWord||Make the position reference even (on word boundary), answering the padding character, if any.|
|padToNextWordPut: aCharacter||Make the position reference even (on word boundary), writing the padding character, aCharacter, if necessary.|
|skipWords: nWords||Position after nWords number of words.|
|wordPosition||Answer the current position in words.|
|wordPosition: wp||Set the current position in words to be the argument, wp.|
|ExternalStream instance protocol|
Class FileStream is a subclass of ExternalStream. All accesses to external files are done using an instance of FileStream. A FileStream acts as though it were accessing a large sequence of bytes or characters; the elements of the sequence are assumed to be Integers or Characters. The protocol for a FileStream is essentially that of class ExternalStream and its superclasses. In addition, protocol is provided to set and to test the status of the sequence the FileStream is streaming over.
Classes ExternalStream and FileStream are provided in the Smalltalk-80 system as the framework in which a file system can be created. Additional protocol in class FileStream assumes that a file system is based on a framework consisting of a directory or dictionary of files, where a file is a sequence of file pages. The Smalltalk-80 system includes classes FileDirectory, File, and FilePage to represent these structural parts of a file system. A FilePage is a record of data that is uniquely identified within its File by a page number. A File is uniquely identified both by an alphanumeric name and a serial number; it maintains a reference to the FileDirectory which contains the File. And the FileDirectory is itself uniquely identified by the device or reserver" to which it refers. User programs typically do not access a File or its FilePages directly; rather they access it as a sequence of characters or bytes through a FileStream. Thus the programmer can create a FileStream as an accessor to a file using an expression of the form
Disk file: 'name.smalltalk'
where Disk is an instance of a FileDirectory. The FileStream can then be sent sequences of reading and writing messages as specified in the protocol of this chapter.