Next: 2.2. Concurrent Aggregates
My first approach to sample generation was Kiwi [Walker], a system based on conventional object-oriented programming. The MUSIC N systems are awkward because C and FORTRAN functions cannot adequately represent unit generators with state information; C++ facilitated the encapsulation of unit generator state information and synthesis algorithms. The Encore Parallel Threads [Encore] library provided simple utilities for managing threads.

Figure 1 - Control flow in Kiwi
Figure 1 outlines the program structure. The Scheduler gives Tasks to the Processors based on information stored in the ScoreList. The Processor is a software model of the real processor. It executes a simple loop; request a Task from the Scheduler, perform the Task, report the results to the Collector. A Task is of the form compute samples for note x starting at time y and ending at time z. The Collector sums the samples from each Task and stores the finished music for later playback.
Kiwi assumes all synthesis algorithms are serial. Hence, each note is assigned only one processor at a time. I refer to this as note-level parallelism. Suppose there are as many processors as notes; Kiwi could compute samples for all the notes simultaneously, maxim izing note-level parallelism. There are three difficulties with this. First, no current multiprocessor has as many processors as the number of notes in a modest score. Second, all the processors would contend for access to the Collector. Third, the Collect or would be prohibitively large, since it would represent the entire duration of the piece.
Kiwi solves these problems with occasional barrier synchronization. The piece is divided into grains of several seconds (My use of the term grains should not be confused with the granular synthesis technique). All the samples for a given grain are comp uted before work on the next grain begins. With this constraint, the sample buffer need only be as large as the grain size. The number of concurrent threads is constrained by the number of notes in a given grain. When a note spans two or more grains, it is divided into several Tasks. Each Task is processed in turn, during the appropriate grain. Kiwi achieves linear speedup using this scheme for as many as sixteen processors on the Encore Multimax. The sample summation bottleneck prevents increased performan ce for more than sixteen processors. Scores with few notes permit little parallelism, and hence little speedup.