We present a general approach of selft-scheduling a non-uniform parallel loop on a distrited-memory machine. The approach has two phases: a static sceduling phase and a dynamic scheduling phase. In addision to reduce scheduling overhed, using the static scheduling pahse alloses the data needed by the statically scheduled iteractions to be prefetched. The dynamic scheduling phase lances the workload. Data distribution methods for self-scheduling are also the focus of this paper. We classify the data distribuitn medhods into four categories and present partial duplication. a method that allows the problem size to grow linearly in the number of processors. The experiments conducted on a 64-noed NCUBE show that as much as 79% improvement is aschieved over static scheduling on the generation of a false-color image.
|
|