Library Automation and Digital Archive
LONTAR
Fakultas Ilmu Komputer
Universitas Indonesia

Pencarian Sederhana

Find Similar Add to Favorite

Call Number SEM-212
Collection Type Indeks Artikel prosiding/Sem
Title List Ranking and List Scan on the Cray C-90
Author Margaret Reid-Miller;
Publisher 6th Annual ACM Symposium on Parallel Algorithms and Architecture
Subject
Location
Lokasi : Perpustakaan Fakultas Ilmu Komputer
Nomor Panggil ID Koleksi Status
SEM-212 TERSEDIA
Tidak ada review pada koleksi ini: 55426
ABSTRACT

List ranking and list scan are two primitive operations used in many parallel algorithms that use list, trees, and graph data structures. But vectorizing and parallelizing list rank- ing is a challenge because it is highly communication inten- sive and dynamic. In addition, the serial algorithm is very simple and has very small constants. In order to compete, a parallel algorithm must also be simple and have small con- stants. A parallel algorithm due to Wyllie is such an algo- rithm, but it is not work efficient-its performance degrades for longer and longer linked lists. In contrast, work effi- cient PRAM algorithms developed to date have very large constants. We introduce a new fully vectorized and paral- lelized algorithm that both is work efficient and has small constants. It does not achieve O(log n) running time, but we contend that work efficiency and small constants is more important, given that vector and multiprocessor machines are used for problems that are much larger than the num- ber of processors and, therefore, the O(log n) time is never achieved in practice. In particular, to the best of our knowl- edge, our implementation of list ranking and list scan on the CRAY C-90 is the fastest implementation to date. In addi- tion, it is the first implementation of which we are aware that outperforms fast workstations. The success of our algorithm is due to its relatively large grain size and simplicity of the inner loops, and the success of the implementation is due to pipelining reads and writes through vectorization to hide latency, minimizing load balancing by deriving equations for predicting and optimizing performance, and avoiding condi- tional tests except when load balancing.