A Hybrid Sorting Algorithm on Heterogeneous Architectures
Abstract: Nowadays high
performance computing devices are more common than ever before. The capacity of
main memories becomes very huge, CPUs get more cores and computing units that
have greater performance. There are more and more machines get accelerators
such as GPUs, too. Take full advantages of modern machines that use
heterogeneous architectures to get higher performance solutions is a real
challenge. There are so much literatures on only use CPUs or GPUs, however,
research on algorithms that utilize heterogeneous architectures is
comparatively few. In this paper, we propose a novel hybrid sorting algorithm
that let CPU cooperate with GPU. To fully utilize computing capability of both
CPU and GPU, we used SIMD intrinsic instructions to implement sorting kernels
that run on CPU, and adopted radix sort kernels that implemented by
CUDA(Compute Unified Device Architecture) that run on GPU. Performance
evaluation is promising that our algorithm can sort one billion 32-bit float
data in no more than 5 seconds.
Author: Ming Xu, Xianbin Xu,
Fang Zheng, Yuanhua Yang, Mengjia Yin
Journal Code: jptkomputergg150139