MapReduce Integrated Multi-algorithm for HPC Running State Analysis
Abstract: High-performance
computer clusters are major seismic processing platforms in the oil industry and
have a frequent occurrence of failures. In this study, K-means and the Naive
Bayes algorithm were programmed into MapReduce and run on Hadoop. The
accumulated high-performance computer cluster running status data were first
clustered by K-means, and then the results were used for Naive Bayes training.
Finally, the test data were discriminated for the knowledge base and equipment
failure. Experiments indicate that K-means returned good results, the Naive
Bayes algorithm had a high rate of discrimination, and the multi-algorithm used
in MapReduce achieved an intelligent prediction mechanism.
Author: ShuRen Liu, ChaoMin
Feng, HongWu Luo, Ling Wen
Journal Code: jptkomputergg160159