COMPARATIVE STUDY OF PARALLEL ALGORITHMS FOR MACHINE LEARNING METHODS

Authors

  • Nurislam Kassymbek U.A. Joldasbekov Institute of Mechanics and Engineering, Almaty, Kazakhstan
  • Altynshash Rakhimzhanova Astana IT University, Astana, Kazakhstan

DOI:

https://doi.org/10.26577/jpcsit2023v1i4a2

Keywords:

Machine learning, Linear regression, Random forest, Parallel computing

Abstract

In the modern world, as the amount of data used in machine learning is constantly growing, the task of accelerating the training of models on large datasets becomes relevant. To solve this problem, methods of parallel data processing are used. This paper discusses methods of parallel data processing for machine learning. Linear regression and random forest are considered as machine learning methods. Parallel algorithms based on the MPI interface were developed for each method. The results of the experiments showed that both methods give acceleration compared to the sequential algorithm. However, the acceleration in the case of random forest was significantly higher than in the case of linear regression. This is because random forest is a more computationally efficient method than linear regression. Therefore, it can be concluded that Random Forest is the most effective machine learning approach for parallel data processing. This statement is confirmed by the results of experiments conducted in this work. Overall, the experimental results show that the use of parallel algorithms in machine learning can significantly speed up model training when working with large data sets. Random forest is the most efficient method for parallel data processing, as it is more computationally efficient and has higher scalability.

Downloads

Download data is not yet available.
        118 45

Downloads

How to Cite

Kassymbek, N., & Rakhimzhanova, A. (2023). COMPARATIVE STUDY OF PARALLEL ALGORITHMS FOR MACHINE LEARNING METHODS. Journal of Problems in Computer Science and Information Technologies, 1(4). https://doi.org/10.26577/jpcsit2023v1i4a2