MODEL BASED SOLUTION FOR COMPUTING CHECKPOINTING INTERVAL FOR FAULT-TOLERANT ROLLBACK-RECOVERY IN ENTERPRISE SERVERS
DOI:
https://doi.org/10.26577/jpcsit20253103Keywords:
checkpointing, fault-tolerance, enterprise, heuristic solution, application, rollback-recoveryAbstract
Recently, reliable information processing has become a relevant topic with the increase of digitalization. It is especially essential for enterprises that process huge amounts of data every day. These processes require stability and reliability as their interruption might lead to various security issues. In order to tackle this, there are fault-tolerance algorithms that are specifically designed to prevent or recover faults. This paper focuses on developing a heuristic solution to find optimal checkpointing interval for rollback-recovery fault-tolerance algorithm. Authors propose a heuristic solution that utilizes CPU capabilities to determine how often checkpointing should be taken for reliable information processing. Research provides statistics and predictions of major research organizations highlighting the relevance of the topic. Paper also reviews related work devoted to this area of research, providing comparisons and overall analysis. The results of the work show that the proposed calculation method introduces minimal performance overhead, averaging 0.04 seconds to the average service time, while maintaining fault tolerance of the process. Authors indicate that this solution is suitable for proof-of-concept systems to efficiently determine optimal interval for checkpointing.