Abstract: In order to overcome the problem of poor recovery and stability of traditional big data fault-tolerant technology, a fault-tolerant technology of big data cluster in distributed flow processing system is proposed. Using the target protocol to build the cache data fault-tolerant mechanism of the distributed flow processing system and the disk data fault-tolerant mechanism, so as to build the internal data fault-tolerant mechanism of the system. By using spark application framework, a fault-tolerant model of big data cluster is built to realize the fault-tolerant of big data cluster in distributed flow processing system. The experimental results show that compared with the traditional methods, the recovery rate of big data cluster fault-tolerant technology proposed in this paper is higher, the highest recovery efficiency is 99.83%, the stability is stronger, and it is more suitable for big data fault-tolerant processing of distributed flow processing system.
Keywords: Distributed flow processing system, big data, cluster fault-tolerant technology, disk data fault-tolerant mechanism