Affiliations: [a] Department of Information Technology Engineering, Tarbiat Modares University, Tehran, Iran | [b] Department of Computer Engineering, Alzahra University, Tehran, Iran
Corresponding author: Majid Rahimi, Department of Information Technology Engineering, Tarbiat Modares University, Tehran, Iran. E-mail: [email protected].
Abstract: MapReduce is a widespread programming model used for overcoming processing limits of current hardware resources. In the MapReduce paradigm, large amount of data distributed into multiple parts and each part processed on different processing units and finally the results of all units combined to obtain the ultimate result. Scheduling is an important aspect that affects overall processing quality, hence finding suitable algorithm based on properties of jobs is essential to acquire maximum performance. The basic FIFO algorithm for job scheduling does not meet maximum efficiency. Therefore, many of new schedulers proposed so far to improve quality metrics of the MapReduce system. The scheduling methods can be designed for job scheduling, map or reduce task scheduling or both. In this paper, we select high quality studies that concern the MapReduce scheduling problem and then classify them based on their main concerning quality measure and subsequently review selected studies. Finally, we discuss research trends and provide a roadmap of the scheduling problem for researchers.
Keywords: MapReduce, big data, scheduling, cloud, optimization