TitleLocation of Processor Allocator and Job Scheduler and Its Impact on CMP Performance
Journal titleInternational Journal of Electronics and Telecommunications
Divisions of PASNauki Techniczne
AbstractLocation of Processor Allocator and Job Scheduler and Its Impact on CMP Performance High Performance Computing (HPC) architectures are being developed continually with an aim of achieving exascale capability by 2020. Processors that are being developed and used as nodes in HPC systems are Chip Multiprocessors (CMPs) with a number of cores. In this paper, we continue our effort towards a better processor allocation process. The Processor Allocator (PA) and Job Scheduler (JS) proposed and implemented in our previous works are explored in the context of its best location on the chip. We propose a system, where all locations on a chip can be analyzed, considering energy used by Network-on-Chip (NoC), PA and JS, and processing elements. We present energy models for the researched CMP components, mathematical model of the system, and experimentation system. Based on experimental results, proper placement of PA and JS on a chip can provide up to 45% NoC energy savings.
PublisherPolish Academy of Sciences Committee of Electronics and Telecommunications
IdentifierISSN 2081-8491 (until 2012) ; eISSN 2300-1933 (since 2013)
ReferencesN. Satish, C. Kim, J. Chhugani, A. D. Nguyen, V. W. Lee, D. Kim, and P. Dubey, "Fast sort on cpus, gpus and intel mic architectures," Intel, Tech. Rep., 2010. ; Zydek D. (2008), Review of packet switching technologies for future NoC, null, 306, doi.org/10.1109/ICSEng.2008.47 ; E. Salminen, A. Kulmala, and T. D. Hamalainen, "Survey of network-on-chip proposals," in <i>White Paper, OCP-IP</i>, 2008, pp. 1-13. ; Uhrig S. (2009), A two-dimensional superscalar processor architecture, null, 608, doi.org/10.1109/ComputationWorld.2009.46 ; Dally W. (1990), Performance analysis of k-ary n-cube interconnection networks, IEEE Transaction on Computers, 39, 6, 775, doi.org/10.1109/12.53599 ; D. N. Jayasimha, B. Zafar, and Y. Hoskote, "On-chip interconnection networks: Why they are different and how to compare them," Intel, Tech. Rep., 2006. ; Zydek D. (2010), Hardware implementation of processor allocation schemes for mesh-based chip multiprocessors, Journal of Microprocessors and Microsystems, 34, 1, 39, doi.org/10.1016/j.micpro.2009.11.003 ; Zydek D. (2011), Fast and efficient processor allocation algorithm for torus-based chip multiprocessors, Journal of Computers & Electrical Engineering, 37, 1, 91, doi.org/10.1016/j.compeleceng.2010.10.001 ; Zydek D. (2010), Evaluation scheme for noc-based cmp with integrated processor management system, International Journal of Electronics and Telecommunications, 56, 2, 157, doi.org/10.2478/v10177-010-0021-4 ; Dally W. (2004), Principles and Practices of Interconnection Networks. ; Michelogiannakis G. (2010), Evaluating bufferless flow control for on-chip networks, null, 9, doi.org/10.1109/NOCS.2010.10 ; Moscibroda T. (2009), A case for bufferless routing in on-chip networks, ACM SIGARCH Computer Architecture News, 37, 3, 196, doi.org/10.1145/1555815.1555781 ; Daoud L. (2011), Faster processor allocation algorithms for mesh-connected cmps, null, 805, doi.org/10.1109/DSD.2011.107 ; Yoo B. (2002), A fast and efficient processor allocation scheme for mesh-connected multicomputers, IEEE Transaction on Computers, 51, 1, 46, doi.org/10.1109/12.980016 ; Zydek D. (2011), Energy characteristic of processor allocator and network-on-chip, Journal of Applied Mathematics and Computer Science, 21, 2, 385, doi.org/10.2478/v10006-011-0029-7 ; Chan Y. (2002), Estimation of circle parameters by centroiding, Journal of Optimization Theory Applications, 114, 2, 363, doi.org/10.1023/A:1016087702231 ; Shawky A. (2002), End-point control of a flexible-link manipulator using state-dependent riccati equation technique, Archives of Control Sciences (ACS), 12, 3, 191, doi.org/10.1109/CCA.2002.1040236 ; Shawky A. (2007), Position control of a flexible-link manipulator using nonlinear h with state-dependent riccati equation, Proceedings of the Institution of Mechanical Engineers, Part I: Journal of Systems and Control Engineering, 221, 3, 475, doi.org/10.1243/09596518JSCE313 ; Zydek D. (2010), Synthesis of processor allocator for torus-based chip multiprocessors, null, 13, doi.org/10.1109/ITNG.2010.145 ; Intel. (2011, Sep) Intel microprocessor export compliance metrics. [Online]. Available: <a target="_blank" href='http://www.intel.com/'>http://www.intel.com/</a> ; Kumar A. (2007), Express virtual channels: Towards the ideal interconnection fabric, ACM SIGARCH Computer Architecture News, 35, 2, 150, doi.org/10.1145/1273440.1250681