Click here for full text:
Architectural Sensitive Application Characterization: The Approach of High-Performance Index-Set (HP-Set)
Keyword(s): shared memory multiprocessor architecture; performance evaluation
Abstract: Good simulation tools that provide architectural relevant insights play vital roles in building complex system such as shared-memory multiprocessors. In this report, we discuss HP-Set, a simulation tool that takes the core scheduling component of CIAT and integrates it with a set of statistic gathering probes that generate the corresponding index. HP-Set stands for High Performance index-Set. In a nutshell, HP-Set is a portfolio with its major indexes being the following: general statistics, coherent misses, data reuse and locality, granularity and the IO index. The objective of HP-Set is to be architectural sensitive and yet not to evolve into the role of a full functional simulator. We achieve the goal by getting rid of fancy statistics and by actually implementing relevant protocols that aim at optimizing certain aspects of the index. By comparing the index with and without the perturbation of the protocols, we will know not only how big the impact the index has on the overall performance, but also how likely we can improve them architecturally. Using HP-Set, we analyzed several commercial applications and obtained insights not available before. For example, our overall analysis points out that it's a common misconception that TPCC is more memory intensive than TPCD, the difference is rather due to their pressures on the memory system. Our communication index indicates that the third-party dirty hits dominate, and thus faster directory lookup and cache-to-cache transfer optimizations should be encouraged. On the other hand, significant number of false-sharing misses is and will continue to be a dominant performance factor. Our granularity analysis suggests that spatial localities of coherent objects are rather limited, and blind sequential prefetching might do more harm than benefit. Our IO-Memory analysis finds that IO contributes a non-negligible factor in total system traffic and is the major cause cache misses.
Back to Index