How to Simulate 1000 Cores
Monchiero, Matteo; Ahn, Jung Ho; Falcon, Ayose; Ortega, Daniel; Faraboschi, Paolo
Keyword(s): simulation, multicore, manycore, chip multiprocessor, application scaling
Abstract: This paper proposes a novel methodology to efficiently simulate shared-memory multiprocessors composed of hundreds of cores. The basic idea is to use thread- level parallelism in the software system and translate it into core level parallelism in the simulated world. To achieve this, we first augment an existing full- system simulator to identify and separate the instruction streams belonging to the different software threads. Then, the simulator dynamically maps each instruction flow to the corresponding core of the target multi-core architecture, taking into account the inherent thread synchronization of the running applications. Our simulator allows a user to execute any multithreaded application in a conventional full- system simulator and evaluate the performance of the application on a many-core hardware. We carried out extensive simulations on the SPLASH-2 benchmark suite and demonstrated the scalability up to 1024 cores with limited simulation speed degradation vs. the single- core case on a fixed workload. The results also show that the proposed technique captures the intrinsic behavior of the SPLASH-2 suite, even when we scale up the number of shared-memory cores beyond the thousand- core limit.
Additional Publication Information: To be presented at dasCMP 2008, Como, Italy, November 9 2008. Date Issued: No date available.
External Posting Date: November 6, 2008 [Fulltext]. Approved for External Publication
Internal Posting Date: November 6, 2008 [Fulltext]