
In order to explore the design space of multicomputer architectures, we developed the Mermaid simulation environment. This environment allows the performance evaluation of a wide range of architectural design options by means of parameterization: from processor parameters, such as the cache specifics, to switching and routing techniques in the message-passing communication network.
Mermaid differs from other simulation environments in the way it tries to cope with the tradeoff between accuracy and simulation efficiency. Most multicomputer and multiprocessor simulators apply direct execution to obtain high simulation performance. In this technique, ``uninteresting'' instructions are not explicitly simulated but are directly executed by the simulating host computer. This requires, however, the instruction set of the host computer to be similar to that of the modelled architecture.
Besides direct execution, some simulators, like SimOS, also provide multiple levels of simulation. This enables the architect to position the simulation at an interesting state using a fast and abstract level of simulation. Thereafter, the interesting section is studied using an accurate, and thus less efficient, mode of simulation.
Mermaid does not perform direct execution. Until now, it has applied two other techniques to address the accuracy-efficiency tradeoff. First, we offer the ability to simulate at different abstraction levels. But, unlike SimOS, the whole simulation takes place at one abstraction level only. So, if the research objective is fast prototyping, maximum accuracy is not required and simulation can be performed at a high level of abstraction. On the other hand, if accuracy is required, then simulation is performed at a lower and more computationally intensive abstraction level.
Second, at the lowest level of abstraction, Mermaid simulates abstract
instructions rather than interpreting and simulating real machine
instructions. For this purpose, we use some kind of trace-driven simulation.
Compared to traditional instruction-level simulation, this
approach typically results in a higher simulation performance at the cost of a
small loss of accuracy. As a consequence, we obtain a simulation efficiency
which is competitive with many direct execution simulators.
Furthermore, our simulation approach does not make demands upon the simulating
host architecture, like direct execution does.
Recently, we have developed a Mermaid version which also allows distributed simulation of multicomputer architectures on a pool of workstations. Because of our simulation methodology, this parallelization is quite straightforward as it does not require the application of algorithms to guarantee the causality within the simulated system. The resulting distributed simulator increases the simulation performance without any loss of simulation accuracy. Furthermore, the parallel simulation environment is more scalable than its sequential counterpart since the latter may easily run out of memory when simulating a large number of processors. Experimental results indicate that parallel Mermaid obtains significant performance improvements with respect to sequential simulation. In several cases, we even measured super-linear speedups
Publications and documents
People involved with Mermaid
The Computer Architecture and Parallel Systems Group homepage