Tejas: A Java-based, cycle-accurate, heterogeneous architectural simulator.
Authors: Smruti R. Sarangi, Rajshekar Kalayappan, Prathmesh Kallurkar, Seep Goel, Eldhose Peter
Affiliation: Department of Computer Science, Indian Institute of Technology, New Delhi, India.
Key Features:
Trace-driven simulator, platform-independent.
Simulates binaries in any ISA (Instruction Set Architecture) and operating system.
Achieves speed through optimized data structures and cache locality.
Validated against real hardware (Dell PowerEdge R620) and shown to exceed accuracy of popular simulators.
Purpose: Architectural simulators are crucial in education, design, and research in computer architecture. They help in:
Teaching basic architectural concepts.
Prototyping new designs.
Estimating performance metrics like temperature and power consumption.
Development Trends: Many simulators have incorporated novel features and speed enhancements but often at the cost of cycle accuracy.
Platform Independence: Unlike many simulators tied to specific platforms (C/C++), Tejas is implemented in Java.
Decoupled Modules: Tejas separates instruction emulators from the timing simulation engine.
Testing Environment: Tejas tested on Linux, Windows, and OSX; results primarily showcased for the Linux platform.
High Performance: Demonstrates high throughput in simulating different workloads with optimized data communication.
Architectural Features: Supports non-uniform caches, out-of-order execution pipelines, complex networks, and GPGPUs (General-Purpose Graphics Processing Units).
Transfer Mechanisms: Efficient transfer of execution traces from emulators to the timing simulator via various methods (shared memory, files, pipes, and network sockets).
Execution Speed: Tejas outperforms Multi2Sim (25% faster) and Gem5 (220% faster) on average for various benchmarks.
Attention to Efficiency: Employs specific techniques to facilitate efficient instruction transfer and processing, such as static analysis and transmission mechanisms for optimized data packaging.
Validation Methodology: Comparison of Tejas’s simulated time against native hardware performance.
Error Rates:
Serial Workloads: Average error of 11.45%.
Parallel Workloads: Average error of 18.77%, attributed to OS-induced scheduling jitter.
Performance Assessment: Tejas’s relative speed demonstrated through KIPS (kilo-instructions per second) indicating it is significantly faster than both Gem5 and Multi2Sim.
Accuracy Comparison: Comparison reveals that Tejas maintains better accuracy in execution time compared to other established simulators like MARSS, Sniper, and FastMP.
Innovation: Tejas represents a significant advancement in architectural simulation by promoting platform independence while retaining high accuracy and performance, demonstrating a rich set of architectural features with fewer lines of code compared to its competitors.
Future Work: Continued development targets enhancements in technologies such as DRAM modeling and transactional memory simulation.
VISA: A custom RISC (Reduced Instruction Set Computer) architecture executing various instruction types (ALU operations, memory loads/stores, control flow).
Process Overview:
Riscify: A static mapping of CISC instructions to VISA.
Fuse: Dynamic adjustment of instructions during benchmarking through packet exchanges.
Detailed simulation parameters for various platforms (Linux, Windows, OSX) showcasing configurations and the latest speed comparisons.