This project is a configurable two-level cache simulator that models typical computer-architecture cache behavior using the Least Recently Used (LRU) replacement policy. It supports:
- Separate L1 and optional L2 data caches
- Write-back, write-allocate memory behaviour
- A stream-buffer prefetcher (parameterised by number of buffers N and blocks per buffer M)
- Detailed statistics output for reads, writes, misses, write-backs and miss-rates
The simulator is implemented in modern C++ (C++11) and can be built with any recent g++
compiler.
Path | Purpose |
---|---|
sim.cpp |
main() – parses CLI, drives the simulation |
cache_class_adi.cpp |
Cache implementation (sets, blocks, LRU logic, write-back handling) |
stream_buffer_class_adi.cpp |
Simple sequential stream-buffer prefetcher |
gcc_trace.txt |
Example memory-access trace (⩾ 100 k lines) |
Makefile |
Convenience targets to build / clean the project |
The generated
sim
/simulate.exe
binaries are not source-controlled – you can always rebuild them with the Makefile.
The Makefile targets Linux, macOS and Windows (via MinGW or WSL). Simply run:
make # builds `sim` (or `sim.exe` on Windows)
Flags can be adjusted via the variables at the top of the Makefile (OPT
, WARN
, STD
, …).
To clean up object files and the simulator binary:
make clean # remove objects + binary
make clobber # remove objects only, keep binary
./sim <BLOCKSIZE> <L1_SIZE> <L1_ASSOC> <L2_SIZE> <L2_ASSOC> <PREF_N> <PREF_M> <TRACE_FILE>
Parameter definition
Pos | Argument | Description |
---|---|---|
1 | BLOCKSIZE |
Block size (bytes) – must be a power of two |
2 | L1_SIZE |
Total size of the L1 cache (bytes) |
3 | L1_ASSOC |
Set-associativity of the L1 cache |
4 | L2_SIZE |
Total size of the L2 cache (bytes). Use 0 to disable L2. |
5 | L2_ASSOC |
Associativity of the L2 cache (ignored when L2_SIZE = 0 ) |
6 | PREF_N |
Number of stream buffers (prefetch queues). Set 0 to disable prefetching. |
7 | PREF_M |
Number of blocks to prefetch per buffer |
8 | TRACE_FILE |
Path to a trace file – one operation per line (r <addr> / w <addr> ). |
Example – simulate a 32 KiB 8-way L1 and 256 KiB 8-way L2, 64-byte blocks, with 4 stream buffers each pre-fetching 8 blocks, using the bundled trace:
./sim 64 32768 8 262144 8 4 8 gcc_trace.txt
On completion the program prints a statistics block similar to:
===== Measurements =====
a. L1 reads: 123456
b. L1 read misses: 7890
c. L1 writes: 65432
d. L1 write misses: 321
e. L1 miss rate: 0.1234
f. L1 writebacks: 456
g. L2 reads (demand): 8211
h. L2 read misses (demand): 300
i. L2 writes: 777
j. L2 write misses: 50
k. L2 miss rate: 0.0365
l. L2 writebacks: 200
If L2_SIZE
is 0, only the first six L1 metrics are shown.
- LRU replacement is tracked with an incrementing
global_counter
, storing the most-recent use per block. - Write-back caches defer memory writes until a dirty block is evicted. Evictions from the last-level cache are counted as write-backs.
- The stream-buffer prefetcher allocates a new buffer for every miss, enqueues the next M sequential blocks and issues prefetches when the CPU accesses the head of the queue. This very lightweight design is located in
stream_buffer_class_adi.cpp
. - All bit-mask calculations for tags/indices rely on the block and set counts being powers of two.
Ideas for further exploration:
- Different replacement policies – Random, FIFO, Pseudo-LRU, etc.
- Non-inclusive / exclusive cache hierarchies.
- Write-through and/or write-no-allocate modes.
- Advanced prefetchers – stride, correlation, Delta-correlating, etc.
- Instruction/Data split or unified caches.
- Cache coherence models for multi-processor traces.
Feel free to fork and experiment!
This project is released under the MIT License – see LICENSE
for details.