ECE/CS 472/572 Final Exam Project

Submit your final project to Canvas by Thursday

June 10th, at 11:59pm

Late submissions will be penalized 15 points per


Remember to check the errata section (at the very bottom of the page)

for updates.

Your submission should be comprised of two items: a .pdf file containing

your written report and a .tar file containing a directory structure with your C

or C++ source code. Your grade will be reduced if you do not follow the

submission instructions.

All written reports (for both 472 and 572 students) must be composed in MS

Word, LaTeX, or some other word processor and submitted as a PDF file.

Please take the time to read this entire document. If you have questions there

is a high likelihood that another section of the document provides answers.


In this final project you will implement a cache simulator. Your simulator will

be configurable and will be able to handle caches with varying capacities,

block sizes, levels of associativity, replacement policies, and write policies.

The simulator will operate on trace files that indicate memory access

properties. All input files to your simulator will follow a specific structure so

that you can parse the contents and use the information to set the properties

of your simulator.

After execution is finished, your simulator will generate an output file

containing information on the number of cache misses, hits, and miss

evictions (i.e. the number of block replacements). In addition, the file will also

record the total number of (simulated) clock cycles used during the situation.

Lastly, the file will indicate how many read and write operations were

requested by the CPU.

It is important to note that your simulator is required to make several

significant assumptions for the sake of simplicity.

1. You do not have to simulate the actual data contents. We simply

pretend that we copied data from main memory and keep track of

the hypothetical time that would have elapsed.

2. Accessing a sub-portion of a cache block takes the exact same

time as it would require to access the entire block. Imagine that you

are working with a cache that uses a 32 byte block size and has an

access time of 15 clock cycles. Reading a 32 byte block from this

cache will require 15 clock cycles. However, the same amount of

time is required to read 1 byte from the cache.

3. In this project assume that main memory RAM is always accessed

in units of 8 bytes (i.e. 64 bits at a time).

When accessing main memory, it’s expensive to access the first

unit. However, DDR memory typically includes buffering which

means that the RAM can provide access to the successive memory

(in 8 byte chunks) with minimal overhead. In this project we assume

an overhead of 1 additional clock cycle per contiguous unit.

For example, suppose that it costs 255 clock cycles to access the

first unit from main memory. Based on our assumption, it would

only cost 257 clock cycles to access 24 bytes of memory.

4. Assume that all caches utilize a “fetch-on-write” scheme if a miss

occurs on a Store operation. This means that you must always

fetch a block (i.e. load it) before you can store to that location (if

that block is not already in the cache).

Additional Resources

Sample trace files

Students are required to simulate the instructor-provided trace files (although

you are welcome to simulate your own files in addition).

Trace files are available on Flip in the following directory:


You should test your code with all three tracefiles in that directory (gcc,

netpath, and openssl).

Starter Code

In order to help you focus on the implementation of the cache simulator,

starter code is provided (written in C++) to parse the input files and handle

some of the file I/O involved in this assignment. You are not required to use

the supplied code (it’s up to you). Note that you will need to adapt this code to

work with your particular design.

The starter code is available


mplatev5.zipLinks to an external site.

Basic-Mode Usage (472 & 572 students)

L1 Cache Simulator

All students are expected to implement the L1 cache simulator. Students who

are enrolled in 472 can ignore the sections that are written in brown text.

Graduate students will be simulating a multiple-level cache (an L1 cache, an

L2 cache, and even an L3 cache).

Input Information

Your cache simulator will accept two arguments on the command line: the file

path of a configuration file and the file path of a trace file containing a

sequence of memory operations. The cache simulator will generate an output

file containing the simulation results. The output filename will have “.out”

appended to the input filename. Additional details are provided below.

# example invocation of cache simulator

cache_sim ./resources/testconfig ./resources/simpletracefile

Output file written to ./resources/simpletracefile.out

The first command line argument will be the path to the configuration file. This

file contains information about the cache design. The file will contain only

numeric values, each of which is on a separate line.

Example contents of a configuration file:

1 <-- this line will always contain a "1" for 472 students 230 <-- number of cycles required to write or read a unit from m ain memory 8 <-- number of sets in cache (will be a non-negative power of 2) 16 <-- block size in bytes (will be a non-negative power of 2) 3 <-- level of associativity (number of blocks per set) 1 <-- replacement policy (will be 0 for random replacement, 1 f or LRU) 1 <-- write policy (will be 0 for write-through, 1 for write-ba ck) 13 <-- number of cycles required to read or write a block from t he cache (consider this to be the access time per block) Here is another example configuration file specifying a direct-mapped cache with 64 entries, a 32 byte block size, associativity level of 1 (direct-mapped), least recently used (LRU) replacement policy, write-through operation, 26 cycles to read or write data to the cache, and 1402 cycles to read or write data to the main memory. CS/ECE472 projects can safely ignore the first line. 1 1402 64 32 1 1 0 26 The second command line argument indicates the path to a trace file. This trace file will follow the format used by Valgrind (a memory debugging tool). The file consists of comments and memory access information. Any line beginning with the ‘=’ character should be treated as a comment and ignored. ==This is a comment and can safely be ignored. ==An example snippet of a Valgrind trace file I 04010173,3 I 04010176,6 S 04222cac,1 I 0401017c,7 L 04222caf,8 I 04010186,6 I 040101fd,7 L 1ffefffd78,8 M 04222ca8,4 I 04010204,4 Memory access entries will use the following format in the trace file: [space]operation address,size • Lines beginning with an ‘I’ character represent an instruction load. For this assignment, you can ignore instruction read requests and assume that they are handled by a separate instruction cache. • Lines with a space followed by an ‘S’ indicate a data store operation. This means that data needs to be written from the CPU into the cache or main memory (possibly both) depending on the write policy. • Lines with a space followed by an ‘L’ indicate a data load operation. Data is loaded from the cache into the CPU. • Lines with a space followed by an ‘M’ indicate a data modify operation (which implies a special case of a data load, followed immediately by a data store). The address is a 64 bit hexadecimal number representing the address of the first byte that is being requested. Note that leading 0's are not necessarily shown in the file. The size of the memory operation is indicated in bytes (as a decimal number). If you are curious about the trace file, you may generate your own trace file by running Valgrind on arbitrary executable files: valgrind --log-fd=1 --log-file=./tracefile.txt --tool=lackey -- trace-mem=yes name_of_executable_to_trace Cache Simulator Output Your simulator will write output to a text file. The output filename will be derived from the trace filename with “.out” appended to the original filename. E.g. if your program was called using the invocation “cache_sim ./dm_config ./memtrace” then the output file would be written to “./memtrace.out” (S)tore, (L)oad, and (M)odify operations will each be printed to the output file (in the exact order that they were read from the Valgrind trace file). Lines beginning with “I” should not appear in the output since they do not affect the operation of your simulator. Each line will have a copy of the original trace file instruction. There will then be a space, followed by the number of cycles used to complete the operation. Lastly, each line will have one or more statements indicating the impact on the cache. This could be one or more of the following: miss, hit, or eviction. Note that an eviction is what happens when a cache block needs to be removed in order to make space in the cache for another block. It is simply a way of indicating that a block was replaced. In our simulation, an eviction means that the next instruction cannot be executed until after the existing cache block is written to main memory. An eviction is an expensive cache operation. It is possible that a single memory access has multiple impacts on the cache. For example, if a particular cache index is already full, a (M)odify operation might miss the cache, evict an existing block, and then hit the cache when the result is written to the cache. The general format of each output line (for 472 students) is as follows (and will contain one or more cache impacts): operation address,size L1 <...>

The final lines of the output file are special. They will indicate the total

number of hits, misses, and evictions. The last line will indicate the total

number of simulated cycles that were necessary to simulate the trace file, as

well as the total number of read and write operations that were directly

requested by the CPU.

These lines should exactly match the following format (with values given in


L1 Cache: Hits: Misses: Evictions:

Cycles: Reads:<# of CPU read requests> Writes:<# of CPU write requests>

In order to illustrate the output file format let’s look at an example. Suppose

we are simulating a direct-mapped cache operating in write-through mode.

Note that the replacement policy does not have any effect on the operation of

a direct-mapped cache. Assume that the configuration file told us that it takes

13 cycles to access the cache and 230 cycles to access main memory. Keep

in mind that a hit during a load operation only accesses the cache while a

miss must access both the cache and the main memory. For this scenario,

assume that memory access is aligned to a single block and does not straddle

multiple cache blocks.

In this example the cache is operating in write-through mode so a standalone

(S)tore operation takes 243 cycles, even if it is a hit, because we always

write the block into both the cache and into main memory. If this particular

cache was operating in write-back mode, a (S)tore operation would take only

13 cycles if it was a hit (since the block would not be written into main memory

until it was evicted).

The exact details of whether an access is a hit or a miss is entirely dependent

on the specific cache design (block size, level of associativity, number of sets,

etc). Your program will implement the code to see if each access is a hit,

miss, eviction, or some combination.

Since the (M)odify operation involves a Load operation (immediately followed

by a Store operation), it is recorded twice in the output file. The first instance

represents the load operation and the next line will represent the store

operation. See the example below…

==For this example we assume that addresses 04222cac, 04222caf,

and 04222ca8 are all in the same block at index 2

==Assume that addresses 047ef249 and 047ef24d share a block tha

t also falls at index 2.

==Since the cache is direct-mapped, only one of those blocks ca

n be in the cache at a time.

==Fortunately, address 1ffefffd78 happens to fall in a differen

t block index (in our hypothetical example).

==Side note: For this example a store takes 243 cycles (even if

it was a hit) because of the write-through behavior.

==The output file for our hypothetical example:

S 04222cac,1 486 L1 miss <-- (243 cycles to fetch the block and write it to L1) + (243 cycles to update the cache & main memory) L 04222caf,8 13 L1 hit M 1ffefffd78,8 243 L1 miss <-- notice that this (M)odify has a m iss for the load and a hit for the store M 1ffefffd78,8 243 L1 hit <-- this line represents the Store of the modify operation M 04222ca8,4 13 L1 hit <-- notice that this (M)odify has two hit s (one for the load, one for the store) M 04222ca8,4 243 L1 hit <-- this line represents the Store of th e modify operation S 047ef249,4 486 L1 miss eviction <-- 486 cycles for miss, no ev iction penalty for write-through cache L 04222caf,8 243 L1 miss eviction M 047ef24d,2 243 L1 miss eviction <-- notice that this (M)odify initially misses, evicts the block, and then hits M 047ef24d,2 243 L1 hit <-- this line represents the Store of th e modify operation L 1ffefffd78,8 13 L1 hit M 047ef249,4 13 L1 hit M 047ef249,4 243 L1 hit L1 Cache: Hits:8 Misses:5 Evictions:3 Cycles:2725 Reads:7 Writes:6 <-- total sum of simulated cycles (from above), as well as the number of reads and writes that wer e requested by the CPU NOTE: The example above is assuming that the cache has a block size of at least 8 bytes. Simulating a cache with a smaller blocksize would affect the timing and could also lead to multiple evictions in a single cache access. For example, if the blocksize was only 4 bytes it's possible that an 8 byte load might evict 3 different blocks. This happens because the data might straddle two or more blocks (depending on the starting memory address). Sample Testing Information Some students have asked for additional test files with "known" results that they can compare against. I've created my own implementation of the cache simulator and provided students with the following files (and results). Note: These files are not an exhaustive representation of the testing that your cache will undergo. It is your job to independently test your code and verify proper behavior. Examples that utilize an L1 cache: • Sample 1 o sample1_config download o sample1_trace download o sample1_trace.out download • Sample 2 o sample2_config download o sample2_trace download o sample2_trace.out download (572) Examples that utilize at least 2 caches: • Sample 1 o multi_sample1_config download o multi_sample1_trace download o multi_sample1_trace.out download • (planning to add more) Facts and Questions (FAQ): • Your "random" cache replacement algorithm needs to be properly seeded so that multiple runs of the same tracefile will generate different results. • I will never test your simulator using a block size that is smaller than 8 bytes. • During testing, the cache will not contain more than 512 indexes. • For our purposes the level of associativity could be as small as N=1 (direct mapped) or as large as N=64. • The last line of your output will indicate the total number of simulated cycles that were necessary to simulate the trace file, as well as the total number of read and write operations that were directly requested by the CPU. In other words, this is asking how many loads and stores the CPU directly requested (remember that a Modify operation counts as both a Load and a Store). • 572 students: For our purposes an L2 cache will always have a block size that is greater than or equal to the L1 block size. The L3 block size will be greater than or equal to the L2 block size. Implementation Details You may use either the C or the C++ programming language. Graduate students will have an additional component to this project. In our simplified simulator, increasing the level of associativity has no impact on the cache access time. Furthermore, you may assume that it does not take any additional clock cycles to access non-data bits such as Valid bits, Tags, Dirty Bits, LRU counters, etc. Your code must support the LRU replacement scheme and the random replacement scheme. For the LRU behavior, a block is considered to be the Least Recently Used if every other block in the cache has been read or written after the block in question. In other words, your simulator must implement a true LRU scheme, not an approximation. You must implement the write-through cache mode. You will receive extra credit if your code correctly supports the write-back cache mode (specified in the configuration file). Acceptable Compiler Versions The flip server provides GCC 4.8.5 for compiling your work. Unfortunately, this version is from 2015 and may not support newer C and C++ features. If you call the program using “gcc” (or “g++”) this is the version you will be using by default. If you wish to use a newer compiler version, I have compiled a copy of GCC 10.3 (released April 8, 2021). You may write your code using this compiler and you’re allowed to use any of the compiler features that are available. The compiler binaries are available in the path: /nfs/farm/classes/eecs/spring2021/cs472/public/gcc/bin For example, in order to compile a C++ program with GCC 10.3, you could use the following command (on a single terminal line): /nfs/farm/classes/eecs/spring2021/cs472/public/gcc/bin/g++ -oca che_sim -Wl,-rpath,/nfs/farm/classes/eecs/spring2021/cs472/publ ic/gcc/lib64 my_source_code.cpp If you use the Makefile that is provided in the starter code, it is already configured to use GCC 10.3. L2/L3 Cache Implementation (required for CS/ECE 572 students) Implement your cache simulator so that it can support up to 3 layers of cache. You can imagine that these caches are connected in a sequence. The CPU will first request information from the L1 cache. If the data is not available, the request will be forwarded to the L2 cache. If the L2 cache cannot fulfill the request, it will be passed to the L3 cache. If the L3 cache cannot fulfill the request, it will be fulfilled by main memory. It is important that the properties of each cache are read from the provided configuration file. As an example, it is possible to have a direct-mapped L1 cache that operates in cohort with an associative L2 cache. All of these details will be read from the configuration file. As with any programming project, you should be sure to test your code across a wide variety of scenarios to minimize the probability of an undiscovered bug. Cache Operation When multiple layers of cache are implemented, the L1 cache will no longer directly access main memory. Instead, the L1 cache will interact with the L2 cache. During the design process, you need to consider the various interactions that can occur. For example, if you are working with three write- through caches, than a single write request from the CPU will update the contents of L1, L2, L3, and main memory! ++++++++++++ ++++++++++++ ++++++++++++ …

Place your order
(550 words)

Approximate price: $22

Calculate the price of your order

550 words
We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
Total price:
The price is based on these factors:
Academic level
Number of pages
Basic features
  • Free title page and bibliography
  • Unlimited revisions
  • Plagiarism-free guarantee
  • Money-back guarantee
  • 24/7 support
On-demand options
  • Writer’s samples
  • Part-by-part delivery
  • Overnight delivery
  • Copies of used sources
  • Expert Proofreading
Paper format
  • 275 words per page
  • 12 pt Arial/Times New Roman
  • Double line spacing
  • Any citation style (APA, MLA, Chicago/Turabian, Harvard)

Our guarantees

Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.

Money-back guarantee

You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.

Read more

Zero-plagiarism guarantee

Each paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.

Read more

Free-revision policy

Thanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.

Read more

Privacy policy

Your email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.

Read more

Fair-cooperation guarantee

By sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.

Read more
error: Content is protected !!
Open chat
You can contact our live agent via WhatsApp! Via + 1 929 473-0077

Feel free to ask questions, clarifications, or discounts available when placing an order.

Order your essay today and save 30% with the discount code GURUH