Course Project - Instruction Set Simulator
Arch Opt/Research Details
This semester you will be developing an instruction set simulator (ISS) for your own custom assembly language. You ISS will need to be both pipelined (basic 5-stage pipeline) and cycle accurate (including stall cycles due to data and control hazards and any potential pipeline forwarding) and you will implement your ISS in C++. Additionally, you will need to implement an optimization of your choice and finish with a technical research paper.
The layout of this project is very similar to the methodology used by researchers when optimizing system architectures. To test out potential optimizations, the researcher must first obtain an ISS for the architecture tested. Many times, if a researcher is using a standard architecture, a simulator already exists and the researcher may bypass developing the ISS. However, in the case of a custom ISS (as is with this project), the researcher must develop an ISS for the instruction set. After the testing environment is obtained and/or verified, the researcher can then modify the ISS to simulate architectural optimizations to measure the effects on the system. Finally, the researcher will culminate their work with a final paper, describing their work and contributions.
The project will be broken into 2 main parts for grading purposes:
For this project, you must work with 1-2 other people - only groups
of 2 or 3 people. Absolutely no singles or groups larger than 3!
- Assembler and ISS - Due Sunday, Oct 26 @ 5pm.
Demos will proceed over the following 4 weeks.
- Architectural Optimization and Research Paper - Due Sunday, Dec 7 @ 5pm
In addition, all academic integrity and collaboration polices discussed on the main page apply to this project.
First you will need to create your own instruction set architecture for a 32-bit processor. This will include, but is not limited to, the following:
- Choosing the instruction set
- How many registers?
- Will you have a stack?
Basic Simulator Design:
You will need to create an Instruction Set Simulator (ISS) for your instuction set. Your simulator will read in a program file in intermediate format and execute the program. At program termination, you will need to print out any values that were calculated and stored in registers or memory locations to verify the functionality of your ISS. In addition, you must calculate the CPI and output that value upon program termination.
Your simulator must use a 5 stage pipeline and must be cycle accurate including handling all stall cycles due to data and control hazards, any potential pipeline forwarding to minimize stall cycles, and resource sharing (register file). The implementation details are up to you, however do not implement any optimizations at this point. Your pipeline should resemble the pipeline in Appendix A. I will give you some sample programs and your CPI should match up with my expected CPI for that bit of code. If it doesn't you will need to explain why, showing details in your simulator.
For simplicity, you may assume that all alu operations take 1 cycle to execute
Don't forget, you are implementing a real architectural simulator so make all
of your design choices keeping that in mind. For example, you will need to
read the binary in to either instruction memory or a ROM and then run the
program from that location.
Input to your ISS will be a program represented in intermediate format. As you do not want to have to write a program using opcodes and field values, you will need to write a separate assembler. This program will take as input a program that you will write in your own custom instruction set and output the program in intermediate format.
You will need to write programs to test the functionality of your ISS. I suggest you write these applications first in C/C++ and then manually translate them to your assembly language before running them through your assembler. Here is an example of a simple program to move memory around and associated assembly code for a sample instruction set. NOTE: Your assembly language will not look exactly like mine. The details of those files are dependent on your instruction set, this is just an example:
Since everyone's instruction sets will be different, I can't provide you with hard testbenches. However, here are the provided test cases that you will have to pass. I am including both C and ASM code, but you may need to slightly modify the assembly code so that the program will work in your instruction set. In fact, I will expect that you will change the ASM code, just make sure that the functionality is the same:
For the following test cases, I am supplying you with sample C code and you will need to create the associated assembly code for testing:
In addition, you will need to create 2 more benchmarks to test your code.
I will verify the functionality of your ISS using both the CPI and the results that you output at program completion.
load_delay_1.asm CPI = 3.5
load_delay_2.asm CPI = 3.5
load_delay_3.asm CPI = 3.5
load_reg_1.asm CPI = 1.8
load_reg_2.asm CPI = 1.8
load_reg_3.asm CPI = 1.8
alu_reg_1.asm CPI = 1.8
alu_reg_2.asm CPI = 1.8
alu_reg_3.asm CPI = 1.8
ex-mem_to_ex_1.asm CPI = 2.33
ex-mem_to_ex_2.asm CPI = 2.33
ex-mem_to_ex_3.asm CPI = 2.33
mem-wb_to_ex_1.asm CPI = 2.33
mem-wb_to_ex_2.asm CPI = 2.33
mem-wb_to_ex_3.asm CPI = 2.33
mem-wb_to_ex_4.asm CPI = 2.33
mem-wb_to_ex_5.asm CPI = 2.33
mem-wb_to_ex_6.asm CPI = 2.33
st_mem-wb_mem.asm CPI = 2.33
st_addr_ex-mem_ex.asm CPI = 2.33
st_addr_mem-wb_ex.asm CPI = 2.33
ld_addr_ex-mem_ex.asm CPI = 2.33
ld_addr_mem-wb_ex.asm CPI = 2.33
ld_st_mem-wb_mem.asm CPI = 2.33
big_test.asm CPI = 1.8
branch_1.asm CPI = 3 (same if branch is evaluated in 2nd or 3rd cycle) - register file must show r1 = 0 and r5 = 6 ONLY
branch_2.asm CPI = 2.5 if branch evaluated in 3rd cycle, CPI = 2.25 if branch evaluated in the 2nd cycle - register file must show r1 = 0 and r5 = 6 ONLY
What to turn in:
You will need to email me a tarred and zipped file (or a Winzipped or RARed file) containing the following:
- All of your code for the simulator and the assembler
- Precompiled executables for the simulator and assembler
- All of the benchmarks in both assembly code and in your intermediate format. This means that for EVERY asm file provided above you must turn in a file that is the input to your simulator. Make sure all code is pre-assembled to make the testing go faster. Plus you must turn in the assembly code and intermediate format code for matrix, gcd, and your two custom benchmarks.
- Detailed log file for each partner showing their contribution.
I expect this to
be very DETAILED with specific work hours logged and tasks during those
hours. You will need one log file for each person. Here is a sample.
After the due date, each group will need to make an appointment to come to my office to demo your ISS. You will have no more than 30 minutes to give me a demo of the above programs and your own custom benchmarks to verify your ISS. Your grade will be based on the following:
NOTE: Both members will not necessarily get the same grade. The purpose of the demo will be to make sure you have implemented the assignment correctly and for me to assess who did the work. You will need to prove to me that all partners shared equal responsibility and that each person knows the internal workings of the code. I will have a set of questions and I will alternate who answers the question. At the end of the interview, I will then assign a percentage weight between the partners for work done. Your goal is just to convince me that all participated equally. Your logfiles will help tremedously with this. Poor log files will result in poor grades.
- Passing the provided test cases (memmove, array, mult, gcd, and matrix) and surprise test vectors (same program, different inputs). Before demoing the tests where I provide assembly code, you will need to tell me if/how you changed my code so that it worked with your instruction set
- Providing 2 useful custom benchmarks and passing your test vectors
- Organization of demo
- Organization of code including good standard programming principles and sufficient comments/documentation
- Completeness of the logfile and even delegation of the work.
During the demo you will be demoing a subset of the provided benchmarks and your custom benchmarks. The subset will be chosen randomly during the demo. You are only given a fixed period of time to demo your code and if we don't get through all of the tests, you will not get points for the ones that you don't get to. Thus, it is imperitive that you follow the directions above on what to turn in (All assembly code must be preassembled so that we can avoid that step during the demo). I will not give you time to modify assembly code nor will I give you time to assemble the code (although I will have you assemble some code to show the funcationality of your assembler). I am not trying to be mean, I just have many groups to get through and it is very important that we finish in the allotted time.
After you have developed your simulator testing environment, you will need to modify your ISS to implement an architectural optimization. You will get to choose which optimization you would like to explore.
You will first choose the optimization and then do a survey of existing techniques. This survey will be the related work section of your research paper. During your survey, you will think about how an existing method can be modified to improve the technique or combined with another technique to produce improved results. This extension to an existing technique is not necessary, you can just implent the basic
optimization if you choose.
You will also need to implement your optimization and if you extend an existing technique, you will need to implement the base technique for comparison purposes. NOTE: You will need to get instructor approval of your optimization technique. NOTE: you will need to add additional functionality to your simulator for this part.
You will need to submit a 1 page document detailing your proposed optimzation
for instructor approval. The document should be formatted like a standard
research paper, (see research paper details
below). Your proposal needs to contain the following sections, at the
- Summary of proposed optimization
- Related work
- Projected changes necessary to ISS
- Evaluation methodology
- Anticipated results
Quantifying Your Optimization
You will need to use an appropriate metric to quantify your optimization compared to the system without the optimization and to previous work if you are improving/extending a technique. If your optimization improves CPI, you may simply compare change in CPI. If your optimization reduced power/energy consumption, you may be able to quantify using reduction in switching activity or you can calculate the power/energy consumption. After you read current research papers in the area of your optimzation, this quantification should become obvious.
Sample Optimzation Topics
One of the most difficult parts of becoming a researcher is developing novel solutions to existing problems and/or discovering new areas for exploration. Below is a list of potential areas of optimization. Or course I would encourage you to pick a topic not listed below:
- Branch Prediction
- Cache Tuning
- Instruction/data Encoding
- Trace cache
- Filter cache
- Tomasulo's algorithm
There will be no demo for this portion of the project. You will need to turn
in your code with detailed documentation telling me how to run your
simulator and see your optimized results.
Again, each person will need to turn in a log file for this part of the
NOTE: Since I am allowing you to implement an existing technique, you must show improvement to your system. Your grade will be severly impacted if you do not show improvement. However, if you extend an existing technique with something novel, your grade will not depend on how successful your optimzation
is. Your grade will be based on implementing a novel optimization that improves a previous technique. There is no minimum placed on what the improvement has to be but you must show at least some improvement.
Again, as with the first demo, both partners will not necessarily get the same grade.
Your final task is to write a research paper presenting your optimization technique. To prepare you for writing technical documents in the future, you will write your paper in the standard technical format used by most conference and journal publications. This is the two column, single spaced, 9-pt font, fully justified format (this will become clear as you read papers). A research paper basically contains the following sections. There are variations depending on what needs to be reported:
Abstact: 100-200 words describing what you did and results
Make sure that you address the following points in the paper:
- Introduction: Background information and a longer summary of what you did and results you got. You should briefly mention how you technique differs from previous work in this section
- Related work: Provide summaries of previous work related to your optimization including references to the papers (include cross-reference to paper listed in your references section). In this section you should clearly distinguish your work from previous work, giving details on how your work is different and how your work improves and/or extends previous methods. Be careful not to trash anyone elses work in this section. For one, it is tacky and two, the authors you are trashing are likely reviewing your paper
- Methodology: Describe what you did, how you did it, how it works, etc. Give architectural diagrams, charts and figures where necessary.
- Experiment Results: Include a subsection describing your experimental setup followed by (a) subsection(s) presenting your results. Be concise and use graphs to show improvements. Tables may also be used but a graph showing improvements conveys your results much easier than a table that must be read and interpretted.
- Conclusions and Future Work: Summarize work and results. Give any future work you may be planning or future work that could be done
- Acknowledgments: Acknowledge any person that helped with your research that is not listed as an auther. Also acknowledge any funding sources that funded this work (of course you will not have that for this paper, just future reference).
- References: List of papers cited throughout your paper.
Although I don't recommend it, research papers are typically written in MS Word (which is actually what I use). LaTex is also a popular method. You may use any editor that you wish, just make sure your document is formatted correctly.
- Describe your optimzation to me and how it differs from previous work
- Describe how you modified your ISS to simulate the optimization
- Describe how you will quantify your optimization
What to turn in
As with the first part of the project, you email me your project as a
single tarred/zipped/rarred file. You will need to turn in the following:
- All code for your assembler and simulator
- Documentation stating how to run your simulator
- Research paper
- Detailed log file for each partner
For this second portion, extra credit is offered for early submission:
Here I have listed some additional resources that may be helpful with your project. I do not know of any pipelined simulators that I can direct you to, the following are simply functional simulators. This means that you should not model your ISS after these simulators, you should just use them as examples of existing simulators.
- 0-24 hours early - 0%
- 24-48 hours early - 2%
- 48-72 hours early - 4%
- 72+ hours early - 6%
- SimpleScalar: Whereas SimpleScalar is not pipelined, nor cycle accurate, it is a widely used functional simulator. I suggest you review the users guide linked under documents. If you aren't familiar with SimpleScalar, it can be a bit intimidating.
- 8051 Simulator: Here is both a VHDL and C++ implementation of the 8051. Again, not a pipelined simulator but it is less intimdating that SimpleScalar. If you look at the cc code, you can see that the code simply switches on each instruction then does what the instruction is supposed to do. I can't stress enough that this is NOT what a pipelined simulator will look like, this is just an example of a simulator.
- Computer Organization and Design: This is the undergraduate version of the book that we are using. You should really think about purchasing this book if you need help with the project. It will help you with the MIPS assembly language and understanding pipelining. This is the third edition which I am NOT familiar with so I hope that it has the same information as the second edition. My suggestiong to you is to go to Amazon and purchase the second edition of the book used for as low as a few dollars. It is definitely worth it!