CS 400: Project 0x11 - Y86 Functions

Objectives:

Complete a functional CPU design that supports basic functions.
Write a recursive assembly language program.

Materials to Get Started:

Click HERE to download JLS. (Depending on your OS and web browser, you may need to Right click and "Save As")
(By the way, JLS was created by Michigan Tech.'s Dave Poplawski. For more information, see the JLS webpage: http://www.cs.mtu.edu/~pop/jlsp/bin/JLS.html)
A solution from Hw#8. You can click HERE to download the posted solution or you can use your own.

The Assignment:

Most programming languages use a run-time stack to support functions. Although there are other alternatives, a run-time stack has some interesting features:

Allocation of storage for local variables only lasts as long as necessary. When an instance of the function is called the local variables are pushed onto the stack. When the instance completes they are popped off. The memory is released and can be re-used for other functions. So the run-time stack is an easy way to manage and reduce the amount of memory needed by programs.
The runtime stack allows each instance (call-to) a function to have its own unique local variables. This allows each instance of a recursive function to use unique instances of the variables declared in the function. So a run-time stack is one way to support recursion.
Unfortunately, the run-time stack also provides a security vulnerability if not used properly. If a program can allow the stack to become corrupted (by a "buffer overflow"), it is possible to inject code that effectively takes control of the execution of the program (and possibly computer) by changing the PC.

In order to support functions you still need the ability to support a run-time stack. In particular, you need instructions for calling functions, returning from functions, and pushing and poping data (local variables) to/from the stack.

The Y86 instruction set supports functions via four instructions:

call for the caller to transfer the Processor Control to the function.
ret for the function to return the Processor Control to the caller.
pushl to add information to the runtime stack. Again, this is primarily used to create a "local variable" for the function.
popl to release the memory that was allocated by a pushl.

Unfortunately some of these instructions modify multiple parts of the processor "state" (registers/memory) at the same time. For example, the call instruction will push the PC of the return address onto the stack, which will modify memory AND increment the esp register, and the call changes the PC (so memory is changed, the esp register is decremented, and the PC is changed). Unfortunately, the current CPU design doesn't allow you to easily modify memory and increment the esp in the same instruction.

There are a couple of significant options to support the new instructions: a) re-design the CPU to use multiple clock cycles per instruction, b) add a lot of new hardware, or c) treat the new instructions as "pseudo-instructions".

Using pseudo-instructions will be the easiest approach and it allows you to re-use underused resources (somewhat like what is done in pipelining) and to get a better understanding of how primitive operations can be used to support a more complex instruction (the pseudo-instruction).

A pseudo-instruction is an instruction that doesn't really exist, but which can be easily translated by the assembler into a few instructions. Since you are acting as the assembler, you will have to translate the instructions yourself.

You will need to add in a few additional "real" instructions to support the new pseudo-instructions:

pcrmovl Value(rB): Which will store the PC value in the given register. I.e., Reg[rB] = PC. pcrmovl will be used as part of the call instruction. It will aid putting the return address on the stack.
rpcmovl rB: Which will copy the value stored in the designated registerinto the PC. I.e. PC=Reg[rB]. This will effectively cause the execution to jump to the location that is loaded. This will be used as part of returing from functions.
iaddl Value,REG: Adding isn't a new instruction, but here we are adding an immediate value to a register rather than the contents of a register.
isubl Value,REG: Again, this will subtract an immediate value.
iandl Value,REG: Again, this will bitwise-and an immediate value.
ixorl Value,REG: Again, this will bitwise-xor an immediate value.

Here is a complete list of all the instructions the CPU will support, including the new instructions (end of the table):

Instruction	`icode/fcode`	format	Comment
`halt`	00	No-Oper	Stop Processor
`nop`	01	No-Oper	No-Operation
`rrmovl`	20	R-data	Reg[`rB`] = Reg[`rA`]
`cmovle`	21	R-data	If "less or equal", Reg[`rB`] = Reg[`rA`]
`cmovl`	22	R-data	If "less", Reg[`rB`] = Reg[`rA`]
`cmove`	23	R-data	If "equal", Reg[`rB`] = Reg[`rA`]
`cmovne`	24	R-data	If "not equal", Reg[`rB`] = Reg[`rA`]
`cmovge`	25	R-data	If "greater or equal", Reg[`rB`] = Reg[`rA`]
`cmovg`	26	R-data	If "greater", Reg[`rB`] = Reg[`rA`]
`irmovl`	30	I-data	Reg[`rB`]=`Value`
`rmmovl`	40	I-data	Mem[Reg[`rB`]+`Value`] = Reg[`rA`]
`mrmovl`	50	I-data	Reg[`rA`] = Mem[Reg[`rB`]+`Value`]
`addl`	60	R-data	Reg[`rB`] = Reg[`rB`]+Reg[`rA`]
`subl`	61	R-data	Reg[`rB`] = Reg[`rB`]-Reg[`rA`]
`andl`	62	R-data	Reg[`rB`] = Reg[`rB`] & Reg[`rA`]
`xorl`	63	R-data	Reg[`rB`] = Reg[`rB`] ^ Reg[`rA`]
`iaddl`	64	I-data	Reg[`rB`] = Reg[`rB`] + Value
`isubl`	65	I-data	Reg[`rB`] = Reg[`rB`] - Value
`iandl`	66	I-data	Reg[`rB`] = Reg[`rB`] & Value
`ixorl`	67	I-data	Reg[`rB`] = Reg[`rB`] ^ Value
`jmp`	70	J-data	PC = `Value`
`jle`	71	J-data	If "less or equal", PC = `Value`
`jl`	72	J-data	If "less", PC = `Value`
`je`	73	J-data	If "equal", PC = `Value`
`jne`	74	J-data	If "not equal", PC = `Value`
`jge`	75	J-data	If "greater or equal", PC = `Value`
`jg`	76	J-data	If "greater", PC = `Value`
`pcrmovl`	80	R-data	Reg[`rB`] = PC
`rpcmovl`	90	R-data	PC = Reg[`rB`]

The pseudo-instructions can be translated into real instructions as follows:

Pseudo-Instruction	Real Instructions	Machine Code
`call address`	`pcrmovl pc,%eax # Get PC (of this inst) iaddl 5,%eax # Add 5 to compute ret. addr. isubl $1,%esp # Push RetAddr rmmovl %eax, 0(%esp) # Push RetAddr jmp address # Start callee`	`800000000000 640000000005 650400000001 400400000000 70XXXXXXXX00`
`ret`	`mrmovl (%esp),%ecx iaddl $1,%esp rpcmovl %ecx,pc`	`501400000000 640400000001 900100000000`
`pushl reg`	`isubl $1,%esp rmmovl reg,%esp`	`650400000001 40X400000000`
`popl reg`	`mrmovl %esp,reg iaddl $1,%esp`	`50X400000000 640400000001`

These pseudo-instructions all assume that esp points to the last used space on the stack. An alternate approach would have esp refer to the first unused location.

Problems to Complete

Implement and test the iALU instructions, like iaddl $1, ....
Think carefully how information needs to proceed through the processor (i.e. the "datapath" --- the path that the data needs to follow) to use immediate values (the Value part of the instruction) with the contents of a register.
Implement and test the pcrmovl instruction.
This will require modifying the hardware so you can transfer the pc to a register.
Create a test case and put a "watch" on an appropriate register.
Implement and test the rpcmovl instruction.
This will require modifying the hardware that determines the source of the PC.

Create a test case. One potential test case:
1. Store a value of the address/label corresponding to the instruction in the the 4th step in this program in a specific register (like %esi).
2. Execute the rpcmovl so that it will retrieve the stored value (and the program will jump over the first halt)
3. Have a "halt" instruction
4. Set a specific value in eax
5. Have a "halt" instruction.
The program should execute the instruction that modifies eax. If you put a watch on the eax register you should observe the change (you may need to open the registers and put a watch on eax). You can also put a watch on the PC and should see it "skip" a count.
Create a program to test all four new pseudo-instructions. The following is a recursive program for multiplying numbers from (the second number must be positive):
```
int mult(int a, int b) {
  // Assumes that b is positive
  if(b==0) 
    return 0;
  return mult(a,b-1)+a;
}
```
Carefully translate the program into assembly language. Be sure to follow proper register conventions (only eax, ecx, and edx can be changed when the function is done) and follow conventions for passing arguments and returning results.
Include:
1. A "main" program (starting at address 0) that initializes the esp and ebp to 255, calls mult with 6 and 3 using the pseudo-instructions (call, etc.), and halts after mult function returns.
2. An assembly language version of the complete program (main and mult) with the pseudo-instructions replaced by the real instructions.
3. A machine language translation of the program.
4. A test case showing the value when a = 6. and b = 3. Include a timing diagram with watches in the PC, eax, and clock as well as the results of a watch on the Data Memory.
5. A description of each value that is written to memory: what does it mean and why was it written?
15% BONUS: Create and test a recursive Fibonacci number calculator:
```
int fib(int n) {
  if(n<=1) 
    return n;
  else 
    return fib(n-1)+fib(n-2);
}
```
Include:
1. Write a program with a "main" program (starting at address 0) that initializes the esp and ebp to 255, calls fib, and halts and the fib function (using the pseudo-instructions).
2. An assembly language version of the complete program (main and fib) with the pseudo-instructions replaced by the real instructions.
3. A machine language translation of the program.
4. A test case showing the value when n = 4. Include a timing diagram with watches in the PC, eax, and clock as well as the results of a watch on the Data Memory.
5. A description of each value that is written to memory: what does it mean and why was it written.

Hints & Suggestions:

Memory addresses need to be given in hex. Instructions should be in locations: 0,1,2,3,4,5,6,7,8,9,A,B,C,D,.... Uninitialized memory has the value of 0, which corresponds to the halt instruction. If your program is mysteriously stopping it's probably because it's accessing the wrong location in memory.
Carefully check and recheck both the logic of your assembly language program and the translation to machine language.
If you have trouble, try debugging by watching a few select registers and executing just a few instructions (watch the PC to see which instruction is being executed). Verify that the CPU is working by checking your work by hand.
The main function should be about 4 instructions long (setup the argument to sum, call sum, and halt.)
The mult function should be about 15-25 instructions long (there are some shortcuts that could make it a little shorter or more complex ways of implementing it that could be longer).
You may want to add watches to several things to watch the program execute. For example:
- You can watch the pc to see which instruction is executing
- You can open the register file and add watches to individual registers to see that each instruction changes the registers in the expected way.
- You can watch the clock to get a sense of when each instruction completes.
- You can monitor the memory to see that pushes into memory are working and storing the correct value at the expected location.
JLS can be finicky about the format of data entered into memory. Be sure that each line of code begins with a number (counting in HEX) and is followed by 12 more hex digits. In the end the file should look something like this:
```
#main:
0  3004000000ff       irmovl $FF,%esp   Setup ESP
1  ...........
...


5  000000000000       halt              eax should be the product of 6*3
#
#mult:
6  658400000001       subi $1,%esp
7  ............
...






F  ...........
10 ............
#mult_else:
11 ............
...




#mult_end:
17 ............




1C 900100000000      rpcmovl  # Put esp back
```
Notice that comment lines (beginning with a #) are used to make the code more readable. Also keep in mind that the line numbers need to be in hex too. You can include comments at the end of a line (after the memory index and corresponding value).

Submission:

Turn in a printed diagram of the Circuit. Please highlight the modifications you have made to the original file. (Circle them, draw arrows to them, or use a highlight (but only one of those))
Provide answers to all the questions, source code for each test case, explanations of how each test case should work, and evidence that they each worked as expected.
You will need to schedule an in-person demo. For the demo you should bring a copy of your file on a USB drive (or e-mail a copy to me in advance).