Contact:
Magnus Jahre (952 22 309)

# TDT4255 COMPUTER DESIGN EXAM 

Thursday 20. December 2012
Time: 09:00 - 12:00
ENGLISH
Allowed Aids:
D.

No written or handwritten examination support materials are permitted.
A specified, simple calculator is permitted.
Use the provided space to answer the problems. If you need more space, an extra answer box is available on the last page of the test. The test accounts for $50 \%$ of the final grade, and the provided points show the maximal number of points that can be achieved on each assignment. Read the problem texts thoroughly. You can answer the questions in English or Norwegian.

## Candidate Number:

## Problem 1 Multiple Choice (20 points)

Answer by circling the answer alternative you believe is the correct answer. You are awarded 2 points for a correct answer and 0 points if you do not answer. If your answer is wrong or you circle more than one alternative, you will get -1 point.
a) (2 p) Which of the following statements is not a design principle for Instruction Set Architectures

1. Simplicity favors regularity
2. Smaller is faster
3. Make the common case fast
4. Good design has no compromises
Answer:
1
2
3
4
b) (2 p) How are throughput and turn-around time affected by replacing a processor with a faster version in a single-core processor?
5. Throughput is increased and turn-around time is constant
6. Throughput is constant and turn-around time is decreased
7. Throughput is increased and turn-around time is decreased
8. Throughput is decreased and turn-around time is increased
Answer:
1
2
3
4
c) (2 p ) What is the carry propagation latency of a 2-bit ripple carry adder constructed using the one bit carry circuit in Figure 1?
9. 2 gate delays
10. 3 gate delays
11. 4 gate delays
12. 5 gate delays
Answer:
1
2
3
4

## Candidate Number:



Figure 1: 1-bit Carry Circuit
d) (2 p ) What is the lowest carry propagation latency of a 2-bit one-level carry look-ahead adder?

1. 2 gate delays
2. 3 gate delays
3. 4 gate delays
4. 5 gate delays
Answer:
1
2
3
4

## Example 1.A

The signals clock, reset and $D$ are one bit wide inputs and the signal $Y$ is a one bit wide output. The definitions of these signals are not shown.

```
process (clock)
begin
    if rising_edge(clock) then
        if reset = '0' then
            Y <= '0';
        else
            Y <= D;
        end if ;
    end if;
end process;
```


## Candidate Number:

e) ( 2 p) The VHDL code in Example 1.A describes a circuit element? Which one?

1. D flip-flop with synchronous reset
2. D flip-flop with asynchronous reset
3. D latch with synchronous reset
4. D latch with asynchronous reset
Answer:
1
2
3
4

## Example 1.B:

The signals sel, in_l and in_2 are one bit wide inputs and the signal output is a one bit wide output. The definitions of these signals are not shown.

```
process (clock)
begin
    if rising_edge(clock) then
        if sel = '0' then
            output <= in_1;
        else
            output <= in_2;
        end if;
    end if;
end process;
```

f) ( 2 p ) Which statements are not correct regarding the VHDL code in Example 1.B?

1. The functionality defined by the code can be implemented correctly with two 2-input AND gates, one 2-input OR gate, one inverter and a D flip-flop
2. The functionality defined by the code can be implemented correctly with two 2 -input AND gates, one 2-input OR gate and one inverter
3. The code describes a sequential circuit
4. The code describes a two-way multiplexer with a 1-bit output register
Answer:
1
2
3
4

## Candidate Number:

The bit mapping of the 32 bit IEEE 754 single precision floating point format is:

| 31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |

Bias: 127
g) ( 2 p ) How is the decimal number 256.25 represented in the IEEE 754 single precision format?

1. $0 \times 43802000$
2. $0 x 04002000$
3. $0 x 83802000$
4. $0 x 84002000$
Answer:
1
2
3
4
h) ( 2 p ) What is the decimal representation of the IEEE 754 single precision representation $0 \times 41520000$ ?
5. -13.125
6. -6.5625
7. 6.5625
8. 13.125
Answer:
1
2
3
4

## Candidate Number:

i) ( 2 p ) Joe the computer designer is designing a memory system with two cache levels and a main memory. The access latency of the L1 cache is 3 clock cycles, the latency of the L2 cache is 20 clock cycles and the latency of the main memory is 150 clock cycles. What is the average memory latency for a benchmark that has a $95 \% \mathrm{~L} 1$ hit rate and a $70 \% \mathrm{~L} 2$ hit rate if Joe decides to access all levels of the memory hierarchy sequentially?

1. 5.6 clock cycles
2. 5.8 clock cycles
3. 6.1 clock cycles
4. 6.3 clock cycles
Answer:
1
2
3
4
j) (2 p) Which statement regarding static and dynamic scheduling is not correct?
5. Dynamic scheduling can handle dependencies that are unknown at compile time
6. Static scheduling works better when the compiler knows the microarchitecture
7. Static scheduling does not improve performance on out-of-order processors
8. Dynamic scheduling increases the complexity of the processor implementation
Answer:
1
2
3
4


Figure 2: A Single-Cycle Processor Architecture

## Problem 2 Single Cycle Processors (6 points)

Joe the computer designer has been given the task of implementing a single-cycle processor. Unfortunately, the only information he is given is the block diagram in Figure 2. Joe is able to figure out the ALUOp signal, but you need to help him find the values of the other control signals.
a) (3 p) What should the values of the following control signals be for an add instruction?

## Answer:

| RegDst | Branch | MemRead | MemToReg | MemWrite | ALUSrc | RegWrite |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- |
|  |  |  |  |  |  |  |

b) (3p) What should the values of the following control signals be for an beq instruction?

## Answer:

| RegDst | Branch | MemRead | MemToReg | MemWrite | ALUSrc | RegWrite |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- |
|  |  |  |  |  |  |  |

## Candidate Number:

## Problem 3 Pipelined Processors (12 points)

In this assignment, you are given two block diagrams of a pipelined processor. For simplicity, all data and control signals have been removed. Your tasks will consist of adding functionality to these figures which may contain new signals and new blocks and to write logic equations describing the behavior of the added blocks/signals.

## Example 3.A:

```
and $3, $2, $1
add $4, $3, $1
```

a) (6 p) Example 3.A exposes a hazard in the processor. How would you change the architecture to achieve correct operation? Add the necessary blocks and signals to the figure and write the logic equations necessary for correct operation in the answer box. State any necessary assumptions.


Answer:

## Candidate Number:

## Example 3.B:

lw \$2, 20(\$1)
and $\$ 4, \$ 2, \$ 5$
b) ( 6 p) Example 3.B exposes another hazard in the processor. How would you change the architecture to achieve correct operation? Add the necessary blocks and signals to the figure and write the logic equations necessary for correct operation in the answer box. State any necessary assumptions.


Answer:

## Candidate Number:

## Problem 4 Out-of-Order Processors ( 12 points)

## Example 4.A:

1 LD F1, 32(R1)
2 LD F2, 40(R1)
3 MULT.D F3, F2, F1
4 SUB.D F4, F2, F1
5 DIV.D F2, F1, F4
6 ADD.D F1, F3, F2

The assembly program in Example 4.A is executed on an out-of-order processor that supports speculation. The processor can fetch 4 instructions each cycle and has two load/store units, one floating point add/sub unit and one floating point multiply/divide unit. In addition, it is able to commit 2 instructions each clock cycle. The latency of all functional units is one clock cycle, and the ROB stores values.
a) ( 4 p ) Rewrite the code in the table below with the technique register renaming. Which hazards are removed by this operation? Can these hazards occur in an in-order architecture? Explain your reasoning.

Answer:

|  | Reg 1 | Reg 2 | Reg 3 |
| :--- | :--- | :--- | :--- |
| LD |  |  |  |
| LD |  |  |  |
| MULT.D |  |  |  |
| SUB.D |  |  |  |
| DIV.D |  |  |  |
| ADD.D |  |  |  |

## Candidate Number:

b) ( 8 p ) Write the state of the ROB in cycles 1 to 4 into the tables below. At cycle 1, R1 has the value 1024 and the values at the offsets 32 and 40 are 2.0 and 4.0 , respectively. State any necessary assumptions.

Answer:
ROB at clock cycle 1

| Ins\# | Use | Exec | Operation | P1 | Source 1 | P2 | Source 2 | PD | Destination | Data |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
|  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |

ROB at clock cycle 2

| Ins\# | Use | Exec | Operation | P1 | Source 1 | P2 | Source 2 | PD | Destination | Data |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
|  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |

ROB at clock cycle 3

| Ins\# | Use | Exec | Operation | P1 | Source 1 | P2 | Source 2 | PD | Destination | Data |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
|  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |

## Candidate Number:

ROB at clock cycle 4

| Ins\# | Use | Exec | Operation | P1 | Source 1 | P2 | Source 2 | PD | Destination | Data |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
|  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |

Assumptions and comments:

## Additional Answer Space

Answer:

## Candidate Number:

## MIPS Reference



## Candidate Number:

Page 15 of 15


Candidate Number:

