intTypePromotion=1
zunia.vn Tuyển sinh 2024 dành cho Gen-Z zunia.vn zunia.vn
ADSENSE

Bài giảng Computer architecture: Part IV

Chia sẻ: Codon_06 Codon_06 | Ngày: | Loại File: PPT | Số trang:72

49
lượt xem
4
download
 
  Download Vui lòng tải xuống để xem tài liệu đầy đủ

Bài giảng Computer architecture: Part IV - Data path and control hướng đến trình bày các vấn đề cơ bản như: Instruction execution steps; control unit synthesis; pipelined data paths;... Hy vọng tài liệu là nguồn thông tin hữu ích cho quá trình học tập và nghiên cứu của các bạn.

Chủ đề:
Lưu

Nội dung Text: Bài giảng Computer architecture: Part IV

  1. Part IV Data Path and Control Mar. 2006 Computer Architecture, Data Path and Control Slide 1
  2. About This Presentation This presentation is intended to support the use of the textbook Computer Architecture: From Microprocessors to Supercomputers, Oxford University Press, 2005, ISBN 0-19-515455-X. It is updated regularly by the author as part of his teaching of the upper- division course ECE 154, Introduction to Computer Architecture, at the University of California, Santa Barbara. Instructors can use these slides freely in classroom teaching and for other educational purposes. Any other use is strictly prohibited. © Behrooz Parhami Edition Released Revised Revised Revised Revised First July 2003 July 2004 July 2005 Mar. 2006 Mar. 2006 Computer Architecture, Data Path and Control Slide 2
  3. A Few Words About Where We Are Headed Performance = 1 / Execution time simplified to 1 / CPU execution time CPU execution time = Instructions   CPI / (Clock rate) Performance = Clock rate / ( Instructions CPI ) Try to achieve CPI = 1 Design hardware with clock that is as for CPI = 1; seek high as that for CPI > 1 Define an instruction set; improvements with designs; is CPI < 1 make it simple enough CPI > 1 (Chap 13-14) feasible? (Chap 15-16) to require a small number of cycles and allow high Design memory & I/O clock rate, but not so structures to support simple that we need many Design ALU for ultrahigh-speed CPUs instructions, even for very arithmetic & logic (chap 17-24) simple tasks (Chap 5-8) ops (Chap 9-12)   Mar. 2006 Computer Architecture, Data Path and Control Slide 3
  4. IV Data Path and Control Design a simple computer (MicroMIPS) to learn about: • Data path – part of the CPU where data signals flow • Control unit – guides data signals through data path • Pipelining – a way of achieving greater performance Topics in This Part Chapter 13 Instruction Execution Steps Chapter 14 Control Unit Synthesis Chapter 15 Pipelined Data Paths Chapter 16 Pipeline Performance Limits Mar. 2006 Computer Architecture, Data Path and Control Slide 4
  5. 13 Instruction Execution Steps A simple computer executes instructions one at a time • Fetches an instruction from the loc pointed to by PC • Interprets and executes the instruction, then repeats Topics in This Chapter 13.1 A Small Set of Instructions 13.2 The Instruction Execution Unit 13.3 A Single-Cycle Data Path 13.4 Branching and Jumping 13.5 Deriving the Control Signals 13.6 Performance of the Single-Cycle Design Mar. 2006 Computer Architecture, Data Path and Control Slide 5
  6. 13.1 A Small Set of Instructions op rs rt rd sh fn 31 25 20 15 10 5 0 R 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits Opcode Source 1 Source 2 Destination Unused Opcode ext I or base or dest’n imm Operand / Offset, 16 bits J jta Jump target address, 26 bits inst Instruction, 32 bits Fig. 13.1 MicroMIPS instruction formats and naming of the various fields. We will refer to this diagram later Seven R-format ALU instructions (add, sub, slt, and, or, xor, nor) Six I-format ALU instructions (lui, addi, slti, andi, ori, xori) Two I-format memory access instructions (lw, sw) Three I-format conditional branch instructions (bltz, beq, bne) Four unconditional jump instructions (j, jr, jal, syscall) Mar. 2006 Computer Architecture, Data Path and Control Slide 6
  7. Instruction Usage op fn The MicroMIPS Copy Load upper immediate lui rt,imm 15 Instruction Set Add add rd,rs,rt 0 32 Subtract sub rd,rs,rt 0 34 Arithmetic Set less than slt rd,rs,rt 0 42 Add immediate addi rt,rs,imm 8 Set less than immediate slti rd,rs,imm 10 AND and rd,rs,rt 0 36 OR or rd,rs,rt 0 37 XOR xor rd,rs,rt 0 38 Logic NOR nor rd,rs,rt 0 39 AND immediate andi rt,rs,imm 12 OR immediate ori rt,rs,imm 13 XOR immediate xori rt,rs,imm 14 Load word lw rt,imm(rs) 35 Memory access Store word sw rt,imm(rs) 43 Jump j L 2 Jump register jr rs 0 8 Branch less than 0 bltz rs,L 1 Control transfer Branch equal beq rs,rt,L 4 Branch not equal bne rs,rt,L 5 Jump and link jal L 3 Table 13.1 System call syscall 0 12 Mar. 2006 Computer Architecture, Data Path and Control Slide 7
  8. 13.2 The Instruction Execution Unit op rs rt rd sh fn 31 25 20 15 10 5 0 beq,bne R 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits syscall Opcode Source 1 Source 2 Destination Unused Opcode ext I or base or dest’n imm Operand / Offset, 16 bits Next addr J jta bltz,jr jta Jump target address, 26 bits inst j,jal Instruction, 32 bits 22 instructions rs,rt,rd (rs) PC 12 A/L, Instr inst Reg lui, Address Data cache file lw,sw ALU cache Data (rt) imm op fn Control Fig. 13.2 Abstract view of the instruction execution unit for MicroMIPS. For naming of instruction fields, see Fig. 13.1. Mar. 2006 Computer Architecture, Data Path and Control Slide 8
  9. 13.3 A Single-Cycle Data Path Incr PC Next addr Next PC jta ALUOvfl Register (PC) writeback PC rs (rs) rt Ovfl ALU Data Data Instr inst Reg out addr Data out 0 ALU 0 cache rd 1 file cache 1 2 2 31 (rt) Func Data 0 in 32 / SE / 1 imm 16 op fn Register input RegDst ALUSrc DataRead RegInSrc Br&Jump RegWrite ALUFunc DataWrite Instruction fetch Reg access / decode ALU operation Data access Fig. 13.3 Key elements of the single-cycle MicroMIPS data path. Mar. 2006 Computer Architecture, Data Path and Control Slide 9
  10. Const Var 00 No shift Constant 5 Shift function 2 01 10 Logical left Logical right lui An ALU for amount 0 Amount Variable 5 1 5 11 Arith right 00 Shift MicroMIPS Shifter 01 Set less amount Function 10 Arithmetic class 11 Logic 32 5 LSBs Shifted y 2 0 x imm c0 0 or 1 Shorthand 32 1 MSB s symbol x y Adder 2 32 for ALU 32 32 c 31 Control y k c 32 3 / x Add Sub Func s ALU 32- y Ovfl Logic input Zero unit NOR AND 00 OR 01 2 XOR 10 Logic function Zero Ovfl NOR 11 Fig. 10.19 A multifunction ALU with 8 control signals (2 for function class, 1 arithmetic, 3 shift, 2 logic) specifying the operation. Mar. 2006 Computer Architecture, Data Path and Control Slide 10
  11. 13.4 Branching and Jumping (PC)31:2 + 1 Default option Update (PC)31:2 + 1 + imm When instruction is branch and condition is met options (PC)31:28 | jta When instruction is j or jal for PC (rs)31:2 When the instruction is jr SysCallAddr Start address of an operating system routine Branch / (rt) Lowest 2 bits of BrTrue 32 PC always 00 condition / (rs) 30 / checker / IncrPC 30 32 / Adder 30 30 SE c in MSBs / / (PC)31:2 30 4 16 30 0 / imm 30 4 MSBs 1 MSBs NextPC 1 / / jta / 30 26 30 2 / 30 3 / SysCallAddr 30 PCSrc BrType   Fig. 13.4 Next-address logic for MicroMIPS (see top part of Fig. 13.3). Mar. 2006 Computer Architecture, Data Path and Control Slide 11
  12. 13.5 Deriving the Control Signals Table 13.2 Control signals for the single-cycle MicroMIPS implementation. Control signal 0 1 2 3 RegWrite Don’t Write Reg write file RegDst1, RegDst0 rt rd $31 RegInSrc1, RegInSrc0 Data out ALU out IncrPC ALUSrc (rt ) imm ALU Add Sub Add Subtract LogicFn1, LogicFn0 AND OR XOR NOR Data FnClass1, FnClass0 lui Set less Arithmeti Logic cache c DataRead Don’t Read Next read addr DataWrite Don’t Write Mar. 2006 write Data Path and Control Computer Architecture, Slide 12 BrType , BrType No beq bne bltz
  13. DataW rite DataRead RegWrite RegInSrc Add’Sub FnClass LogicFn ALUSrc RegDst BrType Control PCSrc Instruction op fn Signal Load upper immediate 001111 1 00 01 1 00 0 0 00 00 Settings Add 000000 100000 1 01 01 0 0 10 0 0 00 00 Subtract 000000 100010 1 01 01 0 1 10 0 0 00 00 Table 13.3 Set less than 000000 101010 1 01 01 0 1 01 0 0 00 00 Add immediate 001000 1 00 01 1 0 10 0 0 00 00 Set less than immediate 001010 1 00 01 1 1 01 0 0 00 00 AND 000000 100100 1 01 01 0 00 11 0 0 00 00 OR 000000 100101 1 01 01 0 01 11 0 0 00 00 XOR 000000 100110 1 01 01 0 10 11 0 0 00 00 NOR 000000 100111 1 01 01 0 11 11 0 0 00 00 AND immediate 001100 1 00 01 1 00 11 0 0 00 00 OR immediate 001101 1 00 01 1 01 11 0 0 00 00 XOR immediate 001110 1 00 01 1 10 11 0 0 00 00 Load word 100011 1 00 00 1 0 10 1 0 00 00 Store word 101011 0 1 0 10 0 1 00 00 Jump 000010 0 0 0 01 Jump register 000000 001000 0 0 0 10 Branch on less than 0 000001 0 0 0 11 00 Branch on equal 000100 0 0 0 01 00 Branch on not equal 000101 0 0 0 10 00 Jump and link 000011 1 10 10 0 0 00 01 System call 000000 001100 0 0 0 11 Mar. 2006 Computer Architecture, Data Path and Control Slide 13
  14. op fn Instruction /6 0 RtypeInst /6 0 Decoding 1 bltzInst 2 jInst 8 jrInst 3 jalInst 4 beqInst 12 syscallInst 5 bneInst 8 addiInst fn Decoder op Decoder 10 sltiInst 32 addInst 1 12 andiInst 34 subInst 13 oriInst 36 andInst 14 xoriInst 37 orInst 15 luiInst 38 xorInst 39 norInst 35 lwInst 42 sltInst 43 swInst 63 63 Fig. 13.5 Instruction decoder for MicroMIPS built of two 6-to-64 decoders. Mar. 2006 Computer Architecture, Data Path and Control Slide 14
  15. Control Signal Generation Auxiliary signals identifying instruction classes arithInst = addInst subInst sltInst addiInst sltiInst logicInst = andInst orInst xorInst norInst andiInst oriInst xoriInst immInst = luiInst addiInst sltiInst andiInst oriInst xoriInst Example logic expressions for control signals RegWrite = luiInst arithInst logicInst lwInst jalInst ALUSrc = immInst lwInst swInst Add Sub = subInst sltInst sltiInst DataRead = lwInst PCSrc0 = jInst jalInst syscallInst Mar. 2006 Computer Architecture, Data Path and Control Slide 15
  16. Putting It All Together Fig. 10.19 Const Var 00 No shift Shift function 01 Logical lef t (rt) Fig. 13.4 BrTrue Branch condition / 32 Cons tant amount 5 0 Amount 2 10 11 Logical right Arith right lui / (rs) 00 Shift 30 / checker / Variable 5 1 5 Shifter Function 01 Set less IncrPC 30 32 amount 10 Arithmetic class 11 Logic / Adder 30 32 30 c in SE MSBs 5 LSBs imm Shifted y 2 / / (PC)31:2 0 30 4 16 30 0 / 1 imm x c0 0 or 1 30 4 MSBs MSBs 1 Shortha NextPC 1 / / jta 32 s symb / 30 26 x y MSB 30 2 / Adder 2 32 for AL 30 32 3 / SysCallAddr 32 c Cont 30 y k c 32 31 3 / x PCSrc BrType Fun   Add Sub A 32- y O Logic Incr PC Next addr Fig. 13.3 AND 00 unit input NOR Zero Next PC jta OR 01 2 ALUOvfl XOR 10 Logic function Zero Ovfl NOR 11 (PC) (rs) PC rs addInst rt Ovfl ALU Data Data Instr inst 0 Reg ALU out addr Data out subInst cache 1 file cache 0 rd 2 1 2 jInst 31 (rt) Func Data 0 32 in .  . imm / SE / 1 Control 16 .  . op fn Register input  . . sltInst RegDst ALUSrc DataRead RegInSrc Br&Jump RegWrite ALUFunc DataWrite Mar. 2006 Computer Architecture, Data Path and Control Slide 16
  17. 13.6 Performance of the Single-Cycle Design Instruction access 2 ns ALU-type P Not Register read 1 ns C used ALU operation 2 ns Data cache access 2 ns P Register write 1 ns Load C Total 8 ns Single-cycle clock = 125 MHz P Not Store used C R-type 44% 6 ns Load 24% 8 ns Not Not Not Store 12% 7 ns Branch (and jr) P C used used used Branch 18% 5 ns Jump 2% 3 ns Weighted mean 6.36 ns Jump P C Not used Not used Not used Not used (except jr & jal) Fig. 13.6 The MicroMIPS data path unfolded (by depicting the register write step as a separate block) so as to better visualize the critical-path latencies. Mar. 2006 Computer Architecture, Data Path and Control Slide 17
  18. How Good is Our Single-Cycle Design? Clock rate of 125 MHz not impressive Instruction access 2 ns Register read 1 ns How does this compare with ALU operation 2 ns current processors on the market? Data cache access 2 ns Register write 1 ns Total 8 ns Not bad, where latency is concerned Single-cycle clock = 125 MHz A 2.5 GHz processor with 20 or so pipeline stages has a latency of about 0.4 ns/cycle 20 cycles = 8 ns Throughput, however, is much better for the pipelined processor: Up to 20 times better with single issue Perhaps up to 100 times better with multiple issue Mar. 2006 Computer Architecture, Data Path and Control Slide 18
  19. 14 Control Unit Synthesis The control unit for the single-cycle design is memoryless • Problematic when instructions vary greatly in complexity • Multiple cycles needed when resources must be reused Topics in This Chapter 14.1 A Multicycle Implementation 14.2 Choosing the Clock Cycle 14.3 The Control State Machine 14.4 Performance of the Multicycle Design 14.5 Microprogramming 14.6 Exception Handling Mar. 2006 Computer Architecture, Data Path and Control Slide 19
  20. 14.1 A Multicycle Implementation Clock Time needed Time allotted Instr 1 Instr 2 Instr 3 Instr 4 Clock Time Time needed saved 3 cycles 5 cycles 3 cycles 4 cycles Time allotted Instr 1 Instr 2 Instr 3 Instr 4 Fig. 14.1 Single-cycle versus multicycle instruction execution. Mar. 2006 Computer Architecture, Data Path and Control Slide 20
ADSENSE

CÓ THỂ BẠN MUỐN DOWNLOAD

 

Đồng bộ tài khoản
2=>2