当前位置:首页 >> 教育学 >>

计算机组成与设计 第五版答案


1
Solutions

Chapter 1

Solutions

S-3

1.1 Personal computer (includes workstation and laptop): Personal computers emphasize delivery of good performance to single users at low cost and usually execute third-party software. Personal mobile device (PMD, includes tablets): PMDs are battery operated with wireless connectivity to the Internet and typically cost hundreds of dollars, and, like PCs, users can download software (“apps”) to run on them. Unlike PCs, they no longer have a keyboard and mouse, and are more likely to rely on a touch-sensitive screen or even speech input. Server: Computer used to run large problems and usually accessed via a network. Warehouse scale computer: Thousands of processors forming a large cluster. Supercomputer: Computer composed of hundreds to thousands of processors and terabytes of memory. Embedded computer: Computer designed to run one application or one set of related applications and integrated into a single system. 1.2 a. Performance via Pipelining b. Dependability via Redundancy c. Performance via Prediction d. Make the Common Case Fast e. Hierarchy of Memories f. Performance via Parallelism g. Design for Moore’s Law h. Use Abstraction to Simplify Design 1.3 The program is compiled into an assembly language program, which is then assembled into a machine language program. 1.4 a. 1280 ? 1024 pixels ? 1,310,720 pixels ?? 1,310,720 ? 3 ? 3,932,160 bytes/frame. b. 3,932,160 bytes ? (8 bits/byte) /100E6 bits/second ? 0.31 seconds 1.5 a. performance of P1 (instructions/sec) ? 3 ? 109/1.5 ? 2 ? 109 performance of P2 (instructions/sec) ? 2.5 ? 109/1.0 ? 2.5 ? 109 performance of P3 (instructions/sec) ? 4 ? 109/2.2 ? 1.8 ? 109

S-4

Chapter 1

Solutions

b. cycles(P1) ? 10 ? 3 ? 109 ? 30 ? 109 s cycles(P2) ? 10 ? 2.5 ? 109 ? 25 ? 109 s cycles(P3) ? 10 ? 4 ? 109 ? 40 ? 109 s c. No. instructions(P1) ? 30 ? 109/1.5 ? 20 ? 109 No. instructions(P2) ? 25 ? 109/1 ? 25 ? 109 No. instructions(P3) ? 40 ? 109/2.2 ? 18.18 ? 109 CPInew ? CPIold ? 1.2, then CPI(P1) ? 1.8, CPI(P2) ? 1.2, CPI(P3) ? 2.6 f ? No. instr. ? CPI/time, then f(P1) ? 20 ? 109 ?1.8/7 ? 5.14 GHz f(P2) ? 25 ? 109 ? 1.2/7 ? 4.28 GHz f(P1) ? 18.18 ? 109 ? 2.6/7 ? 6.75 GHz 1.6 a. Class A: 105 instr. Class B: 2 ? 105 instr. Class C: 5 ? 105 instr. Class D: 2 ? 105 instr. Time ? No. instr. ? CPI/clock rate Total time P1 ? (105 ? 2 ? 105 ? 2 ? 5 ? 105 ? 3 ? 2 ? 105 ? 3)/(2.5 ? 109) ? 10.4 ? 10?4 s Total time P2 ? (105 ? 2 ? 2 ? 105 ? 2 ? 5 ? 105 ? 2 ? 2 ? 105 ? 2)/ (3 ? 109) ? 6.66 ? 10?4 s CPI(P1) ? 10.4 ? 10?4 ? 2.5 ? 109/106 ? 2.6 CPI(P2) ? 6.66 ? 10?4 ? 3 ? 109/106 ? 2.0 b. clock cycles(P1) ? 105 ? 1? 2 ? 105 ? 2 ? 5 ? 105 ? 3 ? 2 ? 105 ? 3 ? 26 ? 105 clock cycles(P2) ? 105 ? 2? 2 ? 105 ? 2 ? 5 ? 105 ? 2 ? 2 ? 105 ? 2 ? 20 ? 105 1.7 a. CPI ? Texec ? f/No. instr. Compiler A CPI ? 1.1 Compiler B CPI ? 1.25 b. fB/fA ? (No. instr.(B) ? CPI(B))/(No. instr.(A) ? CPI(A)) ? 1.37 c. TA/Tnew ? 1.67 TB/Tnew ? 2.27

Chapter 1

Solutions

S-5

1.8 1.8.1 C ? 2 ? DP/(V2*F) Pentium 4: C ? 3.2E–8F Core i5 Ivy Bridge: C ? 2.9E–8F 1.8.2 Pentium 4: 10/100 ? 10% Core i5 Ivy Bridge: 30/70 ? 42.9% 1.8.3 (Snew ? Dnew)/(Sold ? Dold) ? 0.90 Dnew ? C ? Vnew 2 ? F Sold ? Vold ? I Snew ? Vnew ? I Therefore: Vnew ? [Dnew/(C ? F)]1/2 Dnew ? 0.90 ? (Sold ? Dold) ? Snew Snew ? Vnew ? (Sold/Vold) Pentium 4: Snew ? Vnew ? (10/1.25) ? Vnew ? 8 Dnew ? 0.90 ? 100 ? Vnew ? 8 ? 90 ? Vnew ? 8 Vnew ? [(90 ? Vnew ? 8)/(3.2E8 ? 3.6E9)]1/2 Vnew ? 0.85 V Core i5: Snew ? Vnew ? (30/0.9) ? Vnew ? 33.3 Dnew ? 0.90 ? 70 ? Vnew ? 33.3 ? 63 ? Vnew ? 33.3 Vnew ? [(63 ? Vnew ? 33.3)/(2.9E8 ? 3.4E9)]1/2 Vnew ? 0.64 V 1.9 1.9.1
p
1 2 4 8

# arith inst.
2.56E9 1.83E9 9.12E8 4.57E8

# L/S inst.
1.28E9 9.14E8 4.57E8 2.29E8

# branch inst.
2.56E8 2.56E8 2.56E8 2.56E8

cycles
7.94E10 5.67E10 2.83E10 1.42E10

ex. time
39.7 28.3 14.2 7.10

speedup
1 1.4 2.8 5.6

S-6

Chapter 1

Solutions

1.9.2
p
1 2 4 8

ex. time
41.0 29.3 14.6 7.33

1.9.3 3 1.10 1.10.1 die area15cm ? wafer area/dies per wafer ? pi*7.52 / 84 ? 2.10 cm2 yield15cm ? 1/(1?(0.020*2.10/2))2 ? 0.9593 die area20cm ? wafer area/dies per wafer ? pi*102/100 ? 3.14 cm2 yield20cm ? 1/(1?(0.031*3.14/2))2 ? 0.9093 1.10.2 cost/die15cm ? 12/(84*0.9593) ? 0.1489 cost/die20cm ? 15/(100*0.9093) ? 0.1650 1.10.3 die area15cm ? wafer area/dies per wafer ? pi*7.52/(84*1.1) ? 1.91 cm2 yield15cm ? 1/(1 ? (0.020*1.15*1.91/2))2 ? 0.9575 die area20cm ? wafer area/dies per wafer ? pi*102/(100*1.1) ? 2.86 cm2 yield20cm ? 1/(1 ? (0.03*1.15*2.86/2))2 ? 0.9082 1.10.4 defects per area0.92 ? (1–y^.5)/(y^.5*die_area/2) ? (1?0.92^.5)/ (0.92^.5*2/2) ? 0.043 defects/cm2 defects per area0.95 ? (1–y^.5)/(y^.5*die_area/2) ? (1?0.95^.5)/ (0.95^.5*2/2) ? 0.026 defects/cm2 1.11 1.11.1 CPI ? clock rate ? CPU time/instr. count clock rate ? 1/cycle time ? 3 GHz CPI(bzip2) ? 3 ? 109 ? 750/(2389 ? 109)? 0.94 1.11.2 SPEC ratio ? ref. time/execution time SPEC ratio(bzip2) ? 9650/750 ? 12.86 1.11.3. CPU time ? No. instr. ? CPI/clock rate If CPI and clock rate do not change, the CPU time increase is equal to the increase in the of number of instructions, that is 10%.

Chapter 1

Solutions

S-7

1.11.4 CPU time(before) ? No. instr. ? CPI/clock rate CPU time(after) ? 1.1 ? No. instr. ? 1.05 ? CPI/clock rate CPU time(after)/CPU time(before) ? 1.1 ? 1.05 ?1.155. Thus, CPU time is increased by 15.5%. 1.11.5 SPECratio ? reference time/CPU time SPECratio(after)/SPECratio(before) ? CPU time(before)/CPU time(after) ? 1/1.1555 ? 0.86. The SPECratio is decreased by 14%. 1.11.6 CPI ? (CPU time ? clock rate)/No. instr. CPI ? 700 ? 4 ? 109/(0.85 ? 2389 ? 109) ? 1.37 1.11.7 Clock rate ratio ? 4 GHz/3 GHz ? 1.33 CPI @ 4 GHz ? 1.37, CPI @ 3 GHz ? 0.94, ratio ? 1.45 They are different because, although the number of instructions has been reduced by 15%, the CPU time has been reduced by a lower percentage. 1.11.8 700/750 ? 0.933. CPU time reduction: 6.7% 1.11.9 No. instr. ? CPU time ? clock rate/CPI No. instr. ? 960 ? 0.9 ? 4 ? 109/1.61 ? 2146 ? 109 1.11.10 Clock rate ? No. instr. ? CPI/CPU time. Clock ratenew ? No. instr. ? CPI/0.9 ? CPU time ? 1/0.9 clock rateold ? 3.33 GHz 1.11.11 Clock rate ? No. instr. ? CPI/CPU time. Clock ratenew ? No. instr. ? 0.85? CPI/0.80 CPU time ? 0.85/0.80, clock rateold ? 3.18 GHz 1.12 1.12.1 T(P1) ? 5 ? 109 ? 0.9 / (4 ? 109) ? 1.125 s T(P2) ? 109 ? 0.75 / (3 ? 109) ? 0.25 s clock rate (P1) ? clock rate(P2), performance(P1) < performance(P2) 1.12.2 T(P1) ? No. instr. ? CPI/clock rate T(P1) ? 2.25 3 1021 s T(P2) 5 N ? 0.75/(3 ? 109), then N ? 9 ? 108 1.12.3 MIPS ? Clock rate ? 10?6/CPI MIPS(P1) ? 4 ? 109 ? 10?6/0.9 ? 4.44 ? 103

S-8

Chapter 1

Solutions

MIPS(P2) ? 3 ? 109 ? 10?6/0.75 ? 4.0 ? 103 MIPS(P1) ? MIPS(P2), performance(P1) ? performance(P2) (from 11a) 1.12.4 MFLOPS ? No. FP operations ? 10?6/T MFLOPS(P1) ? .4 ? 5E9 ? 1E-6/1.125 ? 1.78E3 MFLOPS(P2) ? .4 ? 1E9 ? 1E-6/.25 ? 1.60E3 MFLOPS(P1) ? MFLOPS(P2), performance(P1) ? performance(P2) (from 11a) 1.13 1.13.1 Tfp ? 70 ? 0.8 ? 56 s. Tnew ? 56?85?55?40 ? 236 s. Reduction: 5.6% 1.13.2 Tnew ? 250 ? 0.8 ? 200 s, Tfp?Tl/s?Tbranch ? 165 s, Tint ? 35 s. Reduction time INT: 58.8% 1.13.3 Tnew ? 250 ? 0.8 ? 200 s, Tfp?Tint?Tl/s ? 210 s. NO 1.14 1.14.1 Clock cycles ? CPIfp ? No. FP instr. ? CPIint ? No. INT instr. ? CPIl/s ? No. L/S instr. ? CPIbranch ? No. branch instr. TCPU ? clock cycles/clock rate ? clock cycles/2 ? 109 clock cycles ? 512 ? 106; TCPU ? 0.256 s To have the number of clock cycles by improving the CPI of FP instructions: CPIimproved fp ? No. FP instr. ? CPIint ? No. INT instr. ? CPIl/s ? No. L/S instr. ? CPIbranch ? No. branch instr. ? clock cycles/2 CPIimproved fp ? (clock cycles/2 ? (CPIint ? No. INT instr. ? CPIl/s ? No. L/S instr. ? CPIbranch ? No. branch instr.)) / No. FP instr. CPIimproved fp ? (256?462)/50 ?0 ??? not possible 1.14.2 Using the clock cycle data from a. To have the number of clock cycles improving the CPI of L/S instructions: CPIfp ? No. FP instr. ? CPIint ? No. INT instr. ? CPIimproved l/s ? No. L/S instr. ? CPIbranch ? No. branch instr. ? clock cycles/2 CPIimproved l/s ? (clock cycles/2 ? (CPIfp ? No. FP instr. ? CPIint ? No. INT instr. ? CPIbranch ? No. branch instr.)) / No. L/S instr. CPIimproved l/s ? (256?198)/80 ? 0.725 1.14.3 Clock cycles ? CPIfp ? No. FP instr. ? CPIint ? No. INT instr. ? CPIl/s ? No. L/S instr. ? CPIbranch ? No. branch instr.

Chapter 1

Solutions

S-9

TCPU ? clock cycles/clock rate ? clock cycles/2 ? 109 CPIint ? 0.6 ? 1 ? 0.6; CPIfp ? 0.6 ? 1 ? 0.6; CPIl/s ? 0.7 ? 4 ? 2.8; CPIbranch ? 0.7 ? 2 ? 1.4 TCPU (before improv.) ? 0.256 s; TCPU (after improv.)? 0.171 s 1.15
processors
1 2 4 8 16

exec. time/ processor
100 50 25 12.5 6.25

time w/overhead
54 29 16.5 10.25

speedup
100/54 ? 1.85 100/29 ? 3.44 100/16.5 ? 6.06 100/10.25 ? 9.76

actual speedup/ideal speedup
1.85/2 ? .93 3.44/4 ? 0.86 6.06/8 ? 0.75 9.76/16 ? 0.61


赞助商链接
相关文章:
计算机组成原理教程第五版(张基温)课后习题大题答案
计算机组成原理教程第五版(张基温)课后习题大题答案_工学_高等教育_教育专区。...!程序如何控制计算机:计算机工作是执行程序,程序是解决特定问题而设计的指令序列,...
计算机组成原理课后答案(白中英主编_第五版 立体化教材)
计算机组成原理课后答案(白中英主编_第五版 立体化教材)_理学_高等教育_教育专区...11. 第一级是微程序设计级,这是一个实在的硬件级,它由机器硬件直接执行微...
计算机组成与设计课后答案
计算机组成与设计课后答案_哲学_高等教育_教育专区。1.1-1.26 qufac dikjo wp...计算机组成与设计答案 69页 免费 计算机组成与设计第三版... 8页 免费 计算机...
计算机组成原理第五版_白中英(详细)第3章习题答案
计算机组成原理第五版_白中英(详细)第3章习题答案_工学_高等教育_教育专区。...8 2 第 3 章习题参考答案 (2) 设计此存储体组成框图如下所示。 A0 ? A16...
《大学计算机基础》第五版_第1-4章课后习题答案
《大学计算机基础》第五版_第1-4章课后习题答案_理学...是指在计算机网络中,通信双方为了实现通信而设计的...常用的计算机网络体系结构有哪些? 计算机网络的各个...
计算机组成原理第五章答案
计算机组成原理第五答案_理学_高等教育_教育专区。蒋本珊著5.4 教材习题解答 1...请从上 述规格中选用芯片设计该机主存储器,画出主存的连接框图,并请注 意画...
计算机组成与系统结构课后答案免费版全(清华大学出版社...
计算机组成与系统结构课后答案免费版全(清华大学出版...第五位数据出错时, 数据字变为: 0101 0001 0101 ...用 SN74181 和 SN74182 器件设计一个 16 位先行...
数据库系统概论第五版课后习题答案王珊
数据库系统概论第五版课后习题答案王珊_理学_高等...是指数据结构是针对某个应用设计的,只 被这个应用...2 .数据库安全性和计算机系统的安全性有什么关系? ...
计算机网络第五版答案
计算机网络第五版答案_理学_高等教育_教育专区。第一章 概述 1-01 试从多个...发送方 TCP 对应用程序交下来的报文数据块,视为无结构的字节流(无边界约 束...
计算机组成原理都给对方版答案(完整版)
计算机组成原理都给对方版答案(完整版)_理学_高等教育...(1)为个人使用而设计的计算机,通常有图形显示器、...第五章 5.1 说明主存储器的组成,并比较 SRAM 和...
更多相关标签: