計(jì)算機(jī)組成與設(shè)計(jì)：硬件/軟件接口（英文版第4版 ARM版）

定　價(jià)：￥95.00

作　者：	（美）帕特林等著
出版社：	機(jī)械工業(yè)出版社
叢編項(xiàng)：
標(biāo)　簽：	計(jì)算機(jī)體系結(jié)構(gòu)

購買這本書可以去

ISBN：	9787111302889	出版時(shí)間：	2010-04-01	包裝：	平裝
開本：	16開	頁數(shù)：	689	字?jǐn)?shù)：

內(nèi)容簡介

　　《計(jì)算機(jī)組成與設(shè)計(jì)：硬件/軟件接口（英文版·第4版·ARM版）》采用了一個(gè)MIPS處理器來展示計(jì)算機(jī)硬件技術(shù)、流水線、存儲(chǔ)器層次結(jié)構(gòu)以及I／O等基本功能。此外?！队?jì)算機(jī)組成與設(shè)計(jì)：硬件/軟件接口（英文版·第4版·ARM版）》還包括一些關(guān)于x86架構(gòu)的介紹。這本最暢銷的計(jì)算機(jī)組成書籍經(jīng)過全面更新，關(guān)注現(xiàn)今發(fā)生在計(jì)算機(jī)體系結(jié)構(gòu)領(lǐng)域的革命性變革：從單處理器發(fā)展到多核微處理器。此外，出版這本書的ARM版是為了強(qiáng)調(diào)嵌入式系統(tǒng)對于全亞洲計(jì)算行業(yè)的重要性，并采用ARM處理器來討論實(shí)際計(jì)算機(jī)的指令集和算術(shù)運(yùn)算。因?yàn)锳RM是用于嵌入式設(shè)備的最流行的指令集架構(gòu)，而全世界每年約銷售40億個(gè)嵌入式設(shè)備。與前幾版一樣。采用ARMv6（ARM 11系列）為主要架構(gòu)來展示指令系統(tǒng)和計(jì)算機(jī)算術(shù)運(yùn)算的基本功能。覆蓋從串行計(jì)算到并行計(jì)算的革命性變革，新增了關(guān)于并行化的一章，并且每章中還有一些強(qiáng)調(diào)并行硬件和軟件主題的小節(jié)。新增一個(gè)由NVIDIA的首席科學(xué)家和架構(gòu)主管撰寫的附錄，介紹了現(xiàn)代GPU的出現(xiàn)和重要性，首次詳細(xì)描述了這個(gè)針對可視計(jì)算進(jìn)行了優(yōu)化的高度并行化、多線程、多核的處理器。描述一種度量多核性能的獨(dú)特方法——“Roofline model”，自帶benchmark測試和分析AMD Opteron X4、Intel Xeo 5000、Sun Ultra SPARC T2和IBM Cell的性能。涵蓋了一些關(guān)于閃存和虛擬機(jī)的新內(nèi)容。提供了大量富有啟發(fā)性的練習(xí)題，內(nèi)容達(dá)200多頁。將AMD Opteron X4和Intel Nehalem作為貫穿《計(jì)算機(jī)組成與設(shè)計(jì)：硬件/軟件接口（英文版·第4版·ARM版）》的實(shí)例。用SPEC CPU2006組件更新了所有處理器性能實(shí)例。

作者簡介

　　David A.Patterson，加州大學(xué)伯克利分校計(jì)算機(jī)科學(xué)系教授。美國國家工程研究院院士。IEEE和ACM會(huì)士。曾因成功的啟發(fā)式教育方法被IEEE授予James H.Mulligan，Jr教育獎(jiǎng)?wù)隆Ｋ驗(yàn)閷ISC技術(shù)的貢獻(xiàn)而榮獲1 995年IEEE技術(shù)成就獎(jiǎng)，而在RAID技術(shù)方面的成就為他贏得了1999年IEEE Reynold Johnson信息存儲(chǔ)獎(jiǎng)。2000年他~13John L.Hennessy分享了John von Neumann獎(jiǎng)。John L.Hennessy，斯坦福大學(xué)校長，IEEE和ACM會(huì)士。美國國家工程研究院院士及美國科學(xué)藝術(shù)研究院院士。Hennessy教授因?yàn)樵赗ISC技術(shù)方面做出了突出貢獻(xiàn)而榮獲2001年的Eckert-Mauchly獎(jiǎng)?wù)?他也是2001年Seymour Cray計(jì)算機(jī)工程獎(jiǎng)得主。并且和David A.Patterson分享了2000年John von Neumann獎(jiǎng)。

圖書目錄

Contents
Preface xv
CHAPTERS
Computer Abstractions and Technology 2
1.1 Introduction 3
1.2 Below Your Program 10
1.3 Under the Covers 13
1.4 Performance 26
1.5 The Power Wall 39
1.6 The Sea Change: The Switch from Uniprocessors to Multiprocessors 41
1.7 Real Stuff: Manufacturing and Benchmarking the AMD Opteron X4 44
1.8 Fallacies and Pitfalls 51
1.9 Concluding Remarks 54
1.10 Historical Perspective and Further Reading 55
1.11 Exercises 56
Instructions: Language of the Computer 74
2.1 Introduction 76
2.2 Operations of the Computer Hardware 77
2.3 Operands of the Computer Hardware 80
2.4 Signed and Unsigned Numbers 86
2.5 Representing Instructions in the Computer 93
2.6 Logical Operations 100
2.7 Instructions for Making Decisions 104
2.8 Supporting Procedures in Computer Hardware 113
2.9 Communicating with People 122
2.10 ARM Addressing for 32-Bit Immediates and More Complex Addressing Modes 127
2.11 Parallelism and Instructions: Synchronization 133
2.12 Translating and Starting a Program 135
2.13 A C Sort Example to Put It All Together 143
: This icon identi.es material on the CD
2.14 Arrays versus Pointers 152
2.15 Advanced Material: Compiling C and Interpreting Java 156
2.16 Real Stuff: MIPS Instructions 156
2.17 Real Stuff: x86 Instructions 161
2.18 Fallacies and Pitfalls 170
2.19 Concluding Remarks 171
2.20 Historical Perspective and Further Reading 174
2.21 Exercises 174
Arithmetic for Computers 214
3.1 Introduction 216
3.2 Addition and Subtraction 216
3.3 Multiplication 220
3.4 Division 226
3.5 Floating Point 232
3.6 Parallelism and Computer Arithmetic: Associativity 258
3.7 Real Stuff: Floating Point in the x86 259
3.8 Fallacies and Pitfalls 262
3.9 Concluding Remarks 265
3.10 Historical Perspective and Further Reading 268
3.11 Exercises 269
The Processor 284
4.1 Introduction 286
4.2 Logic Design Conventions 289
4.3 Building a Datapath 293
4.4 A Simple Implementation Scheme 302
4.5 An Overview of Pipelining 316
4.6 Pipelined Datapath and Control 330
4.7 Data Hazards: Forwarding versus Stalling 349
4.8 Control Hazards 361
4.9 Exceptions 370
4.10 Parallelism and Advanced Instruction-Level Parallelism 377
4.11 Real Stuff: the AMD Opteron X4 (Barcelona) Pipeline 390
4.12 Advanced Topic: an Introduction to Digital Design Using a Hardware Design Language to Describe and Model a Pipeline and More Pipelining Illustrations 392
4.13 Fallacies and Pitfalls 393
4.14 Concluding Remarks 394
4.15 Historical Perspective and Further Reading 395
4.16 Exercises 395
Large and Fast: Exploiting Memory Hierarchy 436
5.1 Introduction 438
5.2 The Basics of Caches 443
5.3 Measuring and Improving Cache Performance 461
5.4 Virtual Memory 478
5.5 A Common Framework for Memory Hierarchies 504
5.6 Virtual Machines 511
5.7 Using a Finite-State Machine to Control a Simple Cache 515
5.8 Parallelism and Memory Hierarchies: Cache Coherence 520
5.9 Advanced Material: Implementing Cache Controllers 524
5.10 Real Stuff: the AMD Opteron X4 (Barcelona) and Intel Nehalem Memory Hierarchies 525
5.11 Fallacies and Pitfalls 529
5.12 Concluding Remarks 533
5.13 Historical Perspective and Further Reading 534
5.14 Exercises 534
Storage and Other I/O Topics 554
6.1 Introduction 556
6.2 Dependability, Reliability, and Availability 559
6.3 Disk Storage 561
6.4 Flash Storage 566
6.5 Connecting Processors, Memory, and I/O Devices 568
6.6 Interfacing I/O Devices to the Processor, Memory, and Operating System 572
6.7 I/O Performance Measures: Examples from Disk and File Systems 582
6.8 Designing an I/O System 584
6.9 Parallelism and I/O: Redundant Arrays of Inexpensive Disks 585
6.10 Real Stuff: Sun Fire x4150 Server 592
6.11 Advanced Topics: Networks 598
6.12 Fallacies and Pitfalls 599
6.13 Concluding Remarks 603
6.14 Historical Perspective and Further Reading 604
6.15 Exercises 605
Multicores, Multiprocessors, and Clusters 616
7.1 Introduction 618
7.2 The Dif.culty of Creating Parallel Processing Programs 620
7.3 Shared Memory Multiprocessors 624
7.4 Clusters and Other Message-Passing Multiprocessors 627
7.5 Hardware Multithreading 631
7.6 SISD, MIMD, SIMD, SPMD, and Vector 634
7.7 Introduction to Graphics Processing Units 640
7.8 Introduction to Multiprocessor Network Topologies 646
7.9 Multiprocessor Benchmarks 650
7.10 Roo.ine: A Simple Performance Model 653
7.11 Real Stuff: Benchmarking Four Multicores Using the Roo. ine Model 661
7.12 Fallacies and Pitfalls 670
7.13 Concluding Remarks 672
7.14 Historical Perspective and Further Reading 674
7.15 Exercises 674 Index I-1
CD-ROM CONTENT
Graphics and Computing GPUs A-2
A.1 Introduction A-3
A.2 GPU System Architectures A-7
A.3 Scalable Parallelism – Programming GPUs A-12
A.4 Multithreaded Multiprocessor Architecture A-25
A.5 Parallel Memory System G.6 Floating Point A-36
A.6 Floating Point Arithmetic A-41
A.7 Real Stuff: The NVIDIA GeForce 8800 A-46
A.8 Real Stuff: Mapping Applications to GPUs A-55
A.9 Fallacies and Pitfalls A-72
A.10 Concluding Remarks A-76
A.11 Historical Perspective and Further Reading A-77
ARM and Thumb Assembler Instructions B1-2
B1.1 Using This Appendix B1-3 B1.2 Syntax B1-4 B1.3 Alphabetical List of ARM and Thumb Instructions B1-8 B1.4 ARM Assembler Quick Reference B1-49 B1.5 GNU Assembler Quick Reference B1-60
ARM and Thumb Instruction Encodings B2-2
B2.1 ARM Instruction Set Encodings B2-3
B2.2 Thumb Instruction Set Encodings B2-9
B2.3 Program Status Registers B2-11

Instruction Cycle Timings B3-2
B3.1 Using the Instruction Set Cycle Timing Tables B3-3 B3.2 ARM7TDMI Instruction Cycle Timings B3-5 B3.3 ARM9TDMI Instruction Cycle Timings B3-6 B3.4 StrongARM1 Instruction Cycle Timings B3-8 B3.5 ARM9E Instruction Cycle Timings B3-9 B3.6 ARM10E Instruction Cycle Timings B3-11 B3.7 Intel XScale Instruction Cycle Timings B3-12 B3.8 ARM11 Cycle Timings B3-14
C The Basics of Logic Design C-2
C.1 Introduction C-3
C.2 Gates, Truth Tables, and Logic Equations C-4
C.3 Combinational Logic C-9
C.4 Using a Hardware Description Language C-20
C.5 Constructing a Basic Arithmetic Logic Unit C-26
C.6 Faster Addition: Carry Lookahead C-38
C.7 Clocks C-48
C.8 Memory Elements: Flip-Flops, Latches, and Registers C-50
C.9 Memory Elements: SRAMs and DRAMs C-58
C.10 Finite-State Machines C-67
C.11 Timing Methodologies C-72
C.12 Field Programmable Devices C-78
C.13 Concluding Remarks C-79
C.14 Exercises C-80
D Mapping Control to Hardware D-2
D.1 Introduction D-3
D.2 Implementing Combinational Control Units D-4
D.3 Implementing Finite-State Machine Control D-8
D.4 Implementing the Next-State Function with a Sequencer D-22
D.5 Translating a Microprogram to Hardware D-28
D.6 Concluding Remarks D-32
D.7 Exercises D-33
ADVANCED CONTENT
Section 2.15 Compiling C and Interpreting Java Section 4.12 An Introduction to Digital Design Using a Hardware Design Language to Describe and Model a Pipeline and More Pipelining Illustrations Section 5.9 Implementing Cache Controllers Section 6.11 Networks
HISTORICAL PERSPECTIVES & FURTHER READING
Chapter 1 Computer Abstractions and Technology: Section 1.10 Chapter 2 Instructions: Language of the Computer: Section 2.20 Chapter 3 Arithmetic for Computers: Section 3.10 Chapter 4 The Processor: Section 4.15 Chapter 5 Large and Fast: Exploiting Memory Hierarchy: Section 5.13 Chapter 6 Storage and Other I/O Topics: Section 6.14 Chapter 7 Multicores, Multiprocessors, and Clusters: Section 7.14 Appendix A Graphics and Computing GPUs: Section A.11
TUTORIALS
VHDL
Verilog

SOFTWARE
Xilinx FPGA Design, Simulation and Synthesis Software QEMU http://www.nongnu.org/qemu/about.html
Glossary G-1 Index I-1 Further Reading FR-1