About the Authors David A.Patterson(University of Califomia at Berkeley)has taught computer architecture since joining the faculty in 1977 and is holder of the E. H. and M. E. Pardee Chair of Computer Science. His teaching has beeo honored by the ACM with the Outstanding Educator Award and by the University of Califomia with the Distinguished Teaching Award. He also received the inaugural Outstanding Alunmus Awanl of the UClA Computer Science Department. He is a member of the National Academy of Engineering and is a Fellow of both the IEEE and the Association for Computing Machinery (ACM).Past chair of the CS Division in the EECS Department at Berikeley and the ACM Special Interest Group in Computer Architecture, Patterson is currently chair of the Computing Research Association. He has consulted for many companies, includmg Digital, HP, Intel, and Sun, and is alsoco-authoroffivebooks.At Berkeley, he led the design and implementation of RlSC l, likely the first VLSl Reduced Instruction Set Computer, This research became the foundation of the SPARC architecture, currently used by Fujitsu, lCL, Sun, Tl, and Xerox. He was also a leader of the Redundant Arrays of Inexpensive Disks (RAlD) project, wbich led to high-perfonnance storage systems from many companies. These projects led to three distinguished dissertadon awards from the ACM. His current research interests are in large-scale computing using networks of workstations (NOW).JOHN L. HENNESSY (Stanford University) has been a member of the Stanford faculty since 1997, where he teaches computer architecture and supervises a group ofenergedc Ph. D. students. He is currentiy Chairman of the Computer Science Department and holds the Willard R. and Inez Kerr Bell Professorship in the School of Engineering. Hennessy is a Fellow of the IEEE, a member of the National Academy of Engineering , and a Fellow of the American Academy of Arts and Sciences. He received the 1994 IEEE Piore Award for his contribudons to the development of RlSC technology.Hennessy's original research area was optimizing compilers. His research group at Stanford developed many of the techniques now in commercial use. In 1981, he started the MlPS project at Stanford with a handful of graduate students. After completing the project in 1984, he took a one- year leave form the university to co-found MlPS Computer Systems, which has since merged with Silicon Graphics. Hennessy's recent research at Stanford focuses on the area of designing and exploitmg multiprocessors. Most recentfy, he has been involved in the development of the DASH multiprocessor architecture, one ofthe first distributed shared-memory multiprocessors.
圖書目錄
Contents
Foreword
Preface
Acknowlodgments
Fundamentals of Computer Doslgn
1.1 Introduction
1.2 The Task of a Computer Designer
1.3 Technology and Computer Usage Trends
1.4 Cost and Trends in Cost
1.5 Measuring and Reporting Performance
1.6 Quantitative Prindptes of Gomputer Deeign
1.7 Putting It All Together: The Concept of Memory Hierarchy
1.8 Fallacies and Pltfalls
1.9 Concluding Remarks
1.10 Historical Perspective and Referencea
Exercises
Instruction Sot Principles and Cxamplw
2.1 Introduction
2.2 Classifying Instruction Set Architectures
2.3 Memory Addressing
2.4 Operations in the Instruction Set
2.5 Type and Size of Operands
2.6 Encoding an Instruction Set
2.7 Crosscutting Issues: The Role of Compilers
2.8 Putting It All Together: The DLX Architecture
2.9 Fallacies and Pitfalls
2.10 Concludlng Remarks
2.11 Historical Perspective and References
Exerciees
Pipellnlng
3.1 What Is Pipelining?
3.2 The Basic Pipeline for DLX
3.3 The Major Hurdte of Pipelining-Pipeline Hazaros
3.4 Data Hazards
3.5 Control Hazards
3.6 What Makes PipeUning Hard to Implement?
3.7 Extending the DLX Pipeline to Handle Multicycte Operations
3.8 Crosscutting Issues: Instruction Set Design and Pipetining
3.9 Putting It All Togather The MIPS R4000 Pipellne
3.10 Fallacies and Pttfalls
3.11 Concluding Remarks
3.12 HistoricalPerspectiveanlelerances
Exercises
Advanced Plpelining a5nd Instruetionlovel Parallolism
4.1 Instruction-Level Parallelism: Concepts and Chalhenges
4.2 OvercomingDataHazantewlthOynamlcScheduling
4.3 Raducing Branch Penalties with Dynamic Hardware Prediction
4.4 TakingAdvantageo of MorelLPwithMultiplelssue
4.5 CompilerSupport for Exploiting lLP
4.6 HardwareSupport for ExtractingMore Parallelism
4.7 Studiesof lLP
4.8 Putting it All Together The PowerPC 620
4.9 FaBa cies and pitfalis
4.10 Conduding Remarks
4.11 HistorlcalPerspective and Referftnces
Exereises
5.1 introduction
5.2 TheABCsofCaches
5.3 FtoducingCacheMisses
5.4 Reducing Ceche Miss Penalty
5.5 ReducingHitTime
5.6 Main Mamory
5.7 VirtualMemory
5.8 ProtectionandExampleso of VirtualMemory
5.9 Crosscuttinglssues ln theDesignofMemoryHlerarchies
5.10 Putting It All Together The Alpha AXP 21064 Memory Hierarchy
5.11 FallacieeandPltfalis
5.12 Concluding Remarks
5.13 Historical Perspective and References
Storage Systemms
6.1 Introducton
6.2 TypesofStorage Devices
6.3 Buses-Connecting VO DevloestoCPU/Memory
6.4 1/0 Performance Msasures
6.5 Reliabitity, Availability, and RAID
6.6 Crosscutting lssues:lntoftecing toanOperatingSystem
6.7 Oesigningan1/OSysem
6.6 Puttlng it All Together UNIXFile SystemPerformance
6.9 FailaciesandPitfaUs
6.10 ConciuJdlng Remarks
6.11 Historical Perspective and Reterences
Exercises
Interconnoction Notworks
Interconnection Networks
7.1 Introduction
7.2 A Simpte Network
7.3 Connecting the Interconnection Network to the Computer
7.4 Interconnection Network Media
7.5 Connecting More Than Two Computen
7.6 Practlcal Issues (or Commeroal InteroonneoionNetworks
7.7 Examples of Interconnection Networks
7.8 Crosscuttlng Issues for Interconnectton Networks
7.9 Intemetworking
7.10 PuttlngltAIITogetharAnATMNetworkofWorkstations
7.11 Fallacies and Pitfalls
7.12 Conduding Remarks
7.13 Hlstorical Perspective and References
Exercises
8 Multiprocessors
8.1 Introduction
8.2 Characteristics of Application Domains
8.3 Centralized Shared-Memory Architectures
8.4 DlstrlbutwtShared-Memory Architectures
8.5 Synchronization
8.6 Models of Memofy Consistency
8.7 Crosscutting Issues
8.8 Putting It All Together The SGI Challenge Multlprocessor
8.9 Fallacis and Pltfalls
8.10 Conduding Remarks
8.11 Historical Perspective and References
Exercises
AppendixA: ComputerArlthmetlc
by DAVIO GOLDBERQ
Xerox Palo Alto Research Center
A.l Introduction
A.2 BasicTechnlquesoflntegerArithmetic
A.3 Floatig Point
A.4 Roating-Point Multpliation
A.5 Roating-PolntAddtion
A.6 Division and Remainder
A.7 More on Roating-Point Arithmetic
A.8 Speeding Up Integer Addition
A.9 Speeding Up Integer Multiplication and Oivision
A.10 Putting It All Together
A.11 Fallacis and Pitalls
A.12 Historical Perspectlve and References
Exercises
Appendix B: Vector Processors
B.1 Why Vector Processors?
B.2 BasicVectorArchitecture
B.3 Two Real-Workl Issues: Vector Length and Stride
B.4 Effectiveness of Compiler Vectorization
B.5 Enhancing Vector Performance
B.6 Putting It All Together: Perfonnance of Vector Processors
B.7 Fallacies and Pitfalls
B.8 Concluding Remarks
B.9 Historical Perapective and Referces
Exerciaes
Appondix Cs SurvoyofRiscArehltectures
C.1 Introduction
C.2 AddressingModes and Inatrction Formats
C.3 Instructions: The DLX Subsat
C.4 nstructions: Common Extenstona to DLX
C.5 InstructionsUniquetoMIPS
C.6 InstructionsUnique to SPARC
C.7 Instructions Unique to PowerPC
C.8 Instructions Unique to PA-RISC
C.9 Concluding Remarks
C.10 References
Appendlx D: An AKenatlveto RISC: Tle intel 80x86
D.1 Introduction
D.2 80x86 Registers and Data Addressing Modes
D.3 80x86 Integer Operations
D.4 80x86 RoatingPoint Operstions
D.5 80x86 Instruction Encoding
D.6 PuttingitAllogetherMeasunirnentsofinstructionSetUsage
D.7 ConckKfcigRwwta
D.8 HistoricalPerspectrveandReferences
E.l Implementation Issues for the Snooping Coherence Protocol
E.2 Imptementation Issues in the Distributed Directory Protocol
Exercises
RefoTMICM
Index