Contents Preface prefcetothe Second Edition 1 Introduction 1 Mathematical Formulation 2 Example:A Transportation Problem 4 Continuous versus Discrete Optimization 5 Constrained and Unconstrained Optimization 6 Global and Local Optimization 6 Stocbastic and Deterministic Optimization 7 Convexity 7 Optimization Algorithms 8 Notes and References 9 2 Fundamentals of Unconstrained Optimization 10 2.1 What ls a Solution? 12 Recognizing a Local Minimum 14 Nonsmooth Problems 17 2.2 Overview of A1gorithms 18 Two Strategies:Line Search and Trust Region 19 Search Directions for Line Search Methods 20 Models for Trust-Region Methods 25 Scaling 26 Exercises 27 3 Line Search Methods 30 3.1 Step Length 31 The Wolfe Conditions 33 The Goldstein Conditions 36 Sufficient Decrease and Backtracking 37 3.2 Convergence of Line Search Methods 37 3.3 Rate of Convergence 41 Convergence Rate of Steepest Descent 42 Newton's Method 44 Quasi-Newton Methods 46 3.4 Newton's Method with Hessian Modification 48 Eigenvalue Modification 49 Adding a Multiple of the ldentity 51 Modified Cholesky Factorization 52 Modified Symmetric Indefinite Factorization 54 3.5 Step-Length Selection Algorithms 56 lnterpolation 57 lnitial Step Length 59 A Line Search A1gorithm for the Wolfe Conditions 60 Notes and References 62 Exercises 63 4 Trust-Region Methods 66 Outline of the Trust-Region Approach 68 4.1 A1gorithms Based on the Cauchy Point 71 The Cauchy Point 71 lmpro時ng on the Cauchy Point 73 The Dogleg Method 73 Two-Dinlensional Subspace Mininlization 76 4.2 Global Convergence 77 Reduction Obtained by the Cauchy Point 77 Convergence to Stationary Points 79 4.3 lterative Solution of the Subproblem 83 The Hard Case 87 Proof of Theorem 4.1 89 Convergence of Algorithms Based on Nearly Exact Solutions 91 4.4 Local Convergence ofTrust-Region Newton Methods 92 4.5 0ther Enhancements 95 Scaling 95 Trust Regions in 0ther Norms 97 Notes and References 98 Exercises 98 5 Conjugate Gradient Methods 101 5.1 The linear Conjugate Gradient Method 102 Conjugate Direction Methods 102 Basic Properties of thee Conjugate Gradient Method 107 A Practical Form of the Conjugate Gradient Method 111 Rate of Convergence 112 Preconditioning 118 Practical Preconditioners 120 5.2 Nonlinear Conjugate Gradient Methods 121 The Fletcher-Reeves Method 121 The Polak-Ribière Method and Variants 122 Quadratic Termination and Restarts 124 Behavior of the Fletcher-Reeves Method 125 Global Convergence 127 Numerical Performance 131 Notes and Reference 132 Exercises 133 6 Quasi-Newton Methods 135 6.1 The BFGS Method 136 Properties ofthe BFGS Method 141 Implementation 142 6.2 The SR1 Method 144 Properties of SR1 Updating 147 6.3 The Broyden Class 149 6.4 Convergence Analysis 153 Global Convergence of the BFGS Method 153 Superlinear Convergence of the BFGS Method 156 Convergence Analysis of the SR1 Method 160 Notes and References 161 Exercises 162 7 Large-Scale Unconstrained optimization 164 7.1 lnexact Newton Methods 165 Local Convergence of Inexact Newton Methods 166 Line Search Newton-CG Method 168 Trust-Region Newton-CG Method 170 Preconditioning the Trust-Region Newton-CG Method 174 Trust-Region Newton-Lanczos Method 175 7.2 Limited-Memory Quasi-Newton Methods 176 Limited-Memory BFGS 177 Relationship with Conjugate Gradient Methods 180 General Lirnited:d-Memory Updatiug 181 Compact Representation of BFGS Updating 181 Unrolling the Update 184 7.3 Sparse Quasi-Newton Updates 185 7.4 Algorithms for Partially Separable Fnnctions 186 7.5 Perspectives and Sotrware 189 Notes and References 190 Exercises 191 8 Calculating Derivatives 193 8.1 Finite-Difference Derivative Approximations 194 Approximating the Gradient 195 Approximating a Sparse Jacobian 197 Approximatiug the Hessian 201 Approximatiug a Sparse Hessian 202 8.2 Automatic Differentiation 204 Au Example 205 The Forward Mode 206 The Reverse Mode 207 Vector Fnnctions and Partial Separablity 210 Calculating Jacobians ofVector Funlctions 212 Calculating Hessians:Forward Mode 213 Calculating Hessians:Reverse Mode 215 Current Lirnitations 216 Notess and References 217 Exercises 217 9 Derivatve-Free Optiimization 220 9.1 Finite Differences and Noise 221 9.2 Model-Based Methods 223 Interpolation aod Polyoomial Bases 226 Updating the Interpolation Set 227 A Method Based on Minimum-Change Updating 228 9.3 Coordinate and Pattern-Search Methods 229 Coordinate Search Method 230 Pattern-Search Methods 231 9.4 A Conjugate-Direction Method 234 9.5 Nelder-Mead Method 238 9.6 Implicit Filtering 240 Notes and References 242 Exercises 242 10