The quadratic approximation method for finding a minimum of a function of one variable generated a sequence of second degree Lagrange polynomials, and used them to approximate where the minimum is located. It was implicitly assumed that near the minimum, the shape of the quadratics approximated the shape of the objective function . The resulting sequence of minimums of the quadratics produced a sequence converging to the minimum of the objective function . Newton's search method extends this process to functions of n independent variables: . Starting at an initial point , a sequence of second-degree polynomials in n variables can be constructed recursively. If the objective function is well-behaved and the initial point is near the actual minimum, then the sequence of minimums of the quadratics will converge to the minimum of the objective function. The process will use both the first- and second-order partial derivatives of the objective function. Recall that the gradient method used only the first partial derivatives. It is to be expected that Newton's method will be more efficient than the gradient method.
Background
Now we turn to the minimization of a function of n variables, where and the partial derivatives of are accessible. Although the Newton search method will turn out to have a familiar form. For illustration purposes we emphasize the two dimensional case when . The extension to n dimensions is discussed in the hyperlink.
Assume that is a function of two variables, , and has partial derivatives and . The gradient of , denoted by , is the vector function
.
Assume that are functions of two variables, , their Jacobian matrix is
.
Assume that is a function of two variables, , and has partial derivatives up to the order two. The Hessian matrix is defined as follows:
.
Lemma 1. For the Hessian matrix is the Jacobian matrix for the two functions , i. e.
.
Lemma 2. If the second order partial derivatives of are continuous then the Hessian matrix is symmetric.
Comments
Post a Comment