Skip to main content

Newton's Method

The quadratic approximation method for finding a minimum of a function of one variable generated a sequence of second degree Lagrange polynomials, and used them to approximate where the minimum is located. It was implicitly assumed that near the minimum, the shape of the quadratics approximated the shape of the objective function [Graphics:Images/NewtonSearchMod_gr_1.gif]. The resulting sequence of minimums of the quadratics produced a sequence converging to the minimum of the objective function [Graphics:Images/NewtonSearchMod_gr_2.gif]. Newton's search method extends this process to functions of n independent variables: [Graphics:Images/NewtonSearchMod_gr_3.gif]. Starting at an initial point [Graphics:Images/NewtonSearchMod_gr_4.gif], a sequence of second-degree polynomials in n variables can be constructed recursively. If the objective function is well-behaved and the initial point is near the actual minimum, then the sequence of minimums of the quadratics will converge to the minimum of the objective function. The process will use both the first- and second-order partial derivatives of the objective function. Recall that the gradient method used only the first partial derivatives. It is to be expected that Newton's method will be more efficient than the gradient method.

Background

Now we turn to the minimization of a function [Graphics:Images/NewtonSearchMod_gr_5.gif] of n variables, where [Graphics:Images/NewtonSearchMod_gr_6.gif] and the partial derivatives of [Graphics:Images/NewtonSearchMod_gr_7.gif] are accessible. Although the Newton search method will turn out to have a familiar form. For illustration purposes we emphasize the two dimensional case when [Graphics:Images/NewtonSearchMod_gr_8.gif]. The extension to n dimensions is discussed in the hyperlink.

Assume that [Graphics:Images/NewtonSearchMod_gr_9.gif] is a function of two variables, [Graphics:Images/NewtonSearchMod_gr_10.gif], and has partial derivatives [Graphics:Images/NewtonSearchMod_gr_11.gif] and [Graphics:Images/NewtonSearchMod_gr_12.gif]. The gradient of [Graphics:Images/NewtonSearchMod_gr_13.gif], denoted by [Graphics:Images/NewtonSearchMod_gr_14.gif], is the vector function


[Graphics:Images/NewtonSearchMod_gr_15.gif].

Assume that [Graphics:Images/NewtonSearchMod_gr_16.gif] are functions of two variables, [Graphics:Images/NewtonSearchMod_gr_17.gif], their Jacobian matrix [Graphics:Images/NewtonSearchMod_gr_18.gif] is

[Graphics:Images/NewtonSearchMod_gr_19.gif].

Assume that [Graphics:Images/NewtonSearchMod_gr_20.gif] is a function of two variables, [Graphics:Images/NewtonSearchMod_gr_21.gif], and has partial derivatives up to the order two. The Hessian matrix [Graphics:Images/NewtonSearchMod_gr_22.gif] is defined as follows:

[Graphics:Images/NewtonSearchMod_gr_23.gif].

Lemma 1. For [Graphics:Images/NewtonSearchMod_gr_24.gif] the Hessian matrix [Graphics:Images/NewtonSearchMod_gr_25.gif] is the Jacobian matrix for the two functions [Graphics:Images/NewtonSearchMod_gr_26.gif], i. e.

[Graphics:Images/NewtonSearchMod_gr_27.gif].

Lemma 2. If the second order partial derivatives of [Graphics:Images/NewtonSearchMod_gr_28.gif] are continuous then the Hessian matrix [Graphics:Images/NewtonSearchMod_gr_29.gif] is symmetric.

Comments

Popular Posts

Runge-Kutta-Fehlberg Method

One way to guarantee accuracy in the solution of an I.V.P. is to solve the problem twice using step sizes h and and compare answers at the mesh points corresponding to the larger step size. But this requires a significant amount of computation for the smaller step size and must be repeated if it is determined that the agreement is not good enough. The Runge-Kutta-Fehlberg method (denoted RKF45) is one way to try to resolve this problem. It has a procedure to determine if the proper step size h is being used. At each step, two different approximations for the solution are made and compared. If the two answers are in close agreement, the approximation is accepted. If the two answers do not agree to a specified accuracy, the step size is reduced. If the answers agree to more significant digits than required, the step size is increased. Each Runge-Kutta-Fehlberg step requires the use of the following six values: Then an approximation to the solution of the I.V.P....

Van Der Pol System

The van der Pol equation is an ordinary differential equation that can be derived from the Rayleigh differential equation by differentiating and setting . It is an equation describing self-sustaining oscillations in which energy is fed into small oscillations and removed from large oscillations. This equation arises in the study of circuits containing vacuum tubes and is given by If , the equation reduces to the equation of simple harmonic motion The van der Pol equation is , where is a constant. When the equation reduces to , and has the familiar solution . Usually the term in equation (1) should be regarded as friction or resistance, and this is the case when the coefficient is positive. However, if the coefficient is negative then we have the case of "negative resistance." In the age of "vacuum tube" radios, the " tetrode vacuum tube " (cathode, grid, plate), was used for a power amplifie...

Newton-Raphson Method

This method is generally used to improve the result obtained by one of the previous methods. Let x o be an approximate root of f(x) = 0 and let x 1 = x o + h be the correct root so that f(x 1 ) = 0. Expanding f(x o + h) by Taylor ’s series, we obtain f(x o ) + hf’(x o ) + h 2 f’’(x o )/2! +-----= 0 Neglecting the second and higher order derivatives, we have f(x o ) + hf’(x o ) = 0 Which gives h = - f(x o )/f’(x o ) A better approximation than x o is therefore given be x 1 , where x 1 = x o – f(x o )/f’(x o ) Successive approximations are given by x 2 , x 3 ,----,x n+1 , where x n+1 = x n – f(x n )/f’(x n ) Which is the Newton-Raphson formula. Example: - Find a real root of the equation x 3 -5x + 3 = 0. Sol n : - Let, f(x) = x 3 -5x + 3 = 0 f’(x) = 3x 2 - 5 Choosing x o = 1 Step-1: f(x o ) = -1 f(x o ) = -2 So, x 1 =1 – ½ = 0.5 Step-2: f(x 1 ) = 0.625 f’(x 1 ) = -4.25 x 2 = 0.5 + 0.625/4.25 = 0.647 S...

Powell's Method

The essence of Powell's method is to add two steps to the process described in the preceding paragraph. The vector represents, in some sense, the average direction moved over the n intermediate steps in an iteration. Thus the point is determined to be the point at which the minimum of the function f occurs along the vector . As before, f is a function of one variable along this vector and the minimization could be accomplished with an application of the golden ratio or Fibonacci searches. Finally, since the vector was such a good direction, it replaces one of the direction vectors for the next iteration. The iteration is then repeated using the new set of direction vectors to generate a sequence of points . In one step of the iteration instead of a zig-zag path the iteration follows a "dog-leg" path. The process is outlined below. Let be an initial guess at the location of the minimum of the function . Let for be the ...