Skip to main content

Nelder-Mead Method

The Nelder-Mead method is a simplex method for finding a local minimum of a function of several variables. It's discovery is attributed to J. A. Nelder and R. Mead. For two variables, a simplex is a triangle, and the method is a pattern search that compares function values at the three vertices of a triangle. The worst vertex, where [Graphics:Images/NelderMeadMod_gr_1.gif] is largest, is rejected and replaced with a new vertex. A new triangle is formed and the search is continued. The process generates a sequence of triangles (which might have different shapes), for which the function values at the vertices get smaller and smaller. The size of the triangles is reduced and the coordinates of the minimum point are found.
The algorithm is stated using the term simplex (a generalized triangle in n dimensions) and will find the minimum of a function of n variables. It is effective and computationally compact.

Initial Triangle
Let
[Graphics:Images/NelderMeadMod_gr_3.gif] be the function that is to be minimized. To start, we are given three vertices of a triangle: [Graphics:Images/NelderMeadMod_gr_4.gif], for [Graphics:Images/NelderMeadMod_gr_5.gif]. The function [Graphics:Images/NelderMeadMod_gr_6.gif] is then evaluated at each of the three points: [Graphics:Images/NelderMeadMod_gr_7.gif], for [Graphics:Images/NelderMeadMod_gr_8.gif]. The subscripts are then reordered so that [Graphics:Images/NelderMeadMod_gr_9.gif]. We use the notation

(1)
[Graphics:Images/NelderMeadMod_gr_10.gif], [Graphics:Images/NelderMeadMod_gr_11.gif], and [Graphics:Images/NelderMeadMod_gr_12.gif].

to help remember that [Graphics:Images/NelderMeadMod_gr_13.gif] is the best vertex, [Graphics:Images/NelderMeadMod_gr_14.gif] is good (next to best), and [Graphics:Images/NelderMeadMod_gr_15.gif] is the worst vertex.

Midpoint of the Good Side

The construction process uses the midpoint
[Graphics:Images/NelderMeadMod_gr_16.gif] of the line segment joining [Graphics:Images/NelderMeadMod_gr_17.gif] and [Graphics:Images/NelderMeadMod_gr_18.gif]. It is found by averaging the coordinates:

(2) [Graphics:Images/NelderMeadMod_gr_19.gif].

Reflection Using the Point [Graphics:Images/NelderMeadMod_gr_20.gif]

The function decreases as we move along the side of the triangle from
[Graphics:Images/NelderMeadMod_gr_21.gif] to [Graphics:Images/NelderMeadMod_gr_22.gif], and it decreases as we move along the side from [Graphics:Images/NelderMeadMod_gr_23.gif] to[Graphics:Images/NelderMeadMod_gr_24.gif]. Hence it is feasible that [Graphics:Images/NelderMeadMod_gr_25.gif] takes on smaller values at points that lie away from [Graphics:Images/NelderMeadMod_gr_26.gif] on the opposite side of the line between[Graphics:Images/NelderMeadMod_gr_27.gif] and[Graphics:Images/NelderMeadMod_gr_28.gif]. We choose a test point [Graphics:Images/NelderMeadMod_gr_29.gif] that is obtained by “reflecting” the triangle through the side [Graphics:Images/NelderMeadMod_gr_30.gif]. To determine [Graphics:Images/NelderMeadMod_gr_31.gif], we first find the midpoint [Graphics:Images/NelderMeadMod_gr_32.gif] of the side [Graphics:Images/NelderMeadMod_gr_33.gif]. Then draw the line segment from [Graphics:Images/NelderMeadMod_gr_34.gif] to [Graphics:Images/NelderMeadMod_gr_35.gif] and call its length d. This last segment is extended a distance d through [Graphics:Images/NelderMeadMod_gr_36.gif] to locate the point [Graphics:Images/NelderMeadMod_gr_37.gif]. The vector formula for [Graphics:Images/NelderMeadMod_gr_38.gif] is

(3)
[Graphics:Images/NelderMeadMod_gr_39.gif].

Expansion Using the Point [Graphics:Images/NelderMeadMod_gr_40.gif]

If the function value at
[Graphics:Images/NelderMeadMod_gr_41.gif] is smaller than the function value at [Graphics:Images/NelderMeadMod_gr_42.gif], then we have moved in the correct direction toward the minimum. Perhaps the minimum is just a bit farther than the point [Graphics:Images/NelderMeadMod_gr_43.gif] . So we extend the line segment through [Graphics:Images/NelderMeadMod_gr_44.gif] and [Graphics:Images/NelderMeadMod_gr_45.gif] to the point [Graphics:Images/NelderMeadMod_gr_46.gif]. This forms an expanded triangle [Graphics:Images/NelderMeadMod_gr_47.gif]. The point [Graphics:Images/NelderMeadMod_gr_48.gif] is found by moving an additional distance d along the line joining [Graphics:Images/NelderMeadMod_gr_49.gif] and [Graphics:Images/NelderMeadMod_gr_50.gif]. If the function value at [Graphics:Images/NelderMeadMod_gr_51.gif] is less than the function value at [Graphics:Images/NelderMeadMod_gr_52.gif], then we have found a better vertex than [Graphics:Images/NelderMeadMod_gr_53.gif]. The vector formula for [Graphics:Images/NelderMeadMod_gr_54.gif] is

(4)
[Graphics:Images/NelderMeadMod_gr_55.gif].

Contraction Using the Point [Graphics:Images/NelderMeadMod_gr_56.gif]

If the function values at
[Graphics:Images/NelderMeadMod_gr_57.gif] and [Graphics:Images/NelderMeadMod_gr_58.gif] are the same, another point must be tested. Perhaps the function is smaller at [Graphics:Images/NelderMeadMod_gr_59.gif], but we cannot replace [Graphics:Images/NelderMeadMod_gr_60.gif] with [Graphics:Images/NelderMeadMod_gr_61.gif] because we must have a triangle. Consider the two midpoints [Graphics:Images/NelderMeadMod_gr_62.gif] and [Graphics:Images/NelderMeadMod_gr_63.gif] of the line segments [Graphics:Images/NelderMeadMod_gr_64.gif] and [Graphics:Images/NelderMeadMod_gr_65.gif], respectively. The point with the smaller function value is called [Graphics:Images/NelderMeadMod_gr_66.gif], and the new triangle is [Graphics:Images/NelderMeadMod_gr_67.gif].
Note: The choice between
[Graphics:Images/NelderMeadMod_gr_68.gif] and [Graphics:Images/NelderMeadMod_gr_69.gif] might seem inappropriate for the two-dimensional case, but it is important in higher dimensions.

Shrink Toward [Graphics:Images/NelderMeadMod_gr_70.gif]

If the function value at
[Graphics:Images/NelderMeadMod_gr_71.gif] is not less than the value at [Graphics:Images/NelderMeadMod_gr_72.gif], the points [Graphics:Images/NelderMeadMod_gr_73.gif] and [Graphics:Images/NelderMeadMod_gr_74.gif] must be shrunk toward [Graphics:Images/NelderMeadMod_gr_75.gif]. The point [Graphics:Images/NelderMeadMod_gr_76.gif] is replaced with [Graphics:Images/NelderMeadMod_gr_77.gif], and [Graphics:Images/NelderMeadMod_gr_78.gif] is replaced with [Graphics:Images/NelderMeadMod_gr_79.gif], which is the midpoint of the line segment joining [Graphics:Images/NelderMeadMod_gr_80.gif] with [Graphics:Images/NelderMeadMod_gr_81.gif].

Logical Decisions for Each Step

A computationally efficient algorithm should perform function evaluations only if needed. In each step, a new vertex is found, which replaces
[Graphics:Images/NelderMeadMod_gr_82.gif]. As soon as it is found, further investigation is not needed, and the iteration step is completed. The logical details for two-dimensional cases are given in the proof.

Comments

Popular Posts

Runge-Kutta-Fehlberg Method

One way to guarantee accuracy in the solution of an I.V.P. is to solve the problem twice using step sizes h and and compare answers at the mesh points corresponding to the larger step size. But this requires a significant amount of computation for the smaller step size and must be repeated if it is determined that the agreement is not good enough. The Runge-Kutta-Fehlberg method (denoted RKF45) is one way to try to resolve this problem. It has a procedure to determine if the proper step size h is being used. At each step, two different approximations for the solution are made and compared. If the two answers are in close agreement, the approximation is accepted. If the two answers do not agree to a specified accuracy, the step size is reduced. If the answers agree to more significant digits than required, the step size is increased. Each Runge-Kutta-Fehlberg step requires the use of the following six values: Then an approximation to the solution of the I.V.P....

Van Der Pol System

The van der Pol equation is an ordinary differential equation that can be derived from the Rayleigh differential equation by differentiating and setting . It is an equation describing self-sustaining oscillations in which energy is fed into small oscillations and removed from large oscillations. This equation arises in the study of circuits containing vacuum tubes and is given by If , the equation reduces to the equation of simple harmonic motion The van der Pol equation is , where is a constant. When the equation reduces to , and has the familiar solution . Usually the term in equation (1) should be regarded as friction or resistance, and this is the case when the coefficient is positive. However, if the coefficient is negative then we have the case of "negative resistance." In the age of "vacuum tube" radios, the " tetrode vacuum tube " (cathode, grid, plate), was used for a power amplifie...

Powell's Method

The essence of Powell's method is to add two steps to the process described in the preceding paragraph. The vector represents, in some sense, the average direction moved over the n intermediate steps in an iteration. Thus the point is determined to be the point at which the minimum of the function f occurs along the vector . As before, f is a function of one variable along this vector and the minimization could be accomplished with an application of the golden ratio or Fibonacci searches. Finally, since the vector was such a good direction, it replaces one of the direction vectors for the next iteration. The iteration is then repeated using the new set of direction vectors to generate a sequence of points . In one step of the iteration instead of a zig-zag path the iteration follows a "dog-leg" path. The process is outlined below. Let be an initial guess at the location of the minimum of the function . Let for be the ...

Newton-Raphson Method

This method is generally used to improve the result obtained by one of the previous methods. Let x o be an approximate root of f(x) = 0 and let x 1 = x o + h be the correct root so that f(x 1 ) = 0. Expanding f(x o + h) by Taylor ’s series, we obtain f(x o ) + hf’(x o ) + h 2 f’’(x o )/2! +-----= 0 Neglecting the second and higher order derivatives, we have f(x o ) + hf’(x o ) = 0 Which gives h = - f(x o )/f’(x o ) A better approximation than x o is therefore given be x 1 , where x 1 = x o – f(x o )/f’(x o ) Successive approximations are given by x 2 , x 3 ,----,x n+1 , where x n+1 = x n – f(x n )/f’(x n ) Which is the Newton-Raphson formula. Example: - Find a real root of the equation x 3 -5x + 3 = 0. Sol n : - Let, f(x) = x 3 -5x + 3 = 0 f’(x) = 3x 2 - 5 Choosing x o = 1 Step-1: f(x o ) = -1 f(x o ) = -2 So, x 1 =1 – ½ = 0.5 Step-2: f(x 1 ) = 0.625 f’(x 1 ) = -4.25 x 2 = 0.5 + 0.625/4.25 = 0.647 S...