Rate of convergence

In mathematical analysis, particularly numerical analysis, the rate of convergence and order of convergence of a sequence that converges to a limit are any of several characterizations of how quickly that sequence approaches its limit. These are broadly divided into asymptotic rates and orders of convergence, describing how quickly a sequence further approaches its limit once it is already close to it, and non-asymptotic rates and orders of convergence, which describe how quickly sequences approach their limits from starting points that are not necessarily close to their limits.

Asymptotic rates and orders of convergence have particular importance both in practical numerics and in formal proof, and they are the primary focus of this article. In practical numerics, asymptotic rates and orders of convergence follow two common conventions for two types of sequences: the first for sequences of iterations of an iterative numerical method and the second for sequences of successively more accurate numerical discretizations of a target. In formal mathematics, rates of convergence and orders of convergence are often described using asymptotic notation commonly called "big O notation," which can be used to encompass both of the prior conventions.

For iterative methods, a sequence $(x_{n})$ that converges to $L$ is said to have order of convergence $q\geq 1$ and rate of convergence $\mu$ if

\lim _{n\rightarrow \infty }{\frac {\left|x_{n+1}-L\right|}{\left|x_{n}-L\right|^{q}}}=\mu .

^[1]^[2]

Where greater methodological precision is required, these rates and orders of convergence are known specifically as the rates and orders of Q-convergence, short for quotient-convergence, since the limit in question is a quotient of error terms.^[2] The rate of convergence $\mu$ may also be called the asymptotic error constant. Furthermore some authors will use rate where this article uses order (e.g., ^[3]).

Similar concepts are used for sequences of discretizations. For instance, ideally the solution of a differential equation discretized via a regular grid will converge to the solution of the continuous differential equation as the grid spacing goes to zero and grid points go to infinity, and if so the rate and order of that convergence are an important characterization of the efficiency of the grid discretization method. A sequence of approximate grid solutions $(y_{n})$ of some problem that converges to a true solution $S$ with a corresponding sequence of regular grid spacings $(h_{n})$ that converge to 0 is said to have order of convergence $q$ and rate of convergence $\mu$ if

$\lim _{n\rightarrow \infty }{\frac {\left|y_{n}-S\right|}{h_{n}^{q}}}=\mu ,$

where the absolute value symbols stand for a function metric for the space of solutions such as the uniform norm. Similar considerations also apply for non-grid discretization schemes such as the polygon meshes of a finite element method or the basis sets in computational chemistry: in general, the appropriate definition of $\mu$ will involve the asymptotic limit of the ratio of some approximation error term above to a $q$ -power of a discretization scale parameter below.

In practice, the rate and order of convergence of a sequence of iterates or approximations provide useful insights when using iterative methods and discretization methods for calculating numerical approximations. Strictly speaking, however, the asymptotic behavior of a sequence does not give conclusive information about any finite part of the sequence.

Series acceleration is a collection of techniques for improving the rate of convergence of the sequence of partial sums of a series and possibly its order of convergence, also. These accelerations are commonly accomplished with sequence transformations.

Rates of convergence for iterative methods

Convergence rate definitions

Suppose that the sequence $(x_{k})$ converges to the number $L$ . The sequence is said to converge with order $q$ to $L$ , and with a rate of convergence $\mu$ , if

$\lim _{k\to \infty }{\frac {|x_{k+1}-L|}{|x_{k}-L|^{q}}}=\mu$

for some positive constant $\mu \in (0,1)$ if $q=1$ and $\mu \in (0,\infty )$ if $q>1$ .^[2]^[4]^[5] It is not necessary that $q$ be an integer. For example, the secant method, when converging to a regular, simple root, has an order of φ ≈ 1.618.^{[citation needed]} This is technically called Q-convergence, short for quotient-convergence, and the rates and orders are called rates and orders of Q-convergence in certain technical settings where alternative rate definitions are more appropriate; see § R-convergence below.

Convergence with order

$q=1$ is called linear convergence and the sequence is said to converge linearly to $L$ .
$q=2$ is called quadratic convergence.
$q=3$ is called cubic convergence.

In addition, when $q>1$ for a sequence or for any sequence such that

$\lim _{k\to \infty }{\frac {|x_{k+1}-L|}{|x_{k}-L|}}=0,$

that sequence is said to converge superlinearly to $L$ (i.e., faster than linearly).^[2]^[6] A sequence is said to converge sublinearly to $L$ (i.e., slower than linearly) if it converges and $\lim _{k\to \infty }{\frac {|x_{k+1}-L|}{|x_{k}-L|}}=1.$

A sequence $(x_{k})$ converges logarithmically to $L$ if the sequence converges sublinearly and ^[7]

$\lim _{k\to \infty }{\frac {|x_{k+1}-x_{k}|}{|x_{k}-x_{k-1}|}}=1.$

R-convergence

The definitions of Q-convergence rates have a shortcoming in that they do not naturally capture the convergence behavior of sequences that do not converge with an asymptotically constant rate with every step, such as the staggered geometric progression below that gets closer to its limit only every other step.

In such cases, a closely related but more technical definition of rate of convergence called R-convergence is more appropriate; The "R-" prefix stands for "root."^[2]^[8]^: 620 A sequence $(x_{k})$ that converges to $L$ is said to converge at least R-linearly if there exists an error-bounding sequence $(\varepsilon _{k})$ such that ${\textstyle |x_{k}-L|\leq \varepsilon _{k}\quad {\text{for all }}k}$ and $(\varepsilon _{k})$ converges Q-linearly to zero; analogous definitions hold for R-superlinear convergence, R-sublinear convergence, R-quadratic convergence, and so on.^[2]^[9]

In order to define the rates and orders of R-convergence, one uses the rate and order of Q-convergence of am error-bounding sequence $(\varepsilon _{k})$ chosen such that no other error-bounding sequence $(\varepsilon '_{k})$ could have been chosen that would converge with a faster rate and order; any $(\varepsilon '_{k})$ provides a lower bound on the rate and order of R-convergence and the greatest lower bound gives the exact rate and order of R-convergence.

Order estimation

A practical method to calculate the order of convergence for a sequence generated by a fixed point iteration is to calculate the following sequence, which converges to the order $q$ :^[10] $q\approx {\frac {\log \left|\displaystyle {\frac {x_{k+1}-x_{k}}{x_{k}-x_{k-1}}}\right|}{\log \left|\displaystyle {\frac {x_{k}-x_{k-1}}{x_{k-1}-x_{k-2}}}\right|}}.$

For numerical approximation of an exact value through a numerical method of order q see.^[11]

Examples

The geometric progression ${\textstyle (a_{k})=1,{\frac {1}{2}},{\frac {1}{4}},{\frac {1}{8}},{\frac {1}{16}},{\frac {1}{32}},\ldots ,1/{2^{k}},\dots }$ converges to $L=0$ . Plugging the sequence into the definition of Q-linear convergence (i.e., order of convergence 1) shows that

$\lim _{k\to \infty }{\frac {\left|1/2^{k+1}-0\right|}{\left|1/2^{k}-0\right|}}=\lim _{k\to \infty }{\frac {2^{k}}{2^{k+1}}}={\frac {1}{2}}.$

Thus $(a_{k})$ converges Q-linearly with a convergence rate of $\mu =1/2$ ; see the first plot of the figure below.

More generally, for any $a\in \mathbb {R} ,r\in (-1,1)$ , a geometric progression $(ar^{k})$ converges linearly with rate $|r|$ and the sequence of partial sums of a geometric series ${\textstyle (\sum _{n=0}^{k}ar^{n})}$ also converges linearly with rate $|r|$ . The same holds also for geometric progressions and geometric series parameterized by any complex numbers $a\in \mathbb {C} ,r\in \mathbb {C} ,|r|<1.$

The staggered geometric progression ${\textstyle (b_{k})=1,1,{\frac {1}{4}},{\frac {1}{4}},{\frac {1}{16}},{\frac {1}{16}},\ldots ,1/4^{\left\lfloor {\frac {k}{2}}\right\rfloor },\ldots ,}$ using the floor function ${\textstyle \lfloor x\rfloor }$ that gives the largest integer that is less than or equal to $x,$ converges R-linearly to 0 with rate 1/2, but it does not converge Q-linearly; see the second plot of the figure below. The defining Q-linear convergence limits do not exist for this sequence because one subsequence of error quotients (the sequence of quotients taken from odd steps) has a different limit than another subsequence (the sequence of quotients taken from even steps). Generally, for any staggered geometric progression $(ar^{\lfloor k/m\rfloor })$ , the sequence will not converge Q-linearly but will converge R-linearly with rate ${\textstyle {\sqrt[{m}]{|r|}};}$ this example highlights why the "R" in R-linear convergence is short for "root."

The sequence $(c_{k})={\frac {1}{2}},{\frac {1}{4}},{\frac {1}{16}},{\frac {1}{256}},{\frac {1}{65,\!536}},\ldots ,{\frac {1}{2^{2^{k}}}},\ldots$ converges to zero Q-superlinearly. In fact, it is quadratically convergent with a quadratic convergence rate of 1. It is shown in the third plot of the figure below.

Finally, the sequence $(d_{k})=1,{\frac {1}{2}},{\frac {1}{3}},{\frac {1}{4}},{\frac {1}{5}},{\frac {1}{6}},\ldots ,{\frac {1}{k+1}},\ldots$ converges to zero Q-sublinearly and logarithmically and its convergence is shown as the fourth plot of the figure below.

Plot showing the different rates of convergence for the sequences ak, bk, ck and dk. — Log-linear plots of the example sequences a_k, b_k, c_k, and d_k that exemplify linear, linear, superlinear (quadratic), and sublinear rates of convergence, respectively.

Rates of convergence for discretization methods

A similar situation exists for discretization methods designed to approximate a function $y=f(x)$ , which might be an integral being approximated by numerical quadrature, or the solution of an ordinary differential equation (see example below). The discretization method generates a sequence ${y_{0},y_{1},y_{2},y_{3},...}$ , where each successive $y_{j}$ is a function of $y_{j-1},y_{j-2},...$ along with the grid spacing $h$ between successive values of the independent variable $x$ . The important parameter here for the rate of convergence to $y=f(x)$ is the grid spacing $h$ , inversely proportional to the number of grid points, i.e. the number of points in the sequence required to reach a given value of $x$ .

In this case, the sequence $(y_{n})$ is said to converge to the sequence $f(x_{n})$ with order q if there exists a constant C such that

|y_{n}-f(x_{n})|<Ch^{q}{\text{ for all }}n.

This is written as $|y_{n}-f(x_{n})|={\mathcal {O}}(h^{q})$ using big O notation.

This is the relevant definition when discussing methods for numerical quadrature or the solution of ordinary differential equations (ODEs).^{[example needed]}

A practical method to estimate the order of convergence for a discretization method is pick step sizes $h_{\text{new}}$ and $h_{\text{old}}$ and calculate the resulting errors $e_{\text{new}}$ and $e_{\text{old}}$ . The order of convergence is then approximated by the following formula:

q\approx {\frac {\log(e_{\text{new}}/e_{\text{old}})}{\log(h_{\text{new}}/h_{\text{old}})}},

^{[citation needed]}

which comes from writing the truncation error, at the old and new grid spacings, as

e=|y_{n}-f(x_{n})|={\mathcal {O}}(h^{q}).

The error $e$ is, more specifically, a global truncation error (GTE), in that it represents a sum of errors accumulated over all $n$ iterations, as opposed to a local truncation error (LTE) over just one iteration.

Example of discretization methods

Consider the ordinary differential equation

{\frac {dy}{dx}}=-\kappa y

with initial condition $y(0)=y_{0}$ . We can approximate a solution to this equation using the forward Euler method for numerical discretization:

{\frac {y_{n+1}-y_{n}}{h}}=-\kappa y_{n},

which implies the first-order linear recurrence with constant coefficients

y_{n+1}=y_{n}(1-h\kappa ).

Given $y(0)=y_{0}$ , the sequence satisfying that recurrence is the geometric progression

$y_{n}=y_{0}(1-h\kappa )^{n}=y_{0}\left(1-nh\kappa +{\frac {n(n-1)}{2}}h^{2}\kappa ^{2}+....\right).$

The exact analytical solution to the differential equation is $y=f(x)=y_{0}\exp(-\kappa x)$ , corresponding to the following Taylor expansion in $h\kappa$ for $h\kappa \ll 1$ : $f(x_{n})=f(nh)=y_{0}\exp(-\kappa nh)=y_{0}\left[\exp(-\kappa h)\right]^{n}=y_{0}\left(1-h\kappa +{\frac {h^{2}\kappa ^{2}}{2}}+....\right)^{n}=y_{0}\left(1-nh\kappa +{\frac {n^{2}}{2}}h^{2}\kappa ^{2}+...\right).$

In this case, the truncation error is

e=|y_{n}-f(x_{n})|={\frac {nh^{2}\kappa ^{2}}{2}}+\ldots ={\mathcal {O}}(h^{2}),

so $(y_{n})$ converges to $f(x_{n})$ with a convergence rate $q=2$ .

Examples (continued)

The sequence $(d_{k})$ with $d_{k}=1/(k+1)$ was introduced above. This sequence converges with order 1 according to the convention for discretization methods.^[why?]

The sequence $(a_{k})$ with $a_{k}=2^{-k}$ , which was also introduced above, converges with order q for every number q. It is said to converge exponentially using the convention for discretization methods. However, it only converges linearly (that is, with order 1) using the convention for iterative methods.^[why?]

Recurrent sequences and fixed points

Recurrent sequences $x_{n+1}:=f(x_{n})$ define discrete dynamical systems and have important general applications in mathematics through various fixed-point theorems about their convergence behavior. When f is continuously differentiable, given a fixed point p, $f(p)=p,$ such that $|f'(p)|<1$ , the fixed point is an attractive fixed point and the recurrent sequence will converge at least linearly to p for any starting value $x_{0}$ sufficiently close to p. If $|f'(p)|=0$ and $|f''(p)|<1$ , then the recurrent sequence will converge at least quadratically, and so on. If $|f'(p)|>1$ , then the fixed point is a repulsive fixed point and sequences cannot converge to p from its immediate neighborhoods, though they may still jump to p directly from outside of its local neighborhoods.

Acceleration of convergence rates

Many methods exist to increase the rate of convergence of a given sequence, i.e., to transform one given sequence into a second one that converges more quickly to the same limit. Such techniques are in general known as "series acceleration" methods. These may reduce the computational costs of approximating the limits of the transformed sequences. One example of series acceleration is Aitken's delta-squared process. These methods in general (and in particular Aitken's method) do not increase the order of convergence, and are useful only if initially the convergence is not faster than linear: if $(x_{n})$ converges linearly, one gets a sequence $(a_{n})$ that still converges linearly (except for pathologically designed special cases), but faster in the sense that $\lim(a_{n}-L)/(x_{n}-L)=0$ . On the other hand, if the convergence is already of order ≥ 2, Aitken's method will bring no improvement.

References

^ Ruye, Wang (2015-02-12). "Order and rate of convergence". hmc.edu. Retrieved 2020-07-31.
^ ^a ^b ^c ^d ^e ^f Nocedal, Jorge; Wright, Stephen J. (1999). Numerical Optimization (1st ed.). New York, NY: Springer. pp. 28–29. ISBN 978-0-387-98793-4.
^ Senning, Jonathan R. "Computing and Estimating the Rate of Convergence" (PDF). gordon.edu. Retrieved 2020-08-07.
^ Hundley, Douglas. "Rate of Convergence" (PDF). Whitman College. Retrieved 2020-12-13.
^ Porta, F. A. (1989). "On Q-Order and R-Order of Convergence" (PDF). Journal of Optimization Theory and Applications. 63 (3): 415–431. doi:10.1007/BF00939805. S2CID 116192710. Retrieved 2020-07-31.
^ Arnold, Mark. "Order of Convergence" (PDF). University of Arkansas. Retrieved 2022-12-13.
^ Van Tuyl, Andrew H. (1994). "Acceleration of convergence of a family of logarithmically convergent sequences" (PDF). Mathematics of Computation. 63 (207): 229–246. doi:10.2307/2153571. JSTOR 2153571. Retrieved 2020-08-02.
^ Nocedal, Jorge; Wright, Stephen J. (2006). Numerical Optimization (2nd ed.). Berlin, New York: Springer-Verlag. ISBN 978-0-387-30303-1.
^ Bockelman, Brian (2005). "Rates of Convergence". math.unl.edu. Retrieved 2020-07-31.
^ Senning, Jonathan R. "Computing and Estimating the Rate of Convergence" (PDF). gordon.edu. Retrieved 2020-08-07.
^ Senning, Jonathan R. "Verifying Numerical Convergence Rates" (PDF). Retrieved 2024-02-09.

Literature

The simple definition is used in

Michelle Schatzman (2002), Numerical analysis: a mathematical introduction, Clarendon Press, Oxford. ISBN 0-19-850279-6.

The extended definition is used in

Walter Gautschi (1997), Numerical analysis: an introduction, Birkhäuser, Boston. ISBN 0-8176-3895-4.
Endre Süli and David Mayers (2003), An introduction to numerical analysis, Cambridge University Press. ISBN 0-521-00794-1.

The Big O definition is used in

Richard L. Burden and J. Douglas Faires (2001), Numerical Analysis (7th ed.), Brooks/Cole. ISBN 0-534-38216-9

The terms Q-linear and R-linear are used in

Nocedal, Jorge; Wright, Stephen J. (2006). Numerical Optimization (2nd ed.). Berlin, New York: Springer-Verlag. pp. 619+620. ISBN 978-0-387-30303-1..

[1] Ruye, Wang (2015-02-12). "Order and rate of convergence". hmc.edu. Retrieved 2020-07-31.

[:0-2] ^ ^a ^b ^c ^d ^e ^f Nocedal, Jorge; Wright, Stephen J. (1999). Numerical Optimization (1st ed.). New York, NY: Springer. pp. 28–29. ISBN 978-0-387-98793-4.

[3] Senning, Jonathan R. "Computing and Estimating the Rate of Convergence" (PDF). gordon.edu. Retrieved 2020-08-07.

[4] Hundley, Douglas. "Rate of Convergence" (PDF). Whitman College. Retrieved 2020-12-13.

[5] Porta, F. A. (1989). "On Q-Order and R-Order of Convergence" (PDF). Journal of Optimization Theory and Applications. 63 (3): 415–431. doi:10.1007/BF00939805. S2CID 116192710. Retrieved 2020-07-31.

[6] Arnold, Mark. "Order of Convergence" (PDF). University of Arkansas. Retrieved 2022-12-13.

[7] Van Tuyl, Andrew H. (1994). "Acceleration of convergence of a family of logarithmically convergent sequences" (PDF). Mathematics of Computation. 63 (207): 229–246. doi:10.2307/2153571. JSTOR 2153571. Retrieved 2020-08-02.

[NocedalWright2006-8] Nocedal, Jorge; Wright, Stephen J. (2006). Numerical Optimization (2nd ed.). Berlin, New York: Springer-Verlag. ISBN 978-0-387-30303-1.

[Bockelman2005-9] Bockelman, Brian (2005). "Rates of Convergence". math.unl.edu. Retrieved 2020-07-31.

[10] Senning, Jonathan R. "Computing and Estimating the Rate of Convergence" (PDF). gordon.edu. Retrieved 2020-08-07.

[11] Senning, Jonathan R. "Verifying Numerical Convergence Rates" (PDF). Retrieved 2024-02-09.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]