Lecture 15: Introduction to nonlinear equations#

The problem#

Given a continuous function \(f(x)\), the problem is to find a point \(x^*\) such that \(f(x^*) = 0\). That is, \(x^*\) is a solution of the equation \(f(x) = 0\) and is called a root of \(f(x)\).

Examples

  1. The linear equation \(f(x) = a x + b = 0\) has a single solution at \(x^* = -\frac{b}{a}\).

  2. The quadratic equation \(f(x) = a x^2 + b x + c = 0\) is nonlinear , but simple enough to have a known formula for the solutions

    \[ x^* = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}. \]
  3. A general nonlinear equation \(f(x) = 0\) rarely has a formula like those above which can be used to calculate its roots.

It is also sometimes better to use a numerical method to solve an equation even if an exact formula exists:

  • the exact formula may have difficulties due to rounding errors;

  • the exact formula may be more expensive to compute.

Example problems#

The following three example problems will be used throughout this section to illustrate the properties of common methods for solving nonlinear equations.

  1. Calculate the value of \(x = \sqrt{R}\), where \(R\) is some positive real number, without the direct use of any sqrt function.

    • Let \(f(x) = x^2 - R\).

    • The equations \(f(x) = 0\) implies that \(x^2 = R\), i.e. \(x = \pm \sqrt{R}\).

    • There are two solutions and the method should be able to distinguish between them.

  2. The following formula allows the monthly repayments (\(M\)) on a compound interest mortgage (for a borrowing of \(P\)) to be calculated based upon an annual interest rate of \(r\)% and \(n\) monthly payments (more details).

    \[ M = P \frac{\frac{r}{1200} \left(1 + \frac{r}{1200}\right)^n}{\left(1 + \frac{r}{12000}\right)^n - 1}. \]
    • Suppose that we wish to work out how many monthly repayments of £1,000 would be required to repay a mortgage of £150,000 at an annual rate of 5%.

    • This would require us to solve \(f(n) = 0\) where \(f(n) = 1000 - 150000 \frac{\frac{5}{1200}(1+\frac{5}{1200})^n}{(1+\frac{5}{1200})^n - 1}\).

  3. Consider the NACA0012 prototype wing section, which is often used for testing computational methods for simulating flows in aerodynamics:

Video source: https://youtu.be/wcahAqSFZ8k

The profile is given by

\[ y^{\pm}(x) = \pm(0.2969 \sqrt{x} - 0.126 x - 0.3516 x^2 + 0.2843 x^3 - 0.1015 x^4), \]

in which \(+\) gives the upper surface and \(-\) gives the lower surface.

Find the point \(x\) at which the thickness \(t\) of the aerofoil is \(0.1\), i.e. solve \(f(x) = 0\) where \(f(x) = y^+(x) - y^-(x) - 0.1\).

  • There will be two solutions for \(x\) for this value of \(t\).

../_images/bd04ad7a5e16fd6ee01e70b3b687aec2e89c2ef23e892b150ba7d0e9bc1528a8.png

Iterative methods#

Use the concept of iterative (or iterative improvement) again.

  • Given an initial estimate of the root \(x^{(0)}\), try to generate a better approximation \(x^{(1)}\) to the root.

  • Once \(x^{(1)}\) has been computed, it can be used to compute another estimate \(x^{(2)}\) in the same way (and so on…).

  • Generally, \(x^{(i)}\) is used to compute \(x^{(i+1)}\) (though other previous estimates may be used as well).

  • There are many methods for doing this.

Issues#

  • Does the iteration converge?

  • That is, will \(x^{(i+1)}\) be a better estimate than \(x^{(i)}\)?

  • How will we know if this is true or not?

  • If the iteration does converge then how quickly does this happen?

  • How should we decide when to stop the iterative procedure?

Bisection method#

The simplest method for solving \(f(x) = 0\), finding \(x^*\) such that \(f(x^*) = 0\), is known as the bisection method.

  • Assume for now that two points \(x_L\) and \(x_R\) are known for which

    \[ f(x_L) f(x_R) < 0 \]

    i.e., the function evaluations at these points are of opposite sign.

  • If \(f\) is continuous then this implies that there must be at least one root \(x^*\) in the interval \((x_L, x_R)\). There may be more than one root.

../_images/40ca00cdbd62b37e147bc88c8f35e863a31a990c30563b2b8413679b511ad328.png

The algorithm#

  • Consider the point \(x_C = (x_L + x_R) / 2\) and find \(f(x_C)\).

    • If \(f(x_C) = 0\), then \(x^* = x_C\) and we can stop.

    • If \(f(x_L) f(x_C) < 0\), then \(x^* \in (x_L, x_C)\).

    • If \(f(x_C) f(x_R) < 0\), then \(x^* \in (x_C, x_R)\).

  • Replace the interval (also termed bracket) \((x_L, x_R)\) with the new interval containing \(x^*\).

    • Note that in one step the length of the interval containing \(x^*\) has been halved.

  • Repeat this process until \(x_R - x_L < TOL\), where \(TOL\) is a user-supplied value, i.e. repeat until the bracket is sufficiently small.

Convergence analysis#

  • Assume that \(k\) steps are taken from an initial bracket \((a, b)\).

  • At each step the bracketing interval is halved.

  • So, after \(k\) steps the length of the interval will be \(\dfrac{b-a}{2^k}\).

  • For large \(b-a\) (a large initial bracket) or small \(TOL\) (high accuracy required) \(k\) will be large.

  • Note that once \(x_R - x_L < TOL\), we can estimate \(x^* = (x_L + x_R)/2\), in which case the absolute error will be at most \(\frac{1}{2} TOL\).

Example 1#

Use the bisection method to calculate \(\sqrt{2}\) with an error of less than \(10^{-4}\).

  • Use \(R = 2\), so \(f(x) = x^2 - 2\), which gives \(x^* = \sqrt{2}\).

  • Set the initial bracket to be \([a, b] = [0, 2]\) and the error tolerance to be \(TOL = 10^{-4}\).

Results of bisection method on example 1
it a b f(a) f(b) update
0 0.000000 2.000000 -2.000000 2.000000 a := 1.0000
1 1.000000 2.000000 -1.000000 2.000000 b := 1.5000
2 1.000000 1.500000 -1.000000 0.250000 a := 1.2500
3 1.250000 1.500000 -0.437500 0.250000 a := 1.3750
4 1.375000 1.500000 -0.109375 0.250000 b := 1.4375
5 1.375000 1.437500 -0.109375 0.066406 a := 1.4062
6 1.406250 1.437500 -0.022461 0.066406 b := 1.4219
7 1.406250 1.421875 -0.022461 0.021729 a := 1.4141
8 1.414062 1.421875 -0.000427 0.021729 b := 1.4180
9 1.414062 1.417969 -0.000427 0.010635 b := 1.4160
10 1.414062 1.416016 -0.000427 0.005100 b := 1.4150
11 1.414062 1.415039 -0.000427 0.002336 b := 1.4146
12 1.414062 1.414551 -0.000427 0.000954 b := 1.4143
13 1.414062 1.414307 -0.000427 0.000263 a := 1.4142
14 1.414185 1.414307 -0.000082 0.000263 b := 1.4142

Note that

  • choosing \([a, b] = [-2, 0]\) results in \(x^* = -1.4142\);

  • if \([a, b]\) brackets both roots, \(\pm \sqrt{R}\), then \(f(x_L) f(x_R) > 0\) and the bisection method cannot start.

Which roots, if any will the following initial intervals converge to?

\[ [0,1] \qquad [2, 10] \qquad [-10, 10] \qquad [-10, -1]. \]

Example 2#

Use the bisection method to calculate the number of monthly repayments of £1,000 that are required to repay a mortgage of £150,000 at an annual rate of 5%.

It is clear that 1 monthly repayment (\(n=1\)) will not be sufficient, whilst we should try a very large value of \(n\) to try to bracket the correct solution.

Results of bisection method on example 2
it a b f(a) f(b) update
0 1.000000 1000.000000 -149625.000000 365.070564 b := 500.5000
1 1.000000 500.500000 -149625.000000 285.881715 b := 250.7500
2 1.000000 250.750000 -149625.000000 34.705094 a := 125.8750
3 125.875000 250.750000 -533.775512 34.705094 a := 188.3125
4 188.312500 250.750000 -151.077416 34.705094 a := 219.5312
5 219.531250 250.750000 -44.091479 34.705094 a := 235.1406
6 235.140625 250.750000 -1.873702 34.705094 b := 242.9453
7 235.140625 242.945312 -1.873702 17.052326 b := 239.0430
8 235.140625 239.042969 -1.873702 7.756305 b := 237.0918
9 235.140625 237.091797 -1.873702 2.984085 b := 236.1162
10 235.140625 236.116211 -1.873702 0.566021 a := 235.6284
11 235.628418 236.116211 -0.651116 0.566021 a := 235.8723
12 235.872314 236.116211 -0.041869 0.566021 b := 235.9943
13 235.872314 235.994263 -0.041869 0.262246 b := 235.9333

We converges to a solution of \(x^* = 235.9\) after 13 iterations.

Note that if we do not try a sufficiently large value for \(n\) for the upper range of the bracketing interval the method will fail.

Example 3#

Use the bisection method to find the points at which the thickness of the NACA0012 aerofoil is 0.1 with an error of less than \(10^{-4}\).

It can be seen that \(0 \le x^* \le 1\) but there are two solutions in this interval, so try \([x_L, x_R] = [0.5, 1]\) as the initial bracket.

Results of bisection method on example 3
it a b f(a) f(b) update
0 0.500000 1.000000 0.038234 -0.047900 a := 0.7500
1 0.750000 1.000000 0.002672 -0.047900 b := 0.8750
2 0.750000 0.875000 0.002672 -0.020758 b := 0.8125
3 0.750000 0.812500 0.002672 -0.008606 b := 0.7812
4 0.750000 0.781250 0.002672 -0.002859 b := 0.7656
5 0.750000 0.765625 0.002672 -0.000067 a := 0.7578
6 0.757812 0.765625 0.001309 -0.000067 a := 0.7617
7 0.761719 0.765625 0.000623 -0.000067 a := 0.7637
8 0.763672 0.765625 0.000279 -0.000067 a := 0.7646
9 0.764648 0.765625 0.000106 -0.000067 a := 0.7651
10 0.765137 0.765625 0.000020 -0.000067 b := 0.7654
11 0.765137 0.765381 0.000020 -0.000023 b := 0.7653
12 0.765137 0.765259 0.000020 -0.000002 a := 0.7652

This gives the root as \(x^* \approx 0.7652\) after 12 iterations.

Note that:

  • taking \([x_L, x_R] = [0, 0.5]\) gives the other root \(x^* \approx 0.0339\);

  • taking \([x_L, x_R] = [0, 1]\) or \([x_L, x_R] = [0.1, 0.6]\) would fail to give an initial bracket.

Weaknesses of the bisection algorithm#

  • Bisection is a reliable method for finding solutions of \(f(x) = 0\) provided an initial bracket can be found.

  • It is far from perfect however:

    • finding an initial bracket is not always easy;

    • it can take a very large number of iterations to obtain an accurate answer;

    • it can never find solutions of \(f(x) = 0\) for which \(f\) does not change sign.

For example:

../_images/24780a3c544b49e6642bce927fb4f6bf2ccf2f39f8f92349c5a8332d4b74a3ad.png

Newton’s method#

  • Newton’s method (or the Newton-Raphson iteration) is an alternative algorithm for solving \(f(x) = 0\).

  • On the positive side:

    • it does not require an initial bracket - just an initial iterate;

    • it converges much more quickly than bisection;

    • it can solve problems where the solution is also a turning point (see previous example).

  • On the negative side:

    • it is not guaranteed to converge (unlike bisection with a good initial bracket).

Graphical derivation of Newton’s method#

\(x^{(i+1)}\) is computed by projecting the slope at \(x^{(i)}\), \(f'(x^{(i)})\), on to the \(x\)-axis, giving

\[\begin{split} \begin{aligned} f'(x^{(i)}) & = \frac{f(x^{(i)}) - 0}{x^{(i)} - x^{(i+1)}} \\ \Rightarrow x^{(i+1)} & = x^{(i)} - \frac{f(x^{(i)})}{f'(x^{(i)})}. \end{aligned} \end{split}\]
../_images/fe09d8d56ce0b6a5db2bf5746c447665075ae3a9896e6cc588a1ec90c34ec35a.png

Notes#

  • The formula \(x^{(i+1)} = x^{(i)} - \frac{f(x^{(i)})}{f'(x^{(i)})}\) requires an initial guess at the solution: \(x^{(0)}\).

  • In order to apply Newton’s method we need to be able to compute an expression for the derivative of \(f(x)\):

    • this may not always be possible or easy;

    • in the examples that follow we will make use of the formula:

      \[ f(x) = x^n \Rightarrow f'(x) = n x^{n-1}, \]

      which is true for any \(n \neq 0\).

  • There are variants of Newton’s method that allow the derivative to be approximated: we will return to these later.

Examples#

Write out an expression for the Newton iteration for each of the following functions \(f(x)\) and then carry out 2 iterations using the resulting iteration and the given initial value \(x^{(0)}\).

  1. \(f(x) = x^2 - 5\) with \(x^{(0)} = 2\).

    This gives \(f'(x) = 2x\) so the iterative formula is

    \[ x^{(i+1)} = x^{(i)} - \frac{(x^{(i)})^2 - 5}{2 x^{(i)}} = \frac{(x^{(i)})^2 + 5}{2 x^{(i)}}. \]

    This gives

    \[\begin{split} \begin{aligned} x^{(1)} & = \frac{2^2 + 5}{2 \times 2} = \frac{9}{4} = 2.25 \\ x^{(2)} & = \frac{(\frac{9}{4})^2 + 5}{2 \times \frac{9}{4}} = \frac{161}{72} \approx 2.23611 \end{aligned} \end{split}\]

    (Note \(\sqrt{5} = 2.23607\))

  2. \(f(x) = x^3 - 2x - 5\) with \(x^{(0)} = 2\). (homework)

  3. \(f(x) = x^3 - 2\) with \(x^{(0)} = 1\). (homework)

Summary#

  • Bisection approach is a simple and reliable algorithm for solving a nonlinear equation.

  • An appropriate initial bracket must be found - some searching may be required to locate one.

  • The analysis shows that the method is not very fast.

  • Newton’s method is an alternative approach:

    • it does not require an initial bracket - just an initial iterate;

    • it converges much more quickly than bisection;

    • it can solve problems where the solution is also a turning point;

  • We will discuss some drawbacks of Newton’s method next time…

Further reading#

The slides used in the lecture are also available