WS03: Triangular systems and Gaussian elimination - partial solutions

WS03: Triangular systems and Gaussian elimination - partial solutions#

These exercises are indented to give you practice at using the material on numerical approximation and are intended to reinforce the material that was covered in lectures.

Please attempt the worksheet before your tutorial. Support is available in your tutorial or in the Class Team.

These are partial solutions. Please create Issues and Pull requests with your solutions.

Part a (pen and paper warm up)#

1. Triangular systems#

Solve the upper triangular linear system given by

\[\begin{split} \begin{aligned} 2 x_1 &+& x_2 &+& 4 x_3 &=& 12 \\ && 1.5 x_2 && &=& 3 \\ && && 2 x_3 &=& 4 \end{aligned}. \end{split}\]

2. Elementary row operations#

Consider the system

\[\begin{split} \begin{aligned} x_1 + 2 x_2 & = 1 && (3) \\ 4 x_1 + x_2 & = -3 && (4). \end{aligned} \end{split}\]

Find:

\(2 \times (3)\) \(\rightarrow\)
\(0.25 \times (4)\) \(\rightarrow\)
\((4) + (-1) \times (3)\) \(\rightarrow\)
\((4) + (-4) \times (3)\) \(\rightarrow\)

3. Gaussian elimination#

Use Gaussian elimination followed by backward substitution to solve the linear system of equations given by

\[\begin{split} \begin{pmatrix} 4 & -1 & -1 \\ 2 & 4 & 2 \\ 1 & 2 & 4 \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix} = \begin{pmatrix} 9 \\ -6 \\ 3 \end{pmatrix}. \end{split}\]

\[\begin{split} \begin{pmatrix} 4 & 3 & 2 & 1 \\ 1 & 2 & 2 & 2 \\ 1 & 1 & 3 & 0 \\ 2 & 1 & 2 & 3 \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \end{pmatrix} = \begin{pmatrix} 10 \\ 7 \\ 5 \\ 8 \end{pmatrix}. \end{split}\]

The solution is \(\vec{x} = (1, 1, 1, 1)^T\).

Part b (code implementations and testing)#

import numpy as np

4. Implementations#

Implement the following functions with doc-strings given.

def lower_triangular_solve(A, b):
    """
    Solve the system  A x = b  where A is assumed to be lower triangular,
    i.e. A(i,j) = 0 for j > i, and the diagonal is assumed to be nonzero,
    i.e. A(i,i) != 0.

    The code checks that A is lower triangular and converts A and b to
    double precision before computing.

    ARGUMENTS:  A   lower triangular n x n array
                b   right hand side column n-vector

    RETURNS:    x   column n-vector solution
    """

    # we should take care to ensure that arrays are stored with the correct type - float!
    A = A.astype(np.float64)
    b = b.astype(np.float64)

    # check sizes of A and b match appropriately
    nb = len(b)
    n, m = A.shape
    if n != m:
        raise ValueError(f"A is not a square matrix! {A.shape=}")
    if n != nb:
        raise ValueError(f"shapes of A and b do not match! {A.shape=} {b.shape=}")

    # checks whether A is lower triangular
    for i in range(n):
        for j in range(i + 1, n):
            if not np.isclose(A[i, j], 0.0):
                raise ValueError(f"A is not lower triangular! {A[i, j]=}")

    # checks whether A has zero diagonal element
    for i in range(n):
        if np.isclose(A[i, i], 0.0):
            raise ValueError(f"A[{i}, {i}] is zero")

    # create a new array to store the results
    x = np.empty_like(b)

    # perform forward substitution
    x[0] = b[0] / A[0, 0]
    for i in range(1, n):
        x[i] = b[i] / A[i, i]
        for j in range(i):
            x[i] = x[i] - A[i, j] * x[j] / A[i, i]

    return x

def upper_triangular_solve(A, b):
    """
    Solve the system  A x = b  where A is assumed to be lower triangular,
    i.e. A(i,j) = 0 for j > i, and the diagonal is assumed to be nonzero,
    i.e. A(i,i) != 0.

    The code checks that A is lower triangular and converts A and b to
    double precision before computing.

    ARGUMENTS:  A   lower triangular n x n array
                b   right hand side column n-vector

    RETURNS:    x   column n-vector solution
    """

    # we should take care to ensure that arrays are stored with the correct type - float!
    A = A.astype(np.float64)
    b = b.astype(np.float64)

    # check sizes of A and b match appropriately
    nb = len(b)
    n, m = A.shape
    if n != m:
        raise ValueError(f"A is not a square matrix! {A.shape=}")
    if n != nb:
        raise ValueError(f"shapes of A and b do not match! {A.shape=} {b.shape=}")

    # check A is upper triangular
    for i in range(n):
        for j in range(0, i):
            if not np.isclose(A[i, j], 0.0):
                raise ValueError(f"A is not upper triangular! {A[i, j]=}")

    # checks whether A has zero diagonal element
    for i in range(n):
        if np.isclose(A[i, i], 0.0):
            raise ValueError(f"A[{i}, {i}] is zero")

    # create a new array to store the results
    x = np.empty_like(b)

    # perform backwards substitution
    x[n - 1] = b[n - 1] / A[n - 1, n - 1]
    for i in range(2, n + 1):
        x[n - i] = b[n - i] / A[n - i, n - i]
        for j in range(n - i + 1, n):
            x[n - i] = x[n - i] - A[n - i, j] * x[j] / A[n - i, n - i]

    return x

def gaussian_elimination(A, b, verbose=False):
    # To ensure that arrays are stored in double precision.
    A = A.astype(np.float64)
    b = b.astype(np.float64)

    # size of solution vector / the square matrix A
    n = len(b)  # or   n, n = A.shape

    # check sizes of A and b match appropriately
    nb = len(b)
    n, m = A.shape
    if n != m:
        raise ValueError(f"A is not a square matrix! {A.shape=}")
    if n != nb:
        raise ValueError(f"shapes of A and b do not match! {A.shape=} {b.shape=}")

    if verbose:
        print("starting system\n", A, b)

    # perform forward elimination
    for i in range(n):
        # eliminate column i
        if verbose:
            print(f"eliminating column {i}")

        # check diagonal
        if np.isclose(A[i, i], 0.0):
            raise ValueError(f"A has zero on diagonal! A[{i}, {i}] = 0")

        # row j <- row j - (a_{ji} / a_{ii}) row i
        for j in range(i + 1, n):
            if verbose:
                print(f"row {j} <- row {j} - {A[j, i] / A[i, i]} row {i}")
            factor = A[j, i] / A[i, i]
            for k in range(0, n):
                A[j, k] = A[j, k] - factor * A[i, k]
            b[j] = b[j] - factor * b[i]

        if verbose:
            print("new system\n", A, b)

    return upper_triangular_solve(A, b)

5. Testing#

Test your solutions using your answers to part a. The first test has been implemented for you.

U = np.array([[2, 1, 4], [0, 1.5, 0], [0, 0, 2]], dtype=np.double)
b = np.array([[12], [3], [4]], dtype=np.double)

# numpy linear solvers
x0 = np.linalg.solve(U, b)
print(x0)

x = upper_triangular_solve(U, b)
print(x)

np.testing.assert_almost_equal(x, x0)

A = np.array([[4, 3, 2, 1], [1, 2, 2, 2], [1, 1, 3, 0], [2, 1, 2, 3]], dtype=np.double)
b = np.array([10, 7, 5, 8], dtype=np.double)

# numpy linear solvers
x0 = np.linalg.solve(A, b)
print("x0=", x0)

x = gaussian_elimination(A, b, verbose=True)
print("x=", x)

print(np.matmul(A, x) - b)

# test solution is close to exact value
np.testing.assert_almost_equal(x, x0)
# test residual is small
np.testing.assert_almost_equal(np.matmul(A, x) - b, np.zeros_like(x))

Part c: Extension#

Next we want to know how good the method is. We do this by testing robustness and efficiency in practical settings

6. Robustness#

Consider the system of linear equations given by:

\[\begin{split} \begin{aligned} x_1 + x_2 & = 3 \\ \varepsilon x_1 + x_2 & = 2 + \varepsilon. \end{aligned} \end{split}\]

First verify that the true solution is \((x_1, x_2)^T = (1, 2)^T\).
Write the problem in matrix form. What does your function give for small values \(\varepsilon = 10^{-4}, 10^{-8}, 10^{-12}\)?
Rewrite the matrix form by considering the equations the other way round. What does your function given now?

TODO contribute your answer via github.

7. Efficiency#

Consider the family of systems of linear equations given by

\[ A \vec{x} = \vec{b}, \]

where \(A\) is given in the code below for varying sizes \(n\).

import time

test_set = np.linspace(100, 300, num=10, endpoint=True, dtype="int")

exc_time = np.zeros(len(test_set))

np.random.seed(1)

i = 0
for n in test_set:
    B = np.random.rand(n, n)
    eps = 0.1

    A = eps * np.eye(n) + B * B.T
    x = np.ones(n)
    b = np.matmul(A, x)

    start = time.time()
    x = gaussian_elimination(A, b)
    end = time.time()

    exc_time[i] = end - start
    i = i + 1

print(exc_time)

Test the run time of your Gaussian elimination implementation for a range of values n based on the data from generate_test_set.

import matplotlib.pyplot as plt

fig = plt.figure()
ax = plt.axes()
# we expect log(exc_time) / log10(test_set) = 3 because the run time of the Gaussian elimination is O(n^3)
plt.plot(np.log10(test_set), np.log10(exc_time))
plt.xlabel("log(matrix size)")
plt.ylabel("log(runtime)")
plt.grid()
ax.set_aspect("equal", "box")
plt.show()