This module comprises two tutorials with associated exercises and a Java applet to do the necessary computations and display the results graphically. We explore three different iterative methods, that is, methods that are intended to generate successive approximations to the exact solution of a linear system of equations.
David Strong is Assistant Professor of Mathematics at Pepperdine University.
Iterative methods are important in solving many of today’s real world problems, so it is important that your first exposure to these methods be as positive and simple, as well as informative and educational, as possible. With this in mind, the main purpose of the tutorials and applet is to allow easy visualization and experimentation.
In more detail, here are the primary features of the applet:
You will see that the applet deals only with
Furthermore, the
This module comprises two tutorials with associated exercises and a Java applet to do the necessary computations and display the results graphically. We explore three different iterative methods, that is, methods that are intended to generate successive approximations to the exact solution of a linear system of equations.
David Strong is Assistant Professor of Mathematics at Pepperdine University.
Iterative methods are important in solving many of today’s real world problems, so it is important that your first exposure to these methods be as positive and simple, as well as informative and educational, as possible. With this in mind, the main purpose of the tutorials and applet is to allow easy visualization and experimentation.
In more detail, here are the primary features of the applet:
You will see that the applet deals only with
Furthermore, the
That is, we want to solve for x in
A is commonly referred to as the coefficient matrix. In order to save space, we usually write column vectors in coordinate form,
Suppose we knew the true values of
The fundamental idea of an iterative method is to use
Iterate means repeat, so an iterative method repeats this process over and over, each time using the current approximation to produce a better approximation for the true solution, until the current approximation is sufficiently close to the true solution -- or until you realize that the sequence of approximations resulting from these iterations is not converging to the true solution.
So given an initial guess or approximation
Since we don’t actually have the true solution x (if we did, why would we be trying to find it?), we can’t check to see how close our current approximation
Another way to check the accuracy of our current approximation is by looking at the differences in successive approximations,
This idea of iteration is not unique to linear algebra. An example that may be familiar to you is Newton’s Method , which finds approximations to the root x of
Under the right conditions, x(k) x [that is, f (xk) 0] as
The applet is designed for solving a 2 x 2 system of equations using the Jacobi, Gauss-Seidel and/or SOR Methods.
How to use the applet:
Notice the color coordination of the applet. For example, everything related to the True solution is in green, everything related to the Gauss-Seidel Method is in purple, and so on.
Perhaps the simplest iterative method for solving Ax = b is Jacobi’s Method. Note that the simplicity of this method is both good and bad: good, because it is relatively easy to understand and thus is a good first taste of iterative methods; bad, because it is not typically used in practice (although its potential usefulness has been reconsidered with the advent of parallel computing). Still, it is a good starting point for learning about more useful, but more complicated, iterative methods.
Given a current approximation
x(k) = (x1(k), x2(k), x3(k), …, xn(k))
for x, the strategy of Jacobi's Method is to use the first equation and the current values of x2(k), x3(k), …, xn(k) to find a new value x1(k+1), and similarly to find a new value xi(k) using the i th equation and the old values of the other variables. That is, given current values x(k) = (x1(k), x2(k), …, xn(k)), find new values by solving for
x(k+1) = (x1(k+1), x2(k+1), …, xn(k+1))
in
This system can also be written as
To be clear, the subscript i means that xi(k) is the i th element of vector
x(k) = (x1(k), x2(k), …, xi(k), …, xn(k) ),
and superscript k corresponds to the particular iteration (not the kth power of xi ).
If we write D, L, and U for the diagonal, strict lower triangular and strict upper triangular and parts of A, respectively,
then Jacobi’s Method can be written in matrix-vector notation as
Let's apply Jacobi's Method to the system
At each step, given the current values x1(k), x2(k), x3(k), we solve for x1(k+1), x2(k+1), and x3(k+1) in
So, if our initial guess x(0) = (x1(0), x2(0), x3(0)) is the zero vector 0 = (0,0,0) — a common initial guess unless we have some additional information that leads us to choose some other — then we find x(1) = (x1(1), x2(1), x3(1) ) by solving
So x(1) = (x1(1), x2(1), x3(1)) = (3/4, 9/6, −6/7) ≈ (0.750, 1.500, −0.857). We iterate this process to find a sequence of increasingly better approximations x(0), x(1), x(2), … . We show the results in the table below, with all values rounded to 3 decimal places.
We are interested in the error e at each iteration between the true solution x and the approximation x(k): e(k) = x − x(k) . Obviously, we don't usually know the true solution x. However, to better understand the behavior of an iterative method, it is enlightening to use the method to solve a system Ax = b for which we do know the true solution and analyze how quickly the approximations are converging to the true solution. For this example, the true solution is x = (1, 2, −1).
The norm of a vector ||x|| tells us how big the vector is as a whole (as opposed to how large each element of the vector is). The vector norm most commonly used in linear algebra is the l2 norm:
For example, if x = (1, 2, −1), then
In this module, we will always use the l2 norm (including for matrix norms in subsequent tutorials), so that || || always signifies || ||2.
For our purposes, we observe that ||x|| will be small exactly when each of the elements x1, x2, …, xn in x = (x1, x2, …, xn ) is small. In the following table, the norm of the error becomes progressively smaller as the error in each of the three elements x1, x2, x3 becomes smaller, or in other words, as the approximations become progressively better.
k | x(k) | x(k) − x(k-1) | e(k) = x − x(k) | ||e(k)|| | ||||||
---|---|---|---|---|---|---|---|---|---|---|
0 | -0.000 | -0.000 | -0.000 | − | − | − | -0.000 | -0.000 | -1.000 | 2.449 |
1 | 0.750 | 1.500 | -0.857 | -0.000 | -0.000 | -0.857 | 0.250 | 0.500 | -0.143 | 0.557 |
2 | 0.911 | 1.893 | -0.964 | 0.161 | 0.393 | -0.107 | 0.089 | 0.107 | -0.036 | 0.144 |
3 | 0.982 | 1.964 | -0.997 | 0.071 | 0.071 | -0.033 | 0.018 | 0.036 | -0.003 | 0.040 |
4 | 0.992 | 1.994 | -0.997 | 0.010 | 0.029 | 0.000 | 0.008 | 0.006 | -0.003 | 0.011 |
5 | 0.999 | 1.997 | -1.000 | 0.007 | 0.003 | -0.003 | 0.001 | 0.003 | 0.000 | 0.003 |
6 | 0.999 | 2.000 | -1.000 | 0.000 | 0.003 | 0.001 | 0.000 | 0.001 | 0.000 | 0.001 |
7 | 1.000 | 2.000 | -1.000 | 0.001 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 |
8 | 1.000 | 2.000 | -1.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 |
For this example, we stop iterating after all three ways of measuring the current error,
x(k) − x(k-1), e(k), and ||e(k)||,
equal 0 to three decimal places. In practice, you would normally choose a single measurement of error to determine when to stop.
This can also be written
That is,
Let's apply the Gauss-Seidel Method to the system from
At each step, given the current values
To compare our results from the two methods, we again choose x(0) = (0, 0, 0). We then find
Let us be clear about how we solve this system. We first solve for
We then solve for
Finally, we solve for
The result of this first iteration of the Gauss-Seidel Method is
We iterate this process to generate a sequence of increasingly better approximations
k | x(k) | x(k) − x(k-1) | e(k) = x − x(k) | ||e(k)|| | ||||||
---|---|---|---|---|---|---|---|---|---|---|
0 | -0.000 | -0.000 | -0.000 | − | − | − | -0.000 | -0.000 | -1.000 | 2.449 |
1 | 0.750 | 1.750 | -1.000 | -0.000 | -0.000 | -1.000 | 0.250 | 0.250 | 0.000 | 0.354 |
2 | 0.938 | 1.979 | -1.006 | 0.188 | 0.229 | -0.006 | 0.063 | 0.021 | 0.006 | 0.066 |
3 | 0.993 | 1.999 | -1.001 | 0.056 | 0.020 | 0.005 | 0.007 | 0.001 | 0.001 | 0.007 |
4 | 0.999 | 2.000 | -1.000 | 0.006 | 0.001 | 0.001 | 0.001 | 0.000 | 0.000 | 0.001 |
5 | 1.000 | 2.000 | -1.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 |
As in Example 1, we stop iterating after
Using x(0) = (0, 0), complete a table like the one below, doing five iterations. Compute the first two iterations
k | x(k) | x(true) − x(k) | ||error(k)|| | ||
---|---|---|---|---|---|
1 | −4.000000 | − |
− |
− |
1.250000 |
2 | |||||
3 | |||||
4 | |||||
5 |
.
Repeat Exercise 1, using the Gauss-Seidel Method.
Repeat Exercise 2, using the Gauss-Seidel Method.
(A) | |
(B) |
System | Initial guess | # iterations until ||error(k)|| = 0.000000 |
Jacobi / GS | |
---|---|---|---|---|
Jacobi | GS | |||
A | (0, 0) | 13 | 7 | 13 / 7 ≈ 1.857 |
A | (100, − 50) | |||
A | (− 500, 1000) | |||
B | (0, 0) | |||
B | (100, − 50) | |||
B | (− 500, 1000) | 156 | 79 | 156 / 79 ≈ 1.975 |
Since the true solution is
so that
which we can rewrite as
where
B is often referred to as the iteration matrix.
If the current approximation x(k) is, in fact, the exact solution x, then the iterative method should certainly produce a next iteration x(k+1) that is also the exact solution. That is, it should be true that
Of course, since the problem we are trying to solve is
On the other hand, choosing
To understand the convergence properties of an iterative method
we subtract it from the equation
which gives us
That is, since the current error is
To be clear, the superscript of B is the power of B, while the superscript of vector e (inside parentheses) is the iteration number to which this particular error corresponds.
Full appreciation of the significance of the relationship
Consequently, since
We end this section by noting that one condition sometimes encountered in practice that will guarantee that
so for a general
their iteration matrices are
Method | B |
---|---|
|
|
|
Notice that for both methods the diagonal elements of A must be non-zero:
It turns out that, if an
We have now answered the first question posed on the preceding page, at least for
When will each of these methods work? That is, under what conditions will they produce a sequence of approximations
x(0), x(1), x(2), … that converges to the true solution x ?Answer: When the eigenvalues of the corresponding iteration matrix are both less than 1 in magnitude.
Because
When the methods do work, how quickly will the approximations approach the true solution? That is, what will the rate of convergence be?
Answer: The rate will be the same as the rate at which ||B||k converges to 0.
For example, if
The magnitude of ||B|| is directly related to the magnitude of the eigenvalues of B. Consequently, a major goal in designing an iterative method is that the corresponding iteration matrix B has eigenvalues that are as small (close to 0) as possible.
The eigenvalues and corresponding eigenvectors for the Jacobi and Gauss-Seidel Methods are shown in the following table.
Method | Eigenvalues | Eigenvectors |
---|---|---|
Jacobi | ||
|
|
|
Notice that, for both methods,
For example, if
Everything on this page relates to the case of
A third iterative method, called the Successive Overrelaxation (SOR) Method, is a generalization of and improvement on the Gauss-Seidel Method. Here is the idea:
For any iterative method, in finding
Here is how we derive the SOR Method from the Gauss-Seidel Method. First, notice that we can write the Gauss-Seidel equation as
so that
We can subtract
Now think of this as the Gauss-Seidel correction
where, as we just found,
and where generally
Written out in detail, the SOR Method is
We can multiply both sides by matrix D and divide both sides by ω to rewrite this as
then collect the
When we solve for x(k+1), we get
Notice that the SOR Method is also of the form
so optimal convergence is achieved by choosing a value of ω that minimizes
As we did earlier for the Jacobi and Gauss-Seidel Methods, we can find the eigenvalues and eigenvectors for the
In practice, we typically use a computer to perform the iterations, so we need implementable algorithms in order to use these methods for
Method | Algorithm for performing iteration k + 1: for i = 1 to n do: |
---|---|
Jacobi | |
Gauss- Seidel |
|
SOR |
use the applet to compute five iterations with both the Jacobi and Gauss-Seidel Methods, using initial value x(0) = (-0.5, 0.5), and record your results. For each method, complete a table like the one below, which shows the first set of entries for Jacobi's Method. To better see the results, click the Default Axes button and then Zoom out a few times.
k | x(k) | ||error(k)|| | |
---|---|---|---|
1 | 0.000000 | 3.000000 | 3.162278 |
2 | |||
3 | |||
4 | |||
5 |
and use the applet to compute five iterations, again making a table for each method. This time the approximations produced by the iterations should converge. Find the eigenvalues of the new iteration matrix B for each method, and use this information to explain why the approximations converge in this case. To better see the results, click the
is strictly diagonally dominant, then the eigenvalues of the iteration matrices B corresponding to the Jacobi and Gauss-Seidel Methods are of magnitude < 1 (which guarantees convergence, as you found in Exercise 7).
and verify that the eigenvalues for this matrix are
Compute the errors
Consider solving each using Jacobi’s Method. Since the coefficient matrices A are different, the iteration matrices B in each case are different, and thus the convergence properties of Jacobi's Method in each case are different.
We will consider the convergence rate of Jacobi’s Method for solving
Recall that
k | ||e(k)|| | ||e(k)|| / ||e(k-1)|| |
---|---|---|
0 | 1.414214 | − |
1 | 0.942809 | 0.666666 |
2 | ||
3 | ||
4 |
In general, what would
What conditions on a and b would guarantee convergence (that is, what would guarantee that
Do Exercise 14 with the Gauss-Seidel Method instead of Jacobi’s Method.
ω |
||error(10)|| |
---|---|
1.1 | 0.000664 |
1.2 | |
1.3 | |
1.4 | |
1.5 | |
1.6 | |
1.7 | |
1.8 | |
1.9 |
One could use the quadratic formula to find the two eigenvalues (as functions of ω) that satisfy
We can simplify this last equation to