Iterative Methods for Solving [i]Ax[/i] = [i]b[/i]

Author(s): 
David M. Strong

Information for Instructors

This module comprises two tutorials with associated exercises and a Java applet to do the necessary computations and display the results graphically.  We explore three different iterative methods, that is, methods that are intended to generate successive approximations to the exact solution of a linear system of equations.

David Strong is Assistant Professor of Mathematics at Pepperdine University.

Iterative methods are important in solving many of today’s real world problems, so it is important that your first exposure to these methods be as positive and simple, as well as informative and educational, as possible. With this in mind, the main purpose of the tutorials and applet is to allow easy visualization and experimentation.

In more detail, here are the primary features of the applet:

  • The applet allows for clear and simple visualization of what each iterative method is doing, in particular, how properties of the matrix affect the convergence (or non-convergence) of each method. Repeatedly zooming-in on the converging approximations will help you literally to see that iterative methods don’t normally find a solution exactly -- rather each iteration gives you a better approximation, and you have to decide how good is good enough.

  • The applet allows you easily and quickly to do as many iterations as you want, without the practically impossible burden of doing more than a few iterations by hand with a calculator. Many of the exercises take advantage of this and allow you to explore a variety of basic ideas and considerations arising in iterative solution of linear systems that would otherwise be impossible to do.

  • The applet allows for a simple comparison of three iterative methods, including a comparison of their rates of convergence

  • The applet is designed for easy use and is free. In particular, there is virtually no learning curve and there is no software, calculator, etc. for you to purchase. The only requirement is a computer with an Internet connection and the Java Plug-in, which is free and easy to install.

You will see that the applet deals only with 2 x 2 sytems, i.e., two linear equations in two unknowns. In the real world, we would never use an iterative method to solve such a small sytem, or even to solve substantially larger (e.g., 64 x 64, 512 x 512, etc.) systems. In my own image processing research I typically deal with (sparse) systems of 65536 (2562) or 262144 (5122) equations in the same number of variables. We generally don't use any of the three methods (Jacobi, Gauss-Seidel, SOR) discussed in the module, although some of their basic ideas are found in any iterative method. Still, these methods are very straightforward, which makes them relatively easy to understand, and that is why they are your first taste of iterative methods for solving linear systems.

Furthermore, the 2 x 2 case (as opposed to 3 x 3 or larger systems) makes it easier to visualize the results of these methods, which I think you will find valuable. With this said, the tutorial and its experimentation and visualization are good introductions to iterative methods for solving Ax = b. However, you should consider further treatment of these topics in a numerical analysis or numerical linear algebra course. In particular, you will learn in such a course that some of the simple and elegant results seen in the 2 x 2 case are no longer true when dealing with larger systems.

Published July, 2005
© David M. Strong 2003 - 2005
All Rights Reserved

 

Iterative Methods for Solving \(Ax = b\) - Introduction to the Module

Author(s): 
David M. Strong

Information for Instructors

This module comprises two tutorials with associated exercises and a Java applet to do the necessary computations and display the results graphically.  We explore three different iterative methods, that is, methods that are intended to generate successive approximations to the exact solution of a linear system of equations.

David Strong is Assistant Professor of Mathematics at Pepperdine University.

Iterative methods are important in solving many of today’s real world problems, so it is important that your first exposure to these methods be as positive and simple, as well as informative and educational, as possible. With this in mind, the main purpose of the tutorials and applet is to allow easy visualization and experimentation.

In more detail, here are the primary features of the applet:

  • The applet allows for clear and simple visualization of what each iterative method is doing, in particular, how properties of the matrix affect the convergence (or non-convergence) of each method. Repeatedly zooming-in on the converging approximations will help you literally to see that iterative methods don’t normally find a solution exactly -- rather each iteration gives you a better approximation, and you have to decide how good is good enough.

     

  • The applet allows you easily and quickly to do as many iterations as you want, without the practically impossible burden of doing more than a few iterations by hand with a calculator. Many of the exercises take advantage of this and allow you to explore a variety of basic ideas and considerations arising in iterative solution of linear systems that would otherwise be impossible to do.

     

  • The applet allows for a simple comparison of three iterative methods, including a comparison of their rates of convergence

     

  • The applet is designed for easy use and is free. In particular, there is virtually no learning curve and there is no software, calculator, etc. for you to purchase. The only requirement is a computer with an Internet connection and the Java Plug-in, which is free and easy to install.

     

You will see that the applet deals only with 2 x 2 sytems, i.e., two linear equations in two unknowns. In the real world, we would never use an iterative method to solve such a small sytem, or even to solve substantially larger (e.g., 64 x 64, 512 x 512, etc.) systems. In my own image processing research I typically deal with (sparse) systems of 65536 (2562) or 262144 (5122) equations in the same number of variables. We generally don't use any of the three methods (Jacobi, Gauss-Seidel, SOR) discussed in the module, although some of their basic ideas are found in any iterative method. Still, these methods are very straightforward, which makes them relatively easy to understand, and that is why they are your first taste of iterative methods for solving linear systems.

Furthermore, the 2 x 2 case (as opposed to 3 x 3 or larger systems) makes it easier to visualize the results of these methods, which I think you will find valuable. With this said, the tutorial and its experimentation and visualization are good introductions to iterative methods for solving Ax = b. However, you should consider further treatment of these topics in a numerical analysis or numerical linear algebra course. In particular, you will learn in such a course that some of the simple and elegant results seen in the 2 x 2 case are no longer true when dealing with larger systems.

Published July, 2005
© David M. Strong 2003 - 2005
All Rights Reserved

 

Iterative Methods for Solving [i]Ax[/i] = [i]b[/i] - Introduction to the Iterative Methods

Author(s): 
David M. Strong
One of the most important problems in mathematics is to find the values of the n unknowns x1, x2, ..., xn that satisfy the system of n equations

That is, we want to solve for x in Ax = b where

A is commonly referred to as the coefficient matrix. In order to save space, we usually write column vectors in coordinate form, x = (x1, x2, …, xn), and we will follow that practice in these tutorials and exercises.

Suppose we knew the true values of x2, x3, … , xn . Then we could use these values, along with the first equation (or any other equation where the x1 coefficient is nonzero), to find the true value of x1. We usually won’t have the true values of x2, x3, … , xn , but suppose we had pretty good approximations (or guesses) for their values. Then we could hope to use these values to find a pretty good approximation for x1. Similarly, we probably could find a pretty good approximation for any xi if we had good approximations for the values of x1, … , xi-1, xi+1, … , xn. The idea emerging here is that, if we currently have approximations for the values of x = (x1, x2, …, xn), then we can use these current values to find new (and hopefully better) approximations for x = (x1, x2, …, xn).

The fundamental idea of an iterative method is to use xcurrent, a current approximation (or guess) for the true solution x of Ax = b, to find a new approximation xnew, where xnew is closer to x than xcurrent is. We then use this new approximation as the current approximation to find yet another, better approximation.

Iterate means repeat, so an iterative method repeats this process over and over, each time using the current approximation to produce a better approximation for the true solution, until the current approximation is sufficiently close to the true solution -- or until you realize that the sequence of approximations resulting from these iterations is not converging to the true solution.

So given an initial guess or approximation x(0) for the true solution x, we use x(0) to find (in a manner we’ll discuss shortly) a new approximation x(1), then we use x(1) to find another, better approximation x(2), and so on. In general, we use x(k) to find a new approximation x(k+1). We expect that x(k) x as k ; that is, our approximations should become closer to the true solution as we take more iterations of this process. Of course, we hope that our current approximation x(k) is close enough to x for our needs after a relatively small number of iterations.

Since we don’t actually have the true solution x (if we did, why would we be trying to find it?), we can’t check to see how close our current approximation x(k) is to x. One common way to check the closeness of x(k) to x is instead by checking how close Ax(k) is to Ax --   that is, how close Ax(k) is to b.

Another way to check the accuracy of our current approximation is by looking at the differences in successive approximations, x(k)x(k-1). We generally expect x(k) to be close to x if x(k)x(k-1) is small.

This idea of iteration is not unique to linear algebra. An example that may be familiar to you is Newton’s Method , which finds approximations to the root x of f(x) = 0. Given a current approximation x(k) to x, Newton’s Method finds x(k+1) by

Under the right conditions, x(k) x [that is, f (xk) 0] as k . Interestingly, the underlying theory of Newton’s Method is actually found in certain iterative methods that solve our problem Ax = b. Those methods are discussed in numerical linear algebra courses.

Iterative Methods for Solving [i]Ax[/i] = [i]b[/i] - Information on the Java Applet

Author(s): 
David M. Strong
How to access and run the applet:
  • You can access the applet here . There are also links to it in the problems for which it is used.

  • The Java Plug-in (version 1.4 or higher) is required to use the applet. If you don’t have this plug-in, visit Sun Microsystems' Java Download page, and you should get it automatically -- just follow the prompts and steps.

The applet is designed for solving a 2 x 2 system of equations using the Jacobi, Gauss-Seidel and/or SOR Methods.

How to use the applet:

  • Enter the values for the matrix A, the vector b, and the initial guess x(0).

  • Under Methods to use, click the Jacobi, Gauss-Seidel, and/or SOR boxes to use the desired iterative method(s). If you select the SOR Method, you should also select a value for ω.

  • To iterate the selected method(s), under Do iteration using, click the Initial guess button to do the first iteration using x(0), and then click the Current approximation button to do subsequent iterations using the most recent approximation from each selected method. You can select or unselect any method at any time. The applet stores the results of the iterations for each method even if they are not currently being displayed.

  • The applet’s table of data will give you
    • the current approximation x(k) = (x1(k), x2(k)),
    • the current error error(k) = (x1truex1(k), x2truex2(k)), and
    • the norm of the current error ||error(k)||.
  • Notice the color coordination of the applet. For example, everything related to the True solution is in green, everything related to the Gauss-Seidel Method is in purple, and so on.

  • When changing the minimum and maximum x1 and x2 values (in the bottom left part of the applet) for the graphing window, you must press Enter to make those changes take effect. You can also use the zoom buttons or set your own viewing window by pressing and dragging the left mouse button on the graph (experiment to see this feature).

  • The applet is designed to be very robust, including checking for bad input. However, like any software or other technology, it is not completely fail-proof. For example, working with extremely small or large numbers could result in inaccurate results.

Iterative Methods for Solving [i]Ax[/i] = [i]b[/i] - Jacobi's Method

Author(s): 
David M. Strong

Perhaps the simplest iterative method for solving Ax = b is Jacobi’s Method. Note that the simplicity of this method is both good and bad: good, because it is relatively easy to understand and thus is a good first taste of iterative methods; bad, because it is not typically used in practice (although its potential usefulness has been reconsidered with the advent of parallel computing). Still, it is a good starting point for learning about more useful, but more complicated, iterative methods.

Given a current approximation

x(k) = (x1(k), x2(k), x3(k), …, xn(k))

for x, the strategy of Jacobi's Method is to use the first equation and the current values of x2(k), x3(k), …, xn(k) to find a new value x1(k+1), and similarly to find a new value xi(k) using the i th equation and the old values of the other variables. That is, given current values x(k) = (x1(k), x2(k), …, xn(k)), find new values by solving for

x(k+1) = (x1(k+1), x2(k+1), …, xn(k+1))

in

 

This system can also be written as

 

To be clear, the subscript i means that xi(k) is the i th element of vector

x(k) = (x1(k), x2(k), …, xi(k), …, xn(k) ),

and superscript k corresponds to the particular iteration (not the kth power of xi ).

If we write D, L, and U for the diagonal, strict lower triangular and strict upper triangular and parts of A, respectively,

then Jacobi’s Method can be written in matrix-vector notation as

so that
 

Example 1

Let's apply Jacobi's Method to the system

 .

At each step, given the current values x1(k), x2(k), x3(k), we solve for x1(k+1), x2(k+1), and x3(k+1) in

 .

So, if our initial guess x(0) = (x1(0), x2(0), x3(0)) is the zero vector 0 = (0,0,0) — a common initial guess unless we have some additional information that leads us to choose some other — then we find x(1) = (x1(1), x2(1), x3(1) ) by solving

So x(1) = (x1(1), x2(1), x3(1)) = (3/4, 9/6, −6/7) ≈ (0.750, 1.500, −0.857). We iterate this process to find a sequence of increasingly better approximations x(0), x(1), x(2), … . We show the results in the table below, with all values rounded to 3 decimal places.

We are interested in the error e at each iteration between the true solution x and the approximation x(k): e(k) = xx(k) . Obviously, we don't usually know the true solution x. However, to better understand the behavior of an iterative method, it is enlightening to use the method to solve a system Ax = b for which we do know the true solution and analyze how quickly the approximations are converging to the true solution. For this example, the true solution is x = (1, 2, −1).

The norm of a vector ||x|| tells us how big the vector is as a whole (as opposed to how large each element of the vector is). The vector norm most commonly used in linear algebra is the l2 norm:

For example, if x = (1, 2, −1), then

In this module, we will always use the l2 norm (including for matrix norms in subsequent tutorials), so that || || always signifies || ||2.

For our purposes, we observe that ||x|| will be small exactly when each of the elements x1, x2, …, xn in x = (x1, x2, …, xn ) is small. In the following table, the norm of the error becomes progressively smaller as the error in each of the three elements x1, x2, x3 becomes smaller, or in other words, as the approximations become progressively better.

 k  x(k) x(k)x(k-1) e(k) = xx(k) ||e(k)||
0 -0.000 -0.000 -0.000 -0.000 -0.000 -1.000 2.449
1 0.750 1.500 -0.857 -0.000 -0.000 -0.857 0.250 0.500 -0.143 0.557
2 0.911 1.893 -0.964 0.161 0.393 -0.107 0.089 0.107 -0.036 0.144
3 0.982 1.964 -0.997 0.071 0.071 -0.033 0.018 0.036 -0.003 0.040
4 0.992 1.994 -0.997 0.010 0.029 0.000 0.008 0.006 -0.003 0.011
5 0.999 1.997 -1.000 0.007 0.003 -0.003 0.001 0.003 0.000 0.003
6 0.999 2.000 -1.000 0.000 0.003 0.001 0.000 0.001 0.000 0.001
7 1.000 2.000 -1.000 0.001 0.000 0.000 0.000 0.000 0.000 0.000
8 1.000 2.000 -1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000

 

For this example, we stop iterating after all three ways of measuring the current error,

x(k)x(k-1), e(k), and ||e(k)||,

equal 0 to three decimal places. In practice, you would normally choose a single measurement of error to determine when to stop.

Iterative Methods for Solving [i]Ax[/i] = [i]b[/i] - Gauss-Seidel Method

Author(s): 
David M. Strong
Let us take Jacobi’s Method one step further. Where the true solution is x = (x1, x2, … , xn), if x1(k+1) is a better approximation to the true value of x1 than x1(k) is, then it would make sense that once we have found the new value x1(k+1) to use it (rather than the old value x1(k)) in finding x2(k+1), … , xn(k+1). So x1(k+1) is found as in Jacobi's Method, but in finding x2(k+1), instead of using the old value of x1(k) and the old values x3(k), … , xn(k), we now use the new value x1(k+1) and the old values x3(k), … , xn(k), and similarly for finding x3(k+1), … , xn(k+1). This technique is called the Gauss-Seidel Method -- even though, as noted by Gil Strang in his Introduction to Applied Mathematics, Gauss didn’t know about it and Seidel didn’t recommend it. It is described by

This can also be written

That is,

 ,
so that
 

Example 2

Let's apply the Gauss-Seidel Method to the system from Example 1:

 .

At each step, given the current values x1(k), x2(k), x3(k), we solve for x1(k+1), x2(k+1), x3(k+1) in

 . 

To compare our results from the two methods, we again choose x(0) = (0, 0, 0). We then find x(1) = (x1(1), x2(1), x3(1)) by solving

 .

Let us be clear about how we solve this system. We first solve for x1(1) in the first equation and find that

x1(1) = 3/4 = 0.750.

We then solve for x2(1) in the second equation, using the new value of x1(1) = 0.750, and find that

x2(1) = [9 + 2(0.750)] / 6 = 1.750.

Finally, we solve for x3(1) in the third equation, using the new values of x1(1) = 0.750 and x2(1) = 1.750, and find that

x3(1) = [-6 + 0.750 − 1.750] / 7 = − 1.000.

The result of this first iteration of the Gauss-Seidel Method is

x(1) = (x1(1), x2(1), x3(1)) = (0.750, 1.750, − 1.000).

We iterate this process to generate a sequence of increasingly better approximations x(0), x(1), x(2), … and find results similar to those that we found for Example 1.

  k   x(k) x(k)x(k-1) e(k) = xx(k) ||e(k)||
0 -0.000 -0.000 -0.000 -0.000 -0.000 -1.000 2.449
1 0.750 1.750 -1.000 -0.000 -0.000 -1.000 0.250 0.250 0.000 0.354
2 0.938 1.979 -1.006 0.188 0.229 -0.006 0.063 0.021 0.006 0.066
3 0.993 1.999 -1.001 0.056 0.020 0.005 0.007 0.001 0.001 0.007
4 0.999 2.000 -1.000 0.006 0.001 0.001 0.001 0.000 0.000 0.001
5 1.000 2.000 -1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000

As in Example 1, we stop iterating after x(k)x(k-1), e(k), and ||e(k)|| are all 0 to three decimal places. Notice that this sequence of iterations converges to the true solution (1, -2, 1) much more quickly than we found in Example 1 using the Jacobi Method. This is generally expected, since the Gauss-Seidel Method uses new values as we find them, rather than waiting until the subsequent iteration, as is done with the Jacobi Method.

Iterative Methods for Solving [i]Ax[/i] = [i]b[/i] - Exercises, Part 1: Jacobi and Gauss-Seidel Methods

Author(s): 
David M. Strong
  1. Solve the system

    using Jacobi’s Method with the following details:
    1. Using x(0) = (0, 0), complete a table like the one below, doing five iterations. Compute the first two iterations x(1) and x(2) by hand (show your work!), and use the applet to perform the next three iterations.

      k x(k) x(true)x(k) ||error(k)||
        1   4.000000 1.250000 1.000000 0.750000 1.250000
        2            
        3            
        4            
        5            

    2. Do more iterations (don’t write down the details of the results -- that is, don’t make a table for these next iterations) until the applet shows ||error(k)|| = 0.000000. How many iterations are required?


  2. Repeat Exercise 1 with the system

     .

  3. Repeat Exercise 1, using the Gauss-Seidel Method.

  4. Repeat Exercise 2, using the Gauss-Seidel Method.

  5.  (A)  
       
     (B) 
    Systems (A) and (B) at the right have the same solution. Use both Jacobi's Method and the Gauss-Seidel Method, with three different initial guesses, to solve both systems. Use the applet to do all of the work. For this problem we're interested only in how many iterations are required to get to ||error(k)|| = 0. (The error is not exactly 0, but it rounds to 0 to six digits after the decimal.) Complete a table like the one shown here, in which the first and last results are given.  Then answer the questions in a. and b. following the table.

    System Initial guess # iterations until
    ||error(k)|| = 0.000000
        Jacobi / GS    
    Jacobi GS
    A (0, 0) 13 7 13 / 7 ≈ 1.857
    A (100, − 50)      
    A (− 500, 1000)      
    B (0, 0)      
    B (100, − 50)      
    B (− 500, 1000) 156 79 156 / 79 ≈ 1.975

    1. Based on the results in the table, comment on which feature seems to have more effect on how quickly we get a good approximation (i.e., how quickly the error goes to 0): the initial guess or the system of equations itself (the matrix A for that system).
    2. Based on the results in the table, for 2 x 2 systems, what is the approximate relationship between the number of iterations required for the Jacobi Method and the number of iterations required for the Gauss-Seidel Method to obtain approximately the same approximation (that is, the same degree of accuracy)? Suppose you were told that for a certain 2 x 2 system of equations and given a certain initial guess, the Jacobi Method required 40 iterations to get to ||error(k)|| = 0. About how many iterations would the Gauss-Seidel Method would require to get approximately the same results?

  6. We expect that an iterative method, such as Jacobi or Gauss-Seidel, will produce a sequence of approximations that get closer and closer to the true solution. In this problem we consider the question of whether we ever reach the true solution exactly. Use Jacobi’s Method to solve the system

     .

    Since the true solution is x = (1, 1), let us center the viewing window around that point, by changing the minimum and maximum boundaries for both x1 and x2 to −4 and 6 (bottom left part of the applet — be sure to press Enter after entering the new values). For this problem, use an initial guess of x(0) = (4, −3). Also, for this problem do not write down the results of your iterations.

    1. Do 10 iterations. On the graph in the applet, does it appear that the approximations have already reached the true solution? Now zoom in about 10 times by clicking on the Zoom in button, and answer the same question.
    2. Do 10 more iterations, for a total of 20, and answer the same question as in (a). As in (a), zoom in about 10 more times and answer the same question again.
    3. Does it appear that we will ever reach the solution exactly? Although it would be nice to have the true solution exactly, is an approximation actually good enough? (Note: if you attempt to continue to iterate and zoom in, you will eventually, perhaps quickly, exhaust the precision of your computer, and it may produce strange results — your computer can zoom in only so far.)

Iterative Methods for Solving [i]Ax[/i] = [i]b[/i] - Convergence Analysis of Iterative Methods

Author(s): 
David M. Strong
On this page and the next, we attempt to answer two questions regarding the Jacobi and Gauss-Seidel Methods :
  1. When will each of these methods work? That is, under what conditions will they produce a sequence of approximations x(0), x(1), x(2), … that converges to the true solution x?

  2. When they do work, how quickly will the approximations approach the true solution? That is, what will the rate of convergence be?
In general, an iterative method that finds the solution to Ax = b takes the form

so that

which we can rewrite as

where

 

B is often referred to as the iteration matrix.

If the current approximation x(k) is, in fact, the exact solution x, then the iterative method should certainly produce a next iteration x(k+1) that is also the exact solution. That is, it should be true that x = Bx + , so

 

Of course, since the problem we are trying to solve is Ax = b, M and N must be chosen so that A = MN.

On the other hand, choosing A = MN does not necessarily guarantee that the iterative method will find a sequence of vectors x(0), x(1), x(2), … that converges to the true solution x . Whether a particular method will work depends on the iteration matrix B = M-1 N. In fact, in general, B completely determines the convergence (or not) of an iterative method. In particular, the initial guess generally has no effect on whether a particular method is convergent or on the rate of convergence. However, if the initial guess is far away from the true solution, more iterations will be required to get an acceptable approximation for the true solution than if the initial guess were closer to the true solution.

To understand the convergence properties of an iterative method

 

we subtract it from the equation

 

which gives us

 

That is, since the current error is e(k) = xx(k),

To be clear, the superscript of B is the power of B, while the superscript of vector e (inside parentheses) is the iteration number to which this particular error corresponds.

Full appreciation of the significance of the relationship e(k) = Bke(0) requires some familiarity with matrix norms . Just like the norm of a vector, the norm of a matrix ||B|| tells us the “size” of the matrix (not its dimensions). More precisely, for any given vector v, the norm of the matrix tells us how much bigger or smaller (in norm)  Bv will be when compared to v. In particular, it is always true that ||Bv|| ||B|| ||v||.

Consequently, since ||e(k)|| = ||Bk e(0)|| ||B||k || e(0)||, then ||e(k)|| 0 (which is the same as e(k) 0) if ||B|| < 1. As we show on the next page, there is a particular iteration matrix B for each of the Jacobi and Gauss-Seidel Methods. For each method, the smaller ||B|| is, the faster the error will converge to 0 -- that is, the faster the approximation will approach the true solution. On the other hand, if ||B|| 1, the error will simply increase, and our approximations will move away from the true solution rather than toward it.

We end this section by noting that one condition sometimes encountered in practice that will guarantee that ||B|| < 1 is that matrix A is strictly diagonally dominant. This means that, for each row of A, the absolute value of the diagonal element is larger than the sum of the absolute values of the off-diagonal elements.

Iterative Methods for Solving [i]Ax[/i] = [i]b[/i] - Analysis of Jacobi and Gauss-Seidel Methods

Author(s): 
David M. Strong
We continue our analysis with only the 2 x 2 case, since the Java applet to be used for the exercises deals only with this case. As we noted on the preceding page, the Jacobi and Gauss-Seidel Methods are both of the form

so for a general 2 x 2 matrix

their iteration matrices are

Method B
Jacobi
GS

Notice that for both methods the diagonal elements of A must be non-zero: a11 ≠ 0 and a22 ≠ 0.

It turns out that, if an n x n iteration matrix B has a full set of n distinct eigenvectors, then ||B|| = |λmax|, where λmax is the eigenvalue of B of greatest magnitude. The 2 x 2 Jacobi and Gauss-Seidel iteration matrices always have two distinct eigenvectors, so each method is guaranteed to converge if all of the eigenvalues of B corresponding to that method are of magnitude < 1. This includes cases in which B has complex eigenvalues. For n x n systems, things are more complicated. More general cases for larger systems are discussed in more detail in any good numerical analysis or numerical linear algebra text.

We have now answered the first question posed on the preceding page, at least for 2 x 2 systems:

When will each of these methods work? That is, under what conditions will they produce a sequence of approximations x(0), x(1), x(2), … that converges to the true solution x ?

Answer: When the eigenvalues of the corresponding iteration matrix are both less than 1 in magnitude.

Because || e(k) || ||B||k ||e0||, the second question is also answered.

When the methods do work, how quickly will the approximations approach the true solution? That is, what will the rate of convergence be?

Answer: The rate will be the same as the rate at which ||B||k converges to 0.

For example, if ||B|| = 0.5, then size of the error e(k) = xx(k) would be cut approximately in half by each additional iteration. That is, the rate of convergence would be 0.5.

The magnitude of ||B|| is directly related to the magnitude of the eigenvalues of B. Consequently, a major goal in designing an iterative method is that the corresponding iteration matrix B has eigenvalues that are as small (close to 0) as possible.

The eigenvalues and corresponding eigenvectors for the Jacobi and Gauss-Seidel Methods are shown in the following table.

Method Eigenvalues Eigenvectors
Jacobi
GS

Notice that, for both methods, ||B|| = ||λmax|| < 1 if |a12a21 / a11a22| < 1. Also notice that the magnitude of the non-zero eigenvalue for the Gauss-Seidel Method is the square of either of the two eigenvalues for the Jacobi Method. As a result, if BJacobi and BGS are the iteration matrices of the 2 x 2 Jacobi and Gauss-Seidel Methods, respectively, then ||BGS|| = ||BJacobi||2.

For example, if ||BJacobi|| = 0.5, then ||BGS|| = (0.5)2 = 0.25. That is, if each iteration of the Jacobi Method causes the error to be halved, then each iteration of the Gauss-Seidel Method will cause the error to be quartered. Another way to look at this is that approximately twice as many iterations of the Jacobi Method iterations are needed to achieve the same level of accuracy (in approximating the exact solution x) as for the Gauss-Seidel Method.

Everything on this page relates to the case of 2 x 2 systems. As mentioned, for general n x n systems, things are generally different and certainly more complicated than for the 2 x 2 case. In fact, Jacobi's Method might converge while the Gauss-Seidel Method does not, or vice versa, and it's possible that neither method converges. This is especially true if the original matrix A is not symmetric or positive definite. Fortunately, many matrices that arise in real life applications are both symmetric and positive definite.

Iterative Methods for Solving [i]Ax[/i] = [i]b[/i] - The SOR Method

Author(s): 
David M. Strong

A third iterative method, called the Successive Overrelaxation (SOR) Method, is a generalization of and improvement on the Gauss-Seidel Method. Here is the idea:

For any iterative method, in finding x(k+1) from x(k), we move a certain amount in a particular direction from x(k) to x(k+1). This direction is the vector x(k+1)x(k), since x(k+1) = x(k) + (x(k+1)x(k)). If we assume that the direction from x(k) to x(k+1) is taking us closer, but not all the way, to the true solution x , then it would make sense to move in the same direction x(k+1)x(k), but farther along that direction.

Here is how we derive the SOR Method from the Gauss-Seidel Method. First, notice that we can write the Gauss-Seidel equation as

 ,

so that

We can subtract x(k) from both sides to get

Now think of this as the Gauss-Seidel correction (x(k+1)x(k))GS. As suggested above, it turns out that convergence x(k) x of the sequence of approximate solutions to the true solution is often faster if we go beyond the standard Gauss-Seidel correction. The idea of the SOR Method is to iterate

where, as we just found,

and where generally 1 < ω < 2. Notice that if ω = 1 then this is the Gauss-Seidel Method.

Written out in detail, the SOR Method is

We can multiply both sides by matrix D and divide both sides by ω to rewrite this as

then collect the x(k+1) terms on the left hand side to get

.

When we solve for x(k+1), we get

Notice that the SOR Method is also of the form x = Bx + , so the general convergence analysis on page 6 also applies to the SOR Method, as does the more specific analysis on page 7 for the Jacobi and Gauss-Seidel Methods. The iteration matrix B that determines convergence of the SOR Method is

,

so optimal convergence is achieved by choosing a value of ω that minimizes

As we did earlier for the Jacobi and Gauss-Seidel Methods, we can find the eigenvalues and eigenvectors for the 2 x 2 SOR Method B matrix. However, because this is quite a bit more complicated, we do not derive these expressions here. Rather, we leave it as Exercise 18 (next page) for the ambitious student or the challenging instructor.

In practice, we typically use a computer to perform the iterations, so we need implementable algorithms in order to use these methods for n x n systems. We conclude by giving one possible set of algorithms for finding element xi(k+1) given x1(k), x2(k), … , xn(k). Although these particular algorithms are not quite optimally efficient, writing the algorithms this way makes more obvious the slight but important differences among the three methods.

Method Algorithm for performing iteration k + 1:
for i = 1 to n do:
Jacobi
Gauss-
Seidel
SOR

Iterative Methods for Solving [i]Ax[/i] = [i]b[/i] - Exercises, Part 2: All Methods

Author(s): 
David M. Strong
  1. Suppose that v1 and v2 are linearly independent eigenvectors of iteration matrix B, with corresponding eigenvalues λ1 and λ2, and suppose that the initial error can be written as linear combination of these two vectors: e(0) = c1v1 + c2v2. Recall that the error after iteration k is e(k) = Be(k-1), so that



    1. Prove (by induction) that



    2. Show that if |λ1| < 1 and |λ2| < 1, then ||e(k)|| 0 as k . Use the fact that
       .
      This shows that the initial error, and thus the initial guess, don't matter that much. What matters most is the eigenvalues of B, which tell you both whether the error will go to 0, and if so, how quickly.  More generally, what matters is the size of ||B ||, which is directly related to the eigenvalues.

  2. In this problem, you will see that the order of the equations in a system can affect the performance of an iterative solution method.

    1. For the system of equations

      use the applet to compute five iterations with both the Jacobi and Gauss-Seidel Methods, using initial value x(0) = (-0.5, 0.5), and record your results. For each method, complete a table like the one below, which shows the first set of entries for Jacobi's Method. To better see the results, click the Default Axes button and then Zoom out a few times.

        k   x(k)   ||error(k)||  
      1   0.000000     3.000000     3.162278  
      2      
      3      
      4      
      5      

      Also, find the eigenvalues of the iteration matrix B for each method, and use this information to explain why the approximations do not converge for either method.

    2. Now rearrange (swap) the two equations, so that the system has the form
       ,

      and use the applet to compute five iterations, again making a table for each method. This time the approximations produced by the iterations should converge. Find the eigenvalues of the new iteration matrix B for each method, and use this information to explain why the approximations converge in this case. To better see the results, click the Default Axes button and then Zoom in a few times.

  3. In this problem you will investigate how whether a matrix is strictly diagonally dominant affects whether the Jacobi and/or Gauss-Seidel Methods work.

    1. Show that if a 2 x 2 coefficient matrix

      is strictly diagonally dominant, then the eigenvalues of the iteration matrices B corresponding to the Jacobi and Gauss-Seidel Methods are of magnitude < 1 (which guarantees convergence, as you found in Exercise 7).

    2. In Exercise 8 you have two coefficient A matrices, one for each ordering of the equations.  Comment on how the diagonal dominance (or not) of each A matrix relates to the convergence (or not) results you found in Exercise 8.

    3. Give a 2 x 2 system of equations in which the coefficient matrix A is strictly diagonally dominant, and compute the first five iterations to help verify that both methods produce approximations that are converging to the exact solution.  Make a table similar to the one in Exercise 8.  It doesn’t matter what you choose for  b  or for the initial guess x(0).

  4. Consider Jacobi’s Method for solving the system Ax = b, where


    1. Verify that the iteration matrix B corresponding to the Jacobi Method for this system is
       ,

      and verify that the eigenvalues for this matrix are λ1 = 0.5 and λ1 = − 0.5 with corresponding eigenvectors v1 = (1, − 1) and v2 = (1, 1).  (All you need to show is that Bv1 = λ1v1 and Bv2 = λ2v2.)

    2. Suppose that b = (0, 0), so that the true solution is x = (0, 0). If we choose x(0) = (5, − 1), then  e(0) = xx(0) = (− 5, 1).  Verify that e(0) = c1v1 + c2v2, where c1 = − 3 and c1 = − 2, and where v1 = (1, − 1) and v2 = (1, 1) are the eigenvectors of B.

    3. Use the fact that e(k) = c1λ1k v1 + c2λ2k v2 to show that
       .

      Compute the errors e(1), e(2), e(3), e(4), and e(5) in five iterations of the Jacobi Method starting from  x(0) = (5, − 1), as in (b). Check your work by using the applet to find the first five iterations when using Jacobi's Method for this system and initial guess, and sketch or print the applet's graph of the five approximations.

    4. Now suppose that b = (3, 0) so that the true solution is x = (2, − 1). Suppose also that x(0) = (7, − 2), so that we happen to have the same initial error, e(0) = xx(0) = (− 5, 1), as in (b) and (c). Use the applet to find the first five iterations and corresponding errors e(1), e(2), e(3), e(4), and e(5) for this new system.  (The coefficient matrix A is the same, and thus the iteration matrix B is also the same -- only the right-hand side b is changed.)  Again sketch or print the graph that shows the approximations. Compare your results here to your results in (c), both the actual errors e(1), e(2), e(3), e(4), and e(5) and the graph of the results.

    5. For this same coefficient matrix A, suppose that any b and x(0) are chosen so that the initial error is e(0) = xx(0) = (− 5, 1), as in (b) − (d). What would you predict that the errors e(1), e(2), e(3), e(4) , and e(5) would be?

    6. Support your answer in (e) by finding any b and x(0) such that e(0) = (− 5, 1) -- you choose b and x(0) from the infinite number of possibilities -- and then use the applet to compute the errors e(1), e(2), e(3), e(4) , and e(5) . Hint: First choose x -- which will give you b, since Ax = b -- and then choose x(0) so that e(0) = xx(0), that is, x(0) = x − (− 5, 1). Sketch or print the applet's graph of the five approximations, and compare this to your graphs from (c) and (d).

  5. Do Exercise 10 with the Gauss-Seidel Method instead of the Jacobi Method. You will need to find B and its eigenvalues and eigenvectors, as well as c1 and c2.

  6. Each of the following systems has the same solution.

    Consider solving each using Jacobi’s Method. Since the coefficient matrices A are different, the iteration matrices B in each case are different, and thus the convergence properties of Jacobi's Method in each case are different.

    1. Using x(0) = (4, 1) for each system, use the applet to do five iterations. (You'll probably need to Zoom out to see the results for the third system.) For all three systems, record the ||error|| after each iteration, and describe what the graph in the applet is showing. (You may sketch or print the applet's graph and comment on what you see.)

    2. For each of the three systems, find the iteration matrix B and its eigenvalues, and use the eigenvalues to explain the results in (a). Note: the "circular" behavior of the Jacobi Method for the second system occurs because the eigenvalues of B for this system are complex.

  7. Do Exercise 12 with the Gauss-Seidel Method instead of Jacobi’s Method.

  8. The rate of convergence corresponds to how the error is reduced by performing another iteration. For this problem we will think of the rate of convergence as how much the error shrinks with each iteration -- in other words, it is the ratio of the new error to the old error: ||e(k)|| / ||e(k-1)||. So, for example, if r = ||e(k)|| / ||e(k-1)|| = 0.7, then ||e(k)|| = 0.7||e(k-1)||. Obviously we would always want r < 1 in order for the error to → 0.

    We will consider the convergence rate of Jacobi’s Method for solving

    Recall that ||e(k)|| = ||Be(k-1)|| ≈ |λmax| ||e(k-1)||, where λmax is the largest (in magnitude) eigenvalue of iteration matrix B. In general (for larger systems), this approximate relationship often is not the case for the first few iterations, but the approximation often becomes more valid for later iterations. It turns out that for 2 x 2 systems, if both eigenvalues are equal in magnitude, |λ1| = |λ2|, then ||e(k)|| = |λmax| ||e(k-1)|| exactly and immediately, which also means that the rate of convergence at every step would be ||e(k)|| / ||e(k-1)|| = |λmax|. For

     ,
    if a11 = a22 and a12 = a21, it turns out that the two eigenvalues for the iterations matrix B for Jacobi's Method will be of the same size, as can be seen on page 7, Analysis of Jacobi and GS Methods.

    1. Find the two eigenvalues of the iteration matrix B corresponding to Jacobi’s Method for the 2 x 2 system described above.

    2. Verify the relationship ||e(k)|| / ||e(k-1)|| = |λmax| by completing a table like the one below (use the applet to do the work) and then comparing these results to the eigenvalues found in (a). Use x(0) = (0, 0).
    k ||e(k)|| ||e(k)|| / ||e(k-1)||
    0 1.414214
    1 0.942809 0.666666
    2    
    3    
    4    

    In general, what would ||e(k)|| / ||e(k-1)|| be for

    What conditions on a and b would guarantee convergence (that is, what would guarantee that ||e(k)|| / ||e(k-1)|| < 1)? How does this relate to Exercise 8(a)?

  9. Do Exercise 14 with the Gauss-Seidel Method instead of Jacobi’s Method.

  10. In this problem you will use the SOR Method to solve the 2 x 2 system

    1. Generally we choose a value for ω such that 1 < ω < 2. For each value of ω = 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, and 1.9, solve the system with x(0) = (0, 0) as the initial guess, and record the ||error|| after k = 10 iterations. Use the applet to perform the iterations. (Be sure to click the Initial guess button after entering each new value of ω.) You do not need to write down the results at each step this time we are interested only in the final result for each value of ω — complete a table like the one below. Based on your results, which value of ω seems to result in the fastest convergence? That is, which one gives the smallest error after 10 iterations? Note: this is not necessarily the optimal value of ω — it is simply the best choice among the nine values that we considered.

    ω

    ||error(10)||
    1.1 0.000664
    1.2  
    1.3  
    1.4  
    1.5  
    1.6  
    1.7  
    1.8  
    1.9  


    1. Repeat (a), except use x(0) = (−20, 10) as your initial guess.

    2. Explain why the best value of ω is the same in both (a) and (b).

    3. For each initial guess, x(0) = (0, 0) and x(0) = (−20, 10), solve the system using the Gauss-Seidel Method (which is equivalent to the SOR Method for ω = 1) and for both cases record the ||error|| after 10 iterations. For which values of ω in part (a) does the SOR Method produce faster convergence results (that is, a smaller ||error|| after 10 iterations) than the Gauss-Seidel Method and for which values is the convergence worse?

    4. For this system, it turns out that for ω ≥ 2, the SOR Method will actually diverge. Again using the applet, do 50 iterations of the SOR Method using ω = 2 and ω = 2.05 and describe (or sketch/print and comment on) the results.

  11. In this problem you will continue exploring the SOR Method for solving the 2 x 2 system


    1. Find the B matrix for the SOR Method for this system (as a function of ω).

    2. The characteristic equation for this B matrix is

      One could use the quadratic formula to find the two eigenvalues (as functions of ω) that satisfy det(Bλ I) = 0. It turns out, if the two eigenvalues are the same, then ||B|| = |λmax| is minimized, which results in the fastest possible convergence for the SOR Method. The two eigenvalues will be the same when the discriminant b2 − 4ac = 0 in the characteristic equation. Ignoring the factor of 1/16 in front, we want to examine the discriminant of  aλ2 + bλ + c, where a = 1, b = −(9ω2 − 32ω + 32), and c = 256(ω − 1)2 . In particular, we want to find where b2 = 4ac, that is,

       .

      We can simplify this last equation to 9ω2 − 64ω + 64 = 0. Find the value of ω that results in the two eigenvalues being the same by solving for ω in 9ω2 − 64ω + 64 = 0, keeping in mind that we want the value that satisfies 1 < ω < 2.

    3. If Exercise 16 was assigned, compare the value you just found in 17(b) to the result you found in Exercise 16 for an optimal value of ω [of the nine ω values in 16(a)].

  12. Warning: this problem is for the determined, patient, and strong-hearted student. Consider the SOR Method for solving the general 2 x 2 system



    1. Find the B matrix and its eigenvalues (as functions of a11, a12, a21, and a22 ).

    2. Find the value of ω that results in the two eigenvalues being the same. (This is the value of ω that minimizes ||B|| = |λmax|.) This ω is a function of a11, a12, a21, and a22. What is the single eigenvalue (with algebraic multiplicity 2) in this case?

    3. For the ω found in (b), what conditions on a11, a12, a21, and a22 guarantee that ||B|| = |λmax| < 1?