There are many numerical methods for approximating the irrational number \(\sqrt{2}.\) In this paper we will explore an iterative method for approximating \(\sqrt{2}\) by rational numbers which was developed in about 100 CE by Theon of Smyrna. We will verify its accuracy and explore possible motivations for its development. We are mainly interested in viewing this historic method through more modern lenses, and so we will explore the method in the contexts of modern courses in Geometry and Linear Algebra. To conclude we'll show how to extend the method to approximate other square roots.

As early as 430 BCE, it was known that the sides and diagonal of a square are incommensurable. This means that the ratio of these lengths cannot be expressed as a rational number. If we think of a square of side length 1, the diagonal has length \(\sqrt{2},\) and this ancient result says that \(\sqrt{2}\) is irrational. This discovery came after the life of Pythagoras, but during the time of his followers, known as the Pythagoreans. We don't have a lot of information about this group. Some say that this incommensurability result caused conflict among Pythagoreans; others say it was fertile ground for new mathematics. [2]

Theon of Smyrna was an astronomer and writer who lived in what is now Izmir, Turkey from about 70 to 135 CE. Theon lived 400 years after Euclid and 600 years after Pythagoras, and is sometimes classified as a Neo-Pythagorean. [4] Neo-Pythagoreans approached mathematics through the Pythagorean tradition, including the idea of incommensurability. [3]

In his work *On Mathematics Useful for the Understanding of Plato* [4], Theon proposed a method which can be used to provide ever closer rational approximations of \(\sqrt{2}.\) This method is sometimes referred to as Theon's ladder. The method starts with \(x_0=1,\) \(y_0=1.\) Our estimate of \(\sqrt{2},\) which will be the ratio \(\displaystyle{\frac {y} {x}},\) thus starts as 1. The numbers \(x\) and \(y\) are sometimes referred to as *side* and *diagonal* numbers – more on that to come. [5] To get a better estimate of \(\sqrt{2},\) we let \(x_1=x_0+y_0\) and \(y_1 = 2x_0+y_0.\) Thus we get that \(x_1=2\) and \(y_1=3,\) so our new estimate is \(\displaystyle{\frac {3} {2}} = 1.5.\) The recursive formula is \[ x_{n+1} = x_n+y_n \ \ \ \ \ \ \ \ y_{n+1} = 2x_n+y_n .\] Continuing in this way for a few more steps gives us the results in Table 1.

\[\begin{array}{r|r|r|r} n & x_n & y_n & y_n/x_n \\\hline 0 & 1 & 1 & 1 \\\hline 1 & 2 & 3 & 1.5 \\\hline 2 & 5 & 7 & 1.4 \\\hline 3 & 12 & 17 & 1.4167 \\\hline 4 & 29 & 41 & 1.4138\\\hline 5 & 70 & 99 & 1.4143\\\hline 6 & 169 & 239 & 1.4142\end{array}\]

**Table 1.** Side and diagonal numbers

So it appears that Theon's method approximates \(\sqrt{2} = 1.4142....\) Note that the denominators (or sides), \(x_n,\) are the Pell numbers, and their properties are well-studied by number theorists.

We cannot be sure how (or why) Theon came up with this procedure. One possible origin is numerical. [1] If \(\sqrt{2}\) were rational, then \(\displaystyle\sqrt{2} = {\frac{y}{x}}\) for some positive integers \(x\) and \(y.\) Squaring and simplifying gives us \(2x^2 =y^2.\) So, does that equation have integer solutions? Let's look at the list of possibilities for \(y^2\) and \(2x^2\) in Table 2.

\[\begin{array}{r|cccccccc} n & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 \\\hline n^2 & 1 & 4 & \fbox{9} & 16 & 25 & 36 & \fbox{49} & 64 \\\hline 2n^2 & 2 & \fbox{8} & 18 & 32 & \fbox{50} & 72 & 98 & 128\end{array}\]

**Table 2.** Values of \(y^2\) and \(2x^2\)

From this tiny table we don't see any places where a square is equal to two times a square, but we do see some places where they are close. For example if \(x=5\) and \(y=7,\) then \(2x^2 = 50 = y^2+1.\) This indicates that \(\displaystyle{\frac {7}{5}}\) is a decent approximation of \(\sqrt{2}.\) If we continue this table and list the possibilities as fractions, we get this list: \[{\frac{1}{1}}, \,{\frac{3}{2}},\, {\frac{7}{5}},\, {\frac{17}{12}},\, \dots.\] These are all values of \(x\) and \(y\) where \(2x^2=y^2 \pm 1.\) If you look for a pattern here, you see that the new denominator is the sum of the previous numerator and denominator, or \(x_{n+1} = x_n+y_n.\) The new numerator is this new denominator plus the old denominator, or \(y_{n+1} = x_{n+1}+x_n = x_n+y_n+x_n = 2x_n+y_n.\) Testing this out gives the next term as \(\displaystyle{\frac{41}{29}}\) which satisfies \(2\cdot29^2 = 1682=41^2+1.\) So Theon could have found this pattern by inspection.

There are also possible geometric motivations for this process, which explain why \(x\) and \(y\) are referred to as "side" and "diagonal" numbers. Suppose we start with an isosceles right triangle \(ABC\) whose leg (side) is of length \(x\) and hypotenuse (diagonal) of length \(y,\) as in Figure 1. Note that this figure has the added point \(H\) to create the square \(ABHC\) to emphasize the names "side and diagonal numbers." Extend the two sides by \(y\) to create an isosceles right triangle with sides of length \(x+y.\)

**Figure 1.** An isosceles right triangle with sides of length \(x+y\) and diagonal \(\overline{DG}\) of length \(2x+y\)

We can see that the larger triangle has a hypotenuse of length \(2x+y\) by noticing that \[\triangle BCA \cong \triangle DBE \cong \triangle GCF\] and that \(BCFE\) is a rectangle. Thus \(DE=FG=x\) and \(EF=y.\)

This triangle motivates the idea that if the side is extended from \(x\) to \(x+y,\) then the diagonal should be extended from \(y\) to \(2x+y,\) which fits Theon's method. We should note that no *integers* \(x\) and \(y\) exist that are sides and diagonals of the isosceles right triangle. This approach is simply a motivation and not a proof.

Let's prove that Theon's method gives a method for finding better and better rational approximations of \(\sqrt{2}.\) Suppose we start with \(x\) and \(y\) satisfying \(2x^2-y^2 = \pm 1\) (note that \(x_0=y_0=1\) satisfies this) and let \(x^*=x+y, y^*=2x+y.\) Then

\begin{eqnarray*} 2(x^*)^2-(y^*)^2 &=& 2(x+y)^2-(2x+y)^2 \\ &=& -2x^2+y^2 = \mp 1.\end{eqnarray*}

This shows that as we iterate, \(2x^2-y^2\) remains \(\pm 1.\) Now if \(2x_n^2-y_n^2 = \pm 1,\) then \(\displaystyle 2 = {\frac {y_n^2} {x_n^2}} \pm {\frac {1} {x_n^2}}.\) If we start with positive values for \(x_0\) and \(y_0,\) then as we iterate, \(x_n\) gets bigger by at least one at each step, so \(x_n\) goes to \(\infty.\) Therefore we can say that \(\displaystyle {\frac {1} {x_n^2}}\) will go to \(0^{+}.\) This means that \(\displaystyle\left({\frac{y_n} {x_n}} \right)^2\) will tend to 2. Assuming that \(\displaystyle \lim_{n\rightarrow \infty} {\frac {y_n} {x_n}}\) exists, then \(\displaystyle {\frac {y_n} {x_n}}\) goes to \(\sqrt 2,\) as desired.

So, in about 100 CE, Theon of Smyrna developed (for an unknown reason) an iterative method that can be used to approximate \(\sqrt 2\) by rational numbers. With our modern mathematical tools, we will look at Theon's method through the lenses of Geometry and Linear Algebra.

If we take a geometric view of Theon's *side* and *diagonal* numbers, we can literally picture them as the sides \(x\) and diagonal \(y\) of an isosceles (but non-right) triangle. The resulting triangles will alternate between acute and obtuse as indicated by \(\pm\) in the equation \(y^2=2x^2 \pm1.\) See Figures 2 and 3.

The area of the square on the base, \(y^2,\) and the sum of the squares on the legs, \(2x^2,\) always differ by exactly 1. Since the areas are increasing, it must be that \(y^2\) approaches \(2x^2.\) Visually, the alternating acute/obtuse angle \(\theta_n\) is approaching a right angle. We will assume for the remainder of this section that \(\displaystyle \lim_{n\rightarrow \infty}\theta_n\) exists (so we can use the limit rules) and that the limit is equal to \(\displaystyle \frac{\pi}{2}.\) The sequence of triangles is approaching a right isosceles triangle, and hence the ratio of the diagonal number to the side number is approaching \(\sqrt{2}.\)

**Figure 2.** The case \(x=1, y=1\) with areas of squares

**Figure 3.** The case \(x=2, y=3\) with areas of squares

So, as the iterative process continues, \(\cos{\theta_n}\) approaches zero. The following animation (of a GeoGebra applet) illustrates convergence as \(n\) increases.

The applet itself is available here.

Let's consider what happens if we look at this visualization in terms of trigonometric functions. Trigonometric functions were not in use in Theon's time, but we can use them to provide an alternate proof of Theon's method. In Figure 4, we note that since the triangle \(ABC\) is isosceles, the angle bisector is a perpendicular bisector of the base \(BC.\) From right triangle trigonometry, \(DC=x\sin(\theta/2).\) Thus the length of \(y=BC\) is \(2x\sin(\theta/2).\) We can check that as \(\theta\) approaches a right angle, the ratio \(\displaystyle\frac{y}{x}\) approaches our expected \(\sqrt{2}.\) That is, \[\frac{y}{x} = \frac{2x\sin(\theta/2)}{x} \rightarrow 2\sin\left(\frac{\pi}{4}\right) = \sqrt{2},\,\,\,{\rm{as}}\,\,\, \theta \rightarrow\frac{\pi}{2}.\]

**Figure 4.** The side opposite angle \(\theta\) has length \(y=2x\sin(\theta/2).\)

Further, if we consider the Law of Cosines applied to the isosceles triangle with legs of length \(x,\) base \(y,\) and vertex angle \(\theta,\) we have \[y^2=x^2+x^2-2x\cdot x\cos\theta,\] which simplifies to \[y^2=2x^2(1-\cos\theta).\] From this we see the ratio \[\frac{y^2}{x^2}=2(1-\cos\theta).\] And, as assumed, as \(\theta\) approaches a right angle, \(\displaystyle\frac{y^2}{x^2}\) approaches \(2.\) That is, \(\displaystyle\frac{y}{x}\) approaches \(\sqrt{2}.\)

As a little bonus, substituting the result from Figure 4 into \(y^2=2x^2(1-\cos\theta),\) we obtain \[4x^2\sin^2(\theta/2)=2x^2(1-\cos\theta).\] Assuming \(x\neq0,\) we verify the half-angle formula, \[2\sin^2(\theta/2)=1-\cos\theta.\]

Theon's method naturally lends itself to an iterative matrix multiplication model, an example of a difference equation. This model allows us to see how eigenvalues and eigenvectors are useful in an application taken from the history of mathematics. If we let \({\vec v_n} = \left[ \begin{array}{r} x_n \\ y_n\end{array} \right] \) and \(A = \left[ \begin{array}{rr} 1 & 1 \\ 2 & 1\end{array} \right] ,\) then \({\vec v_{n+1}} = A{\vec v_n}.\) Since \({\vec v_1} = A{\vec v_0},\) then \({\vec v_2} = A{\vec v_1} = A A {\vec v_0} = A^2\,{\vec v_0}.\) In general, it turns out that \({\vec v_n} = A^n\,{\vec v_0}.\) Try this out starting with \(x_0=y_0=1\) to verify that you get the same values as in Table 1, repeated here for convenience.

\[\begin{array}{r|r|r|r} n & x_n & y_n & y_n/x_n \\\hline 0 & 1 & 1 & 1 \\\hline 1 & 2 & 3 & 1.5 \\\hline 2 & 5 & 7 & 1.4 \\\hline 3 & 12 & 17 & 1.4167 \\\hline 4 & 29 & 41 & 1.4138\\\hline 5 & 70 & 99 & 1.4143\\\hline 6 & 169 & 239 & 1.4142\end{array}\]

**Table 1** (repeated)**.** Side and diagonal numbers

This linear algebra viewpoint gives us another way to argue that this method does work to approximate \(\sqrt{2}.\) To determine the long-range behavior of a system like this, we look at the "eigenstuff": the eigenvalues and eigenvectors of the matrix \(A.\) Solving for the eigenvalues, we get that \(\lambda_1 = 1+\sqrt{2} \approx 2.414,\) and \(\lambda_2 = 1-\sqrt{2} \approx -0.414.\) Notice that \(|\lambda_1| >1\) and \(|\lambda_2|<1.\) As eigenvectors we can choose \({\vec w_1} = \left[ \begin{array}{r} 1 \\\sqrt{2}\end{array} \right] \) and \({\vec w_2} = \left[ \begin{array}{r} 1 \\ -\sqrt{2}\end{array} \right] .\) These eigenvectors are, of course, linearly independent, and so any initial vector \({\vec v_0}\) can be written as a linear combination of the eigenvectors: \({\vec v_0} = c_1{\vec w_1}+c_2{\vec w_2}.\) Then when we iterate we get that \[{\vec v_n} = A^n\,{\vec v_0} = c_1A^n\,{\vec w_1}+c_2A^n\,{\vec w_2} = c_1{\lambda_1}^n\,{\vec w_1}+c_2{\lambda_2}^n\,{\vec w_2}.\] Since \({{\lambda_2}^{n}}\) goes to \(0\) as \(n\) grows, any initial vector will tend to a multiple of \({\vec w_1} = \left[ \begin{array}{r} 1 \\\sqrt{2}\end{array} \right].\) Thus the second component, \(y,\) will, in the limit, be \(\sqrt{2}\) times the first component or \(x\) value. This shows us that \(\displaystyle{y_n \over x_n}\) tends to \(\sqrt 2\) as \(n\) goes to infinity.

Figure 5 shows the resulting vectors (in blue) when starting with \(v_0=\left[ \begin{array}{r} 1 \\1\end{array} \right] .\) The black dotted line is in the direction of vector \({\vec w_1} = \left[\begin{array}{r} 1 \\\sqrt{2}\end{array} \right].\) Clearly the vectors are converging to the direction of that first eigenvector. Note also that the vectors alternate being above and below that direction, much like the alternating \(\pm 1\) when considering the results algebraically.

**Figure 5.** Iterates approach an eigenvector.

You can try out this visualization in the GeoGebra applet below. Use the slider in the applet to see (again) how quickly the vectors \(\vec v_n\) converge to the vector \({\vec w_1}.\) Be sure to change the initial vector \(\vec v_0\) to see if you can slow the convergence.

The classroom activity Approximating Square Roots with Linear Algebra is available here. This applet helps students explore the convergence. A stand-alone version of the applet is available here.

Next, we generalize this vector method to approximate the square root of any positive integer.

The linear algebraic view of Theon's method gives us a quick way to develop methods for approximating other square roots. Since the matrix \(A = \left[ \begin{array}{rr} 1 & 1 \\2 & 1\end{array} \right] \) works to find approximations of \(\sqrt 2,\) let's naively guess that \(B = \left[ \begin{array}{rr} 1 & 1 \\ 3 & 1\end{array} \right] \) will allow us to approximate \(\sqrt 3.\) Is that true? Well, the eigenvalues are \(\lambda_1 = 1+\sqrt 3\) and \(\lambda_2=1-\sqrt 3\) with corresponding eigenvectors \({\vec w_1} = \left[ \begin{array}{r} 1 \\\sqrt{3}\end{array} \right] \) and \({\vec w_2} = \left[ \begin{array}{r} 1 \\ -\sqrt{3}\end{array} \right] .\) Since \(|\lambda_1| >1\) and \(|\lambda_2|<1,\) any starting vector will tend to a multiple of \(\left[ \begin{array}{r} 1 \\\sqrt{3}\end{array} \right],\) and thus \(\displaystyle {\frac {y} {x}}\) will tend to \(\sqrt 3.\) This convergence is demonstrated in general later in this section.

Note that we need to be careful with the phrase "any starting vector." If the starting vector is exactly a multiple of \(\vec w_2,\) then the iterates will not tend to a multiple of \(\vec w_1.\) But note that

- eigenvector \(\vec w_2\) has a negative component, so any starting vector with both components positive will work, and
- eigenvector \(\vec w_2\) has one rational and one irrational component, so any starting vector with both components integers (not both zero) will work as well (even if one or both components are negative).

The tables below show examples starting with \(x=1,y=3\) and with \(x=-2, y=5.\)

\[\begin{array}{r|r|r|r} n & x_n & y_n & y_n/x_n \\\hline 0 & 1 & 3 & 3 \\\hline 1 & 4 & 6 & 1.5 \\\hline 2 & 10 & 18 & 1.8 \\\hline 3 & 28 & 48 & 1.7143 \\\hline 4 & 76 & 132 & 1.7368 \\\hline 5 & 208 & 360 & 1.7308\end{array}\qquad \qquad\begin{array}{r|r|r|r} n & x_n & y_n & y_n/x_n \\\hline 0 & -2 & 5 & -2.5 \\\hline 1 & 3 & -1 & -0.3333 \\\hline 2 & 2 & 8 & 4 \\\hline 3 & 10 & 14 & 1.4 \\\hline 4 & 24 & 44 & 1.8333 \\\hline 5 & 68 & 116 & 1.7059\end{array}\]

**Table 3.** Approximating \(\sqrt{3}\)

So far so good, but there is an issue that arises. If we use \(B = \left[ \begin{array}{rr} 1 & 1 \\5 & 1\end{array} \right] \) to get an approximation of \(\sqrt 5,\) the eigenvalues and eigenvectors are just what we expect. But with eigenvalues of \(\lambda_1 = 1+\sqrt 5 \approx 3.236\) and \(\lambda_2=1-\sqrt 5 \approx -1.236,\) both eigenvalues are larger than \(1\) in absolute value. So does our argument break down? It turns out that the answer is no. The eigenvalue that has the largest absolute value "dominates" in the long run, and vectors still tend to a multiple of the associated eigenvector, in this case \({\vec w_1} = \left[ \begin{array}{r} 1 \\\sqrt{5}\end{array} \right] .\) See Table 4 for an example of approximating \(\sqrt{5}\):

\[\begin{array}{r|r|r|r} n & x_n & y_n & y_n/x_n \\\hline 0 & 1 & 2 & 2 \\\hline 1 & 3 & 7 & 2.3333 \\\hline 2 & 10 & 22 & 2.2 \\\hline 3 & 32 & 72 & 2.25 \\\hline 4 & 104 & 232 & 2.2308\end{array}\]

**Table 4.** Approximating \(\sqrt{5}\)

Here's an argument as to why the vector with the largest eigenvalue dominates. If we use the matrix \(B = \left[ \begin{array}{rr} 1 & 1 \\ k & 1\end{array} \right]\) for \(k \in {{\bf Z}^{+}},\) we get eigenvalues \(\lambda_1 = 1+\sqrt k\) and \(\lambda_2=1-\sqrt k.\) Note that \(|{\lambda_2}|<|{\lambda_1}|.\) The associated eigenvectors are \({\vec w_1} = \left[ \begin{array}{r} 1 \\\sqrt{k}\end{array} \right] \) and \({\vec w_2} = \left[ \begin{array}{r} 1 \\ -\sqrt{k}\end{array} \right] .\) We showed above that if \[ {\vec v_0} = c_1{\vec w_1}+c_2{\vec w_2},\] then

\[ {\vec v_n} = c_1{\lambda_1}^n\,{\vec w_1}+c_2{\lambda_2}^n\,{\vec w_2} . \] If we factor out \({\lambda_1}^n\) on the right side, we get \[ {\vec v_n} = {\lambda_1}^n\left( c_1{\vec w_1}+c_2{\frac {{\lambda_2}^n}{{\lambda_1}^n}}\,{\vec w_2} \right)= {\lambda_1}^n\left( c_1{\vec w_1}+c_2\left({\frac {{\lambda_2}}{{\lambda_1}}}\right)^n{\vec w_2} \right). \] But since \(|{\lambda_2}|<|{\lambda_1}|,\) we have \(\displaystyle \left|{\frac {\lambda_2}{\lambda_1}}\right| < 1.\) So the second term tends to \(\vec 0,\) and \(\vec v_n\) tends to a multiple of \(\vec w_1\) as desired.

Thus, in general, using the matrix \(B = \left[ \begin{array}{rr} 1 & 1 \\ k & 1\end{array} \right]\) for \(k \in {{\bf Z}^{+}},\) the ratio of \(y\) over \(x\) for almost any starting vector tends to \(\sqrt{k}.\)

It's always interesting to analyze historical ideas using modern lenses, and this can lead to unexpected connections and insights. There are more areas to read about and explore regarding Theon's method. In 1951, Vedova [5] explored the method algebraically and made a connection to continued fractions. In their 1967 article, Waugh and Maxfield [6] provided a stimulating historical perspective of side-and-diagonal numbers including alternate ways to view the matrices and speculative evidence that Archimedes used side-and-diagonal numbers in his approximations of radicals. We plan to investigate the efficiency of Theon's method, and how the \(2x^2=y^2 \pm 1\) equation varies when approximating other roots. Theon's method provides fertile ground to connect many seemingly unrelated areas of mathematics at an undergraduate level.

We have created and tested a class activity for Linear Algebra which explores Theon's method and convergence. (Download our Approximating Square Roots with Linear Algebra classroom activity here. The applet students are asked to use to assist them with step 8 of this activity is here on the Linear Algebra page of this article, just under Figure 5.)

[1] David Flannery, *The Square Root of 2: A Dialogue Concerning a Number and a Sequence,* Copernicus (2006).

[2] Kurt Herzinger and Robert Wisner. Connecting Greek Ladders and Continued Fractions. *MAA Convergence,* 2014. DOI:10.4169/convergence20140102.

[3] Carl Huffman. Pythagoreanism. In Edward N. Zalta, editor, *The Stanford Encyclopedia of Philosophy.* Metaphysics Research Lab, Stanford University, Winter 2016 edition. https://plato.stanford.edu/archives/win2016/entries/pythagoreanism/.

[4] Theon of Smyrna. *Mathematics Useful for Understanding Plato* (Robert and Deborah Lawlor, translators). Wizards Bookshelf, 1979.

[5] G. C. Vedova. Notes on Theon of Smyrna. *The American Mathematical Monthly,* 58, no. 10:675–683, 1951.

[6] Frederick V. Waugh and Margaret W. Maxfield. Side-and-diagonal Numbers. *Mathematics Magazine,* 40, no. 2:74–83, 1967.

Matthew Haines is Professor of Mathematics at Augsburg University in Minneapolis, Minnesota. He was a member of the 1998-99 Institute for the History of Mathematics and Its Use in Teaching led by Victor Katz and Fred Rickey. If readers find themselves unexpectedly or otherwise in Minneapolis or neighboring areas, they are invited to stop by Matt's office for HoM conversation, chocolate, and an espresso.

Jody Sorensen is Associate Professor of Mathematics at Augsburg University in Minneapolis, Minnesota. She's interested in dynamical systems, history of mathematics, and expository writing. This article brings Jody one step closer to her goal of completing projects with each of her departmental colleagues!