Teaching the Fundamental Theorem of Calculus: A Historical Reflection

Author(s):

Omar A. Hernandez Rodriguez (University of Puerto Rico) and Jorge M. Lopez Fernandez (University of Puerto Rico)

As mentioned above, there are ways to teach the elementary integral that capitalize on its properties as an area function and that do not require Cauchy's limit-sum definition of the integral to become the center of the exposition. In fact, as pointed out by Gillman [20, p. 17], strictly speaking, Riemann sums need not be mentioned at all. It is hard to imagine, however, that one would actually want to go as far as hiding Riemann sums completely. Cauchy's limit-sum definition of the integral represented a major landmark in the development of analysis, and thus has secured a position in the curriculum of calculus. Our qualm with Riemann sums stems from the fact that there is mounting evidence in mathematics education research (see our Concluding Remarks, page 8) that the integral as a limit of Riemann sums represents knowledge of a higher order of mathematical abstraction, and thus presents special difficulties to students who are learning calculus for the first time. In our Introduction, we described the evolution of an approach to teaching elementary integration whose origin can be traced all the way back to the seventeenth century. We now present the main features of this approach, showing how elementary integration with all of its applications might be developed and how it would lead naturally to more advanced aspects of the theory such as Cauchy's limit-sum definition of the integral and Darboux's approach to Riemann integration. Our presentation relies mainly on the work of Lang [28], Gillman and MacDowell [21], and Gillman [20] (see note 4.1).

In the following discussion \(f:[a,b]\rightarrow\mathbb{R}\) is a continuous function (see note 4.2) on \([a,b],\) where \({-\infty\,{\rm{<}}\,a\,{\rm{<}}\,b\,{\rm{<}}\,\infty}\). We begin by incorporating into the definition of an “integral” what we take to be our present day renditions of the principle of additivity and the principle stating that the area increment contains the area of the inscribed rectangle and is contained in the area of the circumscribed rectangle (see Lang [28, p. 213] and Gillman [20, p. 17]):

Definition 1. An integral for \(f\) on \([a,b]\) is a function \([u,v]\mapsto I_{u}^{v}(f)\) that maps subintervals of \([a,b]\) \((a\leq u\,{\rm{<}}\,v\leq b)\) into the reals in such a way that the following conditions are satisfied:

(A) Additivity. If \(a\leq u\,{\rm{<}}\,v\,{\rm{<}}\,w\leq b\), then \[I_{u}^{w}(f) =I_{u}^{v}(f)+I_{v}^{w}(f),\,\,{\rm and}\quad\quad\quad\quad(4)\]

(B) Boundedness. If \(a\leq u\,{\rm{<}}\,v\leq b\), then \[\min_{x\in [u,v]}f(x)\cdot(v-u)\leq \,I_{u}^{v}(f)\leq\max_{x\in[u,v]}f(x)\cdot(v-u).\quad\quad\quad\quad(5)\]

Sometimes, when \([u,v]\mapsto I_{u}^{v}(f)\) is an integral for \(f\) on \([a,b]\) as in Definition 1, we will simply say that \(I(f)\) is an integral for \(f\) on \([a,b]\). Clearly, equation (4) is a modern rendition of the additive property discussed above and used by Newton, Barrow, Gregory and Leibniz in their proofs of the FTC; on the other hand, equation (5) is a very natural condition helpful in making estimates to subsume the infinitesimal arguments of the above-mentioned proofs of the FTC. It says that the area under a curve is bounded below by the area of the inscribed rectangle and above by the area of the circumscribed rectangle. Bounding rectangles were a part of the mathematical psyche of seventeenth century calculus; they are clearly implicit in Newton's proof of the FTC and appear more explicitly in Barrow's proof of the FTC [9, Lecture X, Section 11, p. 117], where he used the estimates contained in (B) for curves that he assumed to be increasing. The other condition that came into play in the development of the integral calculus during the second half of the seventeenth century was that of the interchangeability of infinitely close ordinates mentioned above, which, as we pointed out, corresponds to our notion of continuity.

These two conditions are very natural and both are expected to hold for any integral of a continuous function. Incidentally, in equation (5) the maximum and the minimum values are attained, by virtue of the extreme value theorem for continuous functions [22, p. 164]. When there is no possibility of confusion, we shall often write \(I_{u}^{v}\) for \(I_{u}^{v}(f)\).

Suppose that we define \(I_{x}^{x}=0\) for each \(x\in[a,b]\), and, if \(a\leq u\,{\rm{<}}\,v\leq b\), we define \(I_{v}^{u}=-I_{u}^{v}\), as is conventional in calculus. With these conventions, equation (4) holds for any \(u,v,w\in[a,b]\), regardless of their relative order. Also, if \([u,v]\mapsto I_{u}^{v}(f)\) is an integral as in Definition 1, then if \(a\leq u\,{\rm{<}}\,v\leq b\), we have from (B), \[\min_{x\in [u,v]}f(x)\leq\frac{1}{v-u}I_{u}^{v}(f)\leq\max_{x\in [u,v]}f(x).\] By the Intermediate Value Theorem [22, p. 164], for some number \(\theta_{u,v}\in[u,v]\), it is true that \[I_{u}^{v}=f(\theta_{u,v})\cdot(v-u).\] This says that our abstract integral satisfies the Mean Value Theorem for Integrals. Gillman [20, p. 18] remarked that properties (A) and (B) “constitute the two steps of the proof of the FTC,” and pointed out some of the advantages of using this definition of the integral rather than the Darboux integral itself.

The argument for the proof of the FTC is direct. If \(I(f)\) is an integral for \(y=f(x)\) on \([a,b]\), \(x\in (a,b)\) (the argument for \(x=a\) or \(x=b\) is similar), and \(\Delta x\not =0\) is chosen in \(\mathbb{R}\) so that \(x,x+\Delta x\in(a,b)\), then, by property (A) of Definition 1 and the Mean Value Theorem for Integrals, we have \[\frac{I_{a}^{x+\Delta x}(f)-I_{a}^{x}(f)}{\Delta x}=\frac{1}{\Delta x}\cdot I_{x}^{x+\Delta x}(f)=f(\theta_{x}),\] for some number \(\theta_{x}\) between \(x\) and \(x+\Delta x\). Since \(\theta_{x}\) is between \(x\) and \(x+\Delta x\), it is clear that \(\lim_{\Delta x\to 0}\theta_{x}=x\), so that, by the continuity of \(f\) and the last relation, we have \[\lim_{\Delta x\to 0}\frac{I_{a}^{x+\Delta x}(f)-I_{a}^{x}(f)}{\Delta x}=\lim_{\Delta x\to 0}f(\theta_{x})=f(x).\] Hence, \[\frac{d}{dx}I_{a}^{x}(f)=f(x).\]

Note that this is actually a modern version of Newton's argument in the case of an abstract integral. Also, once we have the FTC, it is easy to see from the Mean Value Theorem for derivatives [22, p. 167] that the integral is unique.

Interestingly, the linearity property of the integral as a function of its integrand follows from our two postulated properties without the need of Darboux or Riemann sums. In other words, if \(f:[a,b]\rightarrow\mathbb{R}\) and \(g:[a,b]\rightarrow\mathbb{R}\) are continuous functions and \(\alpha,\beta\in\mathbb{R}\), then \[I_{a}^{b}(\alpha f+\beta g)=\alpha I_{a}^{b}(f)+\beta I_{a}^{b}(g).\] In fact, since the functions \(x\mapsto I_{a}^{x}(\alpha f+\beta g)\), \(x\mapsto I_{a}^{x}(f)\), and \(x\mapsto I_{a}^{x}(g)\), where \(x\in[a,b]\), are all integrals, and \[\frac{d}{dx}I_{a}^{x}(\alpha f+\beta g)=[\alpha f+\beta g](x)\] \[=\alpha f(x)+\beta g(x)=\frac{d}{dx}(\alpha I_{a}^{x}(f)+\beta I_{a}^{x}(g)),\] it follows from the usual Mean Value Theorem [22, p. 167] that the functions \(x\mapsto I_{a}^{x}(\alpha f+\beta g)\) and \(x\mapsto\alpha I_{a}^{x}(f)+\beta I_{a}^{x}(g)\), where \(x\in[a,b]\), differ by a constant. Substituting \(x=a\), we see that the constant is zero. Hence, the functions are equal. Substitution of \(x=b\) gives the desired result.

The antiderivative version of the FTC also follows immediately. Remember that a function \(F\) is an antiderivative (or primitive function) of a second function \(f:[a,b]\rightarrow\mathbb{R}\) if and only if \(F\) is continuous on \([a,b]\) and \(F^{\,\prime}(x)=f(x)\) for all \(x\in (a,b)\).

The usual calculus argument gives the antiderivative version of the FTC: Let \(F\) be an antiderivative for \(f:[a,b]\rightarrow\mathbb{R}\) on \([a,b]\). Then \[I_{a}^{b}(f)=F(b)-F(a).\] In fact, by the derivative version of the FTC for our integral, \[\frac{d}{dx}I_{a}^{x}(f)=f(x)\] for all \(x\in [a,b]\). By hypothesis, \(F^{\,\prime}(x)=f(x)\) for all \(x\in (a,b)\) and \(F\) is continuous at \(a\) and \(b\). By the Mean Value Theorem [22, p. 167], the functions \(x\mapsto I_{a}^{x}(f)\) and \(F\) differ by some constant \(C\): \[I_{a}^{x}=F(x)+C\] for all \(x\in [a,b]\). Substituting \(x=a\), we find that \(C=-F(a)\). Hence, \(I_{a}^{x}=F(x)-F(a)\) for all \(x\in [a,b]\). Substituting \(x=b\) we obtain the desired result.

Thus, the general integral from Definition 1 satisfies both versions of the FTC. In view of all this, if \([u,v]\mapsto I_{u}^{v}\) is the unique integral of \(f\) on \([a,b]\) we write \[I_{a}^{b}(f)=\int_{a}^{b}f(u)\,du.\]

Notes for page 4:

4.1. It is noteworthy that the difference between the editions Gillman and MacDowell [21] and Gillman and MacDowell [22] is several hundred pages of exposition; in Gillman and MacDowell [21], the integral is defined axiomatically while in Gillman and MacDowell [22], the Darboux approach plays a central role in the exposition.

4.2. The fact that only continuous functions are to be considered is a compromise accommodating L'Hospital’s axiom of interchangeability of infinitely close ordinates. This admits at the outset that all curves are continuous.

Omar A. Hernandez Rodriguez (University of Puerto Rico) and Jorge M. Lopez Fernandez (University of Puerto Rico), "Teaching the Fundamental Theorem of Calculus: A Historical Reflection - Teaching the Elementary Integral," Convergence (January 2012), DOI:10.4169/loci003803

Convergence