This study explains the creation of a calculus reform program, its objectives, and philosophy and provides an in-depth comparison of reform-trained students with traditional students.
Background and Purpose
The "Calculus, Concepts, Computers and Cooperative Learning," or C4L Calculus Reform Program is part of the National Calculus Reform Movement. The initial design of the C4L program began in 1987 under the leadership of Ed Dubinsky and Keith Schwingendorf on the West Lafayette campus of Purdue University, a large Midwestern Land Grant University. The C4L program has received NSF funding for its design and development during eight of the past nine years, most recently from Fall 1994 through Summer 1997 (grant #DUE-9450750) to continue and expand the program's assessment and evaluation efforts on two fronts: (1) qualitative research into how students learn calculus (mathematics) concepts, and (2) quantitative research and the development of a (national) analytical model for assessment and evaluation of the effectiveness of innovative educational reform efforts as compared to other pedagogical treatments. The C4L program is co-directed by Dubinsky (Georgia State University), Schwingendorf (Purdue University North Central), and David Mathews (South-western Michigan College, Dowagiac, MI).
This paper will describe the twofold assessment and evaluation process* developed as part of the C4L Calculus Reform Program. The major focus of this paper will be the analytical model designed to address differences within the study population of students and deal effectively with the limitation of self-selection a key problem encountered by virtually all control versus alternative treatment studies in education and similar studies related to (among others) health and industrial issues. The analytical method allows researchers to make more meaningful comparisons and conclusions regarding the effectiveness of innovative curriculum reform treatments and other pedagogical treatments. (A detailed description of the C4L Study together with references on qualitative research studies regarding the C4L Program can be found in .) The qualitative research phase and its impact on curriculum development will be briefly addressed. (A detailed description of the research framework used in the qualitative phase of assessment of the C4L Calculus Reform Program can be found in .)
The C4L Program is based on a constructivist theoretical perspective of how mathematics is learned (see ). According to the emerging learning theory on which the C4L Program courses are based, students need to construct their own understanding of each mathematical concept. The framework and step-by-step procedure of the qualitative research phase is described in detail in .
C4L Calculus courses differ radically from traditionally taught courses and from courses in most other Calculus reform programs in many fundamental ways. Traditional courses, delivered primarily via the lecture/recitation system, in general, attempt to "transfer" knowledge, emphasize rote skill, drill and memorization of paper-and-pencil skills. In contrast, the primary emphasis of the C4L program is to minimize lecturing, explaining, or otherwise attempting to "transfer" mathematical knowledge, but rather to create situations which foster student to make the necessary mental constructions to learn mathematics concepts. The emphasis of the C4L program is to help students gain a deeper understanding of concepts than can be obtained via traditional means, together with the acquisition of necessary basic skills through the Activity-Class-Exercise, or ACE, learning cycle developed for the C4L Program, .
Each unit of the ACE learning cycle, which generally lasts about a week, begins with students performing computer investigations in a laboratory setting in an effort to help students construct their own meaning of mathematical concepts and reflect on their experiences with their peers in a cooperative learning environment. Lab periods are followed by class meetings in which a modified Socratic approach is used in conjunction with cooperative problem solving in small groups to help the students to build upon their mathematical experiences from the computer laboratory. Finally, relatively traditional exercises are assigned to reinforce the knowledge students are expected to have constructed during the first two phases of the learning cycle.
(1) Qualitative Research Phase of the C4L Program. A critical aspect of the qualitative aspect of the C4L program is a "genetic decomposition" of each basic mathematical concept into developmental steps following a Piagetian theory of knowledge. Genetic decompositions, initially hypothesized by the researchers based on the underlying learning theory, are modified based on in-depth student interviews together with observations of students as they attempt to learn mathematical concepts. Qualitative interviews have been designed and constructed by the program's research team resulting to date in over 125 completed interviews on the concepts of limit, derivative, integral, and sequences and series. A complete description of the C4L Program's qualitative research procedure is carefully and completely described in .
(2) Quantitative Research Phase of the C4L Program. A longitudinal study (which will be referred to as the "C4L Study") of the C4L Calculus Reform Program was designed in an effort to make meaningful comparisons of C4L (reform) calculus students with traditionally taught students (TRAD) via lecture and recitation classes.
The students in the C4L Study were those enrolled on the West Lafayette campus of Purdue University from Fall 1988 to Spring 1991, in either the C4L (reformed) Program three semester calculus sequence, or in the traditionally taught lecture/recitation three semester calculus sequence.
The student population consisted primarily of engineering, mathematics and science students. The average Math SAT score for first semester calculus students in both three semester calculus sequences was about 600. Comparisons of the 205 students who completed the C4L calculus sequence were made with the control group of 4431 students from the traditionally taught courses. Only data on 4636 students enrolled for the first time in: (i) first semester calculus during the Fall semesters of 1988, 1989 and 1990; (ii) second semester calculus during the Spring semesters of 1989, 1990, and 1991; and (iii) third semester calculus during the Fall semesters of 1989, 1990, and 1991 were included in the longitudinal study. This was done in order to make meaningful comparisons with C4L and TRAD students having as similar as possible backgrounds and experiences. The caveat of self-selection is a limitation of the C4L Study design. However, there are no practical alternatives to the C4L Study design that would provide a sufficient amount of data on which to draw meaningful conclusions. In our situation, random assignment of students to the C4L and TRAD courses would have been preferable to self-selection, since by doing so any possible confounding factors which may have influenced outcomes of the C4L Study design would have been eliminated. For example, the possibility that academically better prepared students would enroll in the C4L courses would be offset by random assignment, thus preventing bias in favor of the C4L program. However, a study involving random assignment of students is not pragmatic, since students dropping out of either program would adversely effect the study results. Moreover, the C4L and TRAD programs are so different that students would quickly become aware of these differences, resulting in possible resentment, which would again detract from the C4L Study results.
The C4L Study, under the supervision of Professor George P. McCabe (Professor of Statistics and Head of Statistical Consulting at Purdue University), featured response and explanatory variables; the first measuring outcomes, and the latter, attempting to explain these outcomes. For example, in a comparison of two Hospitals A and B, we might find a better success rate for all surgeries in Hospital A, whereas, on a closer analysis, Hospital B has a better rate for persons entering surgery in poor condition. Without explanatory variables, any statistical study becomes dubious. Our basic idea was to compare C4L and TRAD students using explanatory variables as controls, a method similar to that found in epidemiological studies. Confounding factors in the C4L Study, indeed in any such observational study of this kind, are often accounted for by a "matching" procedure. However, a "traditional matching procedure" would have been problematic for the C4L Study, since the number of students enrolled in the C4L program was so small as compared to the number of TRAD students. In other words, a traditional matching procedure would have used only a very small portion of the available study data. So, in order to use all the available data, the C4L Study used linear models which were designed to determine whether or not each of the explanatory variables had a statistically significant effect on each of the response variables, and the linear models themselves performed a matching of C4L and TRAD students with as "identical" as is possible characteristics for comparison. The explanatory variables used in the linear models represented various confounding and interaction variables in addition to the one explanatory variable of interest, namely, whether or not the C4L or TRAD teaching method was used. A complete discussion of the analytical method can be found in .
The set of response variables used in the C4L Study linear models were as follows:
The set of explanatory variables used in the C4L Study linear models were as follows:
Each response variable was modeled as a linear function of the explanatory variables. Statistical tests were performed for the general linear model to determine whether or not the explanatory variables had statistically significant effects on each response variable. The focus of the C4L Study was to draw meaningful conclusions and comparisons by answering questions like the following:
The following summarizes the comparisons and conclusions made in the C4L Study:
These results indicate that not only did C4L students do just as well as their TRAD counterparts in mathematics courses beyond calculus, but a larger number of C4L students went on to do just as well as traditional students in higher mathematics courses.
A recently published study  suggests that C4L calculus students appear to be spending more time studying calculus than do their traditionally taught students counterparts. But, as the results of the C4L longitudinal study suggest, C4L students may indeed reap the rewards for studying more than traditional students in that they often receive higher grades in calculus. Moreover, C4L students appear not to be adversely affected in their other courses by the increased study time spent on calculus, since no statistically significant differences in C4L students' grade point averages as compared to traditionally taught students were found in the C4L Study.
Use of Findings
No significant changes in the C4L Program three semester calculus sequence were made based on the findings of the C4L Study. However, the findings of the study do provide potential implementers of the C4L program and the program directors with useful information regarding the effects of the use of such a radical approach to calculus reform. Concerns regarding the increased study time required of students and the possible detrimental effects on their overall performance in other classes seem to have been be effectively addressed. Most faculty might agree that students across the nation do not spend enough time studying calculus which may be one of the critical factors which contributes to attrition and poor performance in calculus.
The results of the qualitative research phase of the C4L assessment/evaluation process have been used to make revisions in curriculum design, the text and other C4L Program course materials. A critical aspect of the C4L program is a decomposition of each mathematical concept into developmental steps (which is often referred to as a "genetic decomposition" of a concept) following a Piagetian theory of knowledge based on observations of and in-depth student interviews as they attempt to learn a concept . The results of the qualitative research phase of the C4L assessment program is used to modify and adjust the C4L pedagogical treatment. In particular, the development steps proposed by researchers may, or may not, be modified. For example, the C4L text treatment and genetic decomposition of the concept of the limit of a function at a point was modified and an alternative treatment was proposed based on analyses of 25 qualitative student interviews on the concept of limit. When possible an attempt is made to make meaningful comparisons of the analyses of interviews of both C4L and TRAD students. This was possible with a set of 40 interviews on derivatives and an in-depth analyses of two interview questions on the students' understanding of the derivative as slope. However, whether such comparisons can be made or not, sometimes the research results confirm that the C4L treatment of a particular concept appears to be doing what is expected regarding the outcomes of student understandings. In this case no change in the C4L treatment is made, as was the case with results of analyses of the interview questions on derivative as slope.
To plan and design a longitudinal study like the C4L Study and then carry out the analysis of the results requires numerous hours of planning, brainstorming and consultation with statistical consultants. Such a major study cannot be taken lightly and does require expert advice and counsel.
Regarding qualitative interviews, we will only say here that to design and construct each interview, the questions and the guidelines for what interviewers should probe for during an interview requires detailed and careful planning. Pilot interviews must be carried out and analyzed prior to the completion of the entire set of interviews. A particular theoretical perspective on how students learn mathematical concepts, such as that used in the C4L program, provides a solid foundation on which the interview process can be based. Students are paid $10-$15 for each qualitative interview, each of which lasted between one and two hours in duration. We note that this need not always be done, as we have conducted interviews on courses other than calculus where student volunteers for interviews were obtained. Transcribing each interview usually requires at least six hours. In the C4L Program, in addition to the researchers doing transcriptions, undergraduate and graduate students are often paid to do transcriptions. The analysis phase of a set of interviews requires many hours of dedicated researchers, not to mention the writing of research papers. The process becomes more time consuming if video taping is involved. However, the whole interview process from design to the writing of a research paper can be a very rewarding experience which contributes to pedagogical design (not to mention the contribution to professional development, and possible tenure and promotion). Once again, we caution that a qualitative research program should not be taken lightly. Such a program requires that the researchers become knowledgeable of qualitative research procedures through the necessary training in order to do a competent job.
 Asiala, M., Brown, N., DeVries, D., Dubinsky, E., Mathews, D. and Thomas, K. "A Framework for Research and Development in Undergraduate Mathematics Education," Research in Collegiate Mathematics Education II, CBMS Issues in Mathematics Education, 6, 1996, pp. 1-32.
 Mathews, D.M. "Time to Study: The C4L Experience," UME Trends, 7 (4), 1995.
 Schwingendorf, K.E., McCabe, G.P., and Kuhn, J. "A Longitudinal Study of the Purdue C4L Calculus Reform Program: Comparisons of C4L and Traditional Stu-dents," Research in Collegiate Mathematics Education, CBMS Issues in Mathematics Education, to appear.
*The C4L Study and subsequent research paper could not have been completed without the expert advice and generous
time provided by our statistical consultants George and Linda McCabe, and Jonathan Kuhn (Purdue University North Central). The
author wishes to thank Professor Kuhn for his insightful comments and suggestions for the completion of this paper.