SAUM SAUM Bibliography


“Get Real!”  Assessing for Quantitative Literacy[*]

 

by Grant Wiggins[†]

 

 


“OK, people, settle down. It’s time to take out some paper and pencil, we’re going to have a pop quiz today in Quant. Lit. 101. Stop the groaning please!  You have 40 minutes only.  As always, you can consult any resource, including other people in the room, but budget your time wisely and note all texts and people consulted for each answer. . . . Here are the questions.”

 

1.     What is the meaning of the phrase “statistical tie” in the sentence “The result of the 2000 election in Florida was a statistical tie, even though George Bush was declared the winner”? Extra credit:  Sketch out a mathematically sound and politically palatable solution to the problem of close elections.

 

2.     Respond to the following claim, made by a student to his geometry teacher:  “Well, you may have proven the theorem today, but we may discover something tomorrow that proves the theorem wrong.”

 

3.     Guesstimate quickly, please:  If you want the most money for your retirement, should you (a) invest $500 per year in an index-based mutual fund from the time you are 16 years old to the time you are 30, or (b) invest  $1,000 per year in a bank savings account from the time you are 25 until you are 65?

 

4.     Is mathematics more like geography (a science of what is really “out there”) or more like chess (whose rules and logical implications we just made up)? Did we “discover” the truth that 1 + 1 = 2, or did we “invent” it?  Based on our work this semester, give two plausible reasons for each perspective.  Then give your own view, with reasons.

 

5.     Study the data on the last 10 years of AIDS cases in the United States from the newspaper clipping in front of you. What are two trends for charting future policy?

 

6.     “At current rates of revenue and payout the Social Security fund will be bankrupt by the time you retire.” Explain how this statement could be both true and false, mathematically speaking, depending on the definitions and assumptions used.

7.     Comment on this proof, please:1

Solve              6x – 10            = 21x – 35  for x.

Solution:       2(3x – 5) = 7(3x – 5)

Therefore         2 = 7

8.     “Hoops” McGinty wants to donate millions of dollars from his salary and sports-drink earnings toward a special exhibit in the new Rose Planetarium area of the American Museum of Natural History in New York.  Hoops wants the exhibit to include a three-dimensional scale model of the solar system in which the size of the planets and the distance of each planet from the sun would be exactly to scale.  There is a catch, however:  the sun is to be represented by a regulation NBA basketball.  The nervous folks in the gifts department of the museum call on you because of your expertise in astronomy and matters of scale. What can you advise them—quickly—about the feasibility of McGinty’s plan? What approach will work best to ensure a basketball-related design in the display?

 

9.     Discuss the following statement, picking a key axiom as an example to support your observations:  “The axioms in any mathematical system may logically precede the theorems, but it does not follow (and indeed is not true historically) that they were all formulated prior in time to the theorems. Axioms are not self-evident truths. They may even sometimes be less obvious than theorems, and formulated late in the game. They are necessary ‘givens’, shaped by what we wish to be able to prove.”

 

10.   Write a memo to the House Education Committee on the accuracy and implications of the following analysis:

 

New York Times, August 13, 2001

Rigid Rules Will Damage School

By Thomas J. Kane and Douglas O. Staiger

As school was about to let out this summer, both houses of Congress voted for a dramatic expansion of the federal role in the education of our children. A committee is at work now to bring the two bills together, but whatever the specific result, the center of the Elementary and Secondary Education Act will be identifying schools that are not raising test scores fast enough to satisfy the federal government and then penalizing or reorganizing them. Once a school has failed to clear the new federal hurdle, the local school district will be required to intervene.

The trouble with this law . . . is that both versions of this bill place far too much emphasis on year-to-year changes in test scores. . . . Because the average elementary school has only 68 children in each grade, a few bright kids one year or a group of rowdy friends the next can cause fluctuations in test performance even if a school is on the right track.

Chance fluctuations are a typical problem in tracking trends, as the federal government itself recognizes in gathering other kinds of statistics. The best way to keep them from causing misinterpretations of the overall picture is to use a large sample. The Department of Labor, for example, tracks the performance of the labor market with a phone survey of 60,000 households each month. Yet now Congress is proposing to track the performance of the typical American elementary school with a sample of students in each grade that is only a thousandth of that size.

With our colleague Jeffrey Geppert of Stanford, we studied the test scores in two states that have done well, investigating how their schools would have fared under the proposed legislation. Between 1994 and 1999, North Carolina and Texas were the envy of the educational world, achieving increases of 2 to 5 percentage points every year in the proportion of their students who were proficient in reading and math. However, the steady progress at the state level masked an uneven, zigzag pattern of improvement at the typical school. Indeed, we estimate that more than 98 percent of the schools in North Carolina and Texas would have failed to live up to the proposed federal expectation in at least one year between 1994 and 1999. At the typical school, two steps forward were often followed by one step back.

More than three-quarters of the schools in North Carolina and Texas would have been required to offer public school options to their students if either version of the new education bill had been in effect. Under the Senate bill a quarter of the schools in both states would have been required to restructure themselves sometime in those five years—by laying off most of their staffs, becoming public charter schools or turning themselves over to private operators. Under the more stringent House bill, roughly three-quarters of the schools would have been required to restructure themselves.

Both bills would be particularly harsh on racially diverse schools. Each school would be expected to achieve not only an increase in test scores for the school as a whole, but increases for each and every racial or ethnic group as well. Because each group’s scores fluctuate depending upon the particular students being tested each year, it is rare to see every group’s performance moving upward in the same year. Black and Latino students are more likely than white students to be enrolled in highly diverse schools, so their schools would be more likely than others to be arbitrarily disrupted by a poorly designed formula. . . .

In their current bills, the House and Senate have set a very high bar—so high that it is likely that virtually all school systems would be found to be inadequate, with many schools failing. And if that happens, the worst schools would be lost in the crowd. The resources and energy required to reform them would probably be dissipated. For these schools, a poorly designed federal rule can be worse than no rule at all.2

 

11.   “It is fair to say that no more cataclysmic event has ever taken place in the history of thought.” Even though we have not read the text from which this quote comes, mathematician Morris Kline was referring to a mid-nineteenth-century development in mathematics. To what was he most likely making such dramatic reference? Why was it so important in the history of thought?

 

*      *      *

 

In an essay designed to stimulate thought and discussion on assessing quantitative literacy (QL), why not start with a little concrete provocation:  an attempt to suggest the content of questions such an assessment should contain?  (Later I will suggest why the typical form of mathematics assessment—a “secure” quiz/test/examination—can produce invalid inferences about students’ QL ability, an argument that undercuts the overall value of my quiz, too.)

 

Note that the questions on my quiz relate to the various proposed definitions of QL offered in Mathematics and Democracy:  The Case for Quantitative Literacy (hereafter “case statement”).3  As part of a working definition, the case statement identified 10 overlapping elements of quantitative literacy:

 

A.            Confidence with Mathematics

B.            Cultural Appreciation

C.            Interpreting Data

D.            Logical Thinking

E.             Making Decisions

F.             Mathematics in Context

G.            Number Sense

H.            Practical Skills

I.              Prerequisite Knowledge

J.             Symbol Sense

 

to which I would peg my quiz questions categorically as follows:

 

1.             Statistical Tie                        C, E, F, H

2.             Fragile Proof                         A, D, I

3.             Investment Estimate            E, F, G, H

4.             Discover or Invent               A, B, D, I

5.             AIDS Data                             C, F, G, I

6.             Social Security      A, B, D, E, G, H

7.             Silly Proof                              D, I

8.             Solar System                         C, E, F, G, H

9.             Axioms and Truth                D, I, J

10.           Testing Memo                      C, D, E, F, H

11.           Cataclysmic                          B

 

If we wish for the sake of mental ease to reduce the 10 overlapping elements of quantitative literacy to a few phrases, I would propose two: realistic mathematics in context and mathematics in perspective. Both of these can be summed up by a familiar phrase: quantitative literacy is about mathematical understanding, not merely technical proficiency. Certainly, the call for a more realistic approach to mathematics via the study of numbers in context is at the heart of the case for QL. The importance of context is underscored repeatedly in Mathematics and Democracy,4 and not only in the case statement:

 

In contrast to mathematics, statistics, and most other school subjects, quantitative literacy is inseparable from its context. In this respect it is more like writing than like algebra, more like speaking than like history. Numeracy has no special content of its own, but inherits its content from its context.5

 

. . . mathematics focuses on climbing the ladder of abstraction, while quantitative literacy clings to context. Mathematics asks students to rise above context, while quantitative literacy asks students to stay in context. Mathematics is about general principles that can be applied in a range of contexts; quantitative literacy is about seeing every context through a quantitative lens.6

 

But what exactly is implied here for assessment, despite the surface appeal of the contrast? To assess QL, we need to make the idea of “context” (and “realistic”) concrete and functional. What exactly is a context? In what sense does mathematics “rise above context” while QL asks students to “stay in context”? Does context refer to the content area in which we do QL (as suggested by one of the essays in Mathematics and Democracy7) or does context refer to the conditions under which we are expected to use mathematical abilities in any content area? If QL is “more like writing,” should we conclude that current writing assessments serve as good models for contextualized assessment? Or might not the opposite be the case: the contextual nature of writing is regularly undercut by the canned, bland, and secure one-shot writing prompts used in all large-scale tests of writing? If context is by definition unique, can we ever have standardized tests “in context”? In other words, is “assessing performance in context” a contradiction in terms?

 

What about assessing for mathematics in perspective, our other capsule summary of QL?  As quiz questions 2, 4, 9, and 11 suggest, such an assessment represents a decidedly unorthodox approach to teaching and assessment for grades 10 to 14. Some readers of this essay no doubt reacted to those questions by thinking, “Gee, aren’t those only appropriate for graduate students?”  But such a reaction may only reveal how far we are from understanding how to teach and assess for understanding.  We certainly do not flinch from asking high school students to read and derive important meaning from Shakespeare’s Macbeth, even though our adult hunch might be that students lack the psychological and literary wisdom to “truly” understand what they read. Reflection and meaning making are central to the learning process, even if it takes years to produce significant results. Why should mathematics assessment be any different?

 

In fact, I have often explored questions 4 and 9 on the nature of “givens” and proof with high school mathematics classes, with great results, through such questions as: Which came first: a game or its rules? Can you change the rules and still have it be the same game? Which geometry best describes the space you experience in school and the space on the surface of the earth? Then why is Euclid’s the one we study?  In one tenth-grade class, a student with the worst grades (as I later found out from the surprised teacher) eagerly volunteered to do research on the history of rule changes in his favorite sports, to serve as fodder for the next class discussion on “core” versus changeable rules. (That discussion, coincidentally, led to inquiry into the phrase “spirit versus letter of the law”—a vital idea in United States history—based on the use of that phrase in a ruling made by the president of baseball’s American League in the famous George Brett pine-tar bat incident 20 years ago.)

 

I confess that making mathematics more deliberately meaningful, and then assessing students’ meaning making (as we do in any humanities class), is important to me.  Although some readers sympathetic to the case statement may disagree, they only need sit in mathematics classrooms for a while (as I have done over the past 20 years) to see that too many teachers of mathematics fail to offer students a clear view of what mathematics is and why it matters intellectually.  Is it any accident that student performance on tests is so poor and that so few people take upper-level mathematics courses?

 

Without anchoring mathematics on a foundation of fascinating issues and “big ideas,” there is no intellectual rationale or clear goal for the student.  This problem is embodied in the role of the textbook.  Instead of being a resource in the service of broader and defensible priorities, in mathematics classes the textbook is the course.  I encourage readers to try this simple assessment of the diagnosis: ask any mathematics student midyear, “So, what are the few really big ideas in this course? What are the key questions? Given the mathematics you are currently learning, what does it enable you to do or do better that you could not do without it?”  The answers will not yield mathematics teachers much joy.  By teaching that mathematics is mere unending symbol manipulation, all we do is induce innumeracy.

 

Quiz question 11 interests me the most in this regard because, whether or not I agree with Kline, I would be willing to bet that not more than one in 100 highly educated people know anything about the development in question—even if I were to give the hint of “Bolyai and Lobachevski.” More important, most would be completely taken aback by Kline’s language: how can any development in mathematics be intellectually cataclysmic?  (I can say without exaggeration that I was utterly roused to a life of serious intellectual work by becoming immersed in the controversies and discoveries Kline refers to. I had no idea that mathematics could be so controversial, so thought provoking, so important.)

 

Regardless of my idiosyncratic St. John’s College experience, should not all students consider the meaning of the skills they learn?  That is what a liberal education is all about:  So what? What of it? Why does it matter? What is its value? What is assumed? What are the limits of this “truth”?  These are questions that a student must regularly ask.  In this respect, quantitative literacy is no different from reading literacy: assessment must seek more than just decoding ability.  We need evidence of fluent, thoughtful meaning making, as Peter T. Ewell noted in his interview in Mathematics and Democracy.8

 

Talking about quantitative literacy as part of liberal education may make the problem seem quaint or “academic” in the pejorative sense. The QL case statement is in fact radical, in the colloquial and mathematical sense of that term.  As these opening musings suggest, we need to question the time-honored testing (and teaching) practices currently used in all mathematics classes.  We are forced to return to our very roots—about teaching, about testing, about what mathematics is and why we teach it to nonspecialists—if the manifesto on quantitative literacy is to be realized, not merely praised.

 

The result of students’ endless exposure to typical tests is a profound lack of understanding about what mathematics is: “Perhaps the greatest difficulty in the whole area of mathematics concerns students’ misapprehension of what is actually at stake when they are posed a problem. . . . [S]tudents are nearly always searching for [how] to follow the algorithm. . . . Seeing mathematics as a way of understanding the world . . . is a rare occurrence.”9 Surely this has more to do with enculturation via the demands of school, than with some innate limitation.10

 

Putting it this way at the outset properly alerts readers to a grim truth: this reform is not going to be easy. QL is a Trojan horse, promising great gifts to educators but in fact threatening all mainstream testing and grading practices in all the disciplines, but especially mathematics.  The implications of contextualized and meaningful assessment in QL challenge the very conception of “test” as we understand and employ that term.  Test “items” posed under standardized conditions are decontextualized by design.

 

These issues create a big caveat for those cheery reformers who may be thinking that the solution to quantitative illiteracy is simply to add more performance-based assessments to our repertoire of test items. The need is not for performance tests (also out of context)—most teacher, state, and commercial tests have added some—but for an altogether different approach to assessment.  Specifically, assessment must be designed to cause questioning (not just “plug and chug” responses to arid prompts); to teach (and not just test) which ideas and performances really matter; and to demonstrate what it means to do mathematics.  The case statement challenges us to finally solve the problem highlighted by John Dewey and the progressives (as Cuban notes11), namely, to make school no longer isolated from the world.  Rather, as the case statement makes clear, we want to regularly assess student work with numbers and numerical ideas in the field (or in virtual realities with great verisimilitude).

 

What does such a goal imply? On the surface, the answer is obvious: we need to see evidence of learners’ abilities to use mathematics in a distinctive and complicated situation.  In other words, the challenge is to assess students’ abilities to bring to bear a repertoire of ideas and skills to a specific situation, applied with good judgment and high standards.  In QL, we are after something akin to the “test” faced by youthful soccer players in fluid games after they have learned some discrete moves via drills, or the “test” of the architect trying to make a design idea fit the constraints of property, location, budget, client style, and zoning laws.

 

Few of us can imagine such a system fully blown, never mind construct one.  Our habits and our isolation—from one another, from peer review, from review by the wider world—keep mathematics assessment stuck in its ways.  As with any habit, the results of design mimic the tests we experienced as students. The solution, then, depends on a team design approach, working against clear and obligatory design standards.  In other words, to avoid reinventing only what we know, assessment design needs to become more public and subject to disinterested review—in a word, more professional.

 

This is in fact the chief recommendation for improving mathematics teaching in The Teaching Gap, based on a process used widely in Japanese middle schools.12 I can report that although such an aim may at first seem threatening to academic prerogative, for the past 10 years we have trained many dozens of high school and college faculties to engage in this kind of group design and peer review against design standards, without rancor or remorse.  (Academic freedom does not provide cover for assessment malpractice: a test and the grading of it are not valid simply because a teacher says that they are.)

 

Thus the sweeping reform needed to make QL a reality in school curriculum and assessment is as much about the reinvention of the job description of “teacher” and the norms of the educational workplace as it is about developing new tests.  To honor the case statement is to end the policies and practices that make schooling more like a secretive and austere medieval guild than a profession.13 The change would be welcome; I sketch some possibilities below.

 

 

What We Assess Depends on Why We Assess

 

Any discussion of assessment must begin with the question of purpose and audience: for what —and whose—purposes are we assessing? What are the standards and end results sought and by whom? What exactly do we seek evidence of and what should that evidence enable us and the learners to do?

 

These are not simple or inconsequential questions. As I have argued elsewhere, in education we have often sacrificed the primary client (the learner) in the name of accountability.14 Students’ needs too often have been sacrificed to teachers’ need for ease of grading; teachers’ needs as coach too often have been sacrificed to the cost and logistical constraints imposed by audits testing for accountability or admissions.  Rather than being viewed as a key element in ongoing feedback cycles of learning to perform, testing is viewed as something that takes place after each bit of teaching is over to see who got it and who did not, done in the most efficient manner possible, before we move on in the linear syllabus, regardless of results.

 

If there is an axiom at the heart of this argument it is this: assessment should be first and foremost for the learner’s sake, designed and implemented to provide useful feedback to the learner (and teacher-coach) on worthy tasks to make improved performance and ultimate mastery more likely.15 This clearly implies that the assessment must be built on a foundation of realistic tasks, not proxies, and built to be a robust, timely, open, and user-friendly system of feedback and its use. Assessments for other purposes, (e.g., to provide efficiently gained scores for ranking decisions, using secure proxies for real performance) would thus have to be perpetually scrutinized to be sure that a secondary purpose does not override the learner’s right to more educative assessment.

 

We understand this in the wider world.  Mathematicians working for the U.S. Census Bureau are paid to work on situated problems on which their performance appraisals depend.  We do not keep testing their mathematical virtuosity, using secure items, to determine whether they get a raise based merely on what they know.  Athletes play many games, under many different conditions, both to test their learning and as an integral part of learning.  I perform in concert once a month with my “retro” rock band the Hazbins to keep learning how to perform (and to feel the joy from doing so); a score from a judge on the fruits of my guitar lessons, in isolated exercises, would have little value for me.  The formal challenge is not an onerous extra exercise but the raison d’ętre of the enterprise, providing educational focus and incentive.

 

Yet, most tests fail to meet this basic criterion, designed as they are for the convenience of scorekeepers not players. Consider:

       The test is typically unknown until the day of the assessment.

       We do not know how we are doing as we perform.

       Feedback after the performance is neither timely nor user friendly. We wait days, sometimes weeks, to find out how we did; and the results are often presented in terms that do not make sense to the performer or sometimes even to the teacher-coach.

       The test is usually a proxy for genuine performance, justifiable and sensible only to psychometricians.

       The test is designed to be scored quickly, with reliability, whether or not the task has intellectual value or meaning for the performer.

 

In mathematics, the facts are arguably far worse than this dreary general picture suggests. Few tests given today in mathematics classrooms (be they teacher, state, or test-company designed) provide students with performance goals that might provide the incentive to learn or meaning for the discrete facts and skills learned. Typical tests finesse the whole issue of purpose by relying on items that ask for discrete facts or technical skill out of context.  What QL requires (and any truly defensible mathematics program should require), however, is assessment of complex, realistic, meaningful, and creative performance.

 

Whether or not my particular opening quiz questions appeal to you, I hope the point of them is clear:  Evidence of “realistic use,” crucial to QL, requires that students confront challenges like those faced in assessment of reading literacy:  Hmm, what does this mean?  What kind of problem is this? What kind of response is wanted (and how might my answer be problematic)? What is assumed here, and is it a wise assumption? What feedback do I need to seek if I am to know whether I am on the right track?16 Assessment of QL requires tasks that challenge the learner’s judgment, not just exercises that cue the learner.

 

The same holds true for assessing students’ understanding of mathematics in perspective.  Students may be able to prove that there are 180 degrees in any triangle, but it does not follow that they understand what they have done. Can they explain why the proof works? Can they explain why it matters? Can they argue the crucial role played by the parallel postulate in making the theorem possible, the 2000-year controversy about that postulate (and the attempts by many mathematicians to prove or alter it), and the eventual realization growing from that controversy that there could be other geometries, as valid as Euclid’s, in which the 180-degree theorem does not hold true?

 

As it stands now, almost all students graduate from college never knowing of this history, of the existence of other valid geometries, and of the intellectual implications.  In other words, they lack perspective on the Euclidean geometry that they have learned.  When they do not really grasp what an axiom is and why we have it, and how other systems might and do exist, can they really be said to understand geometry at all?

 

What is at stake here is a challenge to a long-standing habit conveyed by a system that is not based on well-thought through purposes. This custom was perhaps best summarized by Lauren Resnick and David Resnick over 15 years ago: “American students are the most tested but the least examined students in the world.”27 As the case statement and the Resnick’s remark suggest, what we need is to probe more than quiz, to ask for creative solutions, not merely correct answers.18

 

 

What Is Realistic Assessment and Why Is It Needed?

 

Regardless of the nettlesome questions raised by the call for improved quantitative literacy, one implication for assessment is clear enough:  QL demands evidence of students’ abilities to grapple with realistic or “situated” problems.  But what is unrealistic about most mathematics tests if they have content validity and tap into skills and facts actually needed in mathematics?  The short answer is that typical tests are mere proxies for real performance.  They amount to sideline drills as opposed to playing the game on the field.

 

The aims in the case statement are not new ones.  Consider this enthusiastic report about a modest attempt to change college admissions testing at Harvard a few years back.  Students were asked to perform a set of key physics experiments by themselves and have their high school physics teacher certify the results, while also doing some laboratory work in front of the college’s professors:

 

The change in the physics requirement has been more radical than that in any other subject. . . . For years the college required only such a memory knowledge of physical laws and phenomena as could be got from a . . . textbook. . . . [U]nder the best of circumstances the pupil’s thinking was largely done for him.  By this method of teaching . . . his memory was loaded with facts of which he might or might not have any real understanding, while he did very little real thinking. . . . This was a system of teaching hardly calculated to train his mind, or to awaken an interest in [physics].

 

How different is the present attitude of the college! It now publishes a descriptive list of forty experiments, covering the elementary principles of mechanics, sound, light, heat, and electricity. These, so far as possible, are quantitative experiments; that is, they require careful measurements from which the laws and principles of physics can be reasoned out. Where, for any reason, such measurements are impossible, the experiments are merely illustrative; but even from these the student must reason carefully to arrive at the principles which they illustrate. The student must perform these experiments himself in a laboratory, under the supervision of a teacher. He must keep a record of all his observations and measurements, together with the conclusions which he draws from them. The laboratory book in which this record is kept, bearing the certificate of his instructor, must be presented for critical examination when he comes to [the admissions office]. In addition to this, he is tested by a written paper and by a laboratory examination.19

 

This account was written about Harvard in the Atlantic Monthly—in 1892! We know what happened later, of course. The College Board was invented to make admissions testing more streamlined and standardized (and thereby, it must be said, more equitable for students around the country, as well as less of a hassle for colleges), but at another cost, as it turns out.

 

Although the century-old physics test may not have been situated in a real-world challenge, it was a noble attempt to see if students could actually do science. This is surely where assessment for QL must begin: Can the student do mathematics? Can the student confront inherently messy and situated problems well? That is a different question from “does the student know various mathematical ‘moves’ and facts?”

 

Some folks have regretted or resented my long-time use of the word “authentic” in describing the assessments we need.20 But the phrase remains apt, I think, if readers recall that one meaning of authentic is “realistic.”  Conventional mathematics test questions are not authentic because they do not represent the challenges mathematicians face routinely in their work. As noted above, a mathematics test is more like a series of sideline drills than the challenge of playing the game. In fact, mathematics tests are notoriously unrealistic, the source of unending jokes by laypersons about trains heading toward each other on the same track, and the source of the wider world’s alienation from mathematics.  (Research is needed, I think, to determine whether simplistic test items are so abstracted from the world as to be needlessly hard for all but the symbolically inclined.)

 

How should we define “realistic”?21 An assessment task, problem, or project is realistic if it

is faithful to how mathematics is actually practiced when real people are challenged by problems involving numeracy.   The task(s) must reflect the ways in which a person’s knowledge and abilities are tested in real-world situations. Such challenges

      ask us to “do” the subject.  Students have to use knowledge and skills wisely and effectively to solve unstructured problems, not simply grind out an algorithm, formula, or number.

      require judgment and innovation.  Instead of merely reciting, restating, or replicating through demonstration the lessons taught and skills learned, students have to explore projects in mathematics, using their repertoire of knowledge and skills.

      reflect the contexts in which adults are tested in the workplace, in civic life, and in personal life.   Contexts involve specific situations that have particular constraints, purposes, and audiences.

      allow appropriate opportunities to rehearse, practice, consult resources, solicit feedback, refine performances, and revise products.   Secrecy, enforced quiet, solitary work, and other artificial constraints imposed by large-scale testing are minimized.

 

Nothing new here.  Benjamin Bloom and his colleagues made the same point almost 50 years ago, in their account of application and synthesis:

 

[S]ituations new to the student or situations containing new elements as compared to the situation in which the abstraction was learned . . . . Ideally we are seeking a problem which will test the extent to which an individual has learned to apply the abstraction in a practical way.22  . . . [A] type of divergent thinking [where] it is unlikely that the right solution to a problem can be set in advance.23

 

In later materials, Bloom and his colleagues characterized synthesis tasks in language that makes clearer what we must do to make the assessment more realistic:

 

The problem, task, or situation involving synthesis should be new or in some way different from those used in instruction. The students . . . may have considerable freedom in redefining it. . . . The student may attack the problem with a variety of references or other available materials as they are needed.  Thus synthesis problems may be open-book examinations, in which the student may use notes, the library, and other resources as appropriate. Ideally synthesis problems should be as close as possible to the situation in which a scholar (or artist, engineer, and so forth) attacks a problem he or she is interested in. The time allowed, conditions of work, and other stipulations, should be as far from the typical, controlled examination situation as possible.24

 

Researcher Fred Newmann and his colleagues at the University of Wisconsin have developed a similar set of standards for judging the authenticity of tasks in assessments and instructional work and have used those standards to study instructional and assessment practices around the country.25 In their view, authentic tasks require: