Testing...Testing...One, Two, Three

Quick: Name a standardized testing company. Unless you are a teacher, principal, or state education official, chances are the one name you'll come up with is the Educational Testing Service (ETS), purveyor of the SAT, the GRE, and an alphabet soup of other post-secondary admissions tests.

But if you're a public-school student in the United States--spending an increasing number of hours hunched over a "bubble sheet" with #2 pencil in hand--your test will likely come from one of a handful of companies in the nearly invisible K-12 standardized testing market. Over 40 states now mandate standardized testing of public-school students, usually in multiple grades, and the Bush administration's education plan calls for testing all public-school students nationwide in grades three to eight. Increasingly, test scores are being used for high-stakes decisions--such as whether students get promoted to the next grade or graduate from high school, or whether teachers and principals get bonuses or keep their jobs.

The current "test-heavy" model of education reform represents the growth of corporate influence on the schools. In many states, business leaders have formed coalitions for the express purpose of reshaping public schools. As John H. Stevens, executive director of the Texas Business and Education Coalition, said at a July 2000 education reform conference, "[E]ducators do not dominate the dialogue on education in Texas. For more than a decade, the business community and a group of key legislative leaders ... have been the major players in shaping state education policy."

These business-led coalitions uniformly advocate more--and higher-stakes--standardized tests. In many states, the business coalitions are pressing for increased public education spending; in their view, more testing and stricter accountability systems are merely tradeoffs that educators must make in return for new monies. Of course, the business agenda for education reform does not end there. In many instances, business leaders also hope to weaken teachers' unions and spur privatization.

But more generally, business leaders and management gurus have been very vocal about the need to apply business-based management techniques to the schools. And in the business model, the need for a lot of testing is obvious. After all, what business could function without "quick and constant measurement of output," as a 1999 Forbes article on "quality control" in the schools put it?

As a result of these initiatives, America's schoolchildren are now being subjected to more and more standardized testing. As the stakes grow higher, we need to turn a spotlight on the K-12 testing industry--and how industry trends are affecting the creation and marketing of the tests themselves.

That's Edutainment

In recent years, the creation and scoring of K-12 tests has become big business. Between 1955 and 1970, sales of standardized tests grew gradually, from $5 million to $25 million (both in 1988 dollars). Since then, sales have increased dramatically, reaching $130 million in 1990 and jumping to $234 million in 2000. Even these figures understate the industry's size. A 1993 Boston College study estimates that the K-12 standardized testing industry may be four to six times larger--perhaps as much as $1.5 billion a year--when scoring and score reporting are included, along with customized tests produced under contract with individual states.

One of the largest corporate players in the testing market is a firm that, for much of its history, had only a tangential relationship to education. Capitalizing on new scanning technology developed at the University of Iowa, National Computer Systems (NCS) was founded as a data processing firm in 1962. Over the next 20 years, NCS became the nation's largest scorer of standardized tests, although it maintained a substantial data-processing business outside of education. NCS began selling some guidance and counseling tests in the mid-1980s, then rapidly became a leader in the overall school testing market, in part by winning large contracts in several states that were adopting customized testing programs. NCS's revenues have grown dramatically, from $35 million in 1980 to nearly $630 million in 1999.

Now, in line with industry trends, NCS has been bought out by a multinational. In 2000, Texas renewed NCS's contract to operate its statewide testing program, a contract worth $233 million over five years. A few months later, Pearson plc, a British media conglomerate, purchased NCS for $2.5 billion. Pearson, like NCS, did not start out in the education business. It owns The Financial Times, The Economist, Penguin Books, Simon & Schuster, and some television production companies; until 2000, it also owned about half of Lazard Bros. U.K. Bank. But with the purchase of NCS, Pearson Education--with about $3 billion in sales in 2000--now accounts for over half of Pearson's overall revenues.

Along with NCS, three other firms dominate the K-12 test market: Harcourt Educational Measurement, CTB/McGraw-Hill, and Riverside Publishing. (Non-profit ETS is also trying to carve out a share of this market with a new for-profit subsidiary, K-12 Works.) Unlike NCS, these firms at least have long histories in educational publishing. Like NCS, however, they have been subject to the vicissitudes of multinational merger-and-acquisition mania. Harcourt General, the latest incarnation of the eminent publishing house Harcourt, Brace, Jovanovich since its 1991 acquisition by General Cinemas, was purchased by British-Dutch scientific publisher Reed Elsevier last July. Around the same time, Riverside's parent company, Houghton Mifflin, was acquired by Vivendi Universal SA, a French media company whose website hails the purchase as an important addition to its "edutainment" business.

Risky Business

As test publishers strive to accommodate the needs of their new owners, they will no doubt be paying closer attention to the bottom line. One impact of the changes in industry structure--both the recent mega-mergers and an earlier round of mergers and acquisitions--is already clear: how, and how well, tests are authored. Most of the classic K-12 tests, such as the Stanford Achievement Test and the Iowa Tests of Basic Skills, were created under university auspices and written by scholars of psychometrics (the study of educational and psychological assessment) and education. Today, test authors are more typically anonymous publishing company employees. According to Professor Walter Haney, who co-authored the 1993 Boston College study, this change is probably responsible for "many more errors, in the tests themselves and in their scoring."

Indeed, test and scoring errors have become practically routine. A New York Times analysis in the spring of 2001 found that, in 16 states, testing contractors had made significant errors in scoring or results analysis. In 1999, scoring errors by CTB/McGraw Hill affected schools across the country: In New York City, 9,000 students were mistakenly ordered to go to summer school, and principals and district superintendents across the city--along with Schools Chancellor Rudy Crew--lost their jobs; in Nevada, elementary schools were mistakenly labeled "inadequate." In the spring of 2000, thanks to scoring errors by NCS, a number of Minnesota high-school seniors had their diplomas withheld. And last year in Massachusetts, where Harcourt has run the statewide testing program since 2000, students themselves found errors in several questions on the high-stakes Massachusetts Comprehensive Assessment System (MCAS) tests.

The industry's changing structure, along with new marketing practices, has also compromised test publishers' commitment to complying with ethical standards in the field. The Code of Fair Testing Practices, created by major professional organizations in education and psychology, spells out the responsibilities of both those who publish and use standardized tests. For instance, the code prohibits test users from misusing test results, and also requires test publishers to warn their customers against such practices. While the code does not identify specific examples, educators and psychometricians generally agree that using a single test score to make a high-stakes determination--such as high-school graduation--represents misuse. But as a business-based model of accountability comes to dominate education reform, a number of states are beginning to do just that.

In this new climate, test publishers are leaving their ethical responsibilities behind. In the past, when achievement tests were purchased by thousands of individual school districts, test publishers could reasonably claim that they could not police how each district was using test results. Today, testing companies are signing large, multiyear contracts with states to create customized statewide tests, and sometimes the companies are well aware from the start that a state is planning to misuse test results. But instead of refusing to participate in the process, most publishers are willing to put fair-practice standards aside. (An interesting exception occurred in 1987, when ETS--concerned that Texas planned to fire teachers who did not pass a new recertification test--declined to bid for the contract to develop it. Of course, other firms stepped in and bid, and Texas does fire teachers on the basis of the resulting test.)

For test publishers, the fair-testing protocol can be a sensitive public relations problem. Last May, for example, Eugene Paslov, president of Harcourt Educational Measurement, told a group of Massachusetts high-school students, "When these tests are used exclusively for graduation, I think that's wrong." A day later he backtracked, claiming that the high-stakes MCAS provides "multiple measures" of student achievement since students are allowed to retake it--a statement many education experts found laughable. Likewise, Michael H. Kean, vice president for public and government relations at CTB, told Education Week in March 2000 that "high-stakes decisions should not be made on a single measure." In interviews, both Kean and a spokesperson for NCS claimed that, in several instances, their companies have decided not to compete for particular testing contracts because of concerns about standards. However, neither was willing to give specifics. Time will tell whether upper-level managers from Pearson or Vivendi will allow psychometricians in their employ to stick to their scruples--especially if it means missing out on lucrative contracts.

Since the testing industry's actions can have a profound impact on millions of students and educators, it seems reasonable to expect government regulation. But for now, federal regulation is nonexistent. As the 1993 Boston College study points out, "While our society requires product warning labels on...personal deodorants and food coloring, no warning labels are federally required on test instruments that may determine whether someone gains employment or is classified as mentally retarded."

At the state level, efforts to protect the rights of those who purchase and take standardized tests have been largely unsuccessful. Back in the 1970s, a number of states considered "truth-in-testing" legislation, which would have required companies to release test questions and provide information about test development and scoring--but only New York passed a law. Even then, it applied only to post-secondary admissions tests, not K-12 achievement tests, and test publishers have slowed its implementation with lengthy lawsuits. Now that many states have made standardized testing the centerpiece of their education reform programs, it's probably even less likely that any new state regulations will be passed. Ironically, the current move toward high-stakes testing has forced the industry itself (and its client states and school districts) into the courts. In several states, students who have been held back or denied high-school diplomas are suing, and CTB's Kean emphasized that the company puts great stress on "legal defensibility" in its conversations with state education departments and other clients.

Although the "Big Four" still dominate the K-12 testing market, ballooning revenues have attracted plenty of small start-ups. Some have been created by former schoolteachers--dissatisfied with their salaries, perhaps?--while others are the work of MBA marketing types with no apparent ties to education. Consider Advantage Learning Systems (ALS), created in 1986 to market computerized reading comprehension tests matched to popular children's books. (These tests aren't marketed as achievement tests, but in today's test-crazy climate, who knows?) So far, the plan seems to be working beautifully. Forbes gushes that ALS's "handsome [profit] margin--and [its] 61% annual growth rate--makes it a favorite on Wall Street, with a stunning $1.3 billion market capitalization."

But despite their sales successes, some of these start-ups--like their larger competitors--have had serious problems with quality control. Measurement Inc., a Florida-based test scoring company, guaranteed a 99% accuracy rate for scoring the essays that are now common on state achievement tests. Suspicious of that claim, Boston College's Haney analyzed the scoring protocol and says the accuracy rate is closer to 70%, a fact that the company ultimately acknowledged.

Naturally, the current test mania has spawned a huge demand for test-prep materials. Although estimates of its size are hard to come by, this market appears to be growing too--probably as rapidly as the stakes attached to the tests. Bookstore shelves that for years have been weighed down with volumes claiming to raise scores on the SAT, ACT, GRE, GMAT, or LSAT are now also laden with test-prep books for the younger set, down to and including five-year-olds. Other test-prep companies market their workbooks--with titles ranging from eNo Stress to Buckle Down--directly to schools. Test-prep consultants also sell their services to schools, where they plan test-week pep rallies and teach strategies such as never marking the same letter on three multiple-choice questions in a row.

As with test publishers, the scramble to boost revenues sometimes leads test-prep companies to violate ethical standards. For example, Buckle Down, Inc. sells customized test-prep workbooks--for every test at every grade--in states that have developed their own statewide standardized tests. That sounds great, except that the validity, such as it is, of any standardized test is compromised when students use preparation materials that are virtually the same as the test itself. For example, California Department of Education guidelines prohibit the use of test-prep materials written for a specific test. Buckle Down's California marketing blurb acknowledges this policy, assures customers that it's in compliance--and then, in the same paragraph, touts how closely its materials are pegged to the state's tests!

Who's Picking Up the Tab?

Where is the money for all of these new tests and test-related products coming from? Parents and schools are paying for the test-prep materials, sometimes at the cost of other books and supplies. One teacher in Texas complained that practically all of her school's materials budget is now going to test-prep books.

As for the tests themselves, state governments are increasingly footing the bill. State spending on testing has grown dramatically in recent years: According to two studies cited in a 2001 report by the Education Commission of the States, the total for 2001 was around $400 million. Consider Texas, one of the first states to mandate statewide testing (thanks to Ross Perot) in 1980. In 1990, Texas shifted to a more extensive testing program called the Texas Assessment of Academic Skills (TAAS); now, all students in grades three through eight, plus grade ten, take some TAAS tests. Texas is also implementing standardized end-of-course tests for some high-school classes, including biology, algebra, and U.S. history. (Imagine how that test will limit a U.S. history teacher's curriculum choices!) From third grade on, all second language speakers must take a reading proficiency test every year until they pass. There are alternative versions of the TAAS for special-education students. And on and on. Texas state spending on testing has risen from $19.5 million in fiscal year 1995 to $68.6 million in fiscal year 2001. (Surprisingly, Texas Education Agency officials were unable to provide figures prior to 1995.)

In Massachusetts, a state whose testing program has been widely hailed as a national model, spending on tests has also risen rapidly. The state launched a limited statewide testing program in 1990, and adopted the MCAS test in 1997. Passing the MCAS will become a high-school graduation requirement for the class of 2003, and MCAS scores are already being used to designate underperforming schools. The test was originally intended only for fourth, eighth, and tenth graders. But now, as in Texas, every student in grades three through eight, plus grade ten, takes some MCAS tests. Creating, distributing, and scoring all of these tests has a high price tag. In the early 1990s, the state was spending between $500,000 and $750,000 a year on testing. This rose to $8.3 million in fiscal year 1998, $14.8 million in fiscal year 2000, and $23.2 million (projected) in fiscal year 2002.

Even now, states are spending only a small proportion of their overall education budgets on standardized tests--in the range of 0.5%. Just a few years ago, however, it was closer to 0.1% or 0.2%. Greater spending on assessment requires tradeoffs elsewhere in the education budget. In Texas, for example, between fiscal years 1995 and 2001, spending for adult education was cut by half (from $87.3 million to $40.4 million) and professional development dropped by nearly two-thirds (from $28 million to $9.8 million). In that same period, spending on tests more than tripled.

A number of other factors are pushing up testing costs, too. In response to longtime complaints about traditional standardized tests, which were made up entirely of multiple-choice questions, some states are moving toward "performance-based" tests. These tests include open-response questions that require students to write out an answer, whether to a math problem or an essay prompt. Performance-based tests are not necessarily more expensive to develop, but they're far more expensive to score.

The use of tests scores for high-stakes decisions is also costly, because it requires greater security measures and duplicate scoring. Even the cost of defending against test-taker lawsuits can be attributed to high stakes--since no one would be suing over test or scoring errors if diplomas and jobs weren't at risk.

Finally, the direct costs of developing, printing, and scoring a test are not its only costs. A number of researchers believe that, to weigh the costs and benefits of standardized tests fairly, total costs ought to include the time teachers and principals spend administering the tests, the time teachers spend conducting test-prep activities, and even the time students spend prepping for and taking the tests. Estimates of these indirect costs of standardized tests vary widely--anywhere from 2 times to 60 times their direct costs.

The Truth About Testing

In any case, the real costs of more standardized testing, especially the high-stakes variety, may be those that are harder to quantify: effects on classroom instruction, student and teacher morale, drop-out rates, and so on. A number of studies, many of them commissioned by the Civil Rights Project at Harvard University, have documented negative impacts in all of these areas. Several studies have focused on Texas, where high-stakes testing has been in place the longest. There, drop-out rates among African-American and Latino students have risen since high-stakes testing began. There is even some evidence that students who pass the TAAS test and graduate actually demonstrate poorer writing skills when they arrive at college than did their peers a few years earlier, before high-stakes testing.

This last finding reflects the way high-stakes tests can compromise classroom instruction. Under pressure to produce rapid increases in scores, some teachers are ditching their normal curricula in favor of week after week of isolated test-prep exercises. Students certainly don't benefit when their teachers set aside quality literature--or even carefully prepared textbooks--in favor of short, out-of-context reading passages followed by strings of multiple-choice questions. But the pressure to collapse instruction into mere test preparation can be intense, and it is no doubt greatest in schools with the lowest test scores, typically those serving low-income communities and communities of color. Testing advocates claim that better, performance-based tests will put a stop to the "dumbing-down" phenomenon. But these tests are expensive; in fact, a few states that tried performance-based tests have already abandoned them because of the cost. As of 1999, only fourteen states were using any performance-based testing beyond a single writing sample.

These are the effects of high-stakes standardized testing that are invisible in the hard-data, number-crunching world of the business roundtables--including the national Business Roundtable--that are now setting the education-reform agenda.

Of course, this is not the first time that a business model has been applied to America's public schools. In the early twentieth century, a great faith in the logic of the assembly line led many educators and politicians to reshape the schools in its image. Schools had to become much more efficient, they reasoned, in order to assimilate millions of new immigrants, provide more years of schooling to each child, and prepare students for the world of work. To this factory model we owe today's large schools with classes segregated by subject and students segregated by "ability." Not coincidentally, that era also witnessed the introduction of standardized testing on a large scale.

Now it's clear that this model has failed. Research shows that small schools are more effective and that interdisciplinary approaches to subject matter and heterogeneous grouping of students can enhance learning. However, today's schools are stuck with a 100-year-old model of which hulking, oversized buildings are only the most visible sign. A century from now, will educators look back just as ruefully on today's business-based reforms?

Although the momentum behind the Bush education plan appeared to slow after September 11, the proposal was back on the House floor as of this writing (mid- December). If it passes, it will require a significant expansion of standardized testing. Even in states that already have mandatory testing, most don't test every child in third through eighth grade, as the Bush plan mandates. This means the firms that supply and score most of these tests will have more work to do, and more money to make; it also means that their activities will have an even greater impact on the educational careers and life chances of millions of people. At a minimum, the industry should be under much closer scrutiny than it is today. Better yet, the United States needs to revisit the business model that places high-stakes standardized testing at the center of education reform.

Resources: Walter M. Haney, George F. Madaus, and Robert Lyons, The Fractured Marketplace for Standardized Testing (Boston: Kluwer, 1993); Education Commission of the States, A Closer Look: State Policy Trends in Three Key Areas of the Bush Education Plan--Testing, Accountability, and School Choice (Denver, 2001); FairTest: The National Center for Fair & Open Testing, 342 Broadway, Cambridge MA 02139, tel (617) 864-4810.

Interview: Asian Garment Worker Uprisings

Some Elements of a Progressive International Trade Policy

No Matter How You Look at It, the Big Beautiful Bill is a Monstrosity

Testing...Testing...One, Two, Three

That's Edutainment

Risky Business

Who's Picking Up the Tab?

The Truth About Testing

Amy Gluckman

Read next