Overtesting or Overcounting?

Commenting on the Center for American Progress’s (CAP’s) report, Testing Overload in America’s Schools,


…and the Education Writers’ Association coverage of it,


… Some testing opponents have always said there is overtesting, no matter how much there has been actually (just like they have always said there is a “growing backlash” against testing). Given limited time, I will examine only one of the claims made in the CAP report:

“… in the Jefferson County school district in Kentucky, which includes Louisville, students in grades 6-8 were tested approximately 20 times throughout the year. Sixteen of these tests were district level assessments.” (p.19)

A check of the Jefferson County school district web site –


reveals the following: there are no district-developed standardized tests – NONE. All systemwide tests are either state developed or national exams.

Moreover, regular students in grades 6 and 7 take only one test per year – ONE – the K-Prep, though it is a full-battery test (i.e., five core subjects) with only one subject tested per day. (No, each subtest does not take up a whole day; more likely each subtest takes 1-1.5 hours, but slower students are given all morning to finish while the other students study something else in a different room and the afternoon is used for instruction.) So, even if you (I would say, misleadingly) count each subtest as a whole test, the students in grades 6 and 7 take only 5 tests during the year, none of them district tests.

So, is the Center for American Progress lying to us? Depends on how you define it. There is other standardized testing in grades 6 and 7. There is, for example, the “Alternate K-Prep” for those with disabilities, but students without disabilities don’t take it and students with disabilities don’t take the regular K-Prep.

Also there is the “Make-up K-Prep” which is administered to the regular students who were sick during the regular K-Prep administration times. But, students who took the K-Prep during the regular administration do not take the Make-up K-Prep.

There are also the ACCESS for ELLs and Alternate ACCESS for ELLs tests administered in late January and February, but only to English Language Learners. ACCESS is used to help guide the language training and course placement of ELL (or, ESL) students. Only a Scrooge would begrudge the district using these tests as “overtesting.”

And, that’s it. To get to 20 tests a year, the CAP had to assume that each and every student took each and every subtest. They even had to assume that the students sick during the regular K-Prep administration were not sick, and that all students who took the regular K-Prep also took the Make-up K-Prep.

Counting tests in US education has been this way for at least a quarter-century. Those prone to do so goose the numbers any way they plausibly can. A test is given in grade 5 on Tuesday? Count all students in the school as being tested. A DIBELS test takes all of one minute to administer? Count a full class period as lost. A 3-hour ACT has five sub-sections? That counts as five tests. Only a small percentage of schools in the district are sampled to take the National Assessment of Educational Progress in one or two grades? Count all students in all grades in the district as being tested, and count all the subjects tested individually.

Critics have gotten away with this fibbing for so long it has become routine–the standard way to count the amount of testing. And, reporters tend to pass it along as fact.

Richard P. Phelps

Overtesting or Overcounting? was originally published on Nonpartisan Education Blog

Overtesting or Overcounting? was originally published on Nonpartisan Education Blog

Overtesting or Overcounting? was originally published on Nonpartisan Education Blog

Kamenetz, A. (2015). The Test: Why our schools are obsessed with standardized testing—but you don’t have to be. New York: Public Affairs. Book Review, by Richard P. Phelps

Perhaps it is because I avoid most tabloid journalism that I found journalist Anya Kamenetz’s loose cannon Introduction to The Test: Why our schools are obsessed with standardized testing—but you don’t have to be so jarring. In the space of seven pages, she employs the pejoratives “test obsession”, “test score obsession”, “testing obsession”, “insidious … test creep”, “testing mania”, “endless measurement”, “testing arms race”, “high-stakes madness”, “obsession with metrics”, and “test-obsessed culture”.

Those un-measured words fit tightly alongside assertions that education, or standardized, or high-stakes testing is responsible for numerous harms ranging from stomachaches, stunted spirits, family stress, “undermined” schools, demoralized teachers, and paralyzed public debate, to the Great Recession (pp. 1, 6, 7), which was initially sparked by problems with mortgage-backed financial securities (and parents choose home locations in part based on school average test scores). Oh, and tests are “gutting our country’s future competitiveness,” too (p. 1).

Kamenetz made almost no effort to search for counter evidence[1]: “there’s lots of evidence that these tests are doing harm, and very little in their favor” (p. 13). Among her several sources for information of the relevant research literature are arguably the country’s most prolific proponents of the notion that little to no research exists showing educational benefits to testing.[2] Ergo, why bother to look for it?

Had a journalist covered the legendary feud between the Hatfield and McCoy families, and talked only to the Hatfields, one might expect a surplus of reportage favoring the Hatfields and disfavoring the McCoys, and a deficit of reportage favoring the McCoys and disfavoring the Hatfields.

Looking at tests from any angle, Kamenetz sees only evil. Tests are bad because tests were used to enforce Jim Crow discrimination (p. 63). Tests are bad because some of the first scientists to use intelligence tests were racists (pp. 40-43).

Tests are bad because they employ the statistical tools of latent trait theory and factor analysis—as tens of thousands of social scientists worldwide currently do—but the “eminent paleontologist” Stephen J. Gould doesn’t like them (pp. 46-48). (He argued that if you cannot measure something directly, it doesn’t really exist.) And, by the way, did you know that some of the early 20th-century scientists of intelligence testing were racists? (pp. 48-57)

Tests are bad because of Campbell’s Law: “when a measure becomes a target, it ceases to be a good measure” (p. 5). Such a criticism, if valid, could be used to condemn any measure used evaluatively in any of society’s realm. Forget health and medical studies, sports statistics, Department of Agriculture food monitoring protocols, ratings by Consumers Reports’, Angie’s List, the Food and Drug Administration. None are “good measures” because they are all targets.

Tests are bad because they are “controlled by a handful of companies” (pp. 5, 81), “The testing company determines the quality of teachers’ performance.” (p. 20), and “tests shift control and authority into the hands of the unregulated testing industry” (p. 75). Such criticisms, if valid, could be used to justify nationalizing all businesses in industries with high scale economies (e.g., there are only four big national wireless telephone companies, so perhaps the federal government should take over), and outlaw all government contracting. Most of our country’s roads and bridges, for example, are built by private construction firms under contract to local, state, and national government agencies to their specifications, just like most standardized tests; but who believes that those firms control our roads?

Kamenetz swallows education anti-testing dogma whole. She claims that multiple-choice items can only test recall and basic skills (p. 35), that students learn nothing while they are taking tests (p. 15), and that US students are tested more than any others (pp. 15-17, 75)—and they are if you count the way her information sources do—counting at minimum an entire class period for each test administration, even a one-minute DIBELS test; counting all students in all grades of a school as taking a test whenever any students in any grade are taking a test; counting all subtests independently in the US (e.g., each ACT counts as five because it has five subtests) but only the whole tests for other countries; etc.

Standardized testing absorbs way too much money and time, according to Kamenetz. Later in the book, however, she recommends an alternative education universe of fuzzy assessments that, if enacted, would absorb far more time and money.

What are her solutions to the insidious obsessive mania of testing? There is some Rousseau-an fantasizing—all school should be like her daughter’s happy pre-school where each student learned at his or her own pace (pp. 3-4) and the school’s job was “customizing learning to each student” (p. 8).

Some of the book’s latter half is devoted to “innovative” (of course) solutions that are not quite as innovative as she seems to believe. She is National Public Radio’s “lead digital education reporter” so some interesting new and recent technologies suffuse the recommendations. But, even jazzing up the context, format, and delivery mechanisms with the latest whiz-bang gizmos will not eliminate the problems inherent in her old-new solutions: performance testing, simulations, demonstrations, portfolios, and the like. Like so many Common Core Standards boosters advocating the same “innovations”, she seems unaware that they have been tried in the past, with disastrous results.[3]

As I do not know Ms. Kamenetz personally, I must assume that she is sincere in her beliefs and made her own decisions about what to write. But, if she had naively allowed herself to be wholly misled by those with a vested interest in education establishment doctrine, the end result would have been no different.

The book is a lazily slapped-together rant, unworthy of a journalist. Ironically, however, I agree with Kamenetz on many issues. Like her, I do not much like the assessment components of the old No Child Left Behind Act or the new Common Core Standards. But, my solution would be to repeal both programs, not eliminate standardized testing. Like her, I oppose the US practice of relying on a single proficiency standard for all students (pp. 5, 36). But, my solution would be to employ multiple targets, as most other countries do. She would dump the tests.

Like Kamenetz, I believe it unproductive to devote more than a smidgen of time (at most half a day) to test preparation with test forms and item formats that are separate from subject matter learning. And, like her (p. 194), I am convinced that it does more harm than good. But, she blames the tests and the testing companies for the abomination; in fact, the testing companies prominently and frequently discourage the practice. It is the same testing opponents she has chosen to trust who claim that it works. It serves their argument to claim that non-subject-matter-related test preparation works because, if it were true, it would demonstrate that tests can be gamed with tricks and are invalid measurement instruments.

Like her, I oppose firing teachers based on student test scores, as current value-added measurement (VAM) systems do while there are no consequences for the students. I believe it wrong because too few data points are used and because student effort in such conditions is not reliable, varying by age, gender, socio-economic level, and more. But, I would eliminate the VAM program, or drastically revise it; she would eliminate the tests.

Like Kamenetz, I believe that educators’ cheating on tests is unacceptable, far more common than is publicly known, and should be stopped. I say, stop the cheating. She says, dump the tests.

It defies common sense to have teachers administering high-stakes tests in their own classrooms. Rotating test administration assignments so that teachers do not proctor their own students is easy. Rotating assignments further so that every testing room is proctored by at least two adults is easy, too. So, why aren’t these and other astonishingly easy fixes to test security problems implemented? Note that the education professionals responsible for managing test administrations are often the same who complain that testing is impossibly unfair.

The sensible solution is to take test administration management out of the hands of those who may welcome test administration fiascos, and hire independent professionals with no conflict of interest. But, like many education insiders, Kamenetz would ban the testing; thereby rewarding those who have mismanaged test administrations, sometimes deliberately, with a vacation from reliable external evaluation.

If she were correct on all these issues—that the testing is the problem in each case—shouldn’t we also eliminate examinations for doctors, lawyers, nurses, and pharmacists (all of which rely overwhelmingly on the multiple-choice format, by the way)?

Our country has a problem. More than in most other countries, our public education system is independent, self-contained, and self-renewing. Education professionals staffing school districts make the hiring, purchasing, and school catchment-area boundary-line decisions. School district boundaries often differ from those of other governmental jurisdictions, confusing the electorate. In many jurisdictions, school officials set the dates for votes on bond issues or school board elections, and can do so to their advantage. Those school officials are trained, and socialized, in graduate schools of education.

A half century ago, most faculty in graduate schools of education may have received their own professional training in core disciplines, such as Psychology, Sociology, or Business Management. Today, most education school faculty are themselves education school graduates, socialized in the prevailing culture. The dominant expertise in schools of education can maintain its dominance by hiring faculty who agree with it and denying tenure to those who stray. The dominant expertise in education journals can control education knowledge by accepting article submissions with agreeable results and rejecting those without.

Even most testing and measurement PhD training programs now reside in education schools, inside the same cultural cocoon.

Standardized testing is one of the few remaining independent tools US society has for holding education professionals accountable to serve the public, and not their own, interests. Without valid, reliable, objective external measurement, education professionals can do what they please inside our schools, with our children and our money. When educators are the only arbiters of the quality of their own work, they tend to rate it consistently well.

A substantial portion of The Test’s girth is filled with complaints that tests do not measure most of what students are supposed to or should learn: “It’s math and reading skills, history and science facts that kids are tested and graded on. Emotional, social, moral, spiritual, creative, and physical development all become marginal…” (p. 4). She quotes Daniel Koretz: “These tests can measure only a subset of the goals of education” (p. 14). Several other testing critics are cited making similar claims.

Yet, standards-based tests are developed in a process that takes years, and involves scores of legislators, parents, teachers, and administrators on a variety of decision-making committees. The citizens of a jurisdiction and their representatives choose the content of standards-based tests. They could choose content that Kamenetz and the several other critics she cites prefer, but they don’t.

If the critics are unhappy with test content, they should take their case to the proper authorities, voice their complaints at tedious standards commission hearings, and contribute their time to the rather monotonous work of test framework review committees. I sense that none of that patient effort interests them; instead, they would prefer that all decision-making power be granted to them, ex cathedra, to do as they think best for us.

Moreover, I find some of their assertions about what should be studied and tested rather scary. Our public schools should teach our children emotions, morals, and spirituality?

Likely that prospect would scare most parents, too. But, many parents’ first reaction to a proposal that our schools be allowed to teach their children everything might instead be something like: first show us that you can teach our children to read, write, and compute, then we can discuss further responsibilities.

So long as education insiders insist that we must hand over our money and children and leave them alone to determine—and evaluate—what they do with both, calls for “imploding” the public education system will only grow louder, as they should.

It is bad enough that so many education professors write propaganda, call it research, and deliberately mislead journalists by declaring an absence of countervailing research and researchers. Researchers confident in their arguments and evidence should be unafraid to face opponents and opposing ideas. The researchers Kamenetz trusts do all they can to deny dissenters a hearing.

Another potential independent tool for holding education professionals accountable, in addition to testing, could be an active, skeptical, and inquiring press knowledgeable of education issues and conflicts of interests. Other countries have it. Why are so many US education reporters gullible sycophants?



[1] She did speak with Samuel Casey Carter, the author of No Excuses: Lessons from 21 High-Performing High-Poverty Schools (2000) (pp. 81-84), but chides him for recommending frequent testing without “framing” “the racist origins of standardized testing.” Kamenetz suggests that test scores are almost completely determined by household wealth and dismisses Carter’s explanations as a “mishmash of anecdotal evidence and conservative faith.”

[2] Those sources are Daniel Koretz, Brian Jacob, and the “FairTest” crew. In fact, an enormous research literature revealing large benefits from standardized, high-stakes, and frequent education testing spans a century (Brown, Roediger, & McDaniel, 2014; Larsen & Butler, 2013; Phelps, 2012).

[3] The 1990s witnessed the chaos of the New Standards Project, MSPAP (Maryland), CLAS (California) and KIRIS (Kentucky), dysfunctional programs that, when implemented, were overwhelmingly rejected by citizens, politicians and measurement professionals alike. (Incidentally, some of the same masterminds behind those projects have resurfaced as lead writers for the Common Core Standards.)



Brown, P. C., Roediger, H. L., & McDaniel, M. A. (2014). Make it stick: The science of successful learning. Cambridge, MA: Belknap Press.

Larsen, D. P., & Butler, A. C. (2013). Test-enhanced learning. In K. Walsh (Ed.), Oxford textbook of medical education (pp. 443–452). Oxford: Oxford University Press. http://onlinelibrary.wiley.com/doi/10.1111/j.1365-2923.2008.03124.x/full

Phelps, R. P. (2012). The effect of testing on student achievement, 1910–2010. International Journal of Testing, 12(1), 21–43. http://www.tandfonline.com/doi/abs/10.1080/15305058.2011.602920


Kamenetz, A. (2015). The Test: Why our schools are obsessed with standardized testing—but you don’t have to be. New York: Public Affairs. Book Review, by Richard P. Phelps was originally published on Nonpartisan Education Blog

Kamenetz, A. (2015). The Test: Why our schools are obsessed with standardized testing—but you don’t have to be. New York: Public Affairs. Book Review, by Richard P. Phelps was originally published on Nonpartisan Education Blog

Kamenetz, A. (2015). The Test: Why our schools are obsessed with standardized testing—but you don’t have to be. New York: Public Affairs. Book Review, by Richard P. Phelps was originally published on Nonpartisan Education Blog