Kamenetz, A. (2015). The Test: Why our schools are obsessed with standardized testing—but you don’t have to be. New York: Public Affairs. Book Review, by Richard P. Phelps

Perhaps it is because I avoid most tabloid journalism that I found journalist Anya Kamenetz’s loose cannon Introduction to The Test: Why our schools are obsessed with standardized testing—but you don’t have to be so jarring. In the space of seven pages, she employs the pejoratives “test obsession”, “test score obsession”, “testing obsession”, “insidious … test creep”, “testing mania”, “endless measurement”, “testing arms race”, “high-stakes madness”, “obsession with metrics”, and “test-obsessed culture”.

Those un-measured words fit tightly alongside assertions that education, or standardized, or high-stakes testing is responsible for numerous harms ranging from stomachaches, stunted spirits, family stress, “undermined” schools, demoralized teachers, and paralyzed public debate, to the Great Recession (pp. 1, 6, 7), which was initially sparked by problems with mortgage-backed financial securities (and parents choose home locations in part based on school average test scores). Oh, and tests are “gutting our country’s future competitiveness,” too (p. 1).

Kamenetz made almost no effort to search for counter evidence[1]: “there’s lots of evidence that these tests are doing harm, and very little in their favor” (p. 13). Among her several sources for information of the relevant research literature are arguably the country’s most prolific proponents of the notion that little to no research exists showing educational benefits to testing.[2] Ergo, why bother to look for it?

Had a journalist covered the legendary feud between the Hatfield and McCoy families, and talked only to the Hatfields, one might expect a surplus of reportage favoring the Hatfields and disfavoring the McCoys, and a deficit of reportage favoring the McCoys and disfavoring the Hatfields.

Looking at tests from any angle, Kamenetz sees only evil. Tests are bad because tests were used to enforce Jim Crow discrimination (p. 63). Tests are bad because some of the first scientists to use intelligence tests were racists (pp. 40-43).

Tests are bad because they employ the statistical tools of latent trait theory and factor analysis—as tens of thousands of social scientists worldwide currently do—but the “eminent paleontologist” Stephen J. Gould doesn’t like them (pp. 46-48). (He argued that if you cannot measure something directly, it doesn’t really exist.) And, by the way, did you know that some of the early 20th-century scientists of intelligence testing were racists? (pp. 48-57)

Tests are bad because of Campbell’s Law: “when a measure becomes a target, it ceases to be a good measure” (p. 5). Such a criticism, if valid, could be used to condemn any measure used evaluatively in any of society’s realm. Forget health and medical studies, sports statistics, Department of Agriculture food monitoring protocols, ratings by Consumers Reports’, Angie’s List, the Food and Drug Administration. None are “good measures” because they are all targets.

Tests are bad because they are “controlled by a handful of companies” (pp. 5, 81), “The testing company determines the quality of teachers’ performance.” (p. 20), and “tests shift control and authority into the hands of the unregulated testing industry” (p. 75). Such criticisms, if valid, could be used to justify nationalizing all businesses in industries with high scale economies (e.g., there are only four big national wireless telephone companies, so perhaps the federal government should take over), and outlaw all government contracting. Most of our country’s roads and bridges, for example, are built by private construction firms under contract to local, state, and national government agencies to their specifications, just like most standardized tests; but who believes that those firms control our roads?

Kamenetz swallows education anti-testing dogma whole. She claims that multiple-choice items can only test recall and basic skills (p. 35), that students learn nothing while they are taking tests (p. 15), and that US students are tested more than any others (pp. 15-17, 75)—and they are if you count the way her information sources do—counting at minimum an entire class period for each test administration, even a one-minute DIBELS test; counting all students in all grades of a school as taking a test whenever any students in any grade are taking a test; counting all subtests independently in the US (e.g., each ACT counts as five because it has five subtests) but only the whole tests for other countries; etc.

Standardized testing absorbs way too much money and time, according to Kamenetz. Later in the book, however, she recommends an alternative education universe of fuzzy assessments that, if enacted, would absorb far more time and money.

What are her solutions to the insidious obsessive mania of testing? There is some Rousseau-an fantasizing—all school should be like her daughter’s happy pre-school where each student learned at his or her own pace (pp. 3-4) and the school’s job was “customizing learning to each student” (p. 8).

Some of the book’s latter half is devoted to “innovative” (of course) solutions that are not quite as innovative as she seems to believe. She is National Public Radio’s “lead digital education reporter” so some interesting new and recent technologies suffuse the recommendations. But, even jazzing up the context, format, and delivery mechanisms with the latest whiz-bang gizmos will not eliminate the problems inherent in her old-new solutions: performance testing, simulations, demonstrations, portfolios, and the like. Like so many Common Core Standards boosters advocating the same “innovations”, she seems unaware that they have been tried in the past, with disastrous results.[3]

As I do not know Ms. Kamenetz personally, I must assume that she is sincere in her beliefs and made her own decisions about what to write. But, if she had naively allowed herself to be wholly misled by those with a vested interest in education establishment doctrine, the end result would have been no different.

The book is a lazily slapped-together rant, unworthy of a journalist. Ironically, however, I agree with Kamenetz on many issues. Like her, I do not much like the assessment components of the old No Child Left Behind Act or the new Common Core Standards. But, my solution would be to repeal both programs, not eliminate standardized testing. Like her, I oppose the US practice of relying on a single proficiency standard for all students (pp. 5, 36). But, my solution would be to employ multiple targets, as most other countries do. She would dump the tests.

Like Kamenetz, I believe it unproductive to devote more than a smidgen of time (at most half a day) to test preparation with test forms and item formats that are separate from subject matter learning. And, like her (p. 194), I am convinced that it does more harm than good. But, she blames the tests and the testing companies for the abomination; in fact, the testing companies prominently and frequently discourage the practice. It is the same testing opponents she has chosen to trust who claim that it works. It serves their argument to claim that non-subject-matter-related test preparation works because, if it were true, it would demonstrate that tests can be gamed with tricks and are invalid measurement instruments.

Like her, I oppose firing teachers based on student test scores, as current value-added measurement (VAM) systems do while there are no consequences for the students. I believe it wrong because too few data points are used and because student effort in such conditions is not reliable, varying by age, gender, socio-economic level, and more. But, I would eliminate the VAM program, or drastically revise it; she would eliminate the tests.

Like Kamenetz, I believe that educators’ cheating on tests is unacceptable, far more common than is publicly known, and should be stopped. I say, stop the cheating. She says, dump the tests.

It defies common sense to have teachers administering high-stakes tests in their own classrooms. Rotating test administration assignments so that teachers do not proctor their own students is easy. Rotating assignments further so that every testing room is proctored by at least two adults is easy, too. So, why aren’t these and other astonishingly easy fixes to test security problems implemented? Note that the education professionals responsible for managing test administrations are often the same who complain that testing is impossibly unfair.

The sensible solution is to take test administration management out of the hands of those who may welcome test administration fiascos, and hire independent professionals with no conflict of interest. But, like many education insiders, Kamenetz would ban the testing; thereby rewarding those who have mismanaged test administrations, sometimes deliberately, with a vacation from reliable external evaluation.

If she were correct on all these issues—that the testing is the problem in each case—shouldn’t we also eliminate examinations for doctors, lawyers, nurses, and pharmacists (all of which rely overwhelmingly on the multiple-choice format, by the way)?

Our country has a problem. More than in most other countries, our public education system is independent, self-contained, and self-renewing. Education professionals staffing school districts make the hiring, purchasing, and school catchment-area boundary-line decisions. School district boundaries often differ from those of other governmental jurisdictions, confusing the electorate. In many jurisdictions, school officials set the dates for votes on bond issues or school board elections, and can do so to their advantage. Those school officials are trained, and socialized, in graduate schools of education.

A half century ago, most faculty in graduate schools of education may have received their own professional training in core disciplines, such as Psychology, Sociology, or Business Management. Today, most education school faculty are themselves education school graduates, socialized in the prevailing culture. The dominant expertise in schools of education can maintain its dominance by hiring faculty who agree with it and denying tenure to those who stray. The dominant expertise in education journals can control education knowledge by accepting article submissions with agreeable results and rejecting those without.

Even most testing and measurement PhD training programs now reside in education schools, inside the same cultural cocoon.

Standardized testing is one of the few remaining independent tools US society has for holding education professionals accountable to serve the public, and not their own, interests. Without valid, reliable, objective external measurement, education professionals can do what they please inside our schools, with our children and our money. When educators are the only arbiters of the quality of their own work, they tend to rate it consistently well.

A substantial portion of The Test’s girth is filled with complaints that tests do not measure most of what students are supposed to or should learn: “It’s math and reading skills, history and science facts that kids are tested and graded on. Emotional, social, moral, spiritual, creative, and physical development all become marginal…” (p. 4). She quotes Daniel Koretz: “These tests can measure only a subset of the goals of education” (p. 14). Several other testing critics are cited making similar claims.

Yet, standards-based tests are developed in a process that takes years, and involves scores of legislators, parents, teachers, and administrators on a variety of decision-making committees. The citizens of a jurisdiction and their representatives choose the content of standards-based tests. They could choose content that Kamenetz and the several other critics she cites prefer, but they don’t.

If the critics are unhappy with test content, they should take their case to the proper authorities, voice their complaints at tedious standards commission hearings, and contribute their time to the rather monotonous work of test framework review committees. I sense that none of that patient effort interests them; instead, they would prefer that all decision-making power be granted to them, ex cathedra, to do as they think best for us.

Moreover, I find some of their assertions about what should be studied and tested rather scary. Our public schools should teach our children emotions, morals, and spirituality?

Likely that prospect would scare most parents, too. But, many parents’ first reaction to a proposal that our schools be allowed to teach their children everything might instead be something like: first show us that you can teach our children to read, write, and compute, then we can discuss further responsibilities.

So long as education insiders insist that we must hand over our money and children and leave them alone to determine—and evaluate—what they do with both, calls for “imploding” the public education system will only grow louder, as they should.

It is bad enough that so many education professors write propaganda, call it research, and deliberately mislead journalists by declaring an absence of countervailing research and researchers. Researchers confident in their arguments and evidence should be unafraid to face opponents and opposing ideas. The researchers Kamenetz trusts do all they can to deny dissenters a hearing.

Another potential independent tool for holding education professionals accountable, in addition to testing, could be an active, skeptical, and inquiring press knowledgeable of education issues and conflicts of interests. Other countries have it. Why are so many US education reporters gullible sycophants?

 

Endnotes:

[1] She did speak with Samuel Casey Carter, the author of No Excuses: Lessons from 21 High-Performing High-Poverty Schools (2000) (pp. 81-84), but chides him for recommending frequent testing without “framing” “the racist origins of standardized testing.” Kamenetz suggests that test scores are almost completely determined by household wealth and dismisses Carter’s explanations as a “mishmash of anecdotal evidence and conservative faith.”

[2] Those sources are Daniel Koretz, Brian Jacob, and the “FairTest” crew. In fact, an enormous research literature revealing large benefits from standardized, high-stakes, and frequent education testing spans a century (Brown, Roediger, & McDaniel, 2014; Larsen & Butler, 2013; Phelps, 2012).

[3] The 1990s witnessed the chaos of the New Standards Project, MSPAP (Maryland), CLAS (California) and KIRIS (Kentucky), dysfunctional programs that, when implemented, were overwhelmingly rejected by citizens, politicians and measurement professionals alike. (Incidentally, some of the same masterminds behind those projects have resurfaced as lead writers for the Common Core Standards.)

 

References:

Brown, P. C., Roediger, H. L., & McDaniel, M. A. (2014). Make it stick: The science of successful learning. Cambridge, MA: Belknap Press.

Larsen, D. P., & Butler, A. C. (2013). Test-enhanced learning. In K. Walsh (Ed.), Oxford textbook of medical education (pp. 443–452). Oxford: Oxford University Press. http://onlinelibrary.wiley.com/doi/10.1111/j.1365-2923.2008.03124.x/full

Phelps, R. P. (2012). The effect of testing on student achievement, 1910–2010. International Journal of Testing, 12(1), 21–43. http://www.tandfonline.com/doi/abs/10.1080/15305058.2011.602920

 

Kamenetz, A. (2015). The Test: Why our schools are obsessed with standardized testing—but you don’t have to be. New York: Public Affairs. Book Review, by Richard P. Phelps was originally published on Nonpartisan Education Blog

Kamenetz, A. (2015). The Test: Why our schools are obsessed with standardized testing—but you don’t have to be. New York: Public Affairs. Book Review, by Richard P. Phelps was originally published on Nonpartisan Education Blog

Kamenetz, A. (2015). The Test: Why our schools are obsessed with standardized testing—but you don’t have to be. New York: Public Affairs. Book Review, by Richard P. Phelps was originally published on Nonpartisan Education Blog

Large-scale educational testing in Chile: Some thoughts

Recently in the auditorium of Universidad Finis Terrae, I argued that Chile’s Prueba de Selección Universitaria (PSU) cannot be “fixed” and should be scrapped. I do not, however, advocate the elimination of university entrance examinations but, rather, the creation of a fairer and more informative and transparent examination.

Chile’s pre-2002 system (PAA plus PCEs) may not have been well maintained. But, the basic structure of a general aptitude test strongly correlated with university-level work, along with highly focused content-based tests designed by each faculty is as close to an ideal university entrance system as one could hope for.

I have perused the decade-long history of the PSU, its funding, and the involvement of international organizations (World Bank, OECD) in shaping its character. Most striking is the pervasive involvement of economists in creating, implementing, and managing the test, and the corresponding lack of involvement of professionals trained in testing and measurement.

In the PSU, World Bank, and OECD documents, the economists advocate one year that the PSU be a high school exit examination (which should be correlated with the high school curriculum), then the next year that it be a university entrance examination (which should be correlated with university work), or that it is meant to monitor the implementation of the new curriculum, or that it is designed to increase opportunities for students from low socioeconomic backgrounds (in fact, it has been decreasing those opportunities). No test can possibly do all that the PSU advocates have promised it will do. The PSU has been sold as a test that can do anything you might like a test to do, and now does nothing well. It is time to bring in a team that genuinely understands how to build a test, and is willing to be open and transparent in all its dealings with the public.

The greatest danger posed by the dysfunctional PSU, I fear, is the bad reputation it gives all tests. Some in Chile have advocated eliminating the SIMCE, which, to my observation, is as well managed as the PSU is poorly managed. The SIMCE gathers information to be used in improving instruction. In theory, a school could be closed due to poor SIMCE scores, but not one ever has been. There are no consequences for students or teachers. Much information about the SIMCE is freely available and more becomes available every month; it is not the “black box” that the PSU is.

It would be a mistake to eliminate all testing because one is badly managed. We need assessments. It is easy to know what you are teaching; but, you can only know what students are learning if you assess.

Richard P. Phelps, US Fulbright Specialist at the Agencia de Calidad de la Educacion and Universidad Finis Terrae in Santiago, editor and co-author of Correcting Fallacies about Educational and Psychological Testing (American Psychological Association, 2008/2009)

Large-scale educational testing in Chile: Some thoughts was originally published on Nonpartisan Education Blog

Large-scale educational testing in Chile: Some thoughts was originally published on Nonpartisan Education Blog

Large-scale educational testing in Chile: Some thoughts was originally published on Nonpartisan Education Blog

GAO Could Do More

U.S. GAO Could Do More in Examining Educator Cheating on Tests

The U.S. Government Accountability Office (GAO), a research agency of the U.S. Congress, continues its foray into the field of standardized testing. It started at least as far back as 1993 with a report I wrote on the extent and cost of all systemwide testing in the public schools. Many studies related to school assessment have been completed since, for example, in 1998, 2006, 2006, and 2009.
In the wake of educator cheating scandals in Atlanta, Washington, DC, and elsewhere, the GAO has recently turned its attention to test security (a.k.a. test integrity). For a year or so, they have hosted a web site with the fetching title “Potential State Testing Improprieties”

“…for the purpose of gathering information on fraudulent behavior in state-administered standardized tests. The information submitted here will be used as part of an ongoing GAO investigation into cheating by school officials nationwide, and will be referred to the appropriate State Educational Agency, the U.S. Department of Education, or other agencies, as appropriate.

“Any information provided through this website will be encrypted through our secure server and handled only by authorized staff. GAO will not release any individually identifiable information provided through this website unless compelled by law or required to do so by Congress. Anonymous reports are also welcome. However, providing GAO with as much information as possible allows us to ensure that our investigation is as thorough and efficient as possible. Providing contact information is particularly important to enable clarification or requests for additional information regarding submitted reports.“

I encourage anyone with relevant information to participate, though I would be more encouraging than is the GAO about submitting the information anonymously. In some states, the “State Educational Agency” to which your personal information will be submitted is, indeed, independent, law-abiding, and interested in rooting out corruption; in others, it either does not care much about the issue or itself is an integral part of the corruption.

It would be far better if the information were not submitted to any “educational agencies” but, rather, to state and federal auditors and attorney generals. The problem with education agencies is that they have gotten too comfortable with their own, separate world, with its own elections, funding sources, governance structures, rules and regulations, and ethical code that places a higher priority on the perceived needs of educators than others.

This past week, the GAO released another report with a typically understated title, “K-12 Education: States’ Test Security Policies and Procedures Varied”. Among its findings:

“All states reported including at least 50 percent of the leading practices in test security into their policies and procedures. However, states varied in the extent to which they incorporated certain categories of leading practices. For example, 22 states reported having all of the leading practices for security training, but four states had none of the practices in this category. Despite having reported that they have various leading practices in place to mitigate testing irregularities and prevent cheating, many states reported feeling vulnerable to cheating at some point during the testing process.”

Does one feel better or worse about test security after reading this passage? Is knowing that states are getting their test security policies and procedures at least half right reassuring? Would you trust your life’s savings to a bank that assured you of including at least half of the leading practices in bank security in their policies and procedures?

Though the low percentage may disappoint, I find another aspect of the GAO study more worrisome: it’s entirely about plans and policies and not any actual behavior. In this, the GAO takes its cue from the two associations whose test security checklist it employed in its study: the Council of Chief State School Officers (CCSSO) and the Association of Test Publishers (ATP). Their checklist comprised the “leading practices” to which the GAO refers.

Peruse the list of leading practices, included in the GAO report’s enclosures (starting on p.40) and one may be surprised at how ethereal they are. Schools should have “Procedures to Keep Testing Facilities Secure”, “Rules for Storage of Testing Materials”, “Procedures to Prevent Potential Security Breaches”, and even “Ethical Practices”. There’s no mention of what such rules, procedures, and practices might look like; local schools and districts are free to interpret them as they wish, presuming they even know how. Moreover, there’s no mention of any actual implementation of any of the rules, procedures, and practices; in Ed-speak, test security is about having a test security plan, not actually securing tests.

In a footnote (p.2) the GAO admits “Our survey did not examine state or local implementation of these test security policies.” Even if the GAO had tried to examine state or local implementation, though, what could it have found?

A “leading practice” in the terms of the CCSSO and ATP is not a “practice” at all; it is a plan for practice. That is, it is not about behavior or action, it is about a plan for behavior or action. And even the character of the plan is left to the discretion of the local school or district. Any local school or district with a test security plan in its files can claim that it is following leading practices. As model test security plans are routinely provided by test developers as part of their contract, every local school or district can be a leading test security practitioner by default.

They need not do anything to secure their tests to be a leading test security practitioner.

Borrowing a phrase so often used by the GAO in its review of other government agencies’ work, the GAO “could do more” to study the issue of test security.

GAO Could Do More was originally published on Nonpartisan Education Blog

GAO Could Do More was originally published on Nonpartisan Education Blog

GAO Could Do More was originally published on Nonpartisan Education Blog