Surprise! SBAC and CRESST stonewall public records request for their financial records

Say what you will about Achieve, PARCC, Fordham, CCSSO, and NGA— some of the organizations responsible for promoting the Common Core Initiative on us all. But, their financial records are publicly available.

Not so for some other organizations responsible for the same Common Core promotion. The Smarter Balanced Assessment Consortium (SBAC) and the Center for Research on Educational Standards and Student Testing (CRESST) have absorbed many millions of taxpayer and foundation dollars over the years. But, their financial records have been hidden inside the vast, nebulous cocoon of the University of California – Los Angeles (UCLA). UCLA’s financial records, of course, are publicly available, but amounts there are aggregated at a level that subsumes thousands of separate, individual entities.

UCLA is a tax-supported state institution, however, and California has an open records law on the books. After some digging, I located the UCLA office responsible for records requests and wrote to them. Following is a summary of our correspondence to date:

 

July 5, 2017

Greetings:

I hope that you can help me. I have spent a considerable amount of time clicking around in search of financial reports for the Smarter Balanced Assessment Consortium (SBAC) and the National Center for Research on Evaluation, Standards, and Student Testing (CRESST), both “housed” at UCLA (or, until just recently in SBAC’s case). Even after many hours of web searching, I still have no clue as to where these data might be found.

Both organizations are largely publicly funded through federal grants. I would like to obtain revenue and expenditure detail on the order of what a citizen would expect to see in a nonprofit organization’s Form 990. I would be happy to search through a larger data base that contains relevant financial details for all of UCLA, so long as the details for SBAC and CRESST are contained within and separately labeled.

I would like annual records spanning the lifetimes of each organization: SBAC only goes back several years, but CRESST goes back to the 1980s (in its early years, it was called the Center for the Study of Evaluation).

Please tell me what I need to do next.

Thank you for your time and attention.

Best Wishes, Richard Phelps

 

July 6, 2017

RE: Acknowledgement of Public Records Request – PRR # 17-4854

Dear Mr. Phelps:

This letter is to acknowledge your request under the California Public Records Act (CPRA) dated July 5, 2017, herein enclosed. Information Practices (IP) is notifying the appropriate UCLA offices of your request and will identify, review, and release all responsive documents in accordance with relevant law and University policy.

Under the CPRA, Cal. Gov’t Code Section 6253(b), UCLA may charge for reproduction costs and/or programming services. If the cost is anticipated to be greater than $50.00 or the amount you authorized in your original request, we will contact you to confirm your continued interest in receiving the records and your agreement to pay the charges. Payment is due prior to the release of the records.

As required under Cal. Gov’t Code Section 6253, UCLA will respond to your request no later than the close of business on July 14, 2017. Please note, though, that Section 6253 only requires a public agency to make a determination within 10 days as to whether or not a request is seeking records that are publicly disclosable and, if so, to provide the estimated date that the records will be made available. There is no requirement for a public agency to actually supply the records within 10 days of receiving a request, unless the requested records are readily available. Still, UCLA prides itself on always providing all publicly disclosable records in as timely a manner as possible.

Should you have any questions, please contact me at (310) 794-8741 or via email at pahill@finance.ucla.edu and reference the PRR number found above in the subject line.

Sincerely,

Paula Hill

Assistant Manager, Information Practices

 

July 14, 2017

RE: Public Records Request – PRR # 17-4854

Dear Mr. Phelps:

The purpose of this letter is to confirm that UCLA Information Practices (IP) continues to work on your public records request dated July 5, 2017. As allowed pursuant to Cal. Gov’t Code Section 6253(c), we require additional time to respond to your request, due to the following circumstance(s):

The need to search for and collect the requested records from field facilities or other establishments that are separate from the office processing the request.

IP will respond to your request no later than the close of business on July 28, 2017 with an estimated date that responsive documents will be made available.

Should you have any questions, please contact me at (310) 794-8741 or via email at pahill@finance.ucla.edu and reference the PRR number found above in the subject line.

Sincerely,

Paula Hill

Assistant Manager, Information Practices

 

July 28, 2017

Dear Mr. Phelps,

Please know UCLA Information Practices continues to work on your public records request, attached for your reference. I will provide a further response regarding your request no later than August 18, 2017.

Should you have any questions, please contact me at (310) 794-8741 or via email and reference the PRR number found above in the subject line.

Kind regards,

Paula Hill

Assistant Manager

UCLA Information Practices

 

July 29, 2017

Thank you. RP

 

August 18, 2017

Re: Public Records Request – PRR # 17-4854

Dear Mr. Richard Phelps:

UCLA Information Practices (IP) continues to work on your public records request dated July 5, 2017. As required under Cal. Gov’t Code Section 6253, and as noted in our email communication with you on July 28, 2017, we are now able to provide you with the estimated date that responsive documents will be made available to you, which is September 29, 2017.

As the records are still being compiled and/or reviewed, we are not able at this time to provide you with any potential costs, so that information will be furnished in a subsequent communication as soon as it is known.

Should you have any questions, please contact me at (310) 794-8741 or via email at pahill@finance.ucla.edu and reference the PRR number found above in the subject line.

Sincerely,

Paula Hill

Assistant Manager, Information Practices

 

September 29, 2017

Dear Mr. Richard Phelps,

Unfortunately, we must revise the estimated availability date regarding your attached request as the requisite review has not yet been completed. We expect to provide a complete response by November 30, 2017. We apologize for the delay.

Should you have any questions, please contact our office at (310) 794-8741 or via email, and reference the PRR number found above in the subject line.

Best regards,

UCLA Information Practices

 

September 29, 2017

I believe that if you are leaving it up to CRESST and SBAC to voluntarily provide the information, they will not be ready Nov. 30 either. RP

Surprise! SBAC and CRESST stonewall public records request for their financial records was originally published on Nonpartisan Education Blog

Surprise! SBAC and CRESST stonewall public records request for their financial records was originally published on Nonpartisan Education Blog

Fordham Institute’s pretend research

The Thomas B. Fordham Institute has released a report, Evaluating the Content and Quality of Next Generation Assessments,[i] ostensibly an evaluative comparison of four testing programs, the Common Core-derived SBAC and PARCC, ACT’s Aspire, and the Commonwealth of Massachusetts’ MCAS.[ii] Of course, anyone familiar with Fordham’s past work knew beforehand which tests would win.

This latest Fordham Institute Common Core apologia is not so much research as a caricature of it.

  1. Instead of referencing a wide range of relevant research, Fordham references only friends from inside their echo chamber and others paid by the Common Core’s wealthy benefactors. But, they imply that they have covered a relevant and adequately wide range of sources.
  2. Instead of evaluating tests according to the industry standard Standards for Educational and Psychological Testing, or any of dozens of other freely-available and well-vetted test evaluation standards, guidelines, or protocols used around the world by testing experts, they employ “a brand new methodology” specifically developed for Common Core, for the owners of the Common Core, and paid for by Common Core’s funders.
  3. Instead of suggesting as fact only that which has been rigorously evaluated and accepted as fact by skeptics, the authors continue the practice of Common Core salespeople of attributing benefits to their tests for which no evidence exists
  4. Instead of addressing any of the many sincere, profound critiques of their work, as confident and responsible researchers would do, the Fordham authors tell their critics to go away—“If you don’t care for the standards…you should probably ignore this study” (p. 4).
  5. Instead of writing in neutral language as real researchers do, the authors adopt the practice of coloring their language as so many Common Core salespeople do, attaching nice-sounding adjectives and adverbs to what serves their interest, and bad-sounding words to what does not.

1.  Common Core’s primary private financier, the Bill & Melinda Gates Foundation, pays the Fordham Institute handsomely to promote the Core and its associated testing programs.[iii] A cursory search through the Gates Foundation web site reveals $3,562,116 granted to Fordham since 2009 expressly for Common Core promotion or “general operating support.”[iv] Gates awarded an additional $653,534 between 2006 and 2009 for forming advocacy networks, which have since been used to push Common Core. All of the remaining Gates-to-Fordham grants listed supported work promoting charter schools in Ohio ($2,596,812), reputedly the nation’s worst.[v]

The other research entities involved in the latest Fordham study either directly or indirectly derive sustenance at the Gates Foundation dinner table:

  • the Human Resources Research Organization (HumRRO),[vi]
  • the Council of Chief State School Officers (CCSSO), co-holder of the Common Core copyright and author of the test evaluation “Criteria.”[vii]
  • the Stanford Center for Opportunity Policy in Education (SCOPE), headed by Linda Darling-Hammond, the chief organizer of one of the federally-subsidized Common Core-aligned testing programs, the Smarter-Balanced Assessment Consortium (SBAC),[viii] and
  • Student Achievement Partners, the organization that claims to have inspired the Common Core standards[ix]

The Common Core’s grandees have always only hired their own well-subsidized grantees for evaluations of their products. The Buros Center for Testing at the University of Nebraska has conducted test reviews for decades, publishing many of them in its annual Mental Measurements Yearbook for the entire world to see, and critique. Indeed, Buros exists to conduct test reviews, and retains hundreds of the world’s brightest and most independent psychometricians on its reviewer roster. Why did Common Core’s funders not hire genuine professionals from Buros to evaluate PARCC and SBAC? The non-psychometricians at the Fordham Institute would seem a vastly inferior substitute, …that is, had the purpose genuinely been an objective evaluation.

2.  A second reason Fordham’s intentions are suspect rests with their choice of evaluation criteria. The “bible” of North American testing experts is the Standards for Educational and Psychological Testing, jointly produced by the American Psychological Association, National Council on Measurement in Education, and the American Educational Research Association. Fordham did not use it.[x]

Had Fordham compared the tests using the Standards for Educational and Psychological Testing (or any of a number of other widely-respected test evaluation standards, guidelines, or protocols[xi]) SBAC and PARCC would have flunked. They have yet to accumulate some the most basic empirical evidence of reliability, validity, or fairness, and past experience with similar types of assessments suggest they will fail on all three counts.[xii]

Instead, Fordham chose to reference an alternate set of evaluation criteria concocted by the organization that co-owns the Common Core standards and co-sponsored their development (Council of Chief State School Officers, or CCSSO), drawing on the work of Linda Darling-Hammond’s SCOPE, the Center for Research on Educational Standards and Student Testing (CRESST), and a handful of others.[xiii],[xiv] Thus, Fordham compares SBAC and PARCC to other tests according to specifications that were designed for SBAC and PARCC.[xv]

The authors write “The quality and credibility of an evaluation of this type rests largely on the expertise and judgment of the individuals serving on the review panels” (p.12). A scan of the names of everyone in decision-making roles, however, reveals that Fordham relied on those they have hired before and whose decisions they could safely predict. Regardless, given the evaluation criteria employed, the outcome was foreordained regardless whom they hired to review, not unlike a rigged election in a dictatorship where voters’ decisions are restricted to already-chosen candidates.

Still, PARCC and SBAC might have flunked even if Fordham had compared tests using all 24+ of CCSSO’s “Criteria.” But Fordham chose to compare on only 14 of the criteria.[xvi] And those just happened to be criteria mostly favoring PARCC and SBAC.

Without exception the Fordham study avoided all the evaluation criteria in the categories:

“Meet overall assessment goals and ensure technical quality”,

“Yield valuable reports on student progress and performance”,

“Adhere to best practices in test administration”, and

“State specific criteria”[xvii]

What types of test characteristics can be found in these neglected categories? Test security, providing timely data to inform instruction, validity, reliability, score comparability across years, transparency of test design, requiring involvement of each state’s K-12 educators and institutions of higher education, and more. Other characteristics often claimed for PARCC and SBAC, without evidence, cannot even be found in the CCSSO criteria (e.g., internationally benchmarked, backward mapping from higher education standards, fairness).

The report does not evaluate the “quality” of tests, as its title suggests; at best it is an alignment study. And, naturally, one would expect the Common Core consortium tests to be more aligned to the Common Core than other tests. The only evaluative criteria used from the CCSSO’s Criteria are in the two categories “Align to Standards—English Language Arts” and “Align to Standards—Mathematics” and, even then, only for grades 5 and 8.

Nonetheless, the authors claim, “The methodology used in this study is highly comprehensive” (p. 74).

The authors of the Pioneer Institute’s report How PARCC’s false rigor stunts the academic growth of all students,[xviii] recommended strongly against the official adoption of PARCC after an analysis of its test items in reading and writing. They also did not recommend continuing with the current MCAS, which is also based on Common Core’s mediocre standards, chiefly because the quality of the grade 10 MCAS tests in math and ELA has deteriorated in the past seven or so years for reasons that are not yet clear. Rather, they recommend that Massachusetts return to its effective pre-Common Core standards and tests and assign the development and monitoring of the state’s mandated tests to a more responsible agency.

Perhaps the primary conceit of Common Core proponents is that the familiar multiple-choice/short answer/essay standardized tests ignore some, and arguably the better, parts of learning (the deeper, higher, more rigorous, whatever)[xix]. Ironically, it is they—opponents of traditional testing content and formats—who propose that standardized tests measure everything. By contrast, most traditional standardized test advocates do not suggest that standardized tests can or should measure any and all aspects of learning.

Consider this standard from the Linda Darling-Hammond, et al. source document for the CCSSO criteria:

”Research: Conduct sustained research projects to answer a question (including a self-generated question) or solve a problem, narrow or broaden the inquiry when appropriate, and demonstrate understanding of the subject under investigation. Gather relevant information from multiple authoritative print and digital sources, use advanced searches effectively, and assess the strengths and limitations of each source in terms of the specific task, purpose, and audience.”[xx]

Who would oppose this as a learning objective? But, does it make sense as a standardized test component? How does one objectively and fairly measure “sustained research” in the one- or two-minute span of a standardized test question? In PARCC tests, this is simulated by offering students snippets of documentary source material and grading them as having analyzed the problem well if they cite two of those already-made-available sources.

But, that is not how research works. It is hardly the type of deliberation that comes to most people’s mind when they think about “sustained research”. Advocates for traditional standardized testing would argue that standardized tests should be used for what standardized tests do well; “sustained research” should be measured more authentically.

The authors of the aforementioned Pioneer Institute report recommend, as their 7th policy recommendation for Massachusetts:

“Establish a junior/senior-year interdisciplinary research paper requirement as part of the state’s graduation requirements—to be assessed at the local level following state guidelines—to prepare all students for authentic college writing.”[xxi]

PARCC, SBAC, and the Fordham Institute propose that they can validly, reliably, and fairly measure the outcome of what is normally a weeks- or months-long project in a minute or two. It is attempting to measure that which cannot be well measured on standardized tests that makes PARCC and SBAC tests “deeper” than others. In practice, the alleged deeper parts are the most convoluted and superficial.

Appendix A of the source document for the CCSSO criteria provides three international examples of “high-quality assessments” in Singapore, Australia, and England.[xxiii] None are standardized test components. Rather, all are projects developed over extended periods of time—weeks or months—as part of regular course requirements.

Common Core proponents scoured the globe to locate “international benchmark” examples of the type of convoluted (i.e., “higher”, “deeper”) test questions included in PARCC and SBAC tests. They found none.

3.  The authors continue the Common Core sales tendency of attributing benefits to their tests for which no evidence exists. For example, the Fordham report claims that SBAC and PARCC will:

“make traditional ‘test prep’ ineffective” (p. 8)

“allow students of all abilities, including both at-risk and high-achieving youngsters, to demonstrate what they know and can do” (p. 8)

produce “test scores that more accurately predict students’ readiness for entry-level coursework or training” (p. 11)

“reliably measure the essential skills and knowledge needed … to achieve college and career readiness by the end of high school” (p. 11)

“…accurately measure student progress toward college and career readiness; and provide valid data to inform teaching and learning.” (p. 3)

eliminate the problem of “students … forced to waste time and money on remedial coursework.” (p. 73)

help “educators [who] need and deserve good tests that honor their hard work and give useful feedback, which enables them to improve their craft and boost their students’ success.” (p. 73)

The Fordham Institute has not a shred of evidence to support any of these grandiose claims. They share more in common with carnival fortune telling than empirical research. Granted, most of the statements refer to future outcomes, which cannot be known with certainty. But, that just affirms how irresponsible it is to make such claims absent any evidence.

Furthermore, in most cases, past experience would suggest just the opposite of what Fordham asserts. Test prep is more, not less, likely to be effective with SBAC and PARCC tests because the test item formats are complex (or, convoluted), introducing more “construct irrelevant variance”—that is, students will get lower scores for not managing to figure out formats or computer operations issues, even if they know the subject matter of the test. Disadvantaged and at-risk students tend to be the most disadvantaged by complex formatting and new technology.

As for Common Core, SBAC, and PARCC eliminating the “problem of” college remedial courses, such will be done by simply cancelling remedial courses, whether or not they might be needed, and lowering college entry-course standards to the level of current remedial courses.

4.  When not dismissing or denigrating SBAC and PARCC critiques, the Fordham report evades them, even suggesting that critics should not read it: “If you don’t care for the standards…you should probably ignore this study” (p. 4).

Yet, cynically, in the very first paragraph the authors invoke the name of Sandy Stotsky, one of their most prominent adversaries, and a scholar of curriculum and instruction so widely respected she could easily have gotten wealthy had she chosen to succumb to the financial temptation of the Common Core’s profligacy as so many others have. Stotsky authored the Fordham Institute’s “very first study” in 1997, apparently. Presumably, the authors of this report drop her name to suggest that they are broad-minded. (It might also suggest that they are now willing to publish anything for a price.)

Tellingly, one will find Stotsky’s name nowhere after the first paragraph. None of her (or anyone else’s) many devastating critiques of the Common Core tests is either mentioned or referenced. Genuine research does not hide or dismiss its critiques; it addresses them.

Ironically, the authors write, “A discussion of [test] qualities, and the types of trade-offs involved in obtaining them, are precisely the kinds of conversations that merit honest debate.” Indeed.

5.  Instead of writing in neutral language as real researchers do, the authors adopt the habit of coloring their language as Common Core salespeople do. They attach nice-sounding adjectives and adverbs to what they like, and bad-sounding words to what they don’t.

For PARCC and SBAC one reads:

“strong content, quality, and rigor”

“stronger tests, which encourage better, broader, richer instruction”

“tests that focus on the essential skills and give clear signals”

“major improvements over the previous generation of state tests”

“complex skills they are assessing.”

“high-quality assessment”

“high-quality assessments”

“high-quality tests”

“high-quality test items”

“high quality and provide meaningful information”

“carefully-crafted tests”

“these tests are tougher”

“more rigorous tests that challenge students more than they have been challenged in the past”

For other tests one reads:

“low-quality assessments poorly aligned with the standards”

“will undermine the content messages of the standards”

“a best-in-class state assessment, the 2014 MCAS, does not measure many of the important competencies that are part of today’s college and career readiness standards”

“have generally focused on low-level skills”

“have given students and parents false signals about the readiness of their children for postsecondary education and the workforce”

Appraising its own work, Fordham writes:

“groundbreaking evaluation”

“meticulously assembled panels”

“highly qualified yet impartial reviewers”

Considering those who have adopted SBAC or PARCC, Fordham writes:

“thankfully, states have taken courageous steps”

“states’ adoption of college and career readiness standards has been a bold step in the right direction.”

“adopting and sticking with high-quality assessments requires courage.”

 

A few other points bear mentioning. The Fordham Institute was granted access to operational SBAC and PARCC test items. Over the course of a few months in 2015, the Pioneer Institute, a strong critic of Common Core, PARCC, and SBAC, appealed for similar access to PARCC items. The convoluted run-around responses from PARCC officials excelled at bureaucratic stonewalling. Despite numerous requests, Pioneer never received access.

The Fordham report claims that PARCC and SBAC are governed by “member states”, whereas ACT Aspire is owned by a private organization. Actually, the Common Core Standards are owned by two private, unelected organizations, the Council of Chief State School Officers and the National Governors’ Association, and only each state’s chief school officer sits on PARCC and SBAC panels. Individual states actually have far more say-so if they adopt ACT Aspire (or their own test) than if they adopt PARCC or SBAC. A state adopts ACT Aspire under the terms of a negotiated, time-limited contract. By contrast, a state or, rather, its chief state school officer, has but one vote among many around the tables at PARCC and SBAC. With ACT Aspire, a state controls the terms of the relationship. With SBAC and PARCC, it does not.[xxiv]

Just so you know, on page 71, Fordham recommends that states eliminate any tests that are not aligned to the Common Core Standards, in the interest of efficiency, supposedly.

In closing, it is only fair to mention the good news in the Fordham report. It promises on page 8, “We at Fordham don’t plan to stay in the test-evaluation business”.

 

[i] Nancy Doorey & Morgan Polikoff. (2016, February). Evaluating the content and quality of next generation assessments. With a Foreword by Amber M. Northern & Michael J. Petrilli. Washington, DC: Thomas P. Fordham Institute. http://edexcellence.net/publications/evaluating-the-content-and-quality-of-next-generation-assessments

[ii] PARCC is the Partnership for Assessment of Readiness for College and Careers; SBAC is the Smarter-Balanced Assessment Consortium; MCAS is the Massachusetts Comprehensive Assessment System; ACT Aspire is not an acronym (though, originally ACT stood for American College Test).

[iii] The reason for inventing a Fordham Institute when a Fordham Foundation already existed may have had something to do with taxes, but it also allows Chester Finn, Jr. and Michael Petrilli to each pay themselves two six figure salaries instead of just one.

[iv] http://www.gatesfoundation.org/search#q/k=Fordham

[v] See, for example, http://www.ohio.com/news/local/charter-schools-misspend-millions-of-ohio-tax-dollars-as-efforts-to-police-them-are-privatized-1.596318 ; http://www.cleveland.com/metro/index.ssf/2015/03/ohios_charter_schools_ridicule.html ; http://www.dispatch.com/content/stories/local/2014/12/18/kasich-to-revamp-ohio-laws-on-charter-schools.html ; https://www.washingtonpost.com/news/answer-sheet/wp/2015/06/12/troubled-ohio-charter-schools-have-become-a-joke-literally/

[vi] HumRRO has produced many favorable reports for Common Core-related entities, including alignment studies in Kentucky, New York State, California, and Connecticut.

[vii] CCSSO has received 23 grants from the Bill & Melinda Gates Foundation from “2009 and earlier” to 2016 collectively exceeding $100 million. http://www.gatesfoundation.org/How-We-Work/Quick-Links/Grants-Database#q/k=CCSSO

[viii] http://www.gatesfoundation.org/How-We-Work/Quick-Links/Grants-Database#q/k=%22Stanford%20Center%20for%20Opportunity%20Policy%20in%20Education%22

[ix] Student Achievement Partners has received four grants from the Bill & Melinda Gates Foundation from 2012 to 2015 exceeding $13 million. http://www.gatesfoundation.org/How-We-Work/Quick-Links/Grants-Database#q/k=%22Student%20Achievement%20Partners%22

[x] The authors write that the standards they use are “based on” the real Standards. But, that is like saying that Cheez Whiz is based on cheese. Some real cheese might be mixed in there, but it’s not the product’s most distinguishing ingredient.

[xi] (e.g., the International Test Commission’s (ITC) Guidelines for Test Use; the ITC Guidelines on Quality Control in Scoring, Test Analysis, and Reporting of Test Scores; the ITC Guidelines on the Security of Tests, Examinations, and Other Assessments; the ITC’s International Guidelines on Computer-Based and Internet-Delivered Testing; the European Federation of Psychologists’ Association (EFPA) Test Review Model; the Standards of the Joint Committee on Testing Practices)

[xii] Despite all the adjectives and adverbs implying newness to PARCC and SBAC as “Next Generation Assessment”, it has all been tried before and failed miserably. Indeed, many of the same persons involved in past fiascos are pushing the current one. The allegedly “higher-order”, more “authentic”, performance-based tests administered in Maryland (MSPAP), California (CLAS), and Kentucky (KIRIS) in the 1990s failed because of unreliable scores; volatile test score trends; secrecy of items and forms; an absence of individual scores in some cases; individuals being judged on group work in some cases; large expenditures of time; inconsistent (and some improper) test preparation procedures from school to school; inconsistent grading on open-ended response test items; long delays between administration and release of scores; little feedback for students; and no substantial evidence after several years that education had improved. As one should expect, instruction had changed as test proponents desired, but without empirical gains or perceived improvement in student achievement. Parents, politicians, and measurement professionals alike overwhelmingly rejected these dysfunctional tests.

See, for example, For California: Michael W. Kirst & Christopher Mazzeo, (1997, December). The Rise, Fall, and Rise of State Assessment in California: 1993-96, Phi Delta Kappan, 78(4) Committee on Education and the Workforce, U.S. House of Representatives, One Hundred Fifth Congress, Second Session, (1998, January 21). National Testing: Hearing, Granada Hills, CA. Serial No. 105-74; Representative Steven Baldwin, (1997, October). Comparing assessments and tests. Education Reporter, 141. See also Klein, David. (2003). “A Brief History Of American K-12 Mathematics Education In the 20th Century”, In James M. Royer, (Ed.), Mathematical Cognition, (pp. 175–226). Charlotte, NC: Information Age Publishing. For Kentucky: ACT. (1993). “A study of core course-taking patterns. ACT-tested graduates of 1991-1993 and an investigation of the relationship between Kentucky’s performance-based assessment results and ACT-tested Kentucky graduates of 1992”. Iowa City, IA: Author; Richard Innes. (2003). Education research from a parent’s point of view. Louisville, KY: Author. http://www.eddatafrominnes.com/index.html ; KERA Update. (1999, January). Misinformed, misled, flawed: The legacy of KIRIS, Kentucky’s first experiment. For Maryland: P. H. Hamp, & C. B. Summers. (2002, Fall). “Education.” In P. H. Hamp & C. B. Summers (Eds.), A guide to the issues 2002–2003. Maryland Public Policy Institute, Rockville, MD. http://www.mdpolicy.org/docLib/20051030Education.pdf ; Montgomery County Public Schools. (2002, Feb. 11). “Joint Teachers/Principals Letter Questions MSPAP”, Public Announcement, Rockville, MD. http://www.montgomeryschoolsmd.org/press/index.aspx?pagetype=showrelease&id=644 ; HumRRO. (1998). Linking teacher practice with statewide assessment of education. Alexandria, VA: Author. http://www.humrro.org/corpsite/page/linking-teacher-practice-statewide-assessment-education

[xiii]http://www.ccsso.org/Documents/2014/CCSSO Criteria for High Quality Assessments 03242014.pdf

[xiv] A rationale is offered for why they had to develop a brand new set of test evaluation criteria (p. 13). Fordham claims that new criteria were needed, which weighted some criteria more than others. But, weights could easily be applied to any criteria, including the tried-and-true, preexisting ones.

[xv] For an extended critique of the CCSSO Criteria employed in the Fordham report, see “Appendix A. Critique of Criteria for Evaluating Common Core-Aligned Assessments” in Mark McQuillan, Richard P. Phelps, & Sandra Stotsky. (2015, October). How PARCC’s false rigor stunts the academic growth of all students. Boston: Pioneer Institute, pp. 62-68. http://pioneerinstitute.org/news/testing-the-tests-why-mcas-is-better-than-parcc/

[xvi] Doorey & Polikoff, p. 14.

[xvii] MCAS bests PARCC and SBAC according to several criteria specific to the Commonwealth, such as the requirements under the current Massachusetts Education Reform Act (MERA) as a grade 10 high school exit exam, that tests students in several subject fields (and not just ELA and math), and provides specific and timely instructional feedback.

[xviii] McQuillan, M., Phelps, R.P., & Stotsky, S. (2015, October). How PARCC’s false rigor stunts the academic growth of all students. Boston: Pioneer Institute. http://pioneerinstitute.org/news/testing-the-tests-why-mcas-is-better-than-parcc/

[xix] It is perhaps the most enlightening paradox that, among Common Core proponents’ profuse expulsion of superlative adjectives and adverbs advertising their “innovative”, “next generation” research results, the words “deeper” and “higher” mean the same thing.

[xx] The document asserts, “The Common Core State Standards identify a number of areas of knowledge and skills that are clearly so critical for college and career readiness that they should be targeted for inclusion in new assessment systems.” Linda Darling-Hammond, Joan Herman, James Pellegrino, Jamal Abedi, J. Lawrence Aber, Eva Baker, Randy Bennett, Edmund Gordon, Edward Haertel, Kenji Hakuta, Andrew Ho, Robert Lee Linn, P. David Pearson, James Popham, Lauren Resnick, Alan H. Schoenfeld, Richard Shavelson, Lorrie A. Shepard, Lee Shulman, and Claude M. Steele. (2013). Criteria for high-quality assessment. Stanford, CA: Stanford Center for Opportunity Policy in Education; Center for Research on Student Standards and Testing, University of California at Los Angeles; and Learning Sciences Research Institute, University of Illinois at Chicago, p. 7. https://edpolicy.stanford.edu/publications/pubs/847

[xxi] McQuillan, Phelps, & Stotsky, p. 46.

[xxiii] Linda Darling-Hammond, et al., pp. 16-18. https://edpolicy.stanford.edu/publications/pubs/847

[xxiv] For an in-depth discussion of these governance issues, see Peter Wood’s excellent Introduction to Drilling Through the Core, http://www.amazon.com/gp/product/0985208694

Fordham Institute’s pretend research was originally published on Nonpartisan Education Blog

Fordham Institute’s pretend research was originally published on Nonpartisan Education Blog

Fordham report predictable, conflicted

On November 17, the Massachusetts Board of Elementary and Secondary Education (BESE) will decide the fate of the Massachusetts Comprehensive Assessment System (MCAS) and the Partnership for Assessment of College Readiness for College and Careers (PARCC) in the Bay State. MCAS is homegrown; PARCC is not. Barring unexpected compromises or subterfuges, only one program will survive.

Over the past year, PARCC promoters have released a stream of reports comparing the two testing programs. The latest arrives from the Thomas B. Fordham Institute in the form of a partial “evaluation of the content and quality of the 2014 MCAS and PARCC “relative to” the “Criteria for High Quality Assessments”[i] developed by one of the organizations that developed Common Core’s standards—with the rest of the report to be delivered in January, it says.[ii]

PARCC continues to insult our intelligence. The language of the “special report” sent to Mitchell Chester, Commissioner of Elementary and Secondary Education, reads like a legitimate study.[iii] The research it purports to have done even incorporated some processes typically employed in studies with genuine intentions of objectivity.

No such intentions could validly be ascribed to the Fordham report.

First, Common Core’s primary private financier, the Bill & Melinda Gates Foundation, pays the Fordham Institute handsomely to promote the standards and its associated testing programs. A cursory search through the Gates Foundation web site reveals $3,562,116 granted to Fordham since 2009 expressly for Common Core promotion or “general operating support.”[iv] Gates awarded an additional $653,534 between 2006 and 2009 for forming advocacy networks, which have since been used to push Common Core. All of the remaining Gates-to-Fordham grants listed supported work promoting charter schools in Ohio ($2,596,812), reputedly the nation’s worst.[v]

The other research entities involved in the latest Fordham study either directly or indirectly derive sustenance at the Gates Foundation dinner table:

– the Human Resources Research Organization (HumRRO), which will deliver another pro-PARCC report sometime soon,[vi]
– the Council of Chief State School Officers (CCSSO), co-holder of the Common Core copyright and author of the “Criteria.”, [vii]
– the Stanford Center for Opportunity Policy in Education (SCOPE), headed by Linda Darling-Hammond, the chief organizer of the other federally-subsidized Common Core-aligned testing program, the Smarter-Balanced Assessment Consortium (SBAC),[viii] and
– Student Achievement Partners, the organization that claims to have inspired the Common Core standards[ix]

Fordham acknowledges the pervasive conflicts of interest it claims it faced in locating people to evaluate MCAS versus PARCC. “…it is impossible to find individuals with zero conflicts who are also experts”.[x] But, the statement is false; hundreds, perhaps even thousands, of individuals experienced in “alignment or assessment development studies” were available.[xi] That they were not called reveals Fordham’s preferences.

A second reason Fordham’s intentions are suspect rests with their choice of evaluation criteria. The “bible” of test developers is the Standards for Educational and Psychological Testing, jointly produced by the American Psychological Association, National Council on Measurement in Education, and the American Educational Research Association. Fordham did not use it.

Instead, Fordham chose to reference an alternate set of evaluation criteria concocted by the organization that co-sponsored the development of Common Core’s standards (Council for Chief State School Officers, or CCSSO), drawing on the work of Linda Darling-Hammond’s SCOPE, the Center for Research on Educational Standards and Student Testing (CRESST), and a handful of others. Thus, Fordham compares PARCC to MCAS according to specifications that were designed for PARCC.[xii]

Had Fordham compared MCAS and PARCC using the Standards for Educational and Psychological Testing, MCAS would have passed and PARCC would have flunked. PARCC has not yet accumulated the most basic empirical evidence of reliability, validity, or fairness, and past experience with similar types of assessments suggest it will fail on all three counts.[xiii]

Third, PARCC should have been flunked had Fordham compared MCAS and PARCC using all 24+ of CCSSO’s “Criteria.” But Fordham chose to compare on only 15 of the criteria.[xiv] And those just happened to be the criteria favoring PARCC.

Fordham agreed to compare the two tests with respect to their alignment to Common Core-based criteria. With just one exception, the Fordham study avoided all the criteria in the groups “Meet overall assessment goals and ensure technical quality”, “Yield valuable report on student progress and performance”, “Adhere to best practices in test administration”, and “State specific criteria”[xv]

Not surprisingly, Fordham’s “memo” favors the Bay State’s adoption of PARCC. However, the authors of How PARCC’s false rigor stunts the academic growth of all students[xvi], released one week before Fordham’s “memo,” recommend strongly against the official adoption of PARCC after an analysis of its test items in reading and writing. They also do not recommend continuing with the current MCAS, which is also based on Common Core’s mediocre standards, chiefly because the quality of the grade 10 MCAS tests in math and ELA has deteriorated in the past seven or so years for reasons that are not yet clear. Rather, they recommend that Massachusetts return to its effective pre-Common Core standards and tests and assign the development and monitoring of the state’s mandated tests to a more responsible agency.

Perhaps the primary conceit of Common Core proponents is that ordinary multiple-choice-predominant standardized tests ignore some, and arguably the better, parts of learning (the deeper, higher, more rigorous, whatever)[xvii]. Ironically, it is they—opponents of traditional testing regimes—who propose that standardized tests measure everything. By contrast, most traditional standardized test advocates do not suggest that standardized tests can or should measure any and all aspects of learning.

Consider this standard from the Linda Darling-Hammond, et al. source document for the CCSSO criteria:

“Research: Conduct sustained research projects to answer a question (including a self-generated question) or solve a problem, narrow or broaden the inquiry when appropriate, and demonstrate understanding of the subject under investigation. Gather relevant information from multiple authoritative print and digital sources, use advanced searches effectively, and assess the strengths and limitations of each source in terms of the specific task, purpose, and audience.”[xviii]

Who would oppose this as a learning objective? But, does it make sense as a standardized test component? How does one objectively and fairly measure “sustained research” in the one- or two-minute span of a standardized test question? In PARCC tests, this is done by offering students snippets of documentary source material and grading them as having analyzed the problem well if they cite two of those already-made-available sources.

But, that is not how research works. It is hardly the type of deliberation that comes to most people’s mind when they think about “sustained research”. Advocates for traditional standardized testing would argue that standardized tests should be used for what standardized tests do well; “sustained research” should be measured more authentically.

The authors of the aforementioned Pioneer Institute report recommend, as their 7th policy recommendation for Massachusetts:

“Establish a junior/senior-year interdisciplinary research paper requirement as part of the state’s graduation requirements—to be assessed at the local level following state guidelines—to prepare all students for authentic college writing.”[xix]

PARCC and the Fordham Institute propose that they can validly, reliably, and fairly measure the outcome of what is normally a weeks- or months-long project in a minute or two.[xx] It is attempting to measure that which cannot be well measured on standardized tests that makes PARCC tests “deeper” than others. In practice, the alleged deeper parts of PARCC are the most convoluted and superficial.

Appendix A of the source document for the CCSSO criteria provides three international examples of “high-quality assessments” in Singapore, Australia, and England.[xxi] None are standardized test components. Rather, all are projects developed over extended periods of time—weeks or months—as part of regular course requirements.

Common Core proponents scoured the globe to locate “international benchmark” examples of the type of convoluted (i.e., “higher”, “deeper”) test questions included in PARCC and SBAC tests. They found none.

Dr. Richard P. Phelps is editor or author of four books: Correcting Fallacies about Educational and Psychological Testing (APA, 2008/2009); Standardized Testing Primer (Peter Lang, 2007); Defending Standardized Testing (Psychology Press, 2005); and Kill the Messenger (Transaction, 2003, 2005), and founder of the Nonpartisan Education Review (http://nonpartisaneducation.org).

[i] http://www.ccsso.org/Documents/2014/CCSSO%20Criteria%20for%20High%20Quality%20Assessments%20 03242014.pdf

[ii] Michael J. Petrilli & Amber M. Northern. (2015, October 30). Memo to Dr. Mitchell Chester, Commissioner of Elementary and Secondary Education, Massachusetts Department of Elementary and Secondary Education. Washington, DC: Thomas P. Fordham Institute. http://edexcellence.net/articles/evaluation-of-the-content-and-quality-of-the-2014-mcas-and-parcc-relative-to-the-ccsso

[iii] Nancy Doorey & Morgan Polikoff. (2015, October). Special report: Evaluation of the Massachusetts Comprehensive Assessment System (MCAS) and the Partnership for the Assessment of Readiness for College and Careers (PARCC). Washington, DC: Thomas P. Fordham Institute. http://edexcellence.net/articles/evaluation-of-the-content-and-quality-of-the-2014-mcas-and-parcc-relative-to-the-ccsso

[iv] http://www.gatesfoundation.org/search#q/k=Fordham

[v] See, for example, http://www.ohio.com/news/local/charter-schools-misspend-millions-of-ohio-tax-dollars-as-efforts-to-police-them-are-privatized-1.596318 ; http://www.cleveland.com/metro/index.ssf/2015/03/ohios_charter_schools_ridicule.html ; http://www.dispatch.com/content/stories/local/2014/12/18/kasich-to-revamp-ohio-laws-on-charter-schools.html ; https://www.washingtonpost.com/news/answer-sheet/wp/2015/06/12/troubled-ohio-charter-schools-have-become-a-joke-literally/

[vi] HumRRO has produced many favorable reports for Common Core-related entities, including alignment studies in Kentucky, New York State, California, and Connecticut.

[vii] CCSSO has received 22 grants from the Bill & Melinda Gates Foundation from “2009 and earlier” to 2015 exceeding $90 million. http://www.gatesfoundation.org/How-We-Work/Quick-Links/Grants-Database#q/k=CCSSO

[viii] http://www.gatesfoundation.org/How-We-Work/Quick-Links/Grants-Database#q/k=%22Stanford%20Center%20for%20Opportunity%20Policy%20in%20Education%22

[ix] Student Achievement Partners has received four grants from the Bill & Melinda Gates Foundation from 2012 to 2015 exceeding $13 million. http://www.gatesfoundation.org/How-We-Work/Quick-Links/Grants-Database#q/k=%22Student%20Achievement%20Partners%22

[x] Doorey & Polikoff, p. 4.

[xi] To cite just one example, the world-renowned Center for Educational Measurement at the University of Massachusetts-Amherst has accumulated abundant experience conducting alignment studies.

[xii] For an extended critique of the CCSSO criteria employed in the Fordham report, see “Appendix A. Critique of Criteria for Evaluating Common Core-Aligned Assessments” in Mark McQuillan, Richard P. Phelps, & Sandra Stotsky. (2015, October). How PARCC’s false rigor stunts the academic growth of all students. Boston: Pioneer Institute, pp. 62-68. http://pioneerinstitute.org/news/testing-the-tests-why-mcas-is-better-than-parcc/

[xiii] Despite all the adjectives and adverbs implying newness to PARCC and SBAC as “Next Generation Assessment”, it has all been tried before and failed miserably. Indeed, many of the same persons involved in past fiascos are pushing the current one. The allegedly “higher-order”, more “authentic”, performance-based tests administered in Maryland (MSPAP), California (CLAS), and Kentucky (KIRIS) in the 1990s failed because of unreliable scores; volatile test score trends; secrecy of items and forms; an absence of individual scores in some cases; individuals being judged on group work in some cases; large expenditures of time; inconsistent (and some improper) test preparation procedures from school to school; inconsistent grading on open-ended response test items; long delays between administration and release of scores; little feedback for students; and no substantial evidence after several years that education had improved. As one should expect, instruction had changed as test proponents desired, but without empirical gains or perceived improvement in student achievement. Parents, politicians, and measurement professionals alike overwhelmingly rejected these dysfunctional tests.

See, for example, For California: Michael W. Kirst & Christopher Mazzeo, (1997, December). The Rise, Fall, and Rise of State Assessment in California: 1993-96, Phi Delta Kappan, 78(4) Committee on Education and the Workforce, U.S. House of Representatives, One Hundred Fifth Congress, Second Session, (1998, January 21). National Testing: Hearing, Granada Hills, CA. Serial No. 105-74; Representative Steven Baldwin, (1997, October). Comparing assessments and tests. Education Reporter, 141. See also Klein, David. (2003). “A Brief History Of American K-12 Mathematics Education In the 20th Century”, In James M. Royer, (Ed.), Mathematical Cognition, (pp. 175–226). Charlotte, NC: Information Age Publishing. For Kentucky: ACT. (1993). “A study of core course-taking patterns. ACT-tested graduates of 1991-1993 and an investigation of the relationship between Kentucky’s performance-based assessment results and ACT-tested Kentucky graduates of 1992”. Iowa City, IA: Author; Richard Innes. (2003). Education research from a parent’s point of view. Louisville, KY: Author. http://www.eddatafrominnes.com/index.html ; KERA Update. (1999, January). Misinformed, misled, flawed: The legacy of KIRIS, Kentucky’s first experiment. For Maryland: P. H. Hamp, & C. B. Summers. (2002, Fall). “Education.” In P. H. Hamp & C. B. Summers (Eds.), A guide to the issues 2002–2003. Maryland Public Policy Institute, Rockville, MD. http://www.mdpolicy.org/docLib/20051030Education.pdf ; Montgomery County Public Schools. (2002, Feb. 11). “Joint Teachers/Principals Letter Questions MSPAP”, Public Announcement, Rockville, MD. http://www.montgomeryschoolsmd.org/press/index.aspx?pagetype=showrelease&id=644 ; HumRRO. (1998). Linking teacher practice with statewide assessment of education. Alexandria, VA: Author. http://www.humrro.org/corpsite/page/linking-teacher-practice-statewide-assessment-education

[xiv] Doorey & Polikoff, p. 23.

[xv] MCAS bests PARCC according to several criteria specific to the Commonwealth, such as the requirements under the current Massachusetts Education Reform Act (MERA) as a grade 10 high school exit exam, that tests students in several subject fields (and not just ELA and math), and provides specific and timely instructional feedback.

[xvi] McQuillan, M., Phelps, R.P., & Stotsky, S. (2015, October). How PARCC’s false rigor stunts the academic growth of all students. Boston: Pioneer Institute. http://pioneerinstitute.org/news/testing-the-tests-why-mcas-is-better-than-parcc/

[xvii] It is perhaps the most enlightening paradox that, among Common Core proponents’ profuse expulsion of superlative adjectives and adverbs advertising their “innovative”, “next generation” research results, the words “deeper” and “higher” mean the same thing.

[xviii] The document asserts, “The Common Core State Standards identify a number of areas of knowledge and skills that are clearly so critical for college and career readiness that they should be targeted for inclusion in new assessment systems.” Linda Darling-Hammond, Joan Herman, James Pellegrino, Jamal Abedi, J. Lawrence Aber, Eva Baker, Randy Bennett, Edmund Gordon, Edward Haertel, Kenji Hakuta, Andrew Ho, Robert Lee Linn, P. David Pearson, James Popham, Lauren Resnick, Alan H. Schoenfeld, Richard Shavelson, Lorrie A. Shepard, Lee Shulman, and Claude M. Steele. (2013). Criteria for high-quality assessment. Stanford, CA: Stanford Center for Opportunity Policy in Education; Center for Research on Student Standards and Testing, University of California at Los Angeles; and Learning Sciences Research Institute, University of Illinois at Chicago, p. 7. https://edpolicy.stanford.edu/publications/pubs/847

[xix] McQuillan, Phelps, & Stotsky, p. 46.

[xxi] Linda Darling-Hammond, et al., pp. 16-18. https://edpolicy.stanford.edu/publications/pubs/847

Fordham report predictable, conflicted was originally published on Nonpartisan Education Blog

Fordham report predictable, conflicted was originally published on Nonpartisan Education Blog

Beware of Test Scores Masquerading as Data

A semi-taboo area of insufficient discussion is the reliability of the test score data from the statewide, nationwide, and international standard tests; for example, our National Assessment of Educational Progress (NAEP), but not nearly just the NAEP test scores. You can learn about all of the reliability issues from experts like Richard Phelps, and Richard Innes.

I have frequently raised concerns about test score data generated by exams that don’t impact the students that take them; that is, where a poor effort by a student does not adversely impact the student. The norm for most national, international, and some statewide standardized testing is that the students taking them have no incentive to give their top effort. NAEP – the so-called nation’s report card is among the no-stakes-for-the-students. Expressing a concern for that data reliability in an e-mail or a conversation issue nearly always yields no response, or a vague, dismissive response; something approaching ‘emperor has no clothes’ proportions.

The discovery that prompted this blog was Richard Phelps’ pronouncement that:

“Indeed, one drawback to the standardized student tests with no stakes for the students is that student effort does not just vary, it varies differently by demographic sub-group. The economists who like to use such scores for measuring school and teacher value-added just assume away these and other critical flaws.”

So, while such test scores might be broadly accurate – more substantive persuasion please – they may just be numbers masquerading as data for some of the uses they have been put to. And it’s another reason to question the current system’s extensive reliance on top-down-only-accountability to formal authority that must be based on objective apple-to-apple comparisons. We need robust universal parental school choice to exploit subjective, bottom-up-accountability to clients; to employ a mix of top-down and bottom-up accountability to manage a system of diverse children and educators.

I’m willing to rely on NAEP and PISA test score data (etc.), with some reservations and reticence, because the data are consistent with the high stakes data and other indicators of school system effectiveness, and with established economic theory. But the no-stakes-for-the-students test score issue needs a lot more study and discussion.

Richard Phelps – http://richardphelps.net/Phelps_primer.pdf

Richard Innes – http://educationblog.ncpa.org/wp-content/uploads/2013/06/JSC-NAEP-Data-Issues-from-Innes-for-V-6-2.pdf

emperor has no clothes – http://en.wikipedia.org/wiki/The_Emperor%27s_New_Clothes

pronouncement – http://nonpartisaneducation.org/blog1/2015/02/24/no-child-left-behind-renewal-blinders-on-the-education-policy-horse/

PISA – http://nces.ed.gov/surveys/pisa/

Beware of Test Scores Masquerading as Data was originally published on Nonpartisan Education Blog

Beware of Test Scores Masquerading as Data was originally published on Nonpartisan Education Blog

Using middle schoolers for anti-testing advocacy?

Superintendent Mark D. LaRoach
Vestal School District, New York

Dear Superintendent LaRoach:

I conduct research on the effects of standardized testing on student achievement. I have read over 3 thousand studies dating back a century and spanning over thirty countries. The results have been rather astounding–on average, a very strong positive effect. These results have been corroborated by hundreds of recent studies by cognitive psychologists.

Given the rabid hatred of standardized testing among many inside US public education, however, I have gotten used to routine demonizing of me and my work from education professors and advocates, …but from middle schoolers?

Would you please verify for me that the messages below indeed came from Vestal middle schoolers? I would also be interested in your perspective on this use of both public infrastructure–the email messages were sent from your server–and middle schoolers themselves for political advocacy.

Best Wishes, Richard P. Phelps

http://richardphelps.net
http://nonpartisaneducation.org

_______________________________________________________________

Student EMMA MACDONALD
Jan 23 at 9:03 AM

Dear Mr. Phelps:

Imagine this: you’re sitting in your homeroom, anxiously waiting to get the test over with. How do you feel? Most students feel sick and tired, which just makes it more nerve-wracking than it already is. You don’t want students to feel so nervous that they vomit and have to take the test in it, do you? Would you? Most students don’t even want to go to school everyday, so why make them dread it more? Even the kids who are sick those days have to make it up, unfair. Teachers don’t like them either; they just sit staring at the room, watching kids suffer through these terrible standardized tests. A lot of people would agree with me that you should stop standardized testing.

On the Program for International Student Achievement, the United States slipped from 18th to 31st place in 2009. We want the US to be educated even without the tests. Did you know that 50-80% of year-over-year test score improvements were temporary and caused by fluctuations that had nothing to do with long term changes in learning. They should be permanent! Also, 44% of school districts had reduced the time spent on science, social studies and the arts by an average of 145 minutes per week in order to focus on reading and math. Other subjects are important too. Do you know how these tests make students feel? Standardized testing causes severe stress in younger students. That is very unhealthy for them. Some excessive testing may teach children to be good at taking tests, but they don’t prepare them for productive adult lives. We should prepare them to be productive adults.

The schools that are feeling the pressure of NCLB (No Child Left Behind)’s proficiency requirement are “gaming the system” to raise test scores, also known as cheating. It is unfair to students for the schools to cheat because not all of them do. People say things that people believe in to get on their side. Gerald W. Bracey says that, “qualities that standardized tests cannot measure are creativity, critical thinking, honesty, and so on”. Some students want that to be measured. Gregory J. Cizek says, “Anecdotes abound illustrating how testing…produces gripping anxiety in even the brightest students, and makes young children vomit or cry, or both”. That is pure torture to students. The low-performing students are encouraged to stay home. This isn’t fair to those high-performing students to take the test.

They say that most students believe that standardized tests are fair. Honestly, not one student or teacher I know thinks that the standardized tests are fair. This is because you have to sit still for over an hour taking a test that is really boring for most students. Therefore, I believe that standardized tests are not fair.

Now, you’ve read the whole email, what do you think about standardized tests? You should be thinking that you should really eliminate them. Some students vomit and have to take a vomit-covered test, gross. Please make these silly tests come to an end.

Thank you for your time,
Emma MacDonald

Emma MacDonald
Vestal Middle School
600 South Benita Blvd.
Vestal, NY 13850

________________________________________________________________

Student EMILIA CAPPELLETT
Jan 26 at 9:03 AM

Dear Superintendent Phelps,

“U.S. students slipped from 18th in the world in 2000 to 31st place in 2009, with a similar decline in science and no change in reading.” Schools have spent way to much time not focusing all of the important subjects in schools and focusing on just one or two subjects, from standardized tests instead of focusing on studies and curriculum. Standardized tests hmmm, standardized tests what do I think of them… Some people think they are scary or nervous they get so extreme with nerve racking tests and “they’re life is depending on it” tests it gets out of hand. Sacrameto Bee reported that ‘test related jitters especially young students, are so common that the Stanford 9 exam comes with instructions on what to do with a test booklet in case a student vomits on it.

Standardized tests have no point to them it is a joke for having kids take them it is only help for teachers to judge the student and make them look. Even teachers are being pressured though because if the children d bad on the tests it is crucial for them. Schools feeling the pressure of the NCLB’s 100% proficiency requirement are “gaming the system” to raise test scores.

They say that standardized tests have a “positive effect” on student achievement. But actually you are telling lies. Students believe they are not productive and improvements from these tests are rare. Because based on a “study published by the Brookings institution found that 50-80% of year-over-year test score improvements were temporary. Therefore standardized tests should not be published and children should not be able to take them.

School testing to kids are things that they despise and are always worrying about failing them and not passing, or getting put in workshop classes and not living up to their parents expectations don’t put pressure on kids stop standardized tests. Just stop for a minute and think what you are doing to kids all across the world.

Thank you for your time,

Emilia Cappellett

Vestal Middle School
600 South Benita Blvd.
Vestal, NY 13850

Using middle schoolers for anti-testing advocacy? was originally published on Nonpartisan Education Blog

Using middle schoolers for anti-testing advocacy? was originally published on Nonpartisan Education Blog

Using middle schoolers for anti-testing advocacy? was originally published on Nonpartisan Education Blog

Overtesting or Overcounting?

Commenting on the Center for American Progress’s (CAP’s) report, Testing Overload in America’s Schools,

https://www.americanprogress.org/issues/education/report/2014/10/16/99073/testing-overload-in-americas-schools/

…and the Education Writers’ Association coverage of it,

http://www.ewa.org/blog-ed-beat/how-much-time-do-students-spend-taking-tests

… Some testing opponents have always said there is overtesting, no matter how much there has been actually (just like they have always said there is a “growing backlash” against testing). Given limited time, I will examine only one of the claims made in the CAP report:

“… in the Jefferson County school district in Kentucky, which includes Louisville, students in grades 6-8 were tested approximately 20 times throughout the year. Sixteen of these tests were district level assessments.” (p.19)

A check of the Jefferson County school district web site –

http://www.jefferson.k12.ky.us/Departments/testingunit/FORMS/1415JCPSSYSWIDEASCAL.pdf

reveals the following: there are no district-developed standardized tests – NONE. All systemwide tests are either state developed or national exams.

Moreover, regular students in grades 6 and 7 take only one test per year – ONE – the K-Prep, though it is a full-battery test (i.e., five core subjects) with only one subject tested per day. (No, each subtest does not take up a whole day; more likely each subtest takes 1-1.5 hours, but slower students are given all morning to finish while the other students study something else in a different room and the afternoon is used for instruction.) So, even if you (I would say, misleadingly) count each subtest as a whole test, the students in grades 6 and 7 take only 5 tests during the year, none of them district tests.

So, is the Center for American Progress lying to us? Depends on how you define it. There is other standardized testing in grades 6 and 7. There is, for example, the “Alternate K-Prep” for those with disabilities, but students without disabilities don’t take it and students with disabilities don’t take the regular K-Prep.

Also there is the “Make-up K-Prep” which is administered to the regular students who were sick during the regular K-Prep administration times. But, students who took the K-Prep during the regular administration do not take the Make-up K-Prep.

There are also the ACCESS for ELLs and Alternate ACCESS for ELLs tests administered in late January and February, but only to English Language Learners. ACCESS is used to help guide the language training and course placement of ELL (or, ESL) students. Only a Scrooge would begrudge the district using these tests as “overtesting.”

And, that’s it. To get to 20 tests a year, the CAP had to assume that each and every student took each and every subtest. They even had to assume that the students sick during the regular K-Prep administration were not sick, and that all students who took the regular K-Prep also took the Make-up K-Prep.

Counting tests in US education has been this way for at least a quarter-century. Those prone to do so goose the numbers any way they plausibly can. A test is given in grade 5 on Tuesday? Count all students in the school as being tested. A DIBELS test takes all of one minute to administer? Count a full class period as lost. A 3-hour ACT has five sub-sections? That counts as five tests. Only a small percentage of schools in the district are sampled to take the National Assessment of Educational Progress in one or two grades? Count all students in all grades in the district as being tested, and count all the subjects tested individually.

Critics have gotten away with this fibbing for so long it has become routine–the standard way to count the amount of testing. And, reporters tend to pass it along as fact.

Richard P. Phelps

Overtesting or Overcounting? was originally published on Nonpartisan Education Blog

Overtesting or Overcounting? was originally published on Nonpartisan Education Blog

Overtesting or Overcounting? was originally published on Nonpartisan Education Blog

Kamenetz, A. (2015). The Test: Why our schools are obsessed with standardized testing—but you don’t have to be. New York: Public Affairs. Book Review, by Richard P. Phelps

Perhaps it is because I avoid most tabloid journalism that I found journalist Anya Kamenetz’s loose cannon Introduction to The Test: Why our schools are obsessed with standardized testing—but you don’t have to be so jarring. In the space of seven pages, she employs the pejoratives “test obsession”, “test score obsession”, “testing obsession”, “insidious … test creep”, “testing mania”, “endless measurement”, “testing arms race”, “high-stakes madness”, “obsession with metrics”, and “test-obsessed culture”.

Those un-measured words fit tightly alongside assertions that education, or standardized, or high-stakes testing is responsible for numerous harms ranging from stomachaches, stunted spirits, family stress, “undermined” schools, demoralized teachers, and paralyzed public debate, to the Great Recession (pp. 1, 6, 7), which was initially sparked by problems with mortgage-backed financial securities (and parents choose home locations in part based on school average test scores). Oh, and tests are “gutting our country’s future competitiveness,” too (p. 1).

Kamenetz made almost no effort to search for counter evidence[1]: “there’s lots of evidence that these tests are doing harm, and very little in their favor” (p. 13). Among her several sources for information of the relevant research literature are arguably the country’s most prolific proponents of the notion that little to no research exists showing educational benefits to testing.[2] Ergo, why bother to look for it?

Had a journalist covered the legendary feud between the Hatfield and McCoy families, and talked only to the Hatfields, one might expect a surplus of reportage favoring the Hatfields and disfavoring the McCoys, and a deficit of reportage favoring the McCoys and disfavoring the Hatfields.

Looking at tests from any angle, Kamenetz sees only evil. Tests are bad because tests were used to enforce Jim Crow discrimination (p. 63). Tests are bad because some of the first scientists to use intelligence tests were racists (pp. 40-43).

Tests are bad because they employ the statistical tools of latent trait theory and factor analysis—as tens of thousands of social scientists worldwide currently do—but the “eminent paleontologist” Stephen J. Gould doesn’t like them (pp. 46-48). (He argued that if you cannot measure something directly, it doesn’t really exist.) And, by the way, did you know that some of the early 20th-century scientists of intelligence testing were racists? (pp. 48-57)

Tests are bad because of Campbell’s Law: “when a measure becomes a target, it ceases to be a good measure” (p. 5). Such a criticism, if valid, could be used to condemn any measure used evaluatively in any of society’s realm. Forget health and medical studies, sports statistics, Department of Agriculture food monitoring protocols, ratings by Consumers Reports’, Angie’s List, the Food and Drug Administration. None are “good measures” because they are all targets.

Tests are bad because they are “controlled by a handful of companies” (pp. 5, 81), “The testing company determines the quality of teachers’ performance.” (p. 20), and “tests shift control and authority into the hands of the unregulated testing industry” (p. 75). Such criticisms, if valid, could be used to justify nationalizing all businesses in industries with high scale economies (e.g., there are only four big national wireless telephone companies, so perhaps the federal government should take over), and outlaw all government contracting. Most of our country’s roads and bridges, for example, are built by private construction firms under contract to local, state, and national government agencies to their specifications, just like most standardized tests; but who believes that those firms control our roads?

Kamenetz swallows education anti-testing dogma whole. She claims that multiple-choice items can only test recall and basic skills (p. 35), that students learn nothing while they are taking tests (p. 15), and that US students are tested more than any others (pp. 15-17, 75)—and they are if you count the way her information sources do—counting at minimum an entire class period for each test administration, even a one-minute DIBELS test; counting all students in all grades of a school as taking a test whenever any students in any grade are taking a test; counting all subtests independently in the US (e.g., each ACT counts as five because it has five subtests) but only the whole tests for other countries; etc.

Standardized testing absorbs way too much money and time, according to Kamenetz. Later in the book, however, she recommends an alternative education universe of fuzzy assessments that, if enacted, would absorb far more time and money.

What are her solutions to the insidious obsessive mania of testing? There is some Rousseau-an fantasizing—all school should be like her daughter’s happy pre-school where each student learned at his or her own pace (pp. 3-4) and the school’s job was “customizing learning to each student” (p. 8).

Some of the book’s latter half is devoted to “innovative” (of course) solutions that are not quite as innovative as she seems to believe. She is National Public Radio’s “lead digital education reporter” so some interesting new and recent technologies suffuse the recommendations. But, even jazzing up the context, format, and delivery mechanisms with the latest whiz-bang gizmos will not eliminate the problems inherent in her old-new solutions: performance testing, simulations, demonstrations, portfolios, and the like. Like so many Common Core Standards boosters advocating the same “innovations”, she seems unaware that they have been tried in the past, with disastrous results.[3]

As I do not know Ms. Kamenetz personally, I must assume that she is sincere in her beliefs and made her own decisions about what to write. But, if she had naively allowed herself to be wholly misled by those with a vested interest in education establishment doctrine, the end result would have been no different.

The book is a lazily slapped-together rant, unworthy of a journalist. Ironically, however, I agree with Kamenetz on many issues. Like her, I do not much like the assessment components of the old No Child Left Behind Act or the new Common Core Standards. But, my solution would be to repeal both programs, not eliminate standardized testing. Like her, I oppose the US practice of relying on a single proficiency standard for all students (pp. 5, 36). But, my solution would be to employ multiple targets, as most other countries do. She would dump the tests.

Like Kamenetz, I believe it unproductive to devote more than a smidgen of time (at most half a day) to test preparation with test forms and item formats that are separate from subject matter learning. And, like her (p. 194), I am convinced that it does more harm than good. But, she blames the tests and the testing companies for the abomination; in fact, the testing companies prominently and frequently discourage the practice. It is the same testing opponents she has chosen to trust who claim that it works. It serves their argument to claim that non-subject-matter-related test preparation works because, if it were true, it would demonstrate that tests can be gamed with tricks and are invalid measurement instruments.

Like her, I oppose firing teachers based on student test scores, as current value-added measurement (VAM) systems do while there are no consequences for the students. I believe it wrong because too few data points are used and because student effort in such conditions is not reliable, varying by age, gender, socio-economic level, and more. But, I would eliminate the VAM program, or drastically revise it; she would eliminate the tests.

Like Kamenetz, I believe that educators’ cheating on tests is unacceptable, far more common than is publicly known, and should be stopped. I say, stop the cheating. She says, dump the tests.

It defies common sense to have teachers administering high-stakes tests in their own classrooms. Rotating test administration assignments so that teachers do not proctor their own students is easy. Rotating assignments further so that every testing room is proctored by at least two adults is easy, too. So, why aren’t these and other astonishingly easy fixes to test security problems implemented? Note that the education professionals responsible for managing test administrations are often the same who complain that testing is impossibly unfair.

The sensible solution is to take test administration management out of the hands of those who may welcome test administration fiascos, and hire independent professionals with no conflict of interest. But, like many education insiders, Kamenetz would ban the testing; thereby rewarding those who have mismanaged test administrations, sometimes deliberately, with a vacation from reliable external evaluation.

If she were correct on all these issues—that the testing is the problem in each case—shouldn’t we also eliminate examinations for doctors, lawyers, nurses, and pharmacists (all of which rely overwhelmingly on the multiple-choice format, by the way)?

Our country has a problem. More than in most other countries, our public education system is independent, self-contained, and self-renewing. Education professionals staffing school districts make the hiring, purchasing, and school catchment-area boundary-line decisions. School district boundaries often differ from those of other governmental jurisdictions, confusing the electorate. In many jurisdictions, school officials set the dates for votes on bond issues or school board elections, and can do so to their advantage. Those school officials are trained, and socialized, in graduate schools of education.

A half century ago, most faculty in graduate schools of education may have received their own professional training in core disciplines, such as Psychology, Sociology, or Business Management. Today, most education school faculty are themselves education school graduates, socialized in the prevailing culture. The dominant expertise in schools of education can maintain its dominance by hiring faculty who agree with it and denying tenure to those who stray. The dominant expertise in education journals can control education knowledge by accepting article submissions with agreeable results and rejecting those without.

Even most testing and measurement PhD training programs now reside in education schools, inside the same cultural cocoon.

Standardized testing is one of the few remaining independent tools US society has for holding education professionals accountable to serve the public, and not their own, interests. Without valid, reliable, objective external measurement, education professionals can do what they please inside our schools, with our children and our money. When educators are the only arbiters of the quality of their own work, they tend to rate it consistently well.

A substantial portion of The Test’s girth is filled with complaints that tests do not measure most of what students are supposed to or should learn: “It’s math and reading skills, history and science facts that kids are tested and graded on. Emotional, social, moral, spiritual, creative, and physical development all become marginal…” (p. 4). She quotes Daniel Koretz: “These tests can measure only a subset of the goals of education” (p. 14). Several other testing critics are cited making similar claims.

Yet, standards-based tests are developed in a process that takes years, and involves scores of legislators, parents, teachers, and administrators on a variety of decision-making committees. The citizens of a jurisdiction and their representatives choose the content of standards-based tests. They could choose content that Kamenetz and the several other critics she cites prefer, but they don’t.

If the critics are unhappy with test content, they should take their case to the proper authorities, voice their complaints at tedious standards commission hearings, and contribute their time to the rather monotonous work of test framework review committees. I sense that none of that patient effort interests them; instead, they would prefer that all decision-making power be granted to them, ex cathedra, to do as they think best for us.

Moreover, I find some of their assertions about what should be studied and tested rather scary. Our public schools should teach our children emotions, morals, and spirituality?

Likely that prospect would scare most parents, too. But, many parents’ first reaction to a proposal that our schools be allowed to teach their children everything might instead be something like: first show us that you can teach our children to read, write, and compute, then we can discuss further responsibilities.

So long as education insiders insist that we must hand over our money and children and leave them alone to determine—and evaluate—what they do with both, calls for “imploding” the public education system will only grow louder, as they should.

It is bad enough that so many education professors write propaganda, call it research, and deliberately mislead journalists by declaring an absence of countervailing research and researchers. Researchers confident in their arguments and evidence should be unafraid to face opponents and opposing ideas. The researchers Kamenetz trusts do all they can to deny dissenters a hearing.

Another potential independent tool for holding education professionals accountable, in addition to testing, could be an active, skeptical, and inquiring press knowledgeable of education issues and conflicts of interests. Other countries have it. Why are so many US education reporters gullible sycophants?

 

Endnotes:

[1] She did speak with Samuel Casey Carter, the author of No Excuses: Lessons from 21 High-Performing High-Poverty Schools (2000) (pp. 81-84), but chides him for recommending frequent testing without “framing” “the racist origins of standardized testing.” Kamenetz suggests that test scores are almost completely determined by household wealth and dismisses Carter’s explanations as a “mishmash of anecdotal evidence and conservative faith.”

[2] Those sources are Daniel Koretz, Brian Jacob, and the “FairTest” crew. In fact, an enormous research literature revealing large benefits from standardized, high-stakes, and frequent education testing spans a century (Brown, Roediger, & McDaniel, 2014; Larsen & Butler, 2013; Phelps, 2012).

[3] The 1990s witnessed the chaos of the New Standards Project, MSPAP (Maryland), CLAS (California) and KIRIS (Kentucky), dysfunctional programs that, when implemented, were overwhelmingly rejected by citizens, politicians and measurement professionals alike. (Incidentally, some of the same masterminds behind those projects have resurfaced as lead writers for the Common Core Standards.)

 

References:

Brown, P. C., Roediger, H. L., & McDaniel, M. A. (2014). Make it stick: The science of successful learning. Cambridge, MA: Belknap Press.

Larsen, D. P., & Butler, A. C. (2013). Test-enhanced learning. In K. Walsh (Ed.), Oxford textbook of medical education (pp. 443–452). Oxford: Oxford University Press. http://onlinelibrary.wiley.com/doi/10.1111/j.1365-2923.2008.03124.x/full

Phelps, R. P. (2012). The effect of testing on student achievement, 1910–2010. International Journal of Testing, 12(1), 21–43. http://www.tandfonline.com/doi/abs/10.1080/15305058.2011.602920

 

Kamenetz, A. (2015). The Test: Why our schools are obsessed with standardized testing—but you don’t have to be. New York: Public Affairs. Book Review, by Richard P. Phelps was originally published on Nonpartisan Education Blog

Kamenetz, A. (2015). The Test: Why our schools are obsessed with standardized testing—but you don’t have to be. New York: Public Affairs. Book Review, by Richard P. Phelps was originally published on Nonpartisan Education Blog

Kamenetz, A. (2015). The Test: Why our schools are obsessed with standardized testing—but you don’t have to be. New York: Public Affairs. Book Review, by Richard P. Phelps was originally published on Nonpartisan Education Blog

Large-scale educational testing in Chile: Some thoughts

Recently in the auditorium of Universidad Finis Terrae, I argued that Chile’s Prueba de Selección Universitaria (PSU) cannot be “fixed” and should be scrapped. I do not, however, advocate the elimination of university entrance examinations but, rather, the creation of a fairer and more informative and transparent examination.

Chile’s pre-2002 system (PAA plus PCEs) may not have been well maintained. But, the basic structure of a general aptitude test strongly correlated with university-level work, along with highly focused content-based tests designed by each faculty is as close to an ideal university entrance system as one could hope for.

I have perused the decade-long history of the PSU, its funding, and the involvement of international organizations (World Bank, OECD) in shaping its character. Most striking is the pervasive involvement of economists in creating, implementing, and managing the test, and the corresponding lack of involvement of professionals trained in testing and measurement.

In the PSU, World Bank, and OECD documents, the economists advocate one year that the PSU be a high school exit examination (which should be correlated with the high school curriculum), then the next year that it be a university entrance examination (which should be correlated with university work), or that it is meant to monitor the implementation of the new curriculum, or that it is designed to increase opportunities for students from low socioeconomic backgrounds (in fact, it has been decreasing those opportunities). No test can possibly do all that the PSU advocates have promised it will do. The PSU has been sold as a test that can do anything you might like a test to do, and now does nothing well. It is time to bring in a team that genuinely understands how to build a test, and is willing to be open and transparent in all its dealings with the public.

The greatest danger posed by the dysfunctional PSU, I fear, is the bad reputation it gives all tests. Some in Chile have advocated eliminating the SIMCE, which, to my observation, is as well managed as the PSU is poorly managed. The SIMCE gathers information to be used in improving instruction. In theory, a school could be closed due to poor SIMCE scores, but not one ever has been. There are no consequences for students or teachers. Much information about the SIMCE is freely available and more becomes available every month; it is not the “black box” that the PSU is.

It would be a mistake to eliminate all testing because one is badly managed. We need assessments. It is easy to know what you are teaching; but, you can only know what students are learning if you assess.

Richard P. Phelps, US Fulbright Specialist at the Agencia de Calidad de la Educacion and Universidad Finis Terrae in Santiago, editor and co-author of Correcting Fallacies about Educational and Psychological Testing (American Psychological Association, 2008/2009)

Large-scale educational testing in Chile: Some thoughts was originally published on Nonpartisan Education Blog

Large-scale educational testing in Chile: Some thoughts was originally published on Nonpartisan Education Blog

Large-scale educational testing in Chile: Some thoughts was originally published on Nonpartisan Education Blog

GAO Could Do More

U.S. GAO Could Do More in Examining Educator Cheating on Tests

The U.S. Government Accountability Office (GAO), a research agency of the U.S. Congress, continues its foray into the field of standardized testing. It started at least as far back as 1993 with a report I wrote on the extent and cost of all systemwide testing in the public schools. Many studies related to school assessment have been completed since, for example, in 1998, 2006, 2006, and 2009.
In the wake of educator cheating scandals in Atlanta, Washington, DC, and elsewhere, the GAO has recently turned its attention to test security (a.k.a. test integrity). For a year or so, they have hosted a web site with the fetching title “Potential State Testing Improprieties”

“…for the purpose of gathering information on fraudulent behavior in state-administered standardized tests. The information submitted here will be used as part of an ongoing GAO investigation into cheating by school officials nationwide, and will be referred to the appropriate State Educational Agency, the U.S. Department of Education, or other agencies, as appropriate.

“Any information provided through this website will be encrypted through our secure server and handled only by authorized staff. GAO will not release any individually identifiable information provided through this website unless compelled by law or required to do so by Congress. Anonymous reports are also welcome. However, providing GAO with as much information as possible allows us to ensure that our investigation is as thorough and efficient as possible. Providing contact information is particularly important to enable clarification or requests for additional information regarding submitted reports.“

I encourage anyone with relevant information to participate, though I would be more encouraging than is the GAO about submitting the information anonymously. In some states, the “State Educational Agency” to which your personal information will be submitted is, indeed, independent, law-abiding, and interested in rooting out corruption; in others, it either does not care much about the issue or itself is an integral part of the corruption.

It would be far better if the information were not submitted to any “educational agencies” but, rather, to state and federal auditors and attorney generals. The problem with education agencies is that they have gotten too comfortable with their own, separate world, with its own elections, funding sources, governance structures, rules and regulations, and ethical code that places a higher priority on the perceived needs of educators than others.

This past week, the GAO released another report with a typically understated title, “K-12 Education: States’ Test Security Policies and Procedures Varied”. Among its findings:

“All states reported including at least 50 percent of the leading practices in test security into their policies and procedures. However, states varied in the extent to which they incorporated certain categories of leading practices. For example, 22 states reported having all of the leading practices for security training, but four states had none of the practices in this category. Despite having reported that they have various leading practices in place to mitigate testing irregularities and prevent cheating, many states reported feeling vulnerable to cheating at some point during the testing process.”

Does one feel better or worse about test security after reading this passage? Is knowing that states are getting their test security policies and procedures at least half right reassuring? Would you trust your life’s savings to a bank that assured you of including at least half of the leading practices in bank security in their policies and procedures?

Though the low percentage may disappoint, I find another aspect of the GAO study more worrisome: it’s entirely about plans and policies and not any actual behavior. In this, the GAO takes its cue from the two associations whose test security checklist it employed in its study: the Council of Chief State School Officers (CCSSO) and the Association of Test Publishers (ATP). Their checklist comprised the “leading practices” to which the GAO refers.

Peruse the list of leading practices, included in the GAO report’s enclosures (starting on p.40) and one may be surprised at how ethereal they are. Schools should have “Procedures to Keep Testing Facilities Secure”, “Rules for Storage of Testing Materials”, “Procedures to Prevent Potential Security Breaches”, and even “Ethical Practices”. There’s no mention of what such rules, procedures, and practices might look like; local schools and districts are free to interpret them as they wish, presuming they even know how. Moreover, there’s no mention of any actual implementation of any of the rules, procedures, and practices; in Ed-speak, test security is about having a test security plan, not actually securing tests.

In a footnote (p.2) the GAO admits “Our survey did not examine state or local implementation of these test security policies.” Even if the GAO had tried to examine state or local implementation, though, what could it have found?

A “leading practice” in the terms of the CCSSO and ATP is not a “practice” at all; it is a plan for practice. That is, it is not about behavior or action, it is about a plan for behavior or action. And even the character of the plan is left to the discretion of the local school or district. Any local school or district with a test security plan in its files can claim that it is following leading practices. As model test security plans are routinely provided by test developers as part of their contract, every local school or district can be a leading test security practitioner by default.

They need not do anything to secure their tests to be a leading test security practitioner.

Borrowing a phrase so often used by the GAO in its review of other government agencies’ work, the GAO “could do more” to study the issue of test security.

GAO Could Do More was originally published on Nonpartisan Education Blog

GAO Could Do More was originally published on Nonpartisan Education Blog

GAO Could Do More was originally published on Nonpartisan Education Blog