This article originally appeared in The Bar Examiner print edition, Winter 2018-2019 (Vol. 87, No. 4), pp 26-30.
By Mark A. Albanese, Ph.D.; Mengyao Zhang, Ph.D.; and C. Beth Hill
For this issue’s Testing Column, we bring you a report on NCBE’s attendance at the 2018 Conference on Test Security (COTS) meeting, which was held this past October in Park City, Utah. Staff members from NCBE’s Testing and Research Department regularly attend such educational meetings to stay abreast of the latest research and information pertaining to testing and test security.
The annual COTS meeting allows representatives from across industries to join in focusing on test security capabilities and enhancements and to learn about the latest developments protecting brand integrity and the validity of test results. NCBE has been actively engaged in the COTS meetings over the seven years of their existence. Mark Albanese, NCBE’s Director of Testing and Research, has been on the COTS Executive Committee since its inception, and staff members from Testing and Research and other departments have attended COTS meetings over the years.
The two-and-a-half-day meeting in October 2018 was attended by more than 150 individuals, most of whom either were actively engaged in preventing test security breaches or were academic researchers in search of solutions to growing test security problems. Besides a healthy representation of those involved in licensure testing like the bar examination, there were many who were involved in admissions testing, such as the LSAT, SAT, and GRE, as well as a large contingent from what is called K–12 testing, tests administered to those involved in public education from kindergarten through high school.
Types of Test Security Concerns
Before describing the 2018 COTS topics and some of the recommendations that were made, it will be useful to review some of the types of test security incidents that have been discussed in recent COTS meetings. Most people think of test security issues—or, simply put, cheating—as being limited to examinees copying answers either from others who are seated nearby or from materials that are prohibited during the exam. However, test security is a much larger issue. A recent definition of cheating in the context of testing interprets it as any attempt to “gain an unfair advantage or produce inaccurate results,” without restrictions of when, where, and how it occurs.1
The importance of maintaining test security cannot be overemphasized, because cheating, regardless of which form it takes, erodes the validity of the interpretations of test scores and then undermines the legitimacy of decisions based on those scores. Correspondingly, test security includes protecting exam materials from being lost, stolen, or compromised long before exam day as well as during and after the exam. Test security also includes ensuring that examinees are who they say they are, that they do not bring any impermissible materials and technology devices into the exam to inflate their performance or record the exam questions, and that they do not reproduce or share any exam content at any point even for the benefit of others. While it is not as great a concern in the licensing world, K–12 testing has found that test administrators (often classroom teachers being evaluated for how well their students do on an exam) sometimes provide pre-knowledge to examinees or change answers after the exam before the test materials are returned for processing.2 So, there are a myriad of test security issues that can arise. Further, the increasing sophistication and miniaturization of technology has increased the risk of test security breaches exponentially.
The importance of maintaining test security cannot be overemphasized, because cheating, regardless of which form it takes, erodes the validity of the interpretations of test scores and then undermines the legitimacy of decisions based on those scores. Without remediation, the impacts will be significant on not merely those who cheat and score high but, literally, on everyone. Honest examinees tend to look less competent than they truly are. Testing professionals may be misled by the chaotic test performance caused by security breaches, thereby giving them an inaccurate perception of how well the exam functions. Moreover, cheating on licensure exams may allow unqualified—and unethical—candidates to be admitted to the practice of specific professions, which poses a threat of harm to the public and ultimately damages the credibility of the profession.
To minimize these impacts, considerable efforts have been made to prevent and detect cheating over the past decade. And the COTS meetings provide a platform for those involved in this field to exchange their experiences and insights, thereby enabling a wider awareness of challenges and opportunities facing test security.
A Glimpse of the COTS Topics
Given the nature of test security as just defined, the topics covered by the most recent COTS meeting were, not surprisingly, highly diverse, ranging from policy issues such as whether to pursue legal action against examinees involved in various test security incidents to very technical discussions on different statistical methods for detecting and quantifying irregular examinee behaviors. There were case studies of how test security incidents unfolded and how they were resolved. In addition, a lively, informative, and well-organized debate session invited experienced professionals and researchers from a variety of testing fields to express their different opinions on some controversial test security topics, such as whether to stop an examinee from continuing the exam once misconduct is discerned, or whether artificial intelligence will render security obsolete in the near future.
The following are some of the topics and our brief synopsis of what was concluded from the 2018 COTS meeting. Behind the topics there is a very common theme: that actions should be taken to safeguard the integrity of testing.
Copyright Registration: You Probably Need It and Here Is Why
“Hope for the best and prepare for the worst” is as true in the context of test security as it is in many other fields. Copyright registration of exam materials, as a good example, prepares test developers and sponsors for their future combats, if necessary, with individuals or organizations who steal or misuse exam questions protected by existing copyright laws.3 NCBE copyrights materials associated with the four licensing exams it develops—the Multistate Bar Examination (MBE), the Multistate Essay Examination (MEE), the Multistate Performance Test (MPT), and the Multistate Professional Responsibility Examination (MPRE). We have used this legal tool successfully in several battles with those who were offering, discussing, or reproducing, without our authorization, questions substantially similar to our copyrighted materials—and we are not the only organization in the testing world that has benefited from copyright registration.4 We recommend that jurisdictions with an added state exam also take steps to protect their intellectual property against potential infringement by copyrighting those materials.
The Candidate Agreement: A Contract Between Examinee and Examiner
Many of us have taken or sponsored exams that can be either standardized or informal. Especially for those who experience high-stakes standardized exams like the bar exam, the candidate agreement—which includes a laundry list of to-dos and, maybe more often, not-to-dos—is something routine. It is still worth noting that this agreement constitutes a contract between examinee and examiner.5 Once the examinee chooses to accept the candidate agreement, the rights and obligations as well as consequences of breaking such a contract, all of which have been carefully defined, are imposed on both parties.
The candidate agreement may be delivered on paper or online. As long as it is adequate and clear, this contract can be an effective way to promote a secure testing environment and deter misconduct during and after the exam. Further, it grants test developers and sponsors contractual rights to investigate and resolve test security problems, such as by canceling test scores, based on the good faith principle.6
Statistical Evidence: Perhaps More Powerful Than You Think
An existing security issue is typically associated with some types of abnormal examinee behaviors. Some of those behaviors, such as repeatedly glancing at a neighbor’s answer sheet, are frequently discovered by proctors and other examinees. However, many abnormalities are difficult to detect through simple observation. For example, an examinee who memorizes some of the possible questions and answers shared on a “brain dump” website prior to the exam may show no obvious irregularities when he or she sits for the exam. The rise of digital technology also leads to innovative forms of cheating that are less noticeable.
Statistical and psychometric analysis provides another means to detect abnormal examinee behaviors and uncover potential security breaches. In recent years, an increasing number of statistical methods have been applied to various test security incidents. For example, similarity analysis helps reveal whether two or more examinees have unusually similar responses to the same exam questions, which may be a result of answer copying or collusion. Compared with traditional answer copying, collusion is a more general and concealed cheating scenario where examinees (or sometimes, examinees along with those who do not participate in the exam, such as teachers or trainers) work together for an illegitimate score gain. Another example of statistical analysis is item drift analysis, which dissects the change of item statistics over time, thereby collecting valuable information concerning the presence of item compromise.
In addition to traditional analytic approaches, novel methodological ideas for improving test security have been proposed. For instance, the recent COTS meeting included some fascinating presentations on how to utilize network and graph theory to identify suspicious examinee collaboration. In a network graph, individual or groups of examinees or even exam questions may be conceptualized as points, and their relationships regarding the exam are visualized by different types of edges connecting those points. Network and graph theory offers a rich toolbox for defining and studying relationships and grouping similar examinees or exam questions based on relationships, which could assist in detecting test collusion.
Statistical evidence plays an important role in test security. The findings of appropriately conducted statistical and psychometric analyses can be used to corroborate existing evidence, such as irregularities witnessed and reported by proctors and examinees, or to trigger further security investigation.7 A compelling analysis is part of a good-faith investigation that will be valuable if the case goes to court.8 Besides, even for exams without immediate security concerns and exam content that is considered to be secure, the results from statistical and psychometric analysis are still useful, as they help establish a baseline showing what the normal performance of examinees and exam questions is like.
Looking into the Future
Maintaining test security in this digital age is no easy task, as just discussed. Regardless of the challenges, joint responsibilities exist for those taking and administering the exam to protect the integrity of the testing environment. In the case of licensure exams, ultimately that responsibility is to uphold the honor and dignity of the profession. We will continue participating in future COTS meetings in order to share what we learn from our security efforts and explore best security practices for our exams.
NCBE Presentations at COTS
Staff members and graduate assistants from NCBE’s Testing and Research Department have delivered several presentations at COTS meetings over the years. These presentations disseminate the findings of NCBE’s continuous research on the screening and detection of both long-standing and new types of test security incidents in the licensure testing context.
- Mark A. Albanese, Ph.D., and Cory Tracy, “A Comparison of Similarity Indexes in Detecting Copying in Cases with Significant Observational Evidence on the Multistate Bar Examination,” COTS, October 2014.
- Mark A. Albanese, Ph.D., and Cory Tracy, “Disrupted Opportunity Analysis (DOA): A System for Detecting Unusual Similarity Between a Suspected Copier and a Source,” COTS, October 2014.
- Kellie Early (NCBE’s Chief Strategy Officer, co-presenting with Aimée Hobby Rhodes, CFA Institute; Lorin Mueller, Federation of State Boards of Physical Therapy; A. Benjamin Mannes, American Board of Internal Medicine; and Joy Matthews-Lopez, National Association of Boards of Pharmacy), “So You Flagged a Cheater, Now What?” COTS, October 2014.
- Mengyao Zhang, Ph.D.; Sora Lee; and Mark A. Albanese, Ph.D., “Two-Stage Statistical Screening for Unusual Response Similarity in Large-Scale Assessments,” COTS, October 2016.
- Mengyao Zhang, Ph.D., and Joanne Kane, Ph.D., “Detection of Potential Test Collusion across Multiple Examinees: A Real-World Example,” COTS, September 2017.
- Mengyao Zhang, Ph.D., and Mark A. Albanese, Ph.D., “Graphical Imaging Methods for Detecting Potential Collusion for Test Centers with Unusual Score Gains,” COTS, September 2017.
- Mengyao Zhang, Ph.D., and Mark A. Albanese, Ph.D., “Detecting Potential Time Zone Cheating Using Item Response Theory Approaches,” COTS, October 2018.
- Gregory J. Cizek and James A. Wollack (Eds.), Handbook of Quantitative Methods for Detecting Cheating on Tests, New York, NY: Routledge (2017). (Go back)
- James A. Wollack and John J. Fremer (Eds.), Handbook of Test Security, New York, NY: Routledge (2013). (Go back)
- For detailed information concerning copyright registration of secure tests and test items, see United States Copyright Office, Circular 64, “Copyright Registration of Secure Tests and Test Items,” available at https://www.copyright.gov/circs/circ64.pdf. (Go back)
- See National Conference of Bar Examiners v. Multistate Legal Studies, Inc., 458 F. Supp. 2d 252 (E.D. Pa. 2006); see also National Conference of Bar Examiners v. Saccuzzo, No. 03-CV-0737, 2003 WL 21467772 (S.D. Cal. 2003); Educational Testing Service v. Simon, 95 F. Supp. 2d 1081 (C.D. Cal. 1999); or Graduate Management Admission Council v. Raju, 267 F. Supp. 2d 505 (E.D. Va. 2003). (Go back)
- For a thorough discussion of the candidate agreement, see Jennifer A. Semko and Robert Hunt, “Legal Matters in Test Security,” in J. A. Wollack and J. J. Fremer (Eds.), Handbook of Test Security, New York, NY: Routledge (2013) 238–240. (Go back)
- See, e.g., Murray v. Educational Testing Service, 170 F.3d 514 (5th Cir. 1999). (Go back)
- See, e.g., how a statistical analysis of the suspected examinee’s answer pattern is conducted by NCBE and potentially used by jurisdictions, in James A. Wollack, Ph.D., and Mark A. Albanese, Ph.D., “The Testing Column: How to Keep Cheaters from Passing the Bar Exam,” 85(4) The Bar Examiner (December 2016) 16–28. (Go back)
- See, e.g., Langston v. ACT, 890 F.2d 380 (11th Cir. 1989). (Go back)
Mark A. Albanese, Ph.D., is the Director of Testing and Research for the National Conference of Bar Examiners.
Mengyao Zhang, Ph.D., is a Research Psychometrician for the National Conference of Bar Examiners.
C. Beth Hill is the Director of Test and Information Security for the National Conference of Bar Examiners. She previously served as Program Director for the Multistate Bar Examination at NCBE from 2012 to 2017.
Contact us to request a pdf file of the original article as it appeared in the print edition.