Test Security

Detection of Aberrant Answer Changes via Kullback–Leibler Divergence (RR 14-04)

In standardized testing, test takers may change their answer choices for various reasons. The statistical analysis of answer changes (ACs) has uncovered multiple testing irregularities on large-scale assessments and is now routinely performed at some testing organizations. Research on answer-changing behavior has recently branched off in several directions, including modeling of ACs and addressing scanning errors. Data representing answer choices made prior to the final choice are impacted by such things as scanning errors in paper-and-pencil testing and the potential for a lengthy sequence of selected answer choices before the final choice in computer-based testing. These non-final answer choices are also affected by test-taker warm-up, fatigue, or certain answering strategies. Some statistics used in practice (such as the number of wrong-to-right ACs) capitalize on the inconsistencies inherent in non-final answer choices, especially at the individual test-taker level, and this may result in a high false-positive detection rate when seeking to identify aberrant test-taker behavior.

This paper presents a conservative approach to analyzing ACs at the individual test-taker level. The information about non-final answer choices is used only to partition the responses (from the final answer choices) into two disjoint subsets: responses where an AC did not occur and responses where an AC did occur. A new statistic is presented that is based on the difference in performance between these subsets. Answer-changing behavior was simulated, where realistic distributions of wrong-to-right, wrong-to-wrong, and right-to-wrong ACs were achieved. Results of these preliminary analyses were encouraging, with the new statistic outperforming two popular statistics.

Request the full report

Additional reports in this collection

Detecting Groups of Test Takers Involved in Test...

Test collusion (TC) is the sharing of test materials or answers to test questions (items) before or during a test. Because of the potentially large advantages for the test takers involved, TC poses a serious threat to the validity of score interpretations. The proposed approach applies graph theory methodology to response similarity analyses to identify groups involved in TC while minimizing the false-positive detection rate. The new approach is illustrated and compared with a recently published method using real and simulated data.

A New Approach to Detecting Cluster Aberrancy (RR 16-05)

This report addresses a general type of cluster aberrancy in which a subgroup of test takers has an unfair advantage on some subset of administered items. Examples of cluster aberrancy include item preknowledge and test collusion. In general, cluster aberrancy is hard to detect due to the multiple unknowns involved: Unknown subgroups of test takers have an unfair advantage on unknown subsets of items. The issue of multiple unknowns makes the detection of cluster aberrancy a challenging problem from the standpoint of applied mathematics.