Since the last Questionmark Users Conference, I have heard several clients discuss new measures at their companies requiring them to provide evidence of the legal defensibility of their assessment. Legal defensibility and validity are closely intertwined, but they are not synonymous. An assessment can be legally defensible, yet still have flaws that impact its validity. The distinction between the two is often the difference between how you developed the instrument versus how well you developed the instrument.
Regardless of whether you are concerned with legal defensibility or validity, careful attention should be paid to the evaluative component of your assessment program. What if someone asks, “What does this score mean?” How do you answer? How do you justify your response? The answers to these questions impact how your stakeholders will interpret and use the results, and this may have consequences for your participants. Many factors go into supporting the legal defensibility and validity of assessment results, but one could argue that the keystone is the standard-setting process.
Standard setting is the process of dividing score scales so that scores can be interpreted and actioned (AERA, APA, NCME, 2014). The dividing points between sections of the scales are called “cut scores,” and in criterion-referenced assessment, they typically correspond to performance levels that are defined a priori. These cut scores and their corresponding performance levels help test users make the cognitive leap from a participant’s response pattern to what can be a complex inference about the participant’s knowledge, skills, and abilities.
In their chapter in Educational Measurement (4th Ed.), Hambleton and Pitoniak explain that standard-setting studies need to consider many factors, and that they also can have major implications for participants and test users. For this reason, standard-setting studies are often rigorous, well-documented projects.
At this year’s Questionmark Users Conference, I will be delivering a session that introduces the basics of standard setting. We will discuss standard-setting methods for criterion- referenced and norm-referenced assessments, and we will touch on methods used in both large-scale assessments and in classroom settings. This will be a useful session for anyone who is working on documenting the legal defensibility of their assessment program or who is planning their first standard-setting study and wants to learn about different methods that are available. Participants are encouraged to bring their own questions and stories to share with the group.