Effectively Communicating the Measurement of Constructs to Stakeholders
I co-wrote this article Kerry Eades, Assessment Specialist, Oklahoma Department of Career and Technology Education, a Questionmark user who shares my interest in test security and many other topics related to online assessment.
There are many mentions on websites, blogs, YouTube, etc. about people (employees, students, educators, school administrators, etc.) cheating on tests. Cheating has always been an issue, but the last decade of increased certifications and high-stakes testing seems to have brought about a significant increase in cheating. As a result, some pundits now believe we should redefine cheating and that texting for help, accessing the Web, or using any Web 2.0 resources should be allowed during testing. The basic idea is that a student should no longer be required to learn “facts” that can be easily located on the internet and that instruction should shift to only teaching and testing conceptual content.
There are many reasons for testing (educational, professional certification and licensure, legislative, psychological, etc.) and the pressures that stakeholders feel to succeed at all costs by “teaching to the test” or to condone any form of cheating is obviously immense. Those of us in the testing industry should, to the best of our ability, educate stakeholders on the purpose of tests and on the development and measurement of constructs. Having better informed stakeholders would lessen the “need” and “excuses” for cheating and improve the testing environment for all concerned. A key element of this is promoting an understanding of how to match the testing environment to the nature of an assessment: it is appropriate to allow “open book” assessments in some cases but certainly not all. We must keep in mind that education, in general, builds upon itself over time, and for that reason, constructs must be assessed in a valid, reliable and appropriate manner.
Tests are usually developed to make a point-in-time decision about the knowledge, ability, or skills of an individual based upon a set of predetermined standards/objectives/measures. The “value” of any test is not only this “point-in-time” reference, but what it entails for the future. Although examinees may have passed an assessment they may still have areas of relative weakness that should be remediated in order for them to maximize their full potential as students or employees. Instructors should also observe how all their students are performing on tests in order to identify their own instructional weaknesses. For example, does the curriculum match up with the specified standards and the high level of thinking in those standards? This information can also be aggregated and analyzed at the local, district, or state level to determine program strengths or weaknesses. In order to use scores in a valid way to make decisions about students or programs, we must begin by clearly defining and measuring the psychological/educational constructs or traits that a test purports to measure.
Measuring a construct is certainly complex, but what it boils down to is ensuring that the construct is being measuring in a valid way and then reporting/communicating that process to stakeholders. For example, if the construct we are trying to measure in an assessment is “Surgery Procedure” and if the candidate passes the test, we expect that the person can recall this information from memory where and when needed. It wouldn’t be valid to let the participant look up where the liver is located on the Internet during the assessment, because they would not be able to use the Internet while they are halfway through a surgical procedure.
Another example would be “Crane Operation” knowledge and skills. If this is the construct being measured and it is expected that candidates who pass the test can operate a crane properly, when and where they need to, then allowing them to tweet or text during their crane certification exam would not be a valid thing to do (it would invalidate the test scores) because they would not be able to do this in real life.
However, if the assessment is a low stakes quiz that is measuring the construct, “Tourist Hot Spots of Arkansas,” and the purpose of the quiz is to help people remember some good tourist places in Arkansas, then an “open book” or an “open source” format where the examinee can search the internet or use Web 2.0 resources is fine.
Effectively communicating the purpose of an assessment and the constructs being measured by it is essential for reducing the instances of cheating. This important communication can also help prevent cheating from being “redefined” to the detriment of test security.
For more information on assessment security issues and best practices, check out the Questionmark White Paper: “Delivering Assessments Safely and Securely.”