High-stakes assessment: It’s not just about test takers

In my last post I spent some time defining how I think about the idea of high-stakes assessment. I also talked about how these assessments affect the people who take them including how important it is to their ability to get or do a job.

Now I want to talk a little bit about how these assessments affect the rest of us.

The rest of us

Guess what? The rest of us are affected by the outcomes of these assessments. Did you see that coming?

But seriously, the credentials or scores that result from these assessments affect large swathes of the public. Ultimately that’s the point of high-stakes assessment. The resulting certifications and licenses exist to protect the public. These assessments are acting as barriers preventing incompetent people from practicing professions where competency really matters.

 It really matters

What are some examples of “really matters”? Well, when hiring, it really matters to employers that the network techs they hire knows how to configure a network securely, not that the techs just say they do. It matters to the people crossing a bridge that the engineers who designed it knew their physics. It really matters to every one of us that our doctor, dentist, nurse, or surgeon know what they are doing when they treat us. It really matters to society at large when we measure (well) the children and adults who take large-scale assessments like college entrance exams.

At the end of the day, high-stakes exams are high-stakes because in a very real way, almost all of us have a stake in their outcome.

 Separating the wheat from the chaff

There are a couple of ways that high stakes assessments do what they do. Some assessments are simply designed to measure “minimal competence,” with test takers either ending above the line—often known as “passing”—or below the line. The dreaded “fail.”

Other assessments are designed to place test takers on a continuum of ability. This type of assessment assigns scores to test takers, and the range of
score often appear odd to laypeople. For example, the SAT uses a 200 – 800 scale.

Want to learn more? Hang on till next time!

Posted by Greg Pope

I had the good fortune of presenting a few sessions at the Questionmark 2010 Users Conference in sunny Miami a couple of weeks ago. It was a great opportunity to catch up with customers and learn about the priorities organizations are focusing on.

In one of my best practice sessions there was a great deal of interest in the topic of beta testing, so I thought I would put together a blog article on this in case others were interested.

Beta testing can be defined as gathering psychometric information regarding newly created questions in order to inform the creation of actual exams. Newly developed questions that have gone through the necessary editing and review processes are administered to representative samples of participants, either in advance of or during an actual high-stakes assessment. Psychometric information regarding the new questions is collected and used to build the actual assessments. Questions that have been beta tested are screened to ensure that they meet certain quality benchmarks (e.g., all questions fall into a certain range of difficulty, all questions have acceptable discrimination). These beta tested questions are then used to create the assessments built to specific structure criteria (e.g., there is an appropriate spread of question difficulty, a targeted mean test score is created, more questions are included on the assessment near the pass score if the assessment is criterion referenced, etc.).

A summary graphic describing the general beta testing process is included below:

There are a number of common models for beta testing questions, two of the most common are:

models for beta testing questions

