An Introduction to Building Assessments with Evidence-Centered Design

Austin Fossey

Questionmark users have a flexible set of assessment tools, including a wide variety of item types, conditional item blocks, and weighted scoring. But when should we use these tools, and how do they fit in with our overall measurement goals?

Many of us follow a set of guidelines for developing our assessments, be it The Standards for Educational and Psychological Testing or simply a set of development practices defined by our organization. These guidelines are often special cases of a framework known as evidence-centered design (ECD).

ECD is a formal yet flexible structure for designing and delivering assessments. You may recall my post about argument-based validity, where we discussed using evidence to support a claim, thus creating a validity argument. ECD is a common method for designing an assessment that provides the evidence needed for a validity argument about a participant’s knowledge or abilities.

But ECD is not just another checklist for how to build an assessment: it guides the decision-making process. As test developers, we are accountable for every design and content decision, and ECD helps us to map those choices to the assessment inferences.

For example, the task model is part of the Conceptual Assessment Framework (CAF) in ECD, and it is used to specify which tasks should be used to elicit the types of behavior we want to observe to support our inference about the participant. We use the task model to document why we chose a specific selected-response item format (e.g., the use of drag and drop items as opposed to multiple choice items).

I see more and more research where assessments are described using the vocabulary of ECD, including fixed form assessments, adaptive assessments, education games, and simulations (e.g., Journal of Educational Data Mining, 4(1)). ECD can be used to describe any assessment, not just the standardized formats that we are used to.

As technology allows us to explore new ways of assessing participants, ECD provides a common thread to help define our design choices, make comparisons  between designs, and support our inferences.

ECD has five parts:crane

  • Domain Analysis
  • Domain Modeling
  • Conceptual Assessment Framework (CAF)
  • Assessment Implementation
  • Assessment Delivery

If you are a Questionmark client, Questionmark may play a role in all five of these areas—we are almost certainly a part of the final three. In my next three posts, I will focus on the three parts of the CAF:

By exploring the CAF, we will learn about how to break up our assessment design into its functional components so that we can determine how different Questionmark tools can be leveraged to improve the validity of our inferences.

If you are interested in learning more, there are many great articles about ECD, and the links in this post are a good starting point. Have a good example of ECD being used to implement an assessment? Please share it by leaving a comment!

