Assembling the Test Form — Test Design and Delivery Part 7

Posted By Doug Peterson

In the previous post in this series, we looked at putting together assessment instructions for both the participant and the instructor/administrator. Now it’s time to start selecting the actual questions.

Back in Part 2 we discussed determining how many items needed to be written for each content area covered by the assessment. We looked at writing 3 times as many items as were actually needed, knowing that some would not
make it through the review process. Doing this also enables you to create multiple forms of the test, where each form covers the same concepts with equivalent – but different – questions. We also discussed the amount of time a participant needs to answer each question type, as shown in this table:

As you’re putting your assessment together, you have to account for the time required to take the assessment. You have to multiply the number of each question type in the assessment by the values in the table above.

You also need to allow time for:

  • Reading the instructions
  • Reviewing sample items
  • Completing practice items
  • Completing demographic info
  • Taking breaks

If you already know the time allowed for your assessment, you may have to work backwards or make some compromises. For example, if you know that you only have one hour for the assessment, and you have a large amount of content to cover, you may want to consider focusing on multiple choice and fill-in-the-blank questions, and stay away from matching and short-answer to maximize the number of questions you can include in the time period allowed.

To select the actual items for the assessment, you may want to consider using a Test Assembly Form, which might look something like this:

The content area is in the first column. The second column shows how many questions are needed for that content area (as calculated back in Part 2). Each item should have a short identifier associated with it, and this is provided in the “Item #” column. The “Keyword” column is just that – one or two words to remind you what the question addresses. The last column lists the item number of an alternate item in case a problem is found with the first selection during assessment review.

As you select items, watch out for two things:

1. Enemy items. This is when one item gives away the answer to another item. Make sure that the stimulus or answer to one item does not answer or give a clue to the answer of another item.

2. Overlap. This is when two questions basically test the same thing. You want to cover all of the content in a given content area, so each question for that content area should cover something unique. If you find that you have several questions assessing the same thing, you may need to write some new questions or you may need to re-calculate how many questions you actually need.

Once you have your assessment put together, you need to calculate the cutscore. This topic could easily be another (very lengthy) blog series, and there are many books available on calculating cutscores. I recently read the book, Cutscores: A Manual for Setting Standards of Performance on Educational and Occupational Tests, by Zieky, Perie and Livingston. I found it to be a very good book, considering that the subject matter isn’t exactly “thrill a minute”. The authors discuss 18 different methods for setting cutscores, including which methods to use in various situations and how to carry out a cutscore study. They look at setting cutscores for criterion-referenced assessments (where performance is judged against a set standard) as well as norm-referenced assessments (where the performance of one participant is judged against the performance of the other participants). They also look at pass/fail situations as well as more complex judgments such as dividing participants into basic, proficient and advanced categories.

2 Responses to “Assembling the Test Form — Test Design and Delivery Part 7”

  1. The Angoff Technique is now the most common and researched technique for setting a cut score. it’s reliance on subject matter experts makes it the easiest one execute–and it works just fine.

Leave a Reply