How to create reliable tests using JTA

Jim Farrell HeadshotPosted by Jim Farrell

The gold standard of testing is to have valid test results. You must always be asking yourself: Does this test really test what it is supposed to test? Are the topics covered going to tell me whether or not the participant has the knowledge or skills to perform the tasks required for the job? The only way to be 100 percent sure is to truly know what the tasks are, how important they are, and how often they are performed to make sure you are asking relevant questions. All of this information is covered in a Job Task Analysis (JTA). (A JTA question type is available in Questionmark Live).

A JTA is an exercise that helps you define the tasks a person in a particular position needs to perform or supervise and then measure the:

1. difficulty of the task

2. importance of the task

3. frequency of the task

Together, these dimensions are often called the DIF. There may be other dimensions you may want to measure but the DIF can help you build a competency model for the job. A competency model is a visual representation of the skills and knowledge a person needs to be highly successful. This is created by interviewing subject matter experts (SMEs) who define the DIF for each task. This sounds like a piece of cake, right? Well it can be, but many people often disregard creating a JTA because of the time and expense. The thought of going out and interviewing SMEs and then going back and correlating a ton of data sounds daunting. That is where Questionmark can help out.

With our JTA question type, you can create a list of tasks and dimensions to measure them. You can then send out the survey to all of your SMEs and then use specific job task analysis reports to vet and create your competency model. Now that makes it a piece of cake!

Let’s take a quick look at the process a little more closely. In authoring, you can define your tasks and dimensions by entering them directly or importing them from an outside source.



Once you add your question to a survey, you can deliver it to your SMEs.JTA2 (2)

The final step of the process is running reports broken down by different demographic properties. This will give you the opportunity to sit down and analyze your results, vet them with your SMEs, and develop your competency model.

JTA3Let’s get to why we are here…designing a test that will yield valid, meaningful results. Now that you know what needs to be tested, you can create a test blueprint or specification. This documentation will drive your item development process and make sure you have the right questions because you can map them back to the tasks in your competency model.

Final Planning Considerations – Test Design and Delivery Part 3

Posted By Doug Peterson

In Part 2 of this series, we looked at how to determine how many items you needed to write for each content area covered by your assessment. In this installment, we’ll take a look at a few more things that must be considered before you start writing items.

You must balance the number of items you’ve determined that you need with any time constraints imposed on the actual taking of the test. Take a look at the following table showing the average amount of time a participant spends on different question types:

It’s easy to see that it will take much longer for a participant to take an assessment containing 100 short-answer questions than one with 100 True/False questions. Therefore a time limit on the assessment might constrain what question types can be used, or conversely, the time limit may be influenced by how many questions you calculate you need and the question types you want to use. There’s no hard and fast rule here, it’s just something that needs to be considered before you go off to write a bunch of items.

The time a participant requires to complete an item is not the only thing that should be considered when determining the item format for your assessment. True/False questions might have some problems because there is a 50% chance of guessing correctly, making it hard to know if the participant knew the material or was just lucky. (I look at this in more depth in this blog article.)

Multiple Choice questions are good for recall and are easy, quick and objective to score, but many assessment developers feel that they really only test the participant’s ability to memorize. Some learning professionals are critical about incorrect information being presented, which could lead to the wrong answer being recalled at some point in the future. I discuss writing Multiple Choice question here  and  here.

Essay questions are good for testing higher cognitive skills such as formulating a correct answer from scratch and applying concepts to a new situation. However, they take longer for the participant to answer, and scoring takes longer and is more subjective than for other question types.

Finally, you need to decide on the presentation format, which boils down to basically two choices: paper and pencil, or computer-based (including over the Internet). Each has their pros and cons.

Paper and pencil is not susceptible to technical problems. It is comfortable for people unfamiliar with computers. However, it’s labor-intensive when it comes to distribution of the materials, grading, tracking, reporting, etc.

Computer-based assessments are faster and easier to update. Results can be provided immediately. Computer-based assessments allow for question types such as software simulations that paper and pencil can’t provide. However, not everyone is comfortable with computers or can type at a reasonable rate – you don’t want someone who knows the material to fail a test because they couldn’t answer questions fast enough using a keyboard.

Are True/False Questions Useless?

Posted By Doug Peterson

As a test designer, I need every question to tell me something about the learner that I didn’t know before the question was answered: Does the learner have the knowledge for which the question is testing? Developing questions costs money, and every question takes up some of the learner’s time, so every question needs to be effective.

Is a True/False question effective? Does it tell me whether or not the learner actually learned anything? One would think that if the learner answered correctly, it would mean the learning was successful. The problem is that with only two choices, the learner has a 50% chance of simply guessing the correct answer. So does a correct answer really mean the learner possessed the knowledge, or does it simply mean the learner guessed correctly? You can’t tell, so the True/False question cannot be counted on as an indicator of successful training.

So is that it for our good friend, the True/False question? No more True/False questions on our quizzes, tests and exams? Is the True/False question history?

No, not at all.

While a True/False question may not be truly able to tell you what a learner does know, it is very good at telling you what a learner doesn’t know! When the learner gets a True/False question wrong, you can be guaranteed it is because they don’t possess the desired knowledge.

This means that True/False questions work very well on pre-tests given to the learner before the training. They can help identify what the learner doesn’t know so that they know the topics on which to focus in the training.

So don’t give up on the trusty True/False question! Just make sure that you understand what it really shows you about the learner, and that you use it in the right place.

Creating scenario-based assessments

Scenario-based assessments (like this one, for example) are a great way to test learners’ understanding of a specific subject or gauge how someone would react in certain circumstances.

You can create scenario-based assessments in Questionmark Perception version 5 by grouping a series of questions with a single stimulus such as a reading passage, case study, video, image or audio track.

To do this, you would select the appropriate template within the Perception Assessment Wizard to group related questions into a single block. You use text and images to create a static introduction or stimulus that would remain visible in on one half of the window while the questions related to it show up one at a time on the other half.

You can create questions as you would any normal set of questions. Group them in a single sub-topic or place the questions in other relevant topics.

Once your assessment is complete, you can schedule it like any other assessment.