Writing JTA Task Statements

Austin Fossey-42Posted by Austin Fossey

One of the first steps in an evidence-centered design (ECD) approach to assessment development is a domain analysis. If you work in credentialing, licensure, or workplace assessment, you might accomplish this step with a job task analysis (JTA) study.

A JTA study gathers examples of tasks that potentially relate to a specific job. These tasks are typically harvested from existing literature or observations, reviewed by subject matter experts (SMEs), and rated by practitioners or other stakeholder groups across relevant dimensions (e.g., applicability to the job, frequency of the task). The JTA results are often used later to determine the content areas, cognitive processes, and weights that will be on the test blueprint.

 Questionmark has tools for authoring and delivering JTA items, as well as some limited analysis tools for basic response frequency distributions. But if we are conducting a JTA study, we need to start at the beginning: how do we write task statements?

One of my favorite sources on the subject is Mark Raymond and Sandra Neustel’s chapter, “Determining the Content of Credentialing Examinations,” in The Handbook of Test Development. The chapter provides information on how to organize a JTA study, how to write tasks, how to analyze the results, and how to use the results to build a test blueprint. The chapter is well-written, and easy to understand. It provides enough detail to make it useful without being too dense. If you are conducting a JTA study, I highly recommend checking out this chapter.

Raymond and Neustel explain that a task statement can refer to a physical or cognitive activity related to the job/practice. The format of a task statement should always follow a subject/verb/object format, though it might be expanded to include qualifiers for how the task should be executed, the resources needed to do the task, or the context of its application. They also underscore that most task statements should have only one action and one object. There are some exceptions to this rule, but if there are multiple actions and objects, they typically should be split into different tasks. As a hint, they suggest critiquing any task statement that has the words “and” or “or” in it.

Here is an example of a task statement from the Michigan Commission on Law Enforcement Standards’ Statewide Job Analysis of the Patrol Officer Position: Task 320: “[The patrol officer can] measure skid marks for calculation of approximate vehicle speed.”

I like this example because it is pretty specific, certainly better than just saying “determine vehicle’s speed.” It also provides a qualifier for how good their measurement needs to be (“approximate”). The context might be improved by adding more context (e.g., “using a tape measure”), but that might be understood by their participant population.

Raymond and Neustel also caution researchers to avoid words that might have multiple meanings or vague meanings. For example, the verb “instruct” could mean many different things—the practitioner might be giving some on-the-fly guidance to an individual or teaching a multi-week lecture. Raymond and Neustel underscore the difficult balance of writing task statements at a level of granularity and specificity that is appropriate for accomplishing defined goals in the workplace, but at a high enough level that we do not overwhelm the JTA participants with minutiae. The authors also advise that we avoid writing task statements that describe best practice or that might otherwise yield a biased positive response.

Early in my career, I observed a JTA SME meeting for an entry-level credential in the construction industry. In an attempt to condense the task list, the psychometrician on the project combined a bunch of seemingly related tasks into a single statement—something along the lines of “practitioners have an understanding of the causes of global warming.” This is not a task statement; it is a knowledge statement, and it would be better suited for a blueprint. It is also not very specific. But most important, it yielded a biased response from the JTA survey sample. This vague statement had the words “global warming” in it, which many would agree is a pretty serious issue, so respondents ranked it as of very high importance. The impact was that this task statement heavily influenced the topic weighting of the blueprint, but when it came time to develop the content, there was not much that could be written. Item writers were stuck having to write dozens of items for a vague yet somehow very important topic. They ended up churning out loads of questions about one of the few topics that were relevant to the practice: refrigerants. The end result was a general knowledge assessment with tons of questions about refrigerants. This experience taught me how a lack of specificity and the phrasing of task statements can undermine the entire content validity argument for an assessment’s results.

If you are new to JTA studies, it is worth mentioning that a JTA can sometimes turn into a significant undertaking. I attended one of Mark Raymond’s seminars earlier this year, and he observed anecdotally that he has had JTA studies take anywhere from three months to over a year. There are many psychometricians who specialize in JTA studies, and it may be helpful to work with them for some aspects of the project, especially when conducting a JTA for the first time. However, even if we use a psychometric consultant to conduct or analyze the JTA, learning about the process can make us better-informed consumers and allow us to handle some of work internally, potentially saving time and money.

JTA

Example of task input screen for a JTA item in Questionmark Authoring.

For more information on JTA and other reporting tools that are available with Questionmark, check out this Reporting & Analytics page

The key to reliability and validity is authoring

John Kleeman HeadshotPosted by John Kleeman

In my earlier post I explained how reliability and validity are the keys to trustable assessments results. A reliable assessment means that it is consistent and a valid assessment means that it measures what you need it to measure.

The key to validity and reliability starts with the authoring process. If you do not have a repeatable, defensible process for authoring questions and assessments, then however good the other parts of your process are, you will not have valid and reliable assessments.

The critical value that Questionmark brings is its structured authoring processes, which enable effective planning, authoring, Questionmark Liveand reviewing of questions and assessments and makes them more likely to be valid.

Questionmark’s white paper “Assessment Results You Can Trust” suggests 18 key authoring measures for making trustable assessments – here are three of the most important.

Organize items in an item bank with topic structure

There are huge benefits to using an assessment management system with an item bank that structures items by hierarchical topics as this facilitates:

  • An easy management view of all items and assessments under development
  • Mapping of topics to relevant organizational areas of importance
  • Clear references from items to topics
  • Use of the same item in multiple assessments
  • Simple addition of new items within a topic
  • Easy retiring of items when they are no longer needed
  • Version history maintained for legal defensibility
  • Search capabilities to identify questions that need updating when laws change or a product is retired

Some stand alone e-Learning creation tools and some LMSs do not provide you with an item bank and require you to insert questions individually within an assessment. If you only have a handful of assessments or you rarely need to update assessments, such systems can work, but for anyone with more than a few assessments, you need an item bank to be able to make effective assessments.

Authoring tool subject matter experts can use directly

One of the critical factors in making successful items is to get effective input from subject matter experts (SMEs), as they are usually more knowledgeable and better able to construct and review questions than learning technology specialists or general trainers.

If you can use a system like Questionmark Live to harvest or “crowdsource” items from SMEs and have learning or assessment specialists review them, your items will be of better quality.

Easy collaboration for item reviewers to help make items more valid

Items will be more valid if they have been properly reviewed. They will also be more defensible if the past changes are auditable. A track-changes capability, like that shown in the example screenshot below, is invaluable to aid the review process. It allows authors to see what changes are being proposed and to check they make sense.

Screenshot of track changes functionality in Questionmark Live

These three capabilities – having an item bank, having an authoring tools SMEs can access directly and allowing easy collaboration with “track changes” are critical for obtaining reliable and valid, and therefore trustable assessments.

For more information on how to make trustable assessments, see our white paper “Assessment Results You can Trust” 

Item Development – Organizing a content review committee (Part 2)

Austin Fossey-42Posted by Austin Fossey

In my last post, I explained the function of a content review committee and the importance of having a systematic review process. Today I’ll provide some suggestions for how you can use the content review process to simultaneously collect content validity evidence without having to do a lot of extra work.

If you want to get some extra mileage out of your content review committee, why not tack on a content validity study? Instead of asking them if an item has been assigned to the correct area of the specifications, ask them to each write down how they would have classified the item’s content. You can then see if topics picked by your content review committee correspond with the topics that your item writers assigned to the items.

There are several ways to conduct content validity studies, and a content validity study might not be sufficient evidence to support the overall validity of the assessment results. A full review of validity concepts is outside the scope of this article, but one way to check whether items match their intended topics is to have your committee members rate how well they
think an item matches each topic on the specifications. A score of 1 means they think the item matches, a score of -1 means they think it does not match, and a score of 0 means that they are not sure.

If each committee member provides their own ratings, you can calculate the index of congruence , which was proposed by Richard Rovinelli and Ron Hambleton. You can then create a table of these indices to see whether the committee’s classifications correspond to the content classifications given by your item writers.

The chart below compares item writers’ topic assignments for two items and the index of congruence determined by a content committee’s ratings of the two items on an assessment with ten topics. We see that both groups agreed that Item 1 belonged to Topic 5 and Item 2 belonged to Topic 1. We also see that the content review committee was uncertain on whether or not Item 1 measured Topic 2, and we see that some of the committee members felt that Item 2 measured  Topic 7.

ID2

Comparison of content review committee’s index of congruence and item writers’ classifications of two items on an assessment with ten topics.

 

Item Development – Organizing a content review committee (Part 1)

AustinPosted by Austin Fossey

Once your items have passed through an initial round of edits, it is time for a content review committee to examine them. Remember that you should document the qualifications of your committee members, and if possible, recruit different people than those used to write the items or conduct other reviews.

In their chapter in Educational Measurement (4 th ed.), Cynthia Shmeiser and Catherine Welch explain that the primary function of the content review committee is to verify the accuracy of the items with regard to the defined domain, including content and cognitive classification of items. The committee might answer questions like:

  • Given the information in the stem, is the item key the correct answer in all situations?
  • Is enough information provided in the item for candidates to choose an answer?
  • Given the information in the stem, are the distractors incorrect in all situations?
  • Would a participant with specialized knowledge interpret the item and the options differently from the general population of participants?
  • Is the item tagged to the correct area of the specifications (e.g., topic, subdomain)?
  • Does the item function at the intended cognitive level?

Other content review goals may be added depending on your specific testing purpose. For example, in their chapter in Educational Measurement (4th ed.), Brian Clauser, Melissa Margolis, and Susan Case observe that for certification and licensure exams, a content review committee might determine whether items are relevant to new practitioners—the intended audience for such assessments.

Shmeiser and Welch also recommend that the review process be systematic, implying that the committee should apply a consistent level of scrutiny and decision criteria for each item they review. But how can you as the test developer keep things systematic?

One way is to use a checklist of the acceptance criteria for each item. By using a checklist, you can ensure that the committee reviews and signs off on each aspect of the item’s content. The checklist can also provide a standardized format for documenting problems that need to be addressed by the item writers. These checklists can be used to report the results of the content review, and they can be kept as supporting documentation for the Test Development and Revision requirements specified by the Standards for Educational and Psychological Testing.

In my next post, I’ll suggest some ways for you, as a test developer, to leverage your content review committee to gather content validity evidence for your assessment.

For best practice guidance and practical advice for the five key stages of test and exam development, check out our white paper: 5 Steps to Better Tests.

New tools for building questions and assessments

Jim Farrell HeadshotPosted by Jim Farrell

If you are a Questionmark customer and aren’t using Questionmark Live, what are you waiting for?

More than 2000 of our customers have started using Questionmark Live this year, so I think now is a good time to call out some of the features that are making it a vital part of their assessment development processes.

Let’s start with building questions. One new tool our customers are using is the ability to add notes to a question. This allows reviewers to open questions and leave comments for content developers without changing the version of the question.

image 1

Now over to the assessment-building side of things. Our new assessment interface allows users to add questions in many different ways including single questions, entire topics, and random pull from a topic. You can even prevent participants from seeing repeated of questions during retakes when pulling questions at random. Jump blocks allow you to shorten test time or redirect for extra questions to participants who obtain a certain score. You can also easily tag questions as demographic questions so they can be used as filters in our reporting and analytics tools.

image 2

We have also added a more robust outcome capabilities to give your test administrators new tools for controlling how assessments are completed and reported. You can have multiple outcomes for different score bands, but you can also make it so participants have to get certain scores on particular topics before they can pass a test. For example, suppose you are giving a test on Microsoft Office and you set a pass score at 80%. You probably want to make sure that your participants understand all the products and don’t bomb one of them. You can set a prerequisite for each topic at 80% to make sure participants have knowledge of all areas before passing. If someone gets a 100% on Word questions and 60% on Excel questions, they would not pass. Powerful outcome controls help ensure you are truly measuring the goals of your learning organization.

image 3

If you aren’t using Questionmark Live you are missing out, as we are releasing new functionality every month. Get access and start getting your subject matter experts to contribute to your item banks.

Top 5 Questionmark Videos in 2013

Headshot JuliePosted by Julie Delazyn

As we near the end of the year, we’d like to highlight some of the most popular videos we’ve featured here on the blog in 2013.

We have been posting and sharing many videos from the Questionmark Learning Café. There you can find more than three dozen videos, demos and other resources on everything from quick tutorials to complete webinars about best practices in the use of online surveys, quizzes, tests and exams.

The five most popular videos in 2013… Drumroll, please…

5. Actionable Data And The A-Model
4. Introduction to Questionmark
3. Copy & paste & enhanced question selection in Questionmark Live
2. Assessments Through the Learning Process
1. How to author a Hot Spot Question using Questionmark Live

Thank you for watching, and look for more videos in 2014!