Six tips to increase content validity in competence tests and exams

Posted by John Kleeman

Content validity is one of the most important criteria on which to judge a test, exam or quiz. This blog post explains what content validity is, why it matters and how to increase it when using competence tests and exams within regulatory compliance and other work settings.

What is content validity?

An assessment has content validity if the content of the assessment matches what is being measured, i.e. it reflects the knowledge/skills required to do a job or demonstrate that the participant grasps course content sufficiently.

Content validity is often measured by having a group of subject matter experts (SMEs) verify that the test measures what it is supposed to measure.

Why does content validity matter?

If an assessment doesn’t have content validity, then the test isn’t actually testing what it seeks to, or it misses important aspects of job skills.

Would you want to fly in a plane, where the pilot knows how to take off but not land? Obviously not! Assessments for airline pilots take account all job functions including landing in emergency scenarios.

Similarly, if you are testing your employees to ensure competence for regulatory compliance purposes, or before you let them sell your products, you need to ensure the tests have content validity – that is to say they cover the job skills required.

Additionally to these common sense reasons, if you use an assessment without content validity to make decisions about people, you could face a lawsuit. See this blog post which describes a US lawsuit where a court ruled that because a policing test didn’t match the job skills, it couldn’t be used fairly for promotion purposes.

How can you increase content validity?

Here are some tips to get you started. For a deeper dive, Questionmark has several white papers that will help, and I also recommend Shrock & Coscarelli’s excellent book “Criterion-Referenced Test Development”.

  1. Conduct a job task analysis (JTA). A JTA is a survey which asks experts in the job role what tasks are important and how often they are done. A JTA gives you the information to define assessment topics in terms of what the job needs. Questionmark has a JTA question type which makes it easy to deliver and report on JTAs.
  2. Define the topics in the test before authoring. Use an item bank to store questions, and define the topics carefully before you start writing the questions. See Know what your questions are about before you deliver the test for some more reasoning on this.
  3. You can poll subject matter experts to check content validity for an existing test. If you have an existing assessment, and you need to check its content validity, get a panel of SMEs (experts) to rate each question as to whether it is  “essential,” “useful, but not essential,” or “not necessary” to the performance of what is being measured. The more SMEs who agree that items are essential, the higher the content validity. See Understanding Assessment Validity- Content Validity for a way to do this within Questionmark software.
  4. Use item analysis reporting. Item analysis reports flag questions which are don’t correlate well with the rest of the assessment. Questionmark has an easy to understand item analysis report which will flag potential questions for review. One of the reasons a question might get flagged is because participants who do well on other questions don’t do well on this question – this could indicate the question lacks content validity.
  5. Involve Subject Matter Experts (SMEs). It might sound obvious, but the more you involve SMEs in your assessment development, the more content validity you are likely to get. Use an assessment management system which is easy for busy SMEs to use, and involve SMEs in writing and reviewing questions.
  6. Review and update tests frequently. Skills required for jobs change quickly with changing technology and changing regulations.  Many workplace tests that were valid two years ago, are not valid today. Use an item bank with a search facility to manage your questions, and review and update or retire questions that are no longer relevant.

I hope this blog post reminds you why content validity matters and gives helpful tips to improve the content validity of your tests. If you are using a Learning Management System to create and deliver assessments, you may struggle to obtain and demonstrate content validity. If you want to see how Questionmark software can help manage your assessments, request a personalized demo today.

 

Making your Assessment Valid: 5 Tips from Miami

John Kleeman Headshot

Posted by John Kleeman

A key reason people use Questionmark’s assessment management system is that it helps you make more valid assessments. To remind you, a valid assessment is one that genuinely measures what it is supposed to measure. Having an effective process to ensure your assessments are valid, reliable and trustable was an important topic at Questionmark Conference 2016 in Miami last week. Here is some advice I heard:

Reporting back from 3 days of learning and networking at Questionmark Conference 2016 in Miami

Tip 1: Everything starts from the purpose of your assessment. Define this clearly and document it well. A purpose that is not well defined or that does not align with the needs of your organization will result in a poor test. It is useful to have a formal process to kick off  a new assessment to ensure the purpose is defined clearly and is aligned with business needs.

Tip 2: A Job Task Analysis survey is a great way of defining the topics/objectives for new-hire training assessments. One presenter at the conference sent out a survey to the top performing 50 percent of employees in a job role and asked questions on a series of potential job tasks. For each job task, he asked how difficult it is (complexity), how important it is (priority) and how often it is done (frequency). He then used the survey results to define the structure of knowledge assessments for new hires to ensure they aligned with needed job skills.

Tip 3: The best way to ensure that a workplace assessment starts and remains valid is continual involvement with Subject Matter Experts (SMEs). They help you ensure that the content of the assessment matches the content needed for the job and ensure this stays the case as the job changes. It’s worth investing in training your SMEs in item writing and item review. Foster a collaborative environment and build their confidence.

Tip 4: Allow your participants (test-takers) to feed back into the process. This will give you useful feedback to improve the questions and the validity of the assessment. It’s also an important part of being transparent and open in your assessment programme, which is useful because people are less likely to cheat if they feel that the process is well-intentioned. They are also less likely to complain about the results being unfair. For example it’s useful to write an internal blog explaining why and how you create the assessments and encourage feedback.

Lunch with a view at Questionmark Conference 2016 in Miami

Tip 5: As the item bank grows and as your assessment programme becomes more successful, make sure to manage the item bank and review items. Retire items that are no longer relevant or when they have been overexposed. This keeps the item bank useful, accurate and valid.

There was lots more at the conference – excitement that Questionmark NextGen authoring is finally here, a live demo of our new easy to use Printing and Scanning solution … and having lunch on the hotel terrace in the beautiful Miami spring sunshine – with Questionmark branded sunglasses to keep cool.

There was a lot of buzz at the conference about documenting your assessment decisions and making sure your assessments validly measure job competence. There is increasing understanding that assessment is a process not a project, and also that to be used to measure competence or to select for a job role, an assessment must cover all important job tasks.

I hope these tips on making assessments valid are helpful. Click here for more information on Questionmark’s assessment management system.

Item Development – Summary and Conclusions

Austin Fossey-42Posted by Austin Fossey

This post concludes my series on item development in large-scale assessment. I’ve discussed some key processes in developing items, including drafting items, reviewing items, editing items, and conducting an item analysis. The goal of this process is to fine-tune a set of items so that test developers have an item pool from which they can build forms for scored assessment while being confident about the quality, reliability, and validity of the items. While the series covered a variety of topics, there are a couple of key themes that were relevant to almost every step.

First, documentation is critical, and even though it seems like extra work, it does pay off. Documenting your item development process helps keep things organized and helps you reproduce processes should you need to conduct development again. Documentation is also important for organization and accountability. As noted in the posts about content review and bias review, checklists can help ensure that committee members consider a minimal set of criteria for every item, but they also provide you with documentation of each committee member’s ratings should the item ever be challenged. All of this documentation can be thought of as validity evidence—it helps support your claims about the results and refute rebuttals about possible flaws in the assessment’s content.

The other key theme is the importance of recruiting qualified and representative subject matter experts (SMEs). SMEs should be qualified to participate in their assigned task, but diversity is also an important consideration. You may want to select item writers with a variety of experience levels, or content experts who have different backgrounds. Your bias review committee should be made up of experts who can help identify both content and response bias across the demographic areas that are pertinent to your population. Where possible, it is best to keep your SME groups independent so that you do not have the same people responsible for different parts of the development cycle. As always, be sure to document the relevant demographics and qualifications of your SMEs, even if you need to keep their identities anonymous.

This series is an introduction for organizing an item development cycle, but I encourage readers to refer to the resources mentioned in the articles for
more information. This series also served as the basis for a session at the 2015 Questionmark Users Conference, which Questionmark customers can watch in the Premium section of the Learning Café.

You can link back to all of the posts in this series by clicking on the links below, and if you have any questions, please comment below!

Item Development – Managing the Process for Large-Scale Assessments

Item Development – Training Item Writers

Item Development – Five Tips for Organizing Your Drafting Process

Item Development – Benefits of editing items before the review process

Item Development – Organizing a content review committee (Part 1)

Item Development – Organizing a content review committee (Part 2)

Item Development – Organizing a bias review committee (Part 1)

Item Development – Organizing a bias review committee (Part 2)

Item Development – Conducting the final editorial review

Item Development – Planning your field test study

Item Development – Psychometric review

New white paper: Assessment Results You Can Trust

John Kleeman HeadshotPosted by John Kleeman

Questionmark published an important white paper about why trustable assessment results matter and about how an assessment management system like Questionmark’s can help you make your assessments valid and reliable — and therefore trustable.

The white paper, which I wrote together with Questionmark CEO Eric Shepherd, explains that trustable assessment results must be both valid (measuring what you are looking for them to measure) and reliable (consistently measuring what you want to be measured).

The paper draws upon the metaphor of a doctor using results from a blood test to diagnose an illness and then prescribe a remedy. Delays will occur if the doctor orders the wrong test, and serious consequences could result if the test’s results are untrustworthy. Using this metaphor, it is easy to understand the personnel and organizational risks that can stem from making decisions based on untrustworthy results. If you assesses someone’s knowledge, skill or competence for health and safety or regThe 6 stages of trustable results; Planning assessment, Authoring items, Assembling assessment, Pilot and review, Delivery, Analyze resultsulatory compliance purposes, you need to ensure that your assessment instrument is designed correctly and runs consistently.

Engaging subject matter experts to generate questions to measure the knowledge, skills and abilities required to perform essential tasks of the job is essential in creating the initial pool of questions. However, subject matter experts are not necessarily experts in writing good questions, so an effective authoring system requires a quality control process which allows assessment experts (e.g. instructional designers or psychometricians) to easily review and amend assessment items.

For assessments to be valid and reliable, it’s necessary to follow structured processes at each step from planning through authoring to delivery and reporting.

The white paper covers these six stages of the assessment process:

  • Planning assessment
  • Authoring items
  • Assembling assessment
  • Pilot and review
  • Delivery
  • Analyze results

Following the advice in the white paper and using the capabilities it describes will help you produce assessments that are more valid and reliable — and hence more trustable.
Modern organizations need their people to be competent.

Would you be comfortable in a high-rise building designed by an unqualified architect? Would you fly in a plane whose pilot hadn’t passed a flying test? Would you let someone operate a machine in your factory if they didn’t know what to do if something went wrong? Would you send a sales person out on a call  if they didn’t know what your products do? Can you demonstrate to a regulatory authority that your staff are competent and fit for their jobs if you do not have trustable assessments?

In all these cases and many more, it’s essential to have a reliable and valid test of competence. If you do not ensure that your workforce is qualified and competent, then you should not be surprised if your employees have accidents, cause your organization to be fined for regulatory infractions, give poor customer service or can’t repair systems effectively.

To download the white paper, click here.

John will be talking more about trustable assessments at our 2015 Users Conference in Napa next month. Register today for the full conference, but if you cannot make it, make sure to catch the live webcast.

An easier approach to job task analysis: Q&A

Julie Delazyn HeadshotPosted by Julie Delazyn

Part of the assessment development process is understanding what needs to be tested. When you are testing what someone needs to know in order for them to do their job well, subject matter experts can help you harvest evidence for your test items by observing people at work. That traditionally manual process can take a lot of time and money.

Questionmark’s new job task analysis (JTA) capabilities enable SMEs to harvest information straight from the person doing the job. These tools also offer an easier way to see the frequency, importance, difficulty and applicability of a task in order to know if it’s something that needs to be included in an assessment.

Now that JTA question authoring, assessment creation and reporting are available to users of  Questionmark OnDemand and Questionmark Perception 5.7 I wanted to understand what makes this special and important. Questionmark Product Manager Jim Farrell, who has been working on the JTA question since its conception, was kind enough to speak to me about  its value, why it was created, and how it can now benefit our customers.

Here is a snippet of our conversation:

So … first things first … what exactly IS job task analysis and how would our customers benefit from using it?

Job task analysis, JTA, is a survey that you send out that has a list of tasks, which are broken down into dimensions. Those dimensions are typically difficulty, importance, frequency, and applicability. You want to find out things like this from someone who fills out the surveys: Do they find the job difficult? Do they deem it important? And how frequently do they do it? When you correlate all this data you’ll quickly see the items that are more important to test on and collect information on.

We have a JTA question type in Questionmark Live where you can either build your task list and your dimensions or you can import your tasks through a simple import process—so if you have a spreadsheet with all of your tasks you can easily import it. You would then add those to a survey and send them out to collect information. We also have two JTA reports that allow you to break down results by the actual dimension—just look at the difficulty for all the tasks—or you can look at a summary view of all of your tasks and all the dimensions all at
one time; have a snapshot.

That sounds very interesting and easy to use! I’m interested in how did question type actually came to be.

We initially developed the job task analysis survey for the US Navy. Prior to this, trainers would have to travel with paper and clipboards to submarines, battleships and aircraft carriers and watch sailors and others in the navy do their jobs. We developed the JTA survey to help them be more efficient to collect this data more easily and a lot more quickly than they did before.

What do you think is most valuable and exciting about JTA?

To me, the value comes in the ease of creating the questions and sending them out. And I am probably most excited for our customers. Most customers probably harvest information with paper and clipboard and walking around and watching people do their jobs. That’s a very expensive and time-consuming task, so by being able to send this survey out directly to subject matter experts you’re getting more authentic data because you are getting it right form the SMEs rather than from someone observing the behavior.

 

It was fascinating for me to understand how JTA was created and how it works … Do you find this kind of question type interesting? How do you see yourself using it? Please share your thoughts below!

Psychometrics and Measurement Design: A conversation

Joan Phaup 2013 (3)Posted by Joan Phaup

Many delegates to the Questionmark 2014 Users Conference in San Antonio March 4 – 7 want to learn about assessment-related best practices.

Austin Fossey, our Reporting and Analytics Manager, will talk about Principles of Psychometrics and Measurement Design during one of the many breakout sessions on the agenda.

Austin Fossey

Austin Fossey

Austin had just joined Questionmark when he attended the 2013 conference. This time around, he’ll be more actively involved in the program, so I wanted to learn more about him and his presentation plans.

What made you decide to study psychometrics?

I was working in customer service at a certification testing company. They always brought in psychometricians to build their assessments. I’d never heard of psychometrics before, but I had studied applied math as an undergraduate and thought the math behind psychometrics was interesting. I liked the idea of doing analytical work and heard that psychometricians are always in demand, so I got started right away studying educational measurement at the University of Maryland.

How do you make principles of psychometrics understandable to, well, mere mortals?

I don’t think psychometricians are different than anybody else. Most of it is applying a probabilistic model to a set of data to make an inference about an unobserved trait. Those models are based on concepts or theories, so you don’t have to explain the math as long as you can explain the theory. People understand that.

I really like evidence-centered design, because it provides principles and a vocabulary that can be used by everyone involved in assessments. Using this framework, psychometricians can communicate about measurement design with subject matter experts, item writers, curriculum specialists, programmers, policy makers — all the stakeholders, from start to finish.

Who do you think would benefit from attending your presentation about psychometrics and measurement design?

People who feel they are applying the same test development formula day in and day out and who wonder if there might be a better way to do it. Even with certifications, which usually follow excellent standards based on best practices, we should always be critical about our assessments and we should always be aggressive about ensuring validity. It would be great to see people there who want to be mindful of every decision they make in assessment design.

How could people prepare for this session?

I hope they bring examples of their own test development process and validity studies. We can discuss people’s own experiences and the hurdles they have faced with their measurement design. Other than that I would say just bring an open mind.

What would you like your audience to take away from your presentation?

People who may be new to measurement design and psychometric concepts like validity can take away some tools to use in their assessment programs. I hope that if more experienced people come, they can learn from each others’ experiences and go away with new ideas about their own approach to assessment design.

What do you hope to take away from the Users Conference?
I want to harvest a lot of feedback with our clients during conversations and focus groups, so that we can recalibrate ourselves for the work we are doing and prioritize our tasks.

The session on psychometrics is just one of several for Austin in 2014. Check out the conference program and register by December 12 to save $200.