New white paper: Assessment Results You Can Trust

John Kleeman HeadshotPosted by John Kleeman

Questionmark published an important white paper about why trustable assessment results matter and about how an assessment management system like Questionmark’s can help you make your assessments valid and reliable — and therefore trustable.

The white paper, which I wrote together with Questionmark CEO Eric Shepherd, explains that trustable assessment results must be both valid (measuring what you are looking for them to measure) and reliable (consistently measuring what you want to be measured).

The paper draws upon the metaphor of a doctor using results from a blood test to diagnose an illness and then prescribe a remedy. Delays will occur if the doctor orders the wrong test, and serious consequences could result if the test’s results are untrustworthy. Using this metaphor, it is easy to understand the personnel and organizational risks that can stem from making decisions based on untrustworthy results. If you assesses someone’s knowledge, skill or competence for health and safety or regThe 6 stages of trustable results; Planning assessment, Authoring items, Assembling assessment, Pilot and review, Delivery, Analyze resultsulatory compliance purposes, you need to ensure that your assessment instrument is designed correctly and runs consistently.

Engaging subject matter experts to generate questions to measure the knowledge, skills and abilities required to perform essential tasks of the job is essential in creating the initial pool of questions. However, subject matter experts are not necessarily experts in writing good questions, so an effective authoring system requires a quality control process which allows assessment experts (e.g. instructional designers or psychometricians) to easily review and amend assessment items.

For assessments to be valid and reliable, it’s necessary to follow structured processes at each step from planning through authoring to delivery and reporting.

The white paper covers these six stages of the assessment process:

  • Planning assessment
  • Authoring items
  • Assembling assessment
  • Pilot and review
  • Delivery
  • Analyze results

Following the advice in the white paper and using the capabilities it describes will help you produce assessments that are more valid and reliable — and hence more trustable.
Modern organizations need their people to be competent.

Would you be comfortable in a high-rise building designed by an unqualified architect? Would you fly in a plane whose pilot hadn’t passed a flying test? Would you let someone operate a machine in your factory if they didn’t know what to do if something went wrong? Would you send a sales person out on a call  if they didn’t know what your products do? Can you demonstrate to a regulatory authority that your staff are competent and fit for their jobs if you do not have trustable assessments?

In all these cases and many more, it’s essential to have a reliable and valid test of competence. If you do not ensure that your workforce is qualified and competent, then you should not be surprised if your employees have accidents, cause your organization to be fined for regulatory infractions, give poor customer service or can’t repair systems effectively.

To download the white paper, click here.

John will be talking more about trustable assessments at our 2015 Users Conference in Napa next month. Register today for the full conference, but if you cannot make it, make sure to catch the live webcast.

What is the most important thing about a compliance assessment?

John Kleeman HeadshotPosted by John Kleeman

What is the most important thing about a compliance assessment? Almost certainly that it is  reliable and valid.

An assessment is reliable when it works consistently. An assessment is considered reliable if we get the same result repeatedly. If an assessment isn’t reliable, then it doesn’t normally have much use.

An assessment is valid if it measures what it is supposed to measure. In compliance, a valid assessment often measures the specific knowledge and skills that were intended to be measured to check competence. More broadly, if a survey is administered to happy people, the results should show that they’re all happy. Similarly if a group of people who are all knowledgeable are tested, the test results should reveal that they’re all knowledgeable. Good assessments are both reliable and valid.

A good way of thinking about whether assessments are reliable and valid is to think about throwing darts at a dart board.

Reliable but not valid

In the diagram below, all the darts are stuck in the same area, illustrating that the thrower—the analogue of an assessment—is reliable and consistent, but unfortunately his throws are not valid. If his throws were valid, all the darts would be in the center, the bulls-eye.

Reliable but not valid

Not reliable

In the diagram below, the darts have landed all over the board. This assessment is not reliable because it’s not consistent.  An assessment can be reliable without being valid, but it’s not possible for it to be valid if it’s not reliable.

Not reliable

 Valid and reliable

Finally, the last example is of an assessment that is both reliable and valid, because all of the scores are clustered together and on target.

Valid and reliable

 I hope this explanation of reliability and validity is useful as an introduction or reminder.

You can click here if you would like information about using Questionmark assessments for regulatory compliance, including links to case studies and white papers. And if you are attending the Questionmark European Users Conference in Barcelona next week, you are welcome to attend the presentation I am giving there about 7 good reasons to use online assessments for regulatory compliance.

What’s the key to reliable assessment management?

Posted by John Kleeman

If you are choosing a platform for online assessments, what is your most important criterion? Probably that the system is reliable and trustable. An online assessment is only useful if participants, instructors and other stakeholders trust the results.

But if you are building an assessment system, how do you make sure it is reliable? How do you ensure scores are added up correctly, results don’t get lost, reports are accurate and that your system doesn’t get broken when a new browser comes out?

One of the things we’ve learned at Questionmark is that using a three-tier architecture helps hugely with QA and reliability. In much of the Questionmark software, there are three tiers, which we regard as key ingredients for trustable assessment management.

  • The Presentation tier deals with the user interface. For example, it formats assessments for display
  • The Business tier contains the business logic. For example, the scoring of an assessment
  • The Data tier records results and other data in a robust database

(Questionmark also has a Services Layer between the Presentation and Business tiers, which allows the business logic to be called from other applications and for testing purposes.)

Questionmark 3 tier architecture, presentation tier, business tier, data tier and services layer

So why does this matter? Why should you care if your assessment application has three tiers? Here are four reasons:

1. It gives a way to test the logic in the Business tier independently. So you can set up a range of automated tests with a variety of input/output at the business tier and test this thoroughly, independently of the user interface. If you don’t automate testing, then mistakes will creep in; if you automate at the UI level, you have to change the test scripts whenever browsers change.

2. You need to be able to update the Presentation tier frequently to take account of changes in browsers, new mobile devices and so on, without having to also change and risk errors in the Business tier.

3. Three tiers make an important difference to security. The Presentation tier protects the other tiers from inappropriate calls, and you can put firewalls between each tier, to protect the data and integrity of the system.

4. A tiered system is much easier to load-balance and scale. You can assign the right number of servers to each tier, and when you get a bottleneck, increase the number of servers in that tier.

So if you’re looking for a platform to run online assessments on, it’s worth asking your supplier:

  • Do you have a three-tier architecture?
  • Do you do automated service testing under the user interface layer?

If the answers to these questions are “no”, you might want to ask how they can be sure their system will stay reliable as they update it.