Caveon Q&A: Enhanced security of high-stakes tests

Headshot JuliePosted by Julie Delazyn

Questionmark and Caveon Test Security, an industry leader in protecting high-stakes test programs, have recently joined forces to provide clients of both organizations with additional resources for their test administration toolboxes.

Questionmark’s comprehensive platform offers many features that help ensure security and validity throughout the assessment process. This emphasis on security, along with Caveon’s services, which include analyzing data to identify validity risks as well as monitoring the internet for any leak that could affect intellectual property, adds a strong layer of protection for customers using Questionmark for high-stakes assessment management and delivery.

I sat down with Steve Addicott, Vice President of Caveon, to ask him a few questions about the new partnership, what Caveon does and what security means to him. Here is an excerpt from our conversation

Who is Caveon? Tell me about your company.

At Caveon Test Security, we fundamentally believe in quality testing and trustworthy test results. That’s why Caveon offers test security and test item
development services dedicated to helping prevent test fraud and better protecting our clients’ items, tests, and reputations.

What does security mean to you, and why is it important?

High stakes test programs make important education and career decisions about test takers based on test results. We also spend a tremendous amount of time creating, administering, scoring, and reporting results. With increased security pressures from pirates and cheats, we are here to make sure that those results are trustworthy, reflecting the true knowledge and skills of test takers.

Why a partnership with Questionmark and why now?

With a growing number of Questionmark clients engaging in high-stakes testing, Caveon’s experience in protecting the validity of test results is a natural extension of Questionmark’s security features. For Caveon, we welcome the chance to engage with a vendor like Questionmark to help protect exam results.

And how does this synergy help Questionmark customers who deliver high-stakes tests and exams?

As the stakes in testing continue to rise, so do the challenges involved in protecting your program. Both organizations are dedicated to providing clients with the most secure methods for protecting exam administrations, test development investments, exam result validity and, ultimately, their programs’ reputations.

For more information on Questionmark’s dedication to security, check out this video and download the white paper: Delivering Assessments Safely and Securely.

US Justice Department demands accessible educational technology

John Kleeman HeadshotPosted by John Kleeman

The US Justice Department made an important intervention last week, that could tip the balance in making educational technology more accessible for learners with disabilities.

They are intervening on the side of the learner in a court case between a blind learner and Miami University. The case is about learners with disabilities not getting the same access to digital content as other learners. For example, according to the complaint, the university required all learners to use applications with inaccessible Flash content as well as an LMS that was not usable with screen readers.

To quote the US Justice Department’s motion to intervene:

“Miami University’s failure to make its digital- and web-based technologies accessible to individuals with disabilities, or to otherwise take appropriate steps to ensure effective communication with such individuals, places them at a great disadvantage, depriving them of equal access to Miami University’s educational content and services.”

Example question with black on white text showing buttons that can change text size and contrastQuestionmark has long taken accessibility seriously. When we re-architected our assessment delivery engine for our version 5 release, we made accessibility a priority – see Assessment Accessibility in Questionmark Perception Version 5 .  Our platform  includes several standard templates that include “text sizing” and “contrast controls” that administrators can make available to participants – these can be helpful for certain visual impairments.

Here are some other aspects of the delivery platform that we have optimized for accessibility:

  • The administrator can override an established assessment time limit for certain participants
  • Participants can use a pointing device other than a mouse or navigate the assessment using keystrokes such as the “tab” The same question as above showing a different contrast, with yellow text on a blue backgroundkey
  • Screen readers can be used to clearly dictate assessment questions, choices and other content

Please note that preparing assessments for participants with disabilities takes more than an optimized delivery platform: assessment authors and administrators need to plan for accessibility as well. For example, items that rely heavily on graphics or images must use suitable description tags, videos should be appropriately captioned, and so on. Vendors and testing organizations alike must make a constant effort to ensure that material stays accessible as technology changes.

Providing you are following best practice for developing accessible content, the Questionmark delivery platform can complete the loop and help you give all of your participants–including those with disabilities–a reliable and fair test-taking experience.

Accessible software is good for everyone, not just those who are temporarily or permanently need accommodations for their disabilities. Many of the technologies required to make software accessible also enhance delivery on mobile devices and improve blended delivery in general.

With the US Department of Justice now engaging in lawsuits against institutions that do not take accessibility seriously, accessibility support will become more important to everyone.


Does online learning and assessment help sustainability?

John Kleeman HeadshotPosted by John Kleeman

Encouraged by public interest and increasing statutory controls, most large organizations care about and report on environmental sustainability and greenhouse gas emissions. I’ve been wondering how much online assessments and the wider use of e-learning help sustainability. Does taking assessments and learning online contribute to the planet’s well-being?

Does using computers instead of paper save trees?Picture of trees, part cut down

It’s easy to see that by taking exams on computer, we save a lot of paper. Trees vary in size, but it seems the average tree might make about 50,000 pages of paper. If a typical paper test uses 10 pages of paper, then an organization that delivers 100,000 tests per year is using 20 trees a year. Or suppose a piece of learning material is 100 pages is distributed to 10,000 learners. The 20 trees cut down for that learning would be saved if the learning were delivered online.

These are useful benefits, but they need to be set against the environmental costs of the computers and electricity used. The environmental benefit is probably modest.

What about the benefits of reduced business travel?

A much stronger environmental case might be made around reduced travel. Taking a test on paper and/or in a test center likely means travelling. So we’re not surprised to be seeing increased use of online proctoring. For example, SAP are starting to use it for their certification exams. Online proctoring means that a candidate doesn’t have to travel to a test center but can take an exam from their home or office. This saves time and money. It also eliminates the environmental costs of  travel. Learning online rather than going to a classroom does the same.

Training and assessment are only a small reason for business travel, but the overall environmental impact of business travel is imagehuge.  One large services company has reported that 67 percent of their carbon footprint in 2014 was related to it. Another  indicates that cost at over 30 percent.. Many large companies have internal targets to reduce business travel greenhouse gas emissions.

In the academic world, the Open University in the UK performed a study a few years back on the carbon benefits of their model of distance learning compared with more conventional university education. The study suggested that carbon emissions were 85 percent lower with distance education compared with a more conventional university approach. However, the benefit of electronic delivery rather than paper delivery in distance learning was more modest at 12 percent, partly because students often print the e-learning materials. This suggests that there is a very substantial benefit in distance learning and a smaller benefit in it being electronic rather than paper-based.

The strongest benefit of online assessment is that it  gives accurate information about people’s knowledge, skills and abilities to help organizations make good decisions. But it does seem that there may well also be a useful environmental benefit too.

7 actionable steps for making your assessments more trustable

John Kleeman HeadshotPosted by John Kleeman

Questionmark has recently published a white paper on trustable assessment,  and we blog about this topic frequently. See Reliability and validity are the keys to trust and The key to reliability and validity is authoring for some recent blog posts about the white paper.

But what can you do today if you want to make your assessments more trustable? Obviously you can read the white paper! But here are seven actionable steps that if you’re not doing already you could do today or at least reasonably quickly to improve your assessments.

1. Organize questions in an item bank with topic structure

If you are already using Questionmark software, you are likely doing this already.  But putting questions in an item bank structured by hierarchical topics facilitates an easy management view of all questions and assessments under development. It allows you to use the same question in multiple assessments, easily add questions and retire them and easily search questions, for example to find the ones that need update when laws change or a product is retired.

2. Use questions that apply knowledge in the job context

It is better to ask questions that check how people can apply knowledge in the job context than just to find out whether they have specific knowledge. See my earlier post Test above knowledge: Use scenario questions for some tips on this. If you currently just test on knowledge and not on how to apply that knowledge, make today the day that you start to change!

3. Have your subject matter experts directly involved in authoring

Especially in an area where there is rapid change, you need subject matter experts directly involved in authoring and reviewing questions. Whether you use Questionmark Live or another system, start involving them.

4. Set a pass score fairly

Setting a pass score fairly is critical to being able to trust an assessment’s results. See Is a compliance test better with a higher pass score? and Standard Setting: A Keystone to Legal Defensibility for some starting points on setting good pass scores. And if you don’t think you’re following good practice, start to change.

5. Use topic scoring and feedback

As Austin Fossey explained in his ground-breaking post Is There Value in Reporting Subscores?, you do need to check whether it is sensible to report topic scores. But in most cases, topic scores and topic feedback can be very useful and actionable – they direct people to where there are problems or where improvement is needed.

6. Define a participant code of conduct

If people cheat, it makes assessment results much less trustable. As I explained in my post What is the best way to reduce cheating? , setting up a participant code of conduct (or honesty code) is an easy and effective way of reducing cheating. What can you do today to encourage your test takers to believe your program is fair and be on your side in reducing cheating?

7. Run item analysis and weed out poor items

This is something that all Questionmark users could do today. Run an item analysis report – it takes just a minute or two from our interfaces and look at the questions that are flagged as needing review (usually amber or red). Review them to check appropriateness and potentially retire them from your pool or else improve them.

Questionmark item analysis report


Many of you will probably be doing all the above and more, but I hope that for some of you this post could be a spur to action to make your assessments more trustable. Why not start today?

Is There Value in Reporting Subscores?

Austin Fossey-42Posted by Austin Fossey

The decision to report subscores (reported as Topic Scores in Questionmark’s software) can be a difficult one, and test developers often need to respond to demands from stakeholders who want to bleed as much information out of an instrument as they can. High-stakes test development is lengthy and costly, and the instruments themselves consume and collect a lot of data that can be valuable for instruction or business decisions. It makes sense that stakeholders want to get as much mileage as they can out of the instrument.

It can be anticlimactic when all of the development work results in just one score or a simple pass/fail decision. But that is after all what many instruments are designed to do. Many assessment models assume unidimensionality, so a single score or classification representing the participant’s ability is absolutely appropriate. Nevertheless, organizations often find themselves in the position of trying to wring out more information. What are my participants’ strengths and weaknesses? How effective were my instructors? There are many ways in which people will try to repurpose an assessment.

The question of whether or not to report subscores certainly falls under this category. Test blueprints often organize the instrument around content areas (e.g., Topics), and these lend themselves well to calculating subscores for each of the content areas. From a test user perspective, these scores are easy to interpret, and they are considered valuable because they show content areas where participants perform well or poorly, and because it is believed that this information can help inform instruction.

But how useful are these subscores? In their article, A Simple Equation to Predict a Subscore’s Value, Richard Feinberg and Howard Wainer explain that there are two criteria that must be met to justify reporting a subscore:

  • The subscore must be reliable.
  • The subscore must contain information that is sufficiently different from the information that is contained by the assessment’s total score.

If a subscore (or any score) is not reliable, there is no value in reporting it. The subscore will lack precision, and any decisions made on an unreliable score might not be valid. There is also little value if the subscore does not provide any new information. If the subscores are effectively redundant to the total score, then there is no need to report them. The flip side of the problem is that if subscores do not correlate with the total score, then the assessment may not be unidimensional, and then it may not make sense to report the total score. These are the problems that test developers wrestle with when they lie awake at night.

Excerpt from Questionmark’s Test Analysis Report showing low reliability of three topic scores.

As you might have guessed from the title of their article, Feinberg and Wainer have proposed a simple, empirically-based equation for determining whether or not a subscore should be reported. The equation yields a value that Sandip Sinharay and Shelby Haberman called the Value Added Ratio (VAR). If a subscore on an assessment has a VAR value greater than one, then they suggest that this justifies reporting it. All of the VAR values that are less than one, should not be reported. I encourage interested readers to check out Feinberg and Wainer’s article (which is less than two pages, so you can handle it) for the formula and step-by-step instructions for its application.


Intro to high-stakes assessment

Lance bio pic  Posted by

Hello, and welcome to my first blog post for Questionmark. I joined Questionmark in May of 2014 but have just recently become Product Owner for Authoring. That means I oversee the tools we build to help people like you write their questions and assessments.

My professional background in assessment is mostly in the realm of high-stakes testing. That means I’ve worked with organizations that license, certify, or otherwise credential individuals. These include medical/nursing boards, driving standards organizations, software/hardware companies, financial services, and all sorts of sectors where determining competency is important.

With that in mind, I thought I’d kick off my blogging career at Questionmark with a series of posts on the topic of high-stakes assessment.

Now that I’ve riveted your attention with that awesome and no-at-all tedious opening you’re naturally chomping at the bit to learn more, right?


Read on!

High-stakes assessment defined

I think of a high-stakes assessment as having the following traits:

It strongly influences or determines an individual’s ability to practice a profession

Organizations that administer high-stakes assessments operate along a continuum of influence. For example, certifications from SAP or other IT organizations are typically viewed as desirable by employers and may be used as a differentiator when hiring or setting compensation, but are not necessarily required for employment. At the other end of the continuum we have organizations that actually determine a person’s ability to practice a profession. An example is that you must be licensed by a state bar association to practice law in a US state. In between these extremes lie many shades of influence. The key concept here is that the influence is real…from affecting hiring/promotion decisions to flat-out determining if a person can be hired or continue to work in their chosen profession.

It awards credentials that belong to the individual

This is all about scope and ownership. These credentialing organizations almost always award a license/certification to the individual. If you get certified by SAP, that certification is yours even if your employer paid for it.

Speaking about scope, that certificate represents skills that are employer-neutral, and in the case of most IT certifications, the skills are generally unbounded by region as well. A certification acquired in the United States means the same thing in Canada, in Russia, in Mongolia, in Indonesia in…you get the point.

Stan Lee

Okay, so these organizations influence who can work in professions and who can’t. Big whoop, right? Right! It really is a big whoop.

As Stan Lee has told us repeatedly, “Excelsior!”

Hmmm. That’s not the quote I wanted.

I meant, as Stan Lee has told us repeatedly, “With great power comes great responsibility!”*

And these orgs do have great power. They also, in many cases, have powerful members. For example, medical boards in the United States certify elite medical professionals. In all cases, these orgs are simultaneously making determinations about the public good and people’s livelihoods. As a result, they tend to take the process very seriously.

Ok… But what does it all mean?

Glad you asked. Stay tuned for my next post to find out.

Till then, Excelsior!

* So, it turns out that some guy named “Voltaire” said this first. But really, who’s had a bigger impact on the world? Voltaire – if that’s even his real
name – or Stan Lee? :)

Next Page »
SAP Microsoft Oracle HR-XML AAIC