High-stakes assessment: It’s not just about test takers

Lance bio picPosted by

In my last post I spent some time defining how I think about the idea of high-stakes assessment. I also talked about how these assessments affect the people who take them including how important it is to their ability to get or do a job.

Now I want to talk a little bit about how these assessments affect the rest of us.

The rest of us

Guess what? The rest of us are affected by the outcomes of these assessments. Did you see that coming?

But seriously, the credentials or scores that result from these assessments affect large swathes of the public. Ultimately that’s the point of high-stakes assessment. The resulting certifications and licenses exist to protect the public. These assessments are acting as barriers preventing incompetent people from practicing professions where competency really matters.

 It really matters

What are some examples of “really matters”? Well, when hiring, it really matters to employers that the network techs they hire knows how to configure a network securely, not that the techs just say they do. It matters to the people crossing a bridge that the engineers who designed it knew their physics. It really matters to every one of us that our doctor, dentist, nurse, or surgeon know what they are doing when they treat us. It really matters to society at large when we measure (well) the children and adults who take large-scale assessments like college entrance exams.

At the end of the day, high-stakes exams are high-stakes because in a very real way, almost all of us have a stake in their outcome.

 Separating the wheat from the chaff

There are a couple of ways that high stakes assessments do what they do. Some assessments are simply designed to measure “minimal competence,” with test takers either ending above the line—often known as “passing”—or below the line. The dreaded “fail.”

Other assessments are designed to place test takers on a continuum of ability. This type of assessment assigns scores to test takers, and the range of
score often appear odd to laypeople. For example, the SAT uses a 200 – 800 scale.

Want to learn more? Hang on till next time!

Online or test center proctoring: Which is more secure?

John Kleeman HeadshotPosted by John Kleeman

As explained in my previous post Online or test center proctoring: Which is best?, a new way of proctoring certification exams is rapidly gaining traction. With online proctoring, candidates takes exams at their office or home, with a proctor observing via video camera over the Internet. Parents scaling the walls of a building to help their children cheat

The huge advantage of online proctoring is that the candidate doesn’t need to travel to a test center. This is fairer and saves a lot of time and cost. But how secure is online proctoring? You might at first sight think that test center proctoring is more secure – as it sounds easier to spot cheating in a controlled environment and face-to-face than online. But it’s not as simple as that.

The stakes for a candidate to pass an exam are often high, and there are many examples where proctors at test centers coach candidates or otherwise breach the integrity of the exams process. A proctor in a test center can witness the same test being taken over and over again, and they can start to memorize, and potentially sell, the content that they see.  For example, according to a 2011 article in the Economist , one major test center company at that time was shutting down five test centers a week due to security concerns.

Test center vulnerabilities are not always as obvious as they are in the picture to the right (source here), but they are myriad. This recent photo shows parents in India climbing the walls of a building to help their children pass exams, with proctors bribed to help. According to Standard Digital:

“Supervisors stationed at notorious test centres vie for the postings, enticed by the prospect of bribes from parents eager to have their wards scrape through.”

Proxy test taking – where one person takes a test impersonating another – is a also big concern in the industry. A 2014 Computer World article quotes an expert saying:

“In some cases, proxies have been able to skirt security protocols by visiting corrupt testing facilities overseas that operate both a legitimate “front room” test area and a fraudulent “back room” operation.

This doesn’t just happen in a few parts of the world: there are examples worldwide. For instance, there was a prominent case in the UK in 2014 where proctors were dishonest in a test used to check English knowledge for candidates seeking visas. According to a BBC report, in some tests the proctor read out the correct answers to candidates. And in another test, a candidate came to the test center and had their picture taken, but then a false sitter went on to take the test. An undercover investigator posing as a candidate was told:

“Someone else will sit the exam for you. But you will have to have your photo taken there to prove you were present.”

This wasn’t a small scale affair – the UK government announced that at least 29,000 exam results were invalid due to this fraud.

Corrupt test centers have also been found in the US. In May 2015, a New York man was sentenced to jail for being involved in fraud where five New York test centers allowed applicants for a commercial driving license to pay their way to pass the test. According to a newspaper report:

“The guards are accused of taking bribes to arrange for customers to leave the testing room with their exams, which they gave to a surrogate test-taker outside who looked up the answers on a laptop computer. The guards would allow the test-takers to enter and leave the testing rooms.”

There are many other examples of this kind of cheating at test centers – a good source of information is  Caveon’s blog about cheating in the news. Caveon and Questionmark recently announced a partnership to generally enhance the security of high-stakes testing programs. The partnership with Caveon will also provide Questionmark’s customers with easy access to consulting services to help them enhance the security of the exams.
Of course, most test center proctors are honest and most test center exams are fair, but there are enough problems to raise concerns. Online proctoring has some security disadvantages, too:

  • Although improvements are being developed, it is harder for the proctor to check whether an ID is genuine when looking at it through a camera.
  • A remote camera in the candidate’s own environment is less capable of spotting some forms of cheating than a controlled environment in a test center.

But there are also genuine security advantages.  It is much harder for an online proctor to get to know a candidate to be able to coach him or her or receive a payment to help in other ways.

  • Because proctors can be assigned randomly and without any geographic connection, it’s much less likely for the proctor and candidate to be able to pre-arrange any bad behavior
  • All communication between proctor and candidate is electronic and can be logged, so the candidate cannot easily make an inappropriate approach during the exam.
  • While test center proctors have easy access to exam content which can lead to various types of security breaches, online proctors can be restricted from viewing the exam content through the use of such technologies as secure browsers.
  • Because there is less difficulty and cost involved in online proctoring than when the candidate travels to a physical test center, it’s practical to test more frequently– and this is a security benefit. If there is frequent testing, it may be simpler for a candidate to learn the material and pass the test honestly than put a lot of effort into cheating. If you have several exams, you can also compare the pictures of a candidate at each exam to reduce the chance of impersonation.

In summary, the main reason for online proctoring is that it saves time and money over going to a bricks-and-mortar test center. The security advantages and disadvantages of test center versus online proctoring  are open to debate.  Dealing with security vulnerabilities requires constant vigilance. With new online proctoring technologies enhancing exam security, many certification programs are now transitioning away from test centers. Traditionally a test center was a secure place to administer exams, but in practice there have been so many incidents of proctor dishonesty over the years that online proctoring is likely justifiable for security reasons.

Caveon Q&A: Enhanced security of high-stakes tests

Headshot JuliePosted by Julie Delazyn

Questionmark and Caveon Test Security, an industry leader in protecting high-stakes test programs, have recently joined forces to provide clients of both organizations with additional resources for their test administration toolboxes.

Questionmark’s comprehensive platform offers many features that help ensure security and validity throughout the assessment process. This emphasis on security, along with Caveon’s services, which include analyzing data to identify validity risks as well as monitoring the internet for any leak that could affect intellectual property, adds a strong layer of protection for customers using Questionmark for high-stakes assessment management and delivery.

I sat down with Steve Addicott, Vice President of Caveon, to ask him a few questions about the new partnership, what Caveon does and what security means to him. Here is an excerpt from our conversation

Who is Caveon? Tell me about your company.

At Caveon Test Security, we fundamentally believe in quality testing and trustworthy test results. That’s why Caveon offers test security and test item
development services dedicated to helping prevent test fraud and better protecting our clients’ items, tests, and reputations.

What does security mean to you, and why is it important?

High stakes test programs make important education and career decisions about test takers based on test results. We also spend a tremendous amount of time creating, administering, scoring, and reporting results. With increased security pressures from pirates and cheats, we are here to make sure that those results are trustworthy, reflecting the true knowledge and skills of test takers.

Why a partnership with Questionmark and why now?

With a growing number of Questionmark clients engaging in high-stakes testing, Caveon’s experience in protecting the validity of test results is a natural extension of Questionmark’s security features. For Caveon, we welcome the chance to engage with a vendor like Questionmark to help protect exam results.

And how does this synergy help Questionmark customers who deliver high-stakes tests and exams?

As the stakes in testing continue to rise, so do the challenges involved in protecting your program. Both organizations are dedicated to providing clients with the most secure methods for protecting exam administrations, test development investments, exam result validity and, ultimately, their programs’ reputations.

For more information on Questionmark’s dedication to security, check out this video and download the white paper: Delivering Assessments Safely and Securely.

Intro to high-stakes assessment

Lance bio pic  Posted by

Hello, and welcome to my first blog post for Questionmark. I joined Questionmark in May of 2014 but have just recently become Product Owner for Authoring. That means I oversee the tools we build to help people like you write their questions and assessments.

My professional background in assessment is mostly in the realm of high-stakes testing. That means I’ve worked with organizations that license, certify, or otherwise credential individuals. These include medical/nursing boards, driving standards organizations, software/hardware companies, financial services, and all sorts of sectors where determining competency is important.

With that in mind, I thought I’d kick off my blogging career at Questionmark with a series of posts on the topic of high-stakes assessment.

Now that I’ve riveted your attention with that awesome and no-at-all tedious opening you’re naturally chomping at the bit to learn more, right?

Right?

Read on!

High-stakes assessment defined

I think of a high-stakes assessment as having the following traits:

It strongly influences or determines an individual’s ability to practice a profession

Organizations that administer high-stakes assessments operate along a continuum of influence. For example, certifications from SAP or other IT organizations are typically viewed as desirable by employers and may be used as a differentiator when hiring or setting compensation, but are not necessarily required for employment. At the other end of the continuum we have organizations that actually determine a person’s ability to practice a profession. An example is that you must be licensed by a state bar association to practice law in a US state. In between these extremes lie many shades of influence. The key concept here is that the influence is real…from affecting hiring/promotion decisions to flat-out determining if a person can be hired or continue to work in their chosen profession.

It awards credentials that belong to the individual

This is all about scope and ownership. These credentialing organizations almost always award a license/certification to the individual. If you get certified by SAP, that certification is yours even if your employer paid for it.

Speaking about scope, that certificate represents skills that are employer-neutral, and in the case of most IT certifications, the skills are generally unbounded by region as well. A certification acquired in the United States means the same thing in Canada, in Russia, in Mongolia, in Indonesia in…you get the point.

Stan Lee

Okay, so these organizations influence who can work in professions and who can’t. Big whoop, right? Right! It really is a big whoop.

As Stan Lee has told us repeatedly, “Excelsior!”

Hmmm. That’s not the quote I wanted.

I meant, as Stan Lee has told us repeatedly, “With great power comes great responsibility!”*

And these orgs do have great power. They also, in many cases, have powerful members. For example, medical boards in the United States certify elite medical professionals. In all cases, these orgs are simultaneously making determinations about the public good and people’s livelihoods. As a result, they tend to take the process very seriously.

Ok… But what does it all mean?

Glad you asked. Stay tuned for my next post to find out.

Till then, Excelsior!

* So, it turns out that some guy named “Voltaire” said this first. But really, who’s had a bigger impact on the world? Voltaire – if that’s even his real
name – or Stan Lee? 🙂

Q&A: High-stakes online tests for nurses

Headshot JuliePosted by Julie Delazyn

I spoke recently with Leanne Furby, Director of Testing Services at the National League for Nursing (NLN), about her case study presentation at the Questionmark 2015 Users Conference in Napa Valley March 10-13.

Leanne’s presentation, Transitioning 70 Years of High-Stakes Testing to Questionmark, explains NLN’s switch from a proprietary computer- and paper-based test delivery engine to Questionmark OnDemand for securely delivering standardized exams worldwide. I’m happy to share a snippet from of our conversation:

Tell me about the NLN

The NLN is a national organization for faculty nurses and leaders in nurse education. We offer faculty development, networking opportunities, testing services, nursing research grants and public policy initiatives to more than 26,000 members.

Why did you switch to Questionmark?

Our main concern was delivering our tests and exams to a variety of different devices. We wanted our students to be able to take a test on a tablet or take a quiz on their own mobile devices, and this wasn’t something we could do with our proprietary test delivery engine.

Our second major reason to go with Questionmark was the Customized Assessment Reports and the analytics tools. Before making the switch, we were having to create reports and analyze results manually. It took time and resources. Now this is all integrated in Questionmark.

How do you use Questionmark assessments?

We have 90 different exam lines and deliver approximately 75,000 to 100,000 secure exams a year, both nationally and internationally, in multiple languages. The NLN partnered with Questionmark in 2014 to transition the delivery of these exams through a custom-built portal. Questionmark is now NLN’s turnkey solution—from item banking and test development with SMEs all over the world to inventory control, test delivery and analytics.

This transition has had a positive outcomes for both our organization and our customers. We have developed a new project management policy, procedures for system transition and documentation for training at all levels. This has transformed the way we develop, deliver and analyze exams and the way we collect data for business and education purposes.

What are you looking forward to at the conference?

I am most looking forward to the opportunity to speak to other users and product developers to learn tips, tricks and little secrets surrounding the product. It’s so important to speak to people who have experience and can share ways of utilizing the software in ways you hadn’t thought of.

Thank you Leanne for taking time out of your busy schedule to discuss your session with us!

***

You have the opportunity to save $100 on your own conference registration: Just sign up by January 29 to receive this special early-bird discount.

How can a randomized test be fair to all?

Joan Phaup 2013 (3) Posted by Joan Phaup

James Parry, who is test development manager at the U.S Coast Guard Training Center in Yorktown, Virginia, will answer this question during a case study presentation the Questionmark Users Conference in San Antonio March 4 – 7. He’ll be co-presenting with LT Carlos Schwarzbauer, IT Lead at the USCG Force Readiness Command’s Advanced Distributed Learning Branch.

James and I spoke the other day about why tests created from randomly drawn items can be useful in some cases—but also about their potential pitfalls and some techniques for avoiding them.

When are randomly designed tests an appropriate choice?

James Parry

James Parry

There are several reasons to use randomized tests.  Randomization is appropriate when you think there’s a possibility of participants sharing the contents of their test with others who have not taken it.  Another reason would be in a computer lab style testing environment where you are testing many on the same subject at the same time with no blinders between the computers. So even if participants look at the screens next to them, chances are they won’t see the same items.

How are you using randomly designed tests?

We use randomly generated tests at all three levels of testing low-, medium- and high-stakes.  The low- and medium-stakes tests are used primarily at the schoolhouse level for knowledge- and performance-based knowledge quizzes and tests.  We are also generating randomized tests for on-site testing using tablet computers or local installed workstations.

Our most critical use is for our high-stakes enlisted advancement tests, which are administered both on paper and by computer. Participants are permitted to retake this test every 21 days if they do not achieve a passing score.  Before we were able to randomize the test there were only three parallel paper versions. Candidates knew this so some would “test sample” without studying to get an idea of every possible question. They would retake the first version, then the second, and so forth until they passed it. With randomization the word has gotten out that this is not possible anymore.

What are the pitfalls of drawing items randomly from an item bank?

The biggest pitfall is the potential for producing tests that have different levels of difficulty or that don’t present a balance of questions on all the subjects you want to cover. A completely random test can be unfair.  Suppose you produce a 50-item randomized test from an entire test item bank of 500 items.   Participant “A” might get an easy test, “B” might get a difficult test and “C” might get a test with 40 items on one topic and 10 on the rest and so on.

How do you equalize the difficulty levels of your questions?

This is a multi-step process. The item author has to make sure they develop sufficient numbers of items in each topic that will provide at least 3 to 5 items for each enabling objective.  They have to think outside the box to produce items at several cognitive levels to ensure there will be a variety of possible levels of difficulty. This is the hardest part for them because most are not trained test writers.

Once the items are developed, edited, and approved in workflow, we set up an Angoff rating session to assign a cut score for the entire bank of test items.  Based upon the Angoff score, each item is assigned a difficulty level of easy, moderate or hard and assigned a metatag to match within Questionmark.  We use a spreadsheet to calculate the number and percentage of available items at each level of difficulty in each topic. Based upon the results, the spreadsheet tells how many items to select from the database at each difficulty level and from each topic. The test is then designed to match these numbers so that each time it is administered it will be parallel, with the same level of difficulty and the same cut score.

Is there anything audience members should do to prepare for this session?

Come with an open mind and a willingness to think outside of the box.

How will your session help audience members ensure their randomized tests are fair?

I will give them the tools to use starting with a quick review of using the Angoff method to set a cut score and then discuss the inner workings of the spreadsheet that I developed to ensure each test is fair and equal.

***

See more details about the conference program here and register soon.