2014 South African Users Conference – Addressing Compliance

We are back from the first South African Users Conference which was hosted by Bytes People Solutions. Like all of our users conferences, the most valuable aspect of this gathering was hearing from our customers and potential customers—through presentations as well as informal conversations.

Many attendees manage assessment programs for large academic or commercial institutions, and I was struck by their teams’ organizational skills. From my conversations, it sounds as if many of these program managers have to strike a balance between traditional practices at their organizations and the needs to adopt innovative strategies to improve measurement practices. For example, one program manager spoke about helping item writers transition from writing items in MS Word to writing them in Questionmark Live. The people I spoke to appeared to be pushing the envelope of their assessment capabilities, helping their stakeholders through technological transitions, while simultaneously delivering thousands of assessments. It was impressive.

Compliance was a recurring theme. In the U.S., test developers are always collecting evidence to demonstrate the legal defensibility of their assessments, and we often turn to The Standards for Educational and Psychological Testing for guidance (the latest edition was released just last week). Though the legal and cultural expectations for test development may differ slightly in other regions, no modern test developer is exempt from accountability. Demonstrating compliance with organizational or legal requirements seemed to be a big consideration for many attendees.

Regardless of what compliance means to different organizations, one thing was the same for everyone: demonstrating compliance means having accurate, easily-accessed data. I noticed that many clients were able to cite data-backed evidence for the decisions they made in their testing programs to meet their stakeholders’ compliance requirements. Some of these data came from Questionmark through our APIs and assessment results, but these presenters also clearly did research about other important factors that impact the validity of the results.

For example, presenters talked about the evidence they gathered to support the use of computer-based testing over paper and pencil tests. Another presenter shared qualitative data from interviewing subject matter experts about their impressions of Questionmark’s authoring tools. These decisions affect the delivery mode and task models of the assessment, which directly relate to the validity of the results, so it is encouraging to see test developers documenting their rationales for these kinds of decisions.

All in all, it was an impressive group of professionals who gathered in Midrand, and I am sure that I learned just as much (if not more) from the participants as they did from me. Special thanks to everyone who attended and presented!

Get trustable results : Require a topic score as a prerequisite to pass a test

If you are taking an assessment to prove your competence as a machine operator, and you get all the questions right except the health and safety ones, should you pass the assessment? Probably not. Some topics can be more important than others, and assessment results should reflect that fact.

In most assessments, it’s acceptable to define a pass or cut score, and all that is required to pass the assessment is for the participant to achieve the passing score or higher. The logic for this is that success on one item can make up for failure on another item,  so skills in one area are substitutable for skills in another. However, there are other assessments where some skills or knowledge are critical, and here you might want to require a passing score or even a 100% score in the key or “golden” topics as well as a pass score for the test as a whole.

This is easy to set up in Questionmark when you author your assessments. When you create the assessment outcome that defines passing the test, you define some topic prerequisites.

Here is an illustrative example, showing 4 topics. As well as achieving the pass score on the test, the participant must achieve 60% in three topics: “Closing at end of day”, “Operations” and “Starting up”, and 100% in one topic: “Safety”.


If you need to ensure that participants don’t pass a test unless they have achieved scores in certain topics, topic prerequisites are the way to achieve this.

Case Study: Live monitoring offers security for online tests

Thomas Edison State College (TESC) is one of the oldest schools in the country designed specifically for adults. The college’s 20,000+ students, many of them involved with careers and families, live all over the world and favor courses that enable online study.

In setting up online midterm and final exams, the college wanted to give distance leaners the same kind of security as on-campus students experience at more traditional institutions. At the same time, it was essential to give students some control over where and when they take tests.

Online proctoring offered a way to achieve both of these goals.

Working with Questionmark and ProctorU has enabled TESC to administer proctored exams to students at their home or work computers.

Proctors connect with test takers via webcam and audio hook-ups, verify the each test-taker’s identity, initiate the authentication process, ensure the students are not using any unauthorized materials or aids and troubleshoot technical problems. The college can now run secure tests while meeting the needs of busy students for flexible access to exams.

You can read the full case study here.

Item Development – Managing the Process for Large-Scale Assessments

Whether you work with low-stakes assessments, small-scale classroom assessments or large-scale, high-stakes assessment, understanding and applying some basic principles of item development will greatly enhance the quality of your results.

This is the first in a series of posts setting out item development steps that will help you create defensible assessments. Although I’ll be addressing the requirements of large-scale, high-stakes testing, the fundamental considerations apply to any assessment.

You can find previous posts here about item development including how to write items, review items, increase complexity, and avoid bias. This series will review some of what’s come before, but it will also explore new territory. For instance, I’ll discuss how to organize and execute different steps in item development with subject matter experts. I’ll also explain how to collect information that will support the validity of the results and the legal defensibility of the assessment.

In this series, I’ll take a look at:

Item Dev.

These are common steps (adapted from Crocker and Algina’s Introduction to Classical and Modern Test Theory) taken to create the content for an assessment. Each step requires careful planning, implementation, and documentation, especially for high-stakes assessments.

This looks like a lot of steps, but item development is just one slice of assessment development. Before item development can even begin, there’s plenty of work to do!

In their article, Design and Discovery in Educational Assessment: Evidence-Centered Design, Psychometrics, and Educational Data Mining, Mislevy, Behrens, Dicerbo, and Levy provide an overview of Evidence-Centered Design (ECD). In ECD, test developers must define the purpose of the assessment, conduct a domain analysis, model the domain, and define the conceptual assessment framework before beginning assessment assembly, which includes item development.

Once we’ve completed these preparations, we are ready to begin item development. In the next post, I will discuss considerations for training our item writers and item reviewers.

How many test or exam retakes should you allow? Part 2

In my last post, I offered some ideas about what to consider when determining your retake policy regarding a certification assessment measuring competence and mastery. Some of the issues to balance are test security, fairness, a delay between retakes and the impact of retakes on test preparation. In this conclusion to the post, I’ll share what a few other organizations do and how you might approach deciding the number of retakes to allow.

Here is how a few respected certification programmes manage retakes

SAP have the following rules in their certification programme:

No candidate may participate in the same examination for the same release more than three times. A candidate who has failed at an examination three times for a release may not attempt that examination again until the next release. 

Microsoft allow up to 5 attempts in a 12-month period and then impose a 12-month waiting period. They also have gaps of several days between retakes, with the number of days increasing for subsequent retakes.

The US financial regulator FINRA requires a waiting time of 30 days between exams, but if you fail an exam three or more times in succession, you must wait 6 months before taking it again.

What’s the right answer for you?

The right answer depends on your circumstances. Many programmes allow retakes but have rules in place to limit the delivery rate of the assessment in order to limit content exposure.

1. You should communicate your retake policy to participants and to stakeholders who see the results of the assessments.

2. If you release scores, you also need to decide whether you will have a policy  as to whether scores for all attempts are released, or (as many organizations do) only for the successful attempt. Section 11.2 of the the Standards for Educational and Psychological Testing states

“Test users or the sponsoring agency should explain to test takers their opportunities, if any, to retake an examination; users should also indicate whether the earlier as well as later scores will be reported to those entitled to receive score reports.”

3. You should not allow people to retake a test they have passed.

4. You should consider requiring a period of time to elapse before someone retakes an exam if they fail. This allows time for them to update their learning. You can easily set this up when scheduling within Questionmark software, for example the dialog below gives a 7-day gap.

Limit days between retakes

5. Unless special circumstances apply, you will usually want to allow at least one retake and probably at least two retakes.

6. You may want to consider some intervention or stop procedure after a certain number of failed attempts. A common number I’ve heard anecdotally is three attempts, but it will depend on each assessment program’s own individual factors and use cases.  If this is an internal compliance exam, you might want to organize some remedial training or job review. If this is a public exam, you might want to ensure a longer time period to allow reflection and re-learning.

Please feel free to comment below if you have alternative thoughts on the number of retakes to allow.