5 Things I Learned at the European Association of Test Publishers Conference Last Week

Posted by John Kleeman

I just attended the Association of Test Publisher’s European conference (EATP), held last week in Madrid, and wanted to share some of what I learned.

The Association of Test Publishers (ATP) is the trade association for the assessment industry and promotes good practice in assessment. Questionmark have been members for a long time and I am currently on their board of directors. The theme of the conference was “Transforming Assessments: Challenge. Collaborate. Inspire.”

Panel at European Association of Test Publishers

As well as seeing a bit of Madrid (I particularly enjoyed the beautiful Retiro Park), here are some things I learned at the conference. (These are all my personal opinions, not endorsed by Questionmark or the ATP).

1. Skills change. One area of discussion was skills change. Assessments are often used to measure skills, so as skills change, assessments change too. There were at least three strands of opinion. One is that workplace skills are changing rapidly – half of what you learn today will be out of date in five years, less if you work in technology. Another is that many important skills do not change at all – we need to collaborate with others, analyze information and show emotional resilience; these and other important skills were needed 50 years ago and will still be needed in 50 years’ time. And a third suggested by keynote speaker Lewis Garrad is that change is not new. Ever since the industrial revolution, there has been rapid change, and it’s still the case now. All of these are probably a little true!

2. Artificial Intelligence (AI). Many sessions at the conference covered AI. Of course, a lot of what gets called AI is in fact just clever marketing of smart computer algorithms. But nevertheless, machine learning and other things which might genuinely be AI are definitely on the rise and will be a useful tool to make assessments better. The industry needs to be open and transparent in the use of AI. And in particular, any use of AI to score people or identify anomalies that could indicate test cheating needs to be very well built to defend against the potential of bias.

3. Debate is a good way to learn. There were several debates at the conference, where experts debated issues such as performance testing, how to detect fraud and test privacy vs security, with the audience voting before and after. As the Ancient Greeks knew, this is a good format for learning, as you get to see the arguments on both sides presented with passion. I’d encourage others to use debates for learning.

4. Privacy and test security genuinely need balance. I participated in the privacy vs test security debate, and it’s clear that there is a genuine challenge balancing the privacy rights of individual test-takers and the needs of testing organizations to ensure results are valid and have integrity. There is no single right answer. Test-taker rights are not unlimited. And testing organizations cannot do absolutely anything they want to ensure security. The growing rise of privacy laws including the GDPR has brought discussion about this to the forefront as everyone seeks to give test-takers their mandated privacy rights whilst still being able to process data as needed to ensure test results have integrity. A way forward seems to be emerging where test-takers have privacy and yet testing organizations can assert legitimate interests to resist cheating.

5. Tests have to be useful as well as valid, reliable and fair. One of the highlights of the conference was a CEO panel, where Marten Roorda, CEO of ACT, Norihisa Wada, a senior executive at EduLab in Japan, Sangeet Chowfla, CEO of the Graduate Management Admission Council and Saul Nassé, CEO of Cambridge Assessment gave their views on how assessment was changing. I moderated this panel (see picture below) and it was great to hear these very smart thought leaders talk of the future.  There is widespread agreement that validity, reliability and fairness are key tenets for assessments , but also a reminder that we also need “efficacy” – i.e. that tests need to be useful for their purpose and valuable to those who use them.

There was a huge amount of other conference conversations including sessions on online proctoring, test translation, the update to the ISO 10667 standard, producing new guidelines on technology based assessment and much, much more.

I found it challenging, collaborative and inspiring and I hope this blog gives you a small flavor of the conference.

xAPI: A Way to Enable Learning Analytics

Posted by John Kleeman

Many organizations train and test individuals to ensure they have the right skills and competencies. In doing so, they amass vast amounts of data, which can be used to identify further training opportunities and improve performance. One way of managing this data is to use the Experience API (or xAPI) to pass data from disparate systems into a central Learning Record Store.

xAPI is maintained by the United States Advanced Distributed Learning Initiative (see www.adlnet.gov) and many Questionmark users have requested that we support xAPI so that they can export test data for analysis. For this reason, we’re pleased to let you know that earlier this year, we released our xAPI Connector for OnPremise and OnDemand customers. The integration lets the Questionmark platform connect and ‘talk’ to Learning Record Stores, creating an agile and effective learning and development ecosystem.

The challenges organizations face

For any organization, measuring the competence of employees or consultants through assessment is an essential element of ensuring the team is capable and fit-for-purpose. During this process, organizations collect large amounts of data that needs to be stored under strict data privacy regulations.

Once employers have control of learning and assessment data, it can then be interrogated to analyze employees and the effectiveness of training programs. With Questionmark’s xAPI integration, customers will now be able to transfer data from the assessment platform to their Learning Record Store.

What xAPI does

xAPI provides a standard means for collecting data from training and assessment experiences. The specification allows different systems to communicate and share data, which can then be stored and analyzed. This helps organizations to make better decisions by collecting, tracking, and quantifying learning activities to see what works and what doesn’t.

Organizations are increasingly investing in Learning Record Stores to host and analyze learning and assessment data. With xAPI, Questionmark customers will now to be able to send assessment data directly to their Learning Record Stores, so that they can measure the impact of learning and development activities and maximize the impact of their investment.

xAPI offers universal integration, meaning users can store data anywhere. Reporting across multiple geographies is easy, so users can analyze, compare and contrast data. The data is also presented in a universal format, making it easy to understand and interpret. This provides a solid starting point for big data learning analytics. And, as an assessment technology provider, Questionmark has widened its footprint in the total learning ecology by releasing the xAPI functionality.

If you’d like to find out more about the full range of assessment features that Questionmark offers, contact us or request a demo.



Workplace Exams 101: How to Prevent Cheating

John Kleeman

Posted by John Kleeman

A hot topic in the assessment world today is cheating and what to do to prevent it. Many organizations test their employees, contractors and other personnel to check their competence and skills. These include compliance tests, on-boarding tests, internal certification tests, end-of-course tests and product knowledge quizzes.

There are two reasons why cheating matters in workplace exams:

Issue #1: Validity

Firstly, the validity of the test or exam is compromised; any decision made as a result of the test is invalid. For example, you may use a test to check whether someone is safe to sell your products, but if cheating happens, then he/she is not. Or you may be checking if someone is safe to do a task, and if cheating happens, safety is compromised. Tests and exams are used to make important decisions about people with business, financial and regulatory consequences. If someone cheats at a test or exam, you are making the decision based on bad data.

Issue #2: Integrity

Secondly, people who cheat at tests or exams have demonstrated a lack of integrity. If they will cheat on a test or exam, what else might they lie, cheat or defraud your organization about? Will falsifying a record or report be next? Regulators often have rules requiring integrity and have sanctions if someone demonstrates a lack of it.

For example, in the financial sector, FINRA’s Rule 2010 requires individuals to “observe high standards of commercial honor” and is used to ban people found cheating at exams or continuing education tests. In the accountancy sector, both AICPA and CIMA require accountants to have integrity and those found cheating at tests have been banned or otherwise sanctioned. And in the medical and pharmaceutical field, regulators have codes of conduct which include honesty. For example, the UK General Medical Council requires doctors to “always be honest about your experience, qualifications and current role” and interprets cheating at exams as a violation of this.

The well-respected International Test Commission Guidelines on the Security of Tests, Exams and Other Assessments suggests six categories of cheating threats shown below, alongside examples from me of how they can take place in the work environment.


ITC categoriesTypical examples in the workplace
Using test content pre-knowledge– An employee takes the test and passes questions to a colleague still to take it
– Someone authoring questions leaks them to test-takers
– A security vulnerability allows questions to be seen in advance
Receiving expert help while taking the test– One employee sits and coaches another during the test
– IM or phone help while taking a test
– A manager or proctor supervising the test helps a struggling employee
Using unauthorized test aids– Access to the Internet allows googling the answers
– Unauthorized study guides brought to the test
Using a proxy test taker– A manager sends an assistant or secretary to take the test in place of him/her
– Other situations where a colleague stands in for another
Tampering with answer sheets or stored test results– Technically minded employees subvert communication with the LMS or other corporate systems and change their results
Copying answers from another user– Two people sitting near each other share or copy answers
– Organized answer sharing within a cohort or group of trainees


If you are interested in learning more about any of the threats above, I’ve shared approaches to mitigate them in the workplace in our webinar, Workplace Exams 101: How to Prevent Cheating. You can download the webinar recording slides HERE.

Ten Key Considerations for Defensibility and Legal Certainty for Tests and Exams

John KleemanPosted by John Kleeman

In my previous post, Defensibility and Legal Certainty for Tests and Exams, I described the concepts of Defensibility and Legal Certainty for tests and exams. Making a test or exam defensible means ensuring that it can withstand legal challenge. Legal certainty relates to whether laws and regulations are clear and precise and people can understand how to conduct themselves in accordance with them. Lack of legal certainty can provide grounds to challenge test and exam results.

Questionmark has just published a new best practice guide on Defensibility and Legal Certainty for Tests and Exams. This blog post describes ten key considerations when creating tests and exams that are defensible and encourage legal certainty.

1. Documentation

Without documentation, it will be very hard to defend your assessment in court, as you will have to rely on people’s recollections. It is important to keep records of the development of your tests and ensure that these records are updated so that they accurately reflect what you are doing within your testing programme. Such records will be powerful evidence in the event of any dispute.

2. Consistent procedures

Testing is more a process than a project. Tests are typically created and then updated over time. It’s important that procedures are consistent over time. For example, a question added into the test after its initial development should go through similar procedures as those for a question when the test was first developed. If you adopt an ad hoc approach to test design and delivery, you are exposing yourself to an increased risk of successful legal challenge.

3. Validity

Validity, reliability and fairness are the three generally accepted principles of good test design. Broadly speaking, validity is how well the assessment matches its purpose. If your tests and exams lack validity, they will be open to legal challenge.

4. Reliability

Reliability is a measure of precision and consistency in an assessment and is also critical.There are many posts explaining reliability and validity on this blog, one useful one is Understanding Assessment Validity and Reliability.

5.  Fairness (or equity)

Probably the biggest cause of legal disputes over assessments is whether they are fair or not. The International standard ISO 10667-1:2011 defines equity as the “principle that every assessment participant should be assessed using procedures that are fair and, as far as
possible, free from subjectivity that would make assessment results less accurate”. A significant part of fairness/equity is that a test should not advantage or disadvantage individuals because of characteristics irrelevant to the competence or skill being measured.

6. Job and task analysis

The type of skills and competence needed for a job change over time. Job and task analysis are techniques used to analyse a job and identify the key tasks performed and the skills and competences needed. If you use a test for a job without having some kind of analysis of job skills, it will be hard to prove and defend that the test is actually appropriate to measure someone’s competence and skills for that job.

7. Set the cut or pass score fairly

It is important that you have evidence to reasonably justify that the cut score used to divide pass from fail does genuinely distinguish the minimally competent from those who are not competent. You should not just choose a score of 60%, 70% or 80% arbitrarily, but instead you should work out the cut score based on the difficulty of questions and what you are measuring.

8. Test more than just knowledge recall

Most real-world jobs and skills need more than just knowing facts. Questions which test remember/recall skills are easy to write but they only measure knowledge. For most tests, it is important that a wider range of skills are included in the test. This can be done with conventional questions that test above knowledge or with other kinds of tests such as observational assessments.

9. Consider more than just multiple choice questions

Multiple choice tests can assess well; however in some regions, multiple choice questions sometimes get a “bad press”. As you design your test, you may want to consider including enhanced stimulus and a variety of question types (e.g. matching, fill-in-blanks, etc.) to reduce the possibility of error in measurement and enhance stakeholder satisfaction.

10. Robust and secure test delivery process

A critical part of the chain of evidence is to be able to show that the test delivery process is robust, that the scores are based on answers genuinely given by the test-taker and that there has been no tampering or mistakes. This requires that the software used to deliver the test is reliable and dependably records evidence including the answers entered by the test-taker and how the score is calculated. It also means that there is good security so that you have evidence that the right person took the test and that risks to the integrity of the test have been mitigated.

For more on these considerations, please check out our best practice guide on Defensibility and Legal Certainty for Tests and Exams, which also contains some legal cases to illustrate the points. You can download the guide HERE – it is free with registration.

Defensibility and Legal Certainty for Tests and Exams

John KleemanPosted by John Kleeman

Questionmark has just published a new best practice guide on Defensibility and Legal Certainty for Tests and Exams. Download the guide HERE.

We are all familiar with the concept of a chain of custody for evidence in a criminal case. If the prosecution seeks to provide evidence to a court of an object found at a crime scene, they will carefully document its provenance and what has happened to it over time, to show that the object offered as evidence at court is the object recovered from the crime scene.

There is a useful analogy between this concept and defensibility and legal certainty in tests and exams. Assessments have a “purpose” or a “goal”, for example, the need to check a person’s competence before allowing them to perform a job task. It is important that an assessment programme defines its purpose clearly, ensures that this purpose is then enshrined in the design of the test or exam, and checks that the assessment and delivery is consistent with the defined purpose. Essentially, there should be a chain from the purpose to design to delivery to decision, which makes the end decision defensible. If you follow that chain, your assessments may be defensible and legally certain; if that chain has breaks or gaps, then your assessments are likely to become less certain and more legally vulnerable.

Defensibility of assessments

Defensibility, in the context of assessments, concerns the ability of a testing organisation to withstand legal challenges. These legal challenges may come from individuals or groups who claim that the organisation itself, the processes followed (e.g., administration, scoring, setting pass scores, etc.), or the outcomes of the testing (e.g., a person is certified or not) are not legally valid. Essentially, defensibility has to do with the question: “Are the assessment results, and more generally the testing program, defensible in a court of law?”.

Ensuring that assessments are defensible means ensuring that assessments are valid, reliable and fair and that you have evidence and documentation available to demonstrate the above, in case of a challenge.

Legal certainty for assessments

Legal certainty (“Rechtssicherheit” in German) means that the law (or other rules) must be certain, in that the law is clear and precise, and its legal implications foreseeable. If there is legal certainty, people should understand how to conduct themselves in accordance with the law. This contrasts with legal indeterminacy, where the law is unclear and may require a court’s ruling to determine what it means

  • Lack of legal certainty can provide grounds to challenge assessment results. For instance many organisations have rules for how they administer assessments or make decisions based on the results of assessments. A test-taker might claim that the organisation has not followed its own rules or that the rules are ambiguous.
  • Some public bodies are constrained by law in which case they can only deliver assessments in a way that laws and regulations permit, and if they veer from this, they can be challenged under legal certainty.
  • Legal certainty issues can also arise if the exam process goes awry. For example, someone might claim that their answers have been swapped with those of another test-taker or that the exam was unfair because the user interface was confusing, e.g. they unintentionally pressed to submit their answers and finish the exam before actually intending to do so.

The best practice guide describes the principles and key steps to make assessments that are defensible and that provide legal certainty, and which are less likely to be successfully challenged in courts. The guide focuses primarily on assessments used in the workplace and in certification. It focuses particularly on legal cases and issues in Europe but will also be relevant in other regions.

You can download the guide HERE – it is free with registration.

FAQ – “Testing Out” of Training

Posted by Kristin Bernor

Let’s explore what it means to “test out”, what the business benefits include and how Questionmark enables you to do this in a simple, timely and valid manner.

“Testing out” of training saves time and money by allowing participants to forego unneeded training. It makes training more valid and respected, and so more likely to impact behavior, because it focuses training on the people who need it and further allows those that do know it, to learn additional knowledge, skills and abilities.

The key to “testing out” of training is that the test properly measures what it is you are training. If that is the case, then if someone can demonstrate by passing the test that they know it already, then they don’t need to do the training. Why testing out can sometimes be a hard sell is if the test doesn’t really measure the same outcomes as the training – so just because you pass the test, you might not in fact know the training. So, the key is to write a good test.

Online assessments are about both staying compliant with regulatory requirements AND giving business value. Assessments help ensure your workforce is competent and reduce risk, but they also give business value in improved efficiency, knowledge and customer service.

What does it mean to “test out” of training?

Many organizations create tests that allow participants to “test out” of training if they pass. Essentially, if you already know the material being taught, then you don’t need to spend time in the training. Testing them on training that is already know is a waste of time, value and resources. Directing them to training that is necessary ensures the candidate is motivated and feels they are spending their time wisely. Everyone wins!

Why is this so important? Or What are the advantages to incorporating “testing out”?

The key advantage of this approach is that you save time when people don’t have to attend the training that they don’t need. Time is money for most organizations, and saving time is an important benefit.

Suppose, for example, you have 1,000 people who need to take some training that lasts 2 hours. This is 2,000 hours of people’s time. Now, suppose you can give a 20-minute test that 25% of people pass and therefore skip the training. The total time taken is 333 hours for the test and 1,500 hours for the training, which adds up to 1,833 hours. So having one-fourth of the test takers skip the training saves 9% of the time that would have been required for everyone to attend the training.

In addition to saving time, using diagnostic tests in this way helps people who attend training courses focus their attention on areas they don’t know well and be more receptive to the training that is beneficial.

Is it appropriate to allow “testing out” of all training?

Obviously if you follow this approach, you’ll need to ensure that your tests are appropriate and sufficient – that they measure the right knowledge and skills that the training would otherwise cover.

You’ll need to check your regulations to confirm that this is permissible for you, but most regulators will see sense here.

How Questionmark can be used to “test out”

Online assessments are a consistent and cost-effective means of validating that your workforce knows the law, your procedures and your products. If you are required to document training, it’s the most reliable way of doing so. When creating and delivering assessments within Questionmark, it’s quite simple to qualify a candidate once they reach a score threshold. If they correctly answer a series of items and pass the assessment, this denotes that further training is not needed. It is imperative that the assessment accurately tests for the requisite knowledge that are part of the training objectives.

The candidate can then focus on training that is pertinent, worthwhile and beneficial to both themselves and the company. If they answer incorrectly and are unable to pass the assessment, then training is necessary until they are able to master the information and demonstrate this in a test.