Ten Key Considerations for Defensibility and Legal Certainty for Tests and Exams

John KleemanPosted by John Kleeman

In my previous post, Defensibility and Legal Certainty for Tests and Exams, I described the concepts of Defensibility and Legal Certainty for tests and exams. Making a test or exam defensible means ensuring that it can withstand legal challenge. Legal certainty relates to whether laws and regulations are clear and precise and people can understand how to conduct themselves in accordance with them. Lack of legal certainty can provide grounds to challenge test and exam results.

Questionmark has just published a new best practice guide on Defensibility and Legal Certainty for Tests and Exams. This blog post describes ten key considerations when creating tests and exams that are defensible and encourage legal certainty.

1. Documentation

Without documentation, it will be very hard to defend your assessment in court, as you will have to rely on people’s recollections. It is important to keep records of the development of your tests and ensure that these records are updated so that they accurately reflect what you are doing within your testing programme. Such records will be powerful evidence in the event of any dispute.

2. Consistent procedures

Testing is more a process than a project. Tests are typically created and then updated over time. It’s important that procedures are consistent over time. For example, a question added into the test after its initial development should go through similar procedures as those for a question when the test was first developed. If you adopt an ad hoc approach to test design and delivery, you are exposing yourself to an increased risk of successful legal challenge.

3. Validity

Validity, reliability and fairness are the three generally accepted principles of good test design. Broadly speaking, validity is how well the assessment matches its purpose. If your tests and exams lack validity, they will be open to legal challenge.

4. Reliability

Reliability is a measure of precision and consistency in an assessment and is also critical.There are many posts explaining reliability and validity on this blog, one useful one is Understanding Assessment Validity and Reliability.

5.  Fairness (or equity)

Probably the biggest cause of legal disputes over assessments is whether they are fair or not. The International standard ISO 10667-1:2011 defines equity as the “principle that every assessment participant should be assessed using procedures that are fair and, as far as
possible, free from subjectivity that would make assessment results less accurate”. A significant part of fairness/equity is that a test should not advantage or disadvantage individuals because of characteristics irrelevant to the competence or skill being measured.

6. Job and task analysis

The type of skills and competence needed for a job change over time. Job and task analysis are techniques used to analyse a job and identify the key tasks performed and the skills and competences needed. If you use a test for a job without having some kind of analysis of job skills, it will be hard to prove and defend that the test is actually appropriate to measure someone’s competence and skills for that job.

7. Set the cut or pass score fairly

It is important that you have evidence to reasonably justify that the cut score used to divide pass from fail does genuinely distinguish the minimally competent from those who are not competent. You should not just choose a score of 60%, 70% or 80% arbitrarily, but instead you should work out the cut score based on the difficulty of questions and what you are measuring.

8. Test more than just knowledge recall

Most real-world jobs and skills need more than just knowing facts. Questions which test remember/recall skills are easy to write but they only measure knowledge. For most tests, it is important that a wider range of skills are included in the test. This can be done with conventional questions that test above knowledge or with other kinds of tests such as observational assessments.

9. Consider more than just multiple choice questions

Multiple choice tests can assess well; however in some regions, multiple choice questions sometimes get a “bad press”. As you design your test, you may want to consider including enhanced stimulus and a variety of question types (e.g. matching, fill-in-blanks, etc.) to reduce the possibility of error in measurement and enhance stakeholder satisfaction.

10. Robust and secure test delivery process

A critical part of the chain of evidence is to be able to show that the test delivery process is robust, that the scores are based on answers genuinely given by the test-taker and that there has been no tampering or mistakes. This requires that the software used to deliver the test is reliable and dependably records evidence including the answers entered by the test-taker and how the score is calculated. It also means that there is good security so that you have evidence that the right person took the test and that risks to the integrity of the test have been mitigated.

For more on these considerations, please check out our best practice guide on Defensibility and Legal Certainty for Tests and Exams, which also contains some legal cases to illustrate the points. You can download the guide HERE – it is free with registration.

Defensibility and Legal Certainty for Tests and Exams

John KleemanPosted by John Kleeman

Questionmark has just published a new best practice guide on Defensibility and Legal Certainty for Tests and Exams. Download the guide HERE.

We are all familiar with the concept of a chain of custody for evidence in a criminal case. If the prosecution seeks to provide evidence to a court of an object found at a crime scene, they will carefully document its provenance and what has happened to it over time, to show that the object offered as evidence at court is the object recovered from the crime scene.

There is a useful analogy between this concept and defensibility and legal certainty in tests and exams. Assessments have a “purpose” or a “goal”, for example, the need to check a person’s competence before allowing them to perform a job task. It is important that an assessment programme defines its purpose clearly, ensures that this purpose is then enshrined in the design of the test or exam, and checks that the assessment and delivery is consistent with the defined purpose. Essentially, there should be a chain from the purpose to design to delivery to decision, which makes the end decision defensible. If you follow that chain, your assessments may be defensible and legally certain; if that chain has breaks or gaps, then your assessments are likely to become less certain and more legally vulnerable.

Defensibility of assessments

Defensibility, in the context of assessments, concerns the ability of a testing organisation to withstand legal challenges. These legal challenges may come from individuals or groups who claim that the organisation itself, the processes followed (e.g., administration, scoring, setting pass scores, etc.), or the outcomes of the testing (e.g., a person is certified or not) are not legally valid. Essentially, defensibility has to do with the question: “Are the assessment results, and more generally the testing program, defensible in a court of law?”.

Ensuring that assessments are defensible means ensuring that assessments are valid, reliable and fair and that you have evidence and documentation available to demonstrate the above, in case of a challenge.

Legal certainty for assessments

Legal certainty (“Rechtssicherheit” in German) means that the law (or other rules) must be certain, in that the law is clear and precise, and its legal implications foreseeable. If there is legal certainty, people should understand how to conduct themselves in accordance with the law. This contrasts with legal indeterminacy, where the law is unclear and may require a court’s ruling to determine what it means

  • Lack of legal certainty can provide grounds to challenge assessment results. For instance many organisations have rules for how they administer assessments or make decisions based on the results of assessments. A test-taker might claim that the organisation has not followed its own rules or that the rules are ambiguous.
  • Some public bodies are constrained by law in which case they can only deliver assessments in a way that laws and regulations permit, and if they veer from this, they can be challenged under legal certainty.
  • Legal certainty issues can also arise if the exam process goes awry. For example, someone might claim that their answers have been swapped with those of another test-taker or that the exam was unfair because the user interface was confusing, e.g. they unintentionally pressed to submit their answers and finish the exam before actually intending to do so.

The best practice guide describes the principles and key steps to make assessments that are defensible and that provide legal certainty, and which are less likely to be successfully challenged in courts. The guide focuses primarily on assessments used in the workplace and in certification. It focuses particularly on legal cases and issues in Europe but will also be relevant in other regions.

You can download the guide HERE – it is free with registration.