Ten Tips to Translate Tests Thoughtfully

John KleemanPosted by John Kleeman

Tests and exams are used for serious purposes and have significant impact on people’s lives. If they are translated wrongly, it can result in distress. As a topical illustration, poor translation of an important medical admissions test in India was the subject of a major law case ruled on by the Indian Supreme Court last week.

Because language and cultures vary, fairly translating tests and exams is hard. I recently attended a seminar organized by the OECD on translating large scale assessments which gave me a lot of insight into the test translation process.  If you are interested  in the OECD seminar, Steve Dept of Questionmark partner cApStAn has written a blog here, and the seminar presentations are available on the OECD website.

Here are some tips from what I’ve learned at the seminar and elsewhere on good practice in translating tests and exams.

  1. Put together a capable translation management team. A team approach works well when translating tests. For example a subject matter expert, a linguist/translator, a business person and a testing expert would work well together as a review and management committee.
  2. Think through the purpose of your translation. Experts say that achieving perfect equivalence of a test in two languages is close to impossible, so you need to define your goals. For example, are you seeking to adapt the test to measure the same thing or are you looking for a literal translation? The former may be more realistic especially if your test includes some culturally specific examples or context.  Usually what you will be looking for is that the test in two languages is comparable in that a pass score in the test in either language means a similar thing for competence.
  3. Define a glossary for your project. If your test is on a specialist or technical subject, it will have some words specific to the content area. You can save time and increase the quality of the translation if you identify the expected translation of these words in advance. This will guide the translating team and ensure that test takers see consistent vocabulary.
  4. Use a competent translator (or translation company). A translator must be native in the target language but also needs current cultural knowledge, ideally from living in the target locale. A translator who is not native to the language will not be effective, and a translator who does not have knowledge of the culture may miss some references in question content  (e.g. local names or slang). An ideal translator will also have subject matter knowledge and assessment knowledge.
  5. Diagram showing export into XLIFF XML and then re-importExport to allow a translator to use their own tools. Translators have many automated tools available to them including translation memories, glossaries and automated checking systems. For simple translation, you can translate interactively within an assessment system, but you will get more professional results if you export from your assessment management system, allow the translator to translate in their system, and then re-import (as shown in the diagram).
  6. Put in place a verification procedure. Translators are human and make mistakes, questions can also rely on context or knowledge that a translator may not have. A verification process will involve manual review by stakeholders looking at things like accuracy, style, country issues, culture, no clues given in choices, right choice not obviously longer than other choices and different translation word choices used in stem/choices.
  7. Also review by piloting and looking at item difficulty. Linguistic review is helpful but you should also look at item performance in practice. The difficulty of a translated item will vary slightly between languages. Generally small errors will be up and down and roughly cancel out. You want to catch the big errors, where ambiguity or mis-translation makes a material difference to test accuracy. You can catch some of these by running a small pilot to 50 (or even 25) participants and comparing the p-value (item difficulty or proportion who get right) in the languages. This can flag questions with significant differences in difficulty; such questions need review as they may well be badly translated.
  8. Consider using bilingual reviewers. If you have access to bilingual people (who speak the target and source language), it can be worth asking them to look at both versions of the questions and comment. This shouldn’t be your only verification procedure but can be very helpful and spot issues.
  9. Update translations as questions change. In any real world test, questions in your item bank get updated over time, and that means you need to update the translations and keep track of which ones have been updated in which languages. It can be helpful  to use a translation management system, for example the one included within Questionmark OnDemand to help you manage this process, as it’s challenging and error-prone to manage manually.
  10. Read community guidelines. The International Test Commission have produced well-regarded guidelines on adapting/ translating tests – you can access them here. The OECD PISA guidelines, although specific to the international PISA tests, have  good practice applicable to other programs. I personally like the heading to one of the sections in the PISA guidance: “Keep in mind that some respondents will misunderstand anything that can be misunderstood”!

I hope you found this post interesting – all suggestions are personal and not validated by the OECD or others. If you did find it interesting, you may also want to read my earlier blog post: Twelve tips to make questions translation ready.

To learn more about Questionmark OnDemand and Questionmark’s translation management system, see here or request a demo.

The Nineteen Responsibilities of an Assessment Data Controller under the GDPR

John KleemanPosted by John Kleeman

Back in 2014,  Questionmark produced a white paper covering what at the time was a fairly specialist subject – what assessment organizations needed to do to ensure compliance with European data protection law. With the GDPR in place in 2018, with its extra-territorial reach and potential of large fines, the issue of data protection law compliance is one that all assessment users need to consider seriously.

Data Controller with two Data Processors, one of which has a Sub-Processor

Myself, Questionmark Associate Legal Counsel Jamie Armstrong and Questionmark CEO Eric Shepherd have now rewritten the white paper to cover the GDPR and published it this week. The white paper is called  “Responsibilities of a Data Controller When Assessing Knowledge, Skills and Abilities”. I’m pleased to give you a summary in this blog article.

To remind you, a Data Controller is the organization responsible for making decisions about personal data, whereas a Data Processor is an organization who processes data on behalf of the Data Controller. As shown in the diagram, a Data Processor may have Sub-Processors. In the assessment context, examples of Data Controllers might be:

  • A company that tests its personnel for training or regulatory compliance purposes;
  • A university or college that tests its students;
  • An awarding body that gives certification exams.

Data Processors are typically companies like Questionmark that provide services to assessment sponsors. Data Processors have significant obligations under the GDPR, but the Data Controller has to take the lead.  The Nineteen Responsibilities of an Assessment Data Controller under the GDPR 1. Ensure you have a legitimate reason for processing personal data 2. Be transparent and provide full information to test-takers 3. Ensure that personal data held is accurate 4. Review and deal properly with any rectification requests 5. Respond to subject access requests 6. Respond to data portability requests 7. Delete personal data when it is no longer needed 8. Review and deal properly with any erasure requests 9. Put in place strong security measures 10. Use expert processors and contract with them wisely 11. Adopt privacy by design measures 12. Notify personal data breaches promptly 13. Consider whether you need to carry out a Data Protection Impact Assessment 14. Follow the rules if moving data out of Europe 15. If collecting “special” data, follow the particular rules carefully 16. Include meaningful human input as well as assessment results in making decisions 17. Respond to restriction and objection requests 18. Train your personnel effectively 19. Meet organisational requirementsBack in 2014, we considered there were typically 12 responsibilities for an assessment Data Controller. Our new white paper suggests there are now 19. The GDPR significantly expands the responsibilities Data Controllers have as well as makes it clearer what needs to be done and the likely penalties if it is not done.

The 25 page white paper:

  • Gives a summary of European data protection law
  • Describes what we consider to be the 19 responsibilities of a Data Controller (see diagram)
  • Gives Data Controllers a checklist of the key measures they need from a Data Processor to be able to meet these responsibilities
  • Shares how Questionmark helps meet the responsibilities
  • Comments on how the GDPR by pushing for accuracy of personal data might encourage more use of valid, reliable and trustworthy assessments and benefit us all

The white paper is useful reading for anyone who delivers tests and exams to people in Europe – whether using Questionmark technology or not. Although we hope it will be helpful, like all our blog articles and white papers, this article and the white paper are not a substitute for legal advice specific to your organization’s circumstances. You can see and download all our white papers at www.questionmark.com/learningresources and you can directly download this white paper here.

Six tips to increase reliability in competence tests and exams

Posted by John Kleeman

Reliability (how consistent an assessment is in measuring something) is a vital criterion on which to judge a test, exam or quiz. This blog post explains what reliability is, why it matters and gives a few tips on how to increase it when using competence tests and exams within regulatory compliance and other work settings

What is reliability?

Picture of a kitchen scaleAn assessment is reliable if it measures the same thing consistently and reproducibly.

If you were to deliver an assessment with high reliability to the same participant on two occasions, you would be very likely to reach the same conclusions about the participant’s knowledge or skills. A test with poor reliability might result in very different scores across the two instances.

It’s useful to think of a kitchen scale. If the scale is reliable, then when you put a bag of flour on the scale today and the same bag of flour on tomorrow, then it will show the same weight. But if the scale is not working properly and is not reliable, it could give you a different weight each time.

Why does reliability matter?

Just like a kitchen scale that doesn’t work, an unreliable assessment does not measure anything consistently and cannot be used for any trustable measure of competency.

As well as reliability, it’s also important that an assessment is valid, i.e. measures what it is supposed to. Continuing the kitchen scale metaphor, a scale might consistently show the wrong weight; in such a case, the scale is reliable but not valid. To learn more about validity, see my earlier post Six tips to increase content validity in competence tests and exams.

How can you increase the reliability of your assessments?

Here are six practical tips to help increase the reliability of your assessment:

  1. Use enough questions to assess competence. Although you need a sensible balance to avoid tests being too long, reliability increases with test length. In their excellent book, Criterion-Referenced Test Development, Shrock and Coscarelli suggest a rule of thumb is 4-6 questions per objective, with more for critical objectives. You can also get guidance from an earlier post on this blog How many questions do I need on my assessment?
  2.  Have a consistent environment for participants. For test results to be consistent, it’s important that the test environment is consistent – try to ensure that all participants have the same amount of time to take the test in and have a similar environment. For example, if some participants are taking the test in a hurry in a public and noisy place and others are taking it at leisure in their office, this could impact reliability.
  3. Ensure participants are familiar with the assessment user interface. If a participant is new to the user interface or the question types, then they may not show their true competence due to the unfamiliarity. It’s common to provide practice tests to participants to allow them to become familiar with the assessment user interface. This can also reduce test anxiety which also influences reliability.
  4. If using human raters, train them well. If you are using human raters, for example in grading essays or in observational assessments that check practical skills, make sure to define your scoring rules very clearly and as objectively as possible. Train your observers/raters, review their performance, give practice sessions and provide exemplars.
  5. Measure reliability. There are a number of ways of doing this, but the most common way is to calculate what is called “Cronbach’s Alpha” which measures internal consistency reliability (the higher it is, the better). It’s particularly useful if all questions on the assessment measure the same construct. You can easily calculate this for Questionmark assessments using our Test Analysis Report.
  6. Conduct regular item analysis to weed out ambiguous or poor performing questions. Item analysis is an automated way of flagging weak questions for review and improvement. If questions are developed through sound procedures and so well crafted and non-ambiguously worded they are more likely to discriminate well and so contribute to a reliable test. Running regular item analysis is the best way to identify poorly performing questions. If you want to learn more about item analysis, I recently gave a webinar on “Item Analysis for Beginners”, and you can access the recording of this here.

 

I hope this blog post reminds you why reliability matters and gives some ideas on how to improve reliability. There is lots more information on how to improve reliability and write better assessments on the Questionmark website – check out our resources at www.questionmark.com/learningresources.

Judgement is at the Heart of nearly every Business Scandal: How can we Assess it?

Posted by John Kleeman
How does an organization protect itself from serious mistakes and resultant corporate fines?

An excellent Ernst & Young report on risk reduction explains that an organization needs rules and that they are immensely important in defining the parameters in which teams and individuals operate. But the report suggests that rules alone are not enough, it’s how they are adopted by people when making decisions that matter. Culture is a key part of such decision making. And that ultimately when things go wrong “judgement is at the heart of nearly every business scandal that ever occurred”.

Clearly judgement is important for almost every job role and not just to prevent scandals but to improve results. But how do you measure it? Is it possible to test individuals to identify how they would react in dilemmas and what judgement that would apply? And is it possible to survey an organization to discover what people think their peers would do in difficult situations?  One answer to these questions is that you can use Situational Judgement Assessments (SJAs) to measure judgement, both for individuals and across an organization.

Questionmark has published a white paper on Situational Judgement Assessments, written by myself and Eugene Burke. The white paper describes how to assess judgement where you:

  1. Identify job roles and competencies or aspects of that role in your organization or workforce where judgement is important.
  2. Identify dilemmas which are relevant to your organization and each of which requires a choice to be made and where that choice is linked to the relevant job role.
  3. Build questions based on the dilemmas which asks someone to select from the choices –   SJA (Situational Judgement Assessment) questions.

There are two ways of presenting such questions, either to survey someone or to assess individuals on their judgement.

  • You can present the dilemma and survey your workforce on how they think others would do in such a situation. For example “Rate how you think people in the organization are likely to behave in a situation like this. Use the following scale to rate each of the options below: 1 = Very Unlikely 2 = Unlikely 3 = Neutral 4 = Likely 5 = Very Likely”.
  • You can present the dilemma and test individuals on what they personally would do in such a situation, for example as shown in the screenshot below.

You work in the back office in the team approving new customers, ensuring that the organization’s procedures have been followed (such as credit rating and know your customer). Your manager is away on holiday this week. A senior manager in the company (three levels above you) comes into your office and says that there is an important new customer who needs to be approved today. They want to place a big order, and he can vouch that the customer is good. You review the customer details, and one piece of information required by your procedures is not present. You tell the senior manager and he says not to worry, he is vouching for the customer. You know this senior manager by reputation and have heard that he got a colleague fired a few months ago when she didn’t do what he asked. You would: A. Take the senior manager’s word and approve the customer B. Call your manager’s cellphone and interrupt her holiday to get advice C. Tell the manager you cannot approve the customer without the information needed D. Ask the manager for signed written instructions to override standard procedures to allow you to approve the customer

You can see this question “live” with other examples of SJA questions in one of our example assessments on the Questionmark website at www.questionmark.com/go/example-sja.

Once you deliver such questions, you can easily report on the results segmented by attributes of participants (such as business function, location and seniority as well as demographics such as age, gender and tenure). Such reports can help indicate whether compliance will be acted out in the workplace, evaluate where compliance professionals need to focus their efforts and measure whether compliance programs are gaining traction.

SJAs can be extremely useful as a tool in a compliance programme to reduce regulatory risk. If you’re interesting in learning more about SJAs, read Questionmark’s white paper “Assessing for Situational Judgment”, available free (with registration) at https://www.questionmark.com/sja-whitepaper.

Item Analysis for Beginners – Getting Started

Posted by John Kleeman
Do you use assessments to make decisions about people? If so, then you should regularly run Item Analysis on your results.  Item Analysis can help find questions which are ambiguous, mis-keyed or which have choices that are rarely chosen. Improving or removing such questions will improve the validity and reliability of your assessment, and so help you use assessment results to make better decisions. If you don’t use Item Analysis, you risk using poor questions that make your assessments less accurate.

Sometimes people can be fearful of Item Analysis because they are worried it involves too much statistics. This blog post introduces Item Analysis for people who are unfamiliar with it, and I promise no maths or stats! I’m also giving a free webinar on Item Analysis with the same promise.

An assessment contains many items (another name for questions) as figuratively shown below. You can use Item Analysis to look at how each item performs within the assessment and flag potentially weak items for review. By keeping only stronger questions in the assessment, the assessment will be more effective.

Picture of a series of items with one marked as being weak

Item Analysis looks at the performance of all your participants on the items, and calculates how easy or hard people find the items (“item difficulty” or “p-value”) and how well the scores on items correlate with or show a relationship with the scores on the assessment as a whole (“item discrimination” or correlation). Some of problematic questions that Item Analysis can identify are:

  • Questions almost all participants get right, and so which are very easy. You might want to review to these to see if they are appropriate for the assessment. See my earlier post Item Analysis for Beginners – When are very Easy or very Difficult Questions Useful? for more information.
  • Questions which are difficult, where a lot of participants get the questionwrong. You should check such questions in case they are mis-keyed or ambiguous.
  • Multiple choice questions where some choices are rarely picked. You might want to improve such questions to make the wrong choices more plausible.
  • Questions where there is a poor correlation between participants who get the question right and who do well on the assessment. For example it will flag questions that high performing participants perform poorly on. You should look at such questions in case they are ambiguous, mis-keyed or off-topic.

There is a huge wealth of information available in an Item Analysis report, and assessment experts will delve into the report in detail. But much of the key information in an Item Analysis report is useful to anyone creating and delivering quizzes, tests and exams.

The Questionmark Item Analysis report includes a graph which shows the difficulty of items compared against their discrimination, like in the example below. It flags questions by marking them amber or red if they fall into categories which may need review. For example, in the illustration below, four questions are marked in amber as having low discrimination and so potentially be worth looking at.

Illustration of Questionmark item analysis report showing some questions green and some amber

If you are running an assessment program, and not using Item Analysis regularly, then this throws doubt on the trustworthiness of your results. By using it to identify and improve weak questions you should be able to improve your validity and reliability.

Item Analysis is surprisingly effective in practice. I’m one of the team responsible at Questionmark for managing our data security test which all employees have to take annually to check their understanding of information security and data protection. We recently reviewed the test and ran Item Analysis and very quickly found a question with poor stats where the technology had changed but we’d not updated the wording, and another question where two of the choices could be considered right, which made it hard to answer. It made our review faster and more effective and helped us improve the quality of the test.

If you want to learn a little more about Item Analysis, I’m running a free webinar on the subject “Item Analysis for Beginners” on May 2nd. You can see details and register for the webinar at https://www.questionmark.com/questionmark_webinars. I look forward to seeing some of you there!

 

Seven Ways Assessments Fortify Compliance

Posted by John Kleeman
Picture of a tablet being used to take an assessment with currency symbols adjacentWhy do most of the world’s banks, pharmaceutical companies, utilities and other large companies use online assessments to test the competence of their employees?

It’s primarily because compliance fines round the world are high and assessments reduce the risk of regulatory compliance failures. Assessments also give protection to the organization in the event of an individual mis-step by proving that the organization had checked the individual’s knowledge of the rules prior to the mistake.

Here are seven reasons companies use assessments from my experience:

1. Regulators encourage assessments 

Some regulators require companies to test their workforce regularly. For example the US FDIC says in its compliance manual:

“Once personnel have been trained on a particular subject, a compliance officer should periodically assess employees on their knowledge and comprehension of the subject matter”

And the European Securities and Market Authority says in its guidelines for assessment of knowledge and competence:

“ongoing assessment will contain updated material and will test staff on their knowledge of, for example, regulatory changes, new products and services available on the market”

Other regulators focus more on companies ensuring that their workforce is competent, rather than specifying how companies ensure it, but most welcome clear evidence that personnel have been trained and have shown understanding of the training.

People sitting at desks with computers taking tests2. Assessments demonstrate commitment to your workforce and to regulators

Many compliance errors happen because managers pay lip service to following the rules but indicate in their behavior they don’t mean it. If you assess all employees and managers regularly, and require additional training or sanctions for failing tests, it sends a clear message to your workforce that knowledge and observance of the rules is genuinely required.

Some regulators also take commitment to compliance by the organization into account when setting the level of fines, and may reduce fines if there is serious evidence of compliance activities, which assessments can be a useful part of. For example the German Federal Court recently ruled that fines should be less if there is evidence of effective compliance management.

3. Assessments find problems early

Online assessments are one of the few ways in which a compliance team can touch all employees in an organization. You can see results by team, department, location or individual and identify who understands what and focus in on weak areas to look at improving. There is no better way to reach all employees.

4. Assessments document understanding after training

Many regulators require training to be documented. Giving someone an assessment after training doesn’t just confirm he or she attended the course but confirms they understood the training.

5. Assessments increase retention of knowledge and reduce forgetting

Can you remember everything you learned? Of course, none of us can!

There is good evidence that quizzes and tests increase retention and reduce forgetting. This is partly because people study for tests and so remind themselves of the knowledge they learned, which helps retain it. And it is partly because retrieving information in a quiz or test makes it easier to retrieve the same information in future, and so more likely to be able to apply in practice when needed.

6. By allowing testing out, assessments reduce the time and cost of compliance trainingTake test. If pass, skip training. Otherwise do training.

Many organizations permit employees to “test out” of compliance training. People can take a test and if they demonstrate good enough knowledge, they don’t need to attend the training. This concentrates training resources and employee time on areas that are needed, and avoids demoralizing employees with boring compliance training repeating what they already know.

7. Assessments reduce human error which reduces the likelihood of a compliance mis-step

Many compliance failures arise from human error. Root cause analysis of human error suggests that a good proportion of errors are caused by people not understanding training, training being missing or people not following procedures. Assessments can pick up and prevent mistakes caused by people not understanding what they should do or how to follow procedures, and so reduce the risk of error.

 

If you are interested in learning more about the reasons online assessments mitigate compliance risk, Questionmark are giving a webinar “Seven Ways Assessments Fortify Compliance” on April 11th. To register for this or our other free webinars, go to www.questionmark.com/questionmark_webinars.