Ten tips on recommended assessment practice – from San Antonio, Texas

John Kleeman HeadshotPosted by John Kleeman

One of the best parts of Questionmark user conferences is hearing about good practice from users and speakers. I shared nine tips after our conference in Barcelona, but Texas has to be bigger and better (!), so here are ten things I learned last week at our conference in San Antonio.

1. Document your decisions and processes. I met people in San Antonio who’d taken over programmes from colleagues. They valued all the documentation on decisions made before their time and sometimes wished for more. I encourage you to document the piloting you do, the rationale behind your question selection, item changes and cut scores. This will help future colleagues and also give you evidence if you have need to justify or defend your programmesgen 5.

2. Pilot with non-masters as well as masters. Thanks to Melissa Fein for this tip. Some organizations pilot new questions and assessments just with “masters”, for example the subject matter experts who helped compile them. It’s much better if you can pilot to a wider sample, and include participants who are not experts/masters. That way you get better item analysis data to review and you also will get more useful comments about the items.

3. Think about the potential business value of OData. It’s easy to focus on the technology of OData, but it’s better to think about the business value of the dynamic data it can provide you. Our keynote speaker, Bryan Chapman, made a powerful case at the conference about getting past the technology. The real power is in working out what you can do with your assessment data once it’s free to connect with other business data. OData lets you link assessment and business data to help you solve business problems.

4. Use item analysis to identify low-performing questions. The most frequent and easiest use of item analysis is to identify low-performing questions. Many Questionmark customers use it regularly to identify questions that are too easy, too hard or not sufficiently discriminating. Once you identify these questions, you modify them or remove them depending on what your review finds. This is an easy win and makes your assessments more trustworthy.thurs night longhorn flag

5. Retention of learning is a challenge and assessments help. Many people shared that retention was a key challenge. How do you ensure your employees retain compliance training to use when they need it? How do you ensure your learners retain their learning beyond the final exam? There is a growing realization that using Questionmark assessments can significantly reduce the forgetting curve.

6. Use performance data to validate and improve your assessments. I spoke to a few people who were looking at improving their assessments and their selection procedure by tracking back and connecting admissions or onboarding assessments with later performance. This is a rich vein to mine.

7. Topic feedback and scores. Topic scores and feedback are actionable. If someone gets an item wrong, it might just be a mistake or a misunderstanding. But if someone is weak in a topic area, you can direct them to remediation. It’s hugely successful for a lot of organizations to divide assessments into topics and feedback and analyze by topic.

8. Questionmark Community Spaces is a great place to get advice. Several users shared that they’d posed a question or problem in the forums there and got useful answers. Customers can access Community Spaces here.wed dinner gents

9. The Open Assessment Platform is real. We promote Questionmark as the “Open Assessment Platform,” allowing you to easily link Questionmark to other systems, and it’s not just marketing! As one presenter said at the conference “The beauty of using Questionmark is you can do it all yourself”. If you have a need to build a system including assessments, check out the myriad ways in which Questionmark is open.

10. Think of your Questionmark assessments like a doctor thinks of a blood test. A doctor relies on a blood test to diagnose a patient. By using Questionmark’s trustable processes and technology, you can start to think of your assessments in a similar light, and rely on your assessments for business value.

I hope some of these tips might help you get more business value out of your assessments.

Evidence that topic feedback correlates with improved learning

John Kleeman HeadshotPosted by John Kleeman

It seems obvious that topic feedback helps learners, but it’s great to see some evidence!

Here is a summary of a paper, “Student Engagement with Topic-based Facilitative Feedback on e-Assessments” (see here for full paper) by John Dermo and Liz Carpenter of the University of Bradford, presented at the 2013 International Computer Assisted Assessment conference.

Dermo and Carpenter  delivered a formative assessment in Questionmark Perception over a period of 3 years to 300 students on an undergraduate biology module.  All learners were required to take the assessment once, and were allowed to re-take it as many times as they wanted. Most took the test several times. The assessment didn’t give question feedback, but gave topic feedback on the 11 main topic areas covered by the module.

The intention was for students to use the topic feedback as part of their revision and study to diagnose weaknesses in their learning: the comments provided might be able to direct students in their learning. The students were encouraged to incorporate this feedback into their study planners and to take the test repeatedly, expecting that students who engage with their feedback, and are “mindful” of their learning will  benefit most.

Here is an example end of test feedback screen.

Assessment Feedback screen showing topic feedback

As you can see, learners achieved “Distinction”, “Merit”, “Pass” and “Fail” for each topic. They were also given a topic score and some guidance on how to improve. The authors then correlated time spent on the tests, questions answered and distribution of taking the test over time with each student’s score on the end-of-module summative exam.  They found a correlation between taking the test and doing well on the exam. For example, the correlation factor on number of attempts on the formative assessment and the score on the  summative exam was 0.29 (spearman rank order correlation, p<0.01).

You can see some of their results below, with learners divided into a top, middle and bottom scoring group on the summative exam. This shows that the top scoring group answered more questions, spent more time on the test, and spread the effort over a longer period of time.

Clustered bar charts showing differences between top middle and bottom scoring groups on the dependent variables time, attempts, and distribution

The researchers also surveyed the learners, 82% of whom agreed or strongly agreed that “I found the comments and feedback useful”. Many students also drew attention to the fact that the assessment and feedback let them focus their revision time on the topics that needed most attention, for example one student said:

“It showed clearly areas for further improvement and where more work was needed”.”

There could be other reasons why learners who spent time on the formative assessments did well on the summative exam:  they might, for instance, have been more diligent in other things. So this research offers proof of correlation, not proof of cause and effect. However, it does provide evidence pointing to topic feedback being useful and valuable in improving learning by telling learners which areas they are weak in and need work on more. This seems likely to apply to the world of work as well as to higher education.

The impact of feedback on learning and retention

Joan Phaup HeadshotPosted by Joan Phaup

Each year at the Questionmark Users Conference we like to include at least one breakout session relating to cognitive learning research – and 2013 is no exception.

John Kleeman, Questionmark’s founder and chairman, takes a special interest in learning research and has been focusing lately on the role feedback plays in improving the value of quizzes and tests.

John will lead a best practices session when we meet in Baltimore March 3 – 6, on Assessment Feedback – What Can We Learn from Psychology Research?

I spent a few minutes asking John about his presentation.

John Kleeman portrait

John Kleeman

What research have you been following on the effects of feedback on learning and retention?

My main role is as chairman of Questionmark, but I keep an active eye on relevant research, and I follow a number of researchers who are looking into how learning and retention work – and I’m particularly interested in how assessments fit into that. For example, I’ve been following Professor Roddy Roediger at Washington University in St Louis, Missouri and several of his colleagues across the U.S.

(Click here to read one of John’s interviews with Professor Roediger.)

What would you say are the key findings from this research?

What we kind of know but don’t always put into practice is that we forget a surprising amount of what we learn. People know about the forgetting curve as an idea, but don’t always think it applies to them! We think we are going to be better at remembering things later than we actually are. A quiz or test can force you into practicing retrieving and that makes it more likely for things to stick in your mind.

When you learn something – whether in a formal or informal context– you won’t remember a lot of it in a month or six months. Taking a quiz or test helps you retain that learning by providing retrieval practice and slowing the forgetting curve. If people take quizzes or tests, it slows down the forgetting curve – and quizzes with feedback slow down the forgetting curve more effectively than quizzes without feedback.

Will you discuss topic feedback and well as question feedback?

A lot of the research covers question feedback because it’s very easy to measure how well people do on a specific fact. But there is also evidence about topic feedback, and yes; I will be covering topic feedback as well as question feedback.

What would you like your audience to take away from your session?

I aim to practice what I preach, so I will use interactive techniques to help people remember what I talk about! I don’t want just to provide theory: I also want to give actionable ideas that people can apply to their Questionmark assessments to improve retention.

I’d like to add that I’ve found from talking with customers that the conference is a fantastic place to learn. People who come to the conference get a lot of formal learning – for instance by presentations from assessment experts and Questionmark staff who explain effective ways to use our technologies – but they also get a lot of informal learning from interacting with other users. I’m especially looking forward to presentations from our expert customers. Some of our case study presenters have been using our software for many years and have a lot of experience and wisdom to share. So I think I’ll learn a lot from those presentations myself!

You can save $100 if you register for the conference by January 18th. Check out the conference agenda and sign up soon!


Feeding back from eAssessment Scotland

 Posted by Steve Lay

eAssessment Scotland is an annual event hosted by the University of Dundee in Scotland.

This year’s conference had a very clear theme: Feeding Back, Forming the Future. I have to say that the programme was managed very well to fit with this theme and that the theme also fits well with the current mood of the wider community. For example, in the UK as a whole the JISC have an ongoing programme on assessment and feedback, and this event provided an opportunity for some of those projects to report on their progress.

I do find that ”feedback’ can be a very general term. In the opening keynote, Professor David Boud, University of Technology Sydney provided an analysis of the subject through a 3-generation model of feedback. At one point he encouraged us to “position feedback as part of learning and not as an adjunct to assessment”.

I sensed that assessment was being used in an Assessment of  Learning sense here. This contrasts with “Assessment for Learning”, these phrases are simpler ways of expressing the basic idea behind summative and formative assessment respectively. It is the latter which generates the type of feedback that could potentially meet the challenge posed by Dr Steve Draper, University of Glasgow: What If Feedback Only Counted When it Changed the Learner?

From the tone of the discussion at the conference, I do sense that the higher-education community is trying hard to adapt to the new perceptions of formal, informal and experiential learning reflected in the 70:20:10 model of education and development — by continuing to embrace the value of formal learning while adopting other modes of learning.

The 10% is sometimes summarised as being the part of our learning effected by formal courses (and reading). Feedback is reserved for the 20% where we learn from our peers. Many of the presentations were about embracing social systems to attempt to exploit these modes of feedback.

Clearly, assessment can have an important role to play in assessment for learning but I took away the impression that this community sometimes needs reminding that understanding the purpose of an assessment is vital to its success. Combining assessment for learning and assessment of  learning may not be fruitful.

Professor Roddy Roediger on applying the retrieval practice effect to creating and administering assessments

Posted by John Kleeman

In the first part of this interview, I asked Professor Roediger to explain how quizzes and tests give retrieval practice that helps you learn.  Now, he moves on to giving some practical advice on how to use this effectively when creating and administering assessments.

Roddy Roediger

Is it better to have formal quizzes or self-retrieval practice?

They both work great, but there is a puzzle inherent in using self-testing types of retrieval practice.

In self-testing, students can base their studying on their self-knowledge. That is, we think we can accurately judge what we know and don’t know, so we will practice the material we don’t know until we know it. However, one outcome that cognitive psychologists have repeatedly shown is that often our own judgments of learning are not a sure guide to what we really know (as measured on a test).

Jeff Karpicke at Purdue University has done experiments comparing cases where learning is put under students’ control compared to that of a computer algorithm that requires the same schedule of testing for all students. You might imagine that if you let people test themselves by giving them a stack of cards and saying, “Just learn this material” until you are sure you know it, that they could outpace a computer program. After all, the student has awareness of his or her knowledge states to guide study whereas the computer does not. The computer might just have each person practice the fact three times. Still, what Karpicke finds is that when people are tested some days later, the computer schedule leads to better retention than when people are permitted to study according to their own schedule. The computer schedule that tests everyone on everything three times leads to better long-term learning.

Karpicke finds that when you put the ability to restudy and test under the students’ control, they usually do not test themselves enough. They retrieve something once and think they have it permanently without realizing that repeated testing is the key to long-term memory. Getting it once isn’t enough. You need repeated retrieval practice for something critical. If you retrieve it many times, the fact becomes much more easily retrieved in the future. So people who use retrieval practice often don’t do it that effectively, because they stop before they should.

How many times should one get people to retrieve things, and how soon after learning?

We’re just exploring this area – it probably depends on what kinds of material you’re trying to learn. Mary Pyc and Katherine Rawson at Kent State University showed that for simple things like foreign language vocabulary, retrieving about 5 to 7 times is about right — if you test people a week later you wouldn’t see much difference between having tested people 7 times or 10 times, but you do see gains going up to the range of 5-7 times. After that it just levels off. But most people would only practice once or twice, so the idea of going up to 5 or 7 retrievals seems like too much to many people. Of course, to keep knowledge at your mental fingertips, you would need continued spaced retrieval practice, too.

So would you advise people training sales people or factory workers to give people retrieval practice after training?

Yes, absolutely. Giving quizzes, tests or other retrieval practice will help people retain learning. Again, spaced retrieval is the key.

Besides too little retrieval practice, another mistake trainers sometimes make (in sports, in education, in the military, in industry) is to train people using blocks of practice on the same task. If you have 5 tasks you have to train a person to do, often teachers will train people up on task one and get them to practice until they get to 100% performance. They then train task two and bring them to 100%, and so forth. That’s fine — you get students up to speed quickly with this kind of training — but when you test them after a long delay, they also show rapid forgetting. The better training technique for long-term retention is to interleave the practice; that is, practice task one for a little while, then go to task three, then go to two and so forth.  That slows down initial learning and frustrates learners a bit; it frustrates teachers too because they don’t see a great initial improvement. However, many studies show that when you provide this interleaved practice, where you have to switch back and forth, it leads to much better retention later on. When you are practicing retrieval, or even if you’re practicing a motor skill, it’s better to skip around between tasks than to train on tasks one by one.

Is that an example of desirable difficulty?

Yes! (A desirable difficulty is a difficulty introduced during learning that slows initial learning but increases long-term retention and transfer.)

That’s really interesting….How important and useful is feedback?

Feedback is critical in the testing world. Taking a test when you can’t remember something is of no use, but if you get feedback, that can help a lot. For anything difficult, especially for tests after a delay, you need feedback. Feedback never hurts!

If you’re at a cocktail party and you can’t remember someone’s name after being introduced it’s hard to get feedback. But in most cases — if you’re a sales person learning your products or a student learning about the muscular system — you can arrange to get feedback, and I always advise people to get feedback. We’ve never seen studies where it hurts; it only helps.

In some cases, such as with multiple choice and true/false tests, we have seen some negative consequences of delivering tests without feedback. Suppose you answer a question with choice D, and you’re sure that is correct. Well, suppose it is actually wrong. We know people learn from answering questions on tests, so if you’ve responded and made an error, you will have stamped that error into memory. Because retrieval enhances learning, you will continue remembering that error. So especially in tests like true/false and multiple choice, where error is invited, feedback is very important. In fact, studies by Andrew Butler of Duke University show that potential negative effects of multiple-choice tests can be completely removed by providing feedback soon after the tests. As long as you give feedback on multiple choice quizzes and tests, they provide learning benefits.

What are you looking into now and what research still needs to be done?

There’s a lot to be done. Although the first experiments showing the retrieval practice or testing effect go back 100 years ago, it’s only been in the last 10 years that researchers have really begun to dig into the issues and have started exploring the parameters. Many researchers besides me are doing work on retrieval practice. One question being asked in my own lab now, by Adam Putnam, is, “Does it matter how you do the retrieval?” Does it matter if you write down your answer, type it into a computer, say it out loud or just think it? Is thinking the answer as effective as writing or typing it for later retention?  Right now, the early evidence is that they’re all about the same…. but we’re just beginning this line of work.

You can see more on Professor Roediger’s work at http://psych.wustl.edu/memory/publications/.  Research by Professor Roediger and colleagues is sponsored by the James S. McDonnell Foundation – a foundation left by James McDonnell, the famous aircraft pioneer, who had wide ranging interests including psychology. He is also sponsored by the Cognition and Student Learning Program of the U.S. Department of Education and by Dart Neuroscience.

Control access to surveys but keep the results anonymous

Posted by John Kleeman

When delivering a course evaluation survey or an employee engagement survey, it’s usually best to make the survey anonymous, as this will encourage people to answer candidly and so give you the feedback you want.  But how do you make the results anonymous yet still control who can take the survey and ensure they can only take it once?

Questionmark Perception lets you make a survey anonymous as one of the options when creating the assessment.

Anonymous setting screenshot

What some Questionmark users don’t know is that even if a survey is anonymous, you can still schedule people to it individually, as shown in the diagram below.

All the scheduling capabilities of Questionmark (who can take the survey, when they can do it, limiting to a single attempt) work normally with an anonymous survey. It’s just that the assessment delivery software doesn’t store the person’s name and any other identifying information with the results. So you can limit attempts, you can use Email Broadcast to send out an invitation to participants and you can even remind people who’ve not taken the survey to take it. But in the results database, names are replaced with the text “Anonymous,” and so none of the answers and comments your participants give will be identifiable.

You can also use anonymous surveys in a similar way when running a survey via Single Sign On, from SharePoint or from a Learning Management System. The external system will authenticate the participant, but Perception will not store the identities of people with their results, and so instructors and other reporting users will see the answers and comments as anonymous.

If you are relying on this, you should check it for yourself:  Take a dummy survey and confirm that you cannot see the results when reporting. One thing to be aware of is Special fields, which can sometimes contain identifying information.  There is a system setting  that lets you control whether these are captured for anonymous surveys. (Questionmark Software Support Plan customers can see details here.)

Questionmark ourselves use this capability to deliver anonymous surveys to our own employees, and I hope this capability might be helpful to you, too.