Three-Day Symposium in New Zealand: Focus on Assessment Best Practices

rafael-conf-australia2Posted by Rafael Lami Dozo

I am looking forward to an exciting three-day learning event in Auckland, New Zealand, an Online Assessments Symposium organized by Business Toolbox. The first two days will be devoted to instruction on best practices in creating assessments, and the third will bring together industry experts to share advice about moving assessments online.


Participants take a break from learning at the Online Assessments Symposium in Colombia last month

Last month I presented a similar program in Colombia, and the response was tremendous. A good number or people from both New Zealand and Australia have already registered for the Auckland symposium, and we still have room for a few more. You may opt to register for one or both workshops, but registration will end soon so I encourage you to act now if you would like to join us. Not only is this a great learning opportunity: it also will give you the chance to network with other learning and assessment professionals in the region.

Here are some of the activities we have planned:

August 10 and 11 (Monday and Tuesday) will feature an intensive 2-day workshop: Creating Assessments that Get Results. This course is designed to instruct and engage you in applying best practices for online assessment design, development and delivery.

On August 12 (Wednesday) we will hold a seminar on Putting Your Assessments Online. The seminar  will feature an excellent line-up of industry experts and practitioners who will talk about assessment best practice and their own experiences with moving from paper-based to online assessments.

This is a great opportunity for you to see examples of what other organizations are doing, and network with your industry peers.  I hope to see many of you at the symposium!

Please feel free to post your comments and questions below or you can contact me directly.

Research Survey for Test Takers: You Can Help


Posted by Greg Pope

I am working with Dr. Bruno Zumbo, professor at the University of British Columbia, on  a research study about the beliefs of people who are waiting to take, or have taken, a certification or licensure examination.

In this initial study we want to document people’s attitudes and beliefs regarding taking these exams as well as issues in the area of certification and licensure testing. This research is designed to help certification and licensing organizations improve high-stakes exams by shedding light on test takers’ perspectives.

To complete our research, we need input from anyone who is planning to take or has already taken a certification or licensing exam. If you are a test taker we thank you in advance for answering a 35-question survey that will take 5 or 10 minutes to complete. This is an opportunity to weigh in on important issues in the testing industry. If you are a test taker, please take the survey!  If you know certification or licensing exam participants, we’d appreciate it if you could encourage them to take it too.

We will report on the results of our research this fall and appreciate your help!

Understanding Common eLearning Standards


Posted by Tom King

I’ve prepared a video podcast which is your introduction to key interoperability standards for elearning. It also serves as my introduction to video podcasts. Your feedback on both the content and the style will be put to use as I continue the series—so please post comments or send email.

The video for Part 1 provides a quick overview of the need for interoperability standards, the names of the keys standards, and the types of interoperability they support. Part 1 addresses AICC, ADL SCORM, IEEE LTSC and IMS specifications at a high level. It introduces the concepts of run-time communication, content packaging, and meta-data.

I hope you find it a good refresher if you are already somewhat knowledgeable about these standards, and an excellent introduction if you are new to most of this.

Item Analysis Analytics Part 4: The Nitty-Gritty of Item Analysis



Posted by Greg Pope

In my previous blog post I highlighted some of the essential things to look for in a typical Item Analysis Report. Now I will dive into the nitty-gritty of item analysis, looking at example questions and explaining how to use the Questionmark Item Analysis Report in an applied context for a State Capitals Exam.

The Questionmark Item Analysis Report first produces an overview of question performance both in terms of the difficulty of questions and in terms of the discrimination of questions (upper minus lower groups). These overview charts give you a “bird’s eye view” of how the questions composing an assessment perform. In the example below we see that we have a range of questions in terms of their difficulty (“Item Difficulty Level Histogram”), with some harder questions (the bars on the left), most average-difficulty questions (bars in the middle), and some easier questions (the bars on the right). In terms of discrimination (“Discrimination Indices Histogram”) we see that we have many questions that have high discrimination as evidenced by the bars being pushed up to the right (more questions on the assessment have higher discrimination statistics).


Overall, if I were building a typical criterion-referenced assessment with a pass score around 50% I would be quite happy with this picture. We have more questions functioning at the pass score point with a range of questions surrounding it and lots of highly discriminating questions. We do have one rogue question on the far left with a very low discrimination index, which we need to look at.

The next step is to drill down into each question to ensure that each question performs as it should. Let’s look at two questions from this assessment, one question that performs well and one question that does not perform so well.

The question below is an example of a question that performs nicely. Here are some reasons why:

  • Going from left to right, first we see that the “Number of Results” is 175, which is a nice sample of participants to evaluate the psychometric performance of this question.
  • Next we see thateveryone answered the question (“Number not Answered” = 0), which means there probably wasn’t a problem with people not finishing or finding the questions confusing and giving up.
  • The “P Value Proportion Correct” shows us that this question is just above the pass score where 61% of participants ‘got it right.’ Nothing wrong with that: the question is neither too easy nor too hard.
  • The “Item Discrimination” indicates good discrimination, with the difference between the upper and lower group in terms of the proportion selecting the correct answer of ‘Salem’ at 48%. This means that of the participants with high overall exam scores, 88% selected the correct answer versus only 40% of the participants with the lowest overall exam scores. This is a nice, expected pattern.
  • The “Item Total Correlation” backs the Item Discrimination up with a strong value of 0.40. This means that of all participants who answered the questions, the pattern of high scorers getting the question right more than low scorers holds true.
  • Finally we look at the Outcome information to see how the distracters perform. We find that each distracter pulled some participants, with ‘Portland’ pulling the most participants, especially from the “Lower Group.” This pattern makes sense because those with poor state capital knowledge may make the common mistake of selecting Portland as the capital of Oregon.

The psychometricians, SMEs, and test developers reviewing this question all have smiles on their faces when they see the item analysis for this item.


Next we look at that rogue question that does not perform so well in terms of discrimination-–the one we saw in the Discrimination Indices Histogram. When we look into the question we understand why it was flagged:

  • Going from left to right, first we see that the “Number of Results” is 175, which is again a nice sample size: nothing wrong here.
  • Next we see everyone answered the question, which is good.
  • The first red flag comes from the “P Value Proportion Correct” as this question is quite difficult (only 35% of participants selected the correct answer). This is not in and of itself a bad thing so we can keep this in memory as we move on,
  • The “Item Discrimination” indicates a major problem, a negative discrimination value. This means that participants with the lowest exam scores selected the correct answer more than participants with the highest exam scores. This is not the expected pattern we are looking for: Houston, this question has a problem!
  • The “Item Total Correlation” backs up the Item Discrimination with a high negative value.
  • To find out more about what is going on we delve into the Outcome information area to see how the distracters perform. We find that the keyed-correct answer of Nampa is not showing the expected pattern of upper minus lower proportions. We do, however, find that the distracter “Boise” is showing the expected pattern of the Upper Group (86%) selecting this response option much more than the Lower Group (15%). Wait a second…I think I know what is wrong with this one, it has been mis-keyed! Someone accidently assigned a score of 1 to Nampa rather than Boise.


No problem: the administrator pulls the data into the Results Management System (RMS), changes the keyed correct answer to Boise, and presto, we now have defensible statistics that we can work with for this question.


The psychometricians, SMEs, and test developers reviewing this question had a frown on their faces at first but those frowns were turned upside down when they realized it is just a simple mis-keyed question.

In my next blog post I would like share some observations on the relationship between Outcome Discrimination and Outcome Correlation.

Are you ready for some light relief after pondering all these statistics? Then have some fun with our own State Capitals Quiz.

Questionmark User Group Meetings in September and October

joan-small1Posted by Joan Phaup

Questionmark customers will be getting together with Questionmark managers in September and October for a series of regional user group meetings. We started organizing these meetings a few years ago in conjunction with our annual series of complimentary Breakfast Briefings that introduce people to the basics of using Questionmark Perception. During user group meetings, our customers get an in-depth look at new product features and have the opportunity to share their views about our products and services. They also have a great time networking with each other and learn a lot in the process.

Here’s the schedule for morning briefings and afternoon user group meetings:

September 1: Boston
September 3: New York
September 24: Chicago
October 6:  Dallas
October 8:  Washington, DC
October 20: Atlanta
October 22: Los Angeles

Online registration for the user group meetings is available here.

People who would like a basic introduction to the use of Perception can sign up for a briefing.

Podcast: Managing Test Data Effectively

Posted by Joan Phaup

End-of-course tests taken by more  than 100,000 students per year give the Arkansas Department of Career Education enormous amounts of data to process. Managing and reporting on that data effectively is essential not only for promoting classroom improvement and measuring student performance but also for reporting to government regulatory agencies and acquiring federal Perkins funding.

In this podcast, Karen Chisholm and Keith Peterson from the department’s Office of Assessment and Curriculum explain how they administer tests to so many students, determine what data is actionable and then disseminate the right data to the right people in the right form–all of which help career and technical education in the state keep pace with the needs of industry.

Listen to our conversation: