How the sample of participants being tested affects item analysis information


Posted by Greg Pope

Ever think about who took the test when you are interpreting your item analysis report? Maybe you should! Classical Test Theory (CTT) item analysis information is very much based on the sample of participants who took the test.

Hold on a second, what is a sample? What is the difference between a sample and a population? Well, a sample is a selection from a population. If your population is composed of all the 1.5 million of people in the United States who will write a college entrance exam in a year, a sample of this population could be 1,000 people selected based on certain criteria (e.g., age, gender, ethnicity, etc.). If we were to beta test questions that we hope to include on an upcoming college entrance exam it is usually not possible or practical to beta test all 1.5 million people in the population, so one or more representative samples are selected to beta test the questions.

As I mentioned, the sample of participants taking an assessment has an impact on the difficulty and discrimination statistics that you will obtain in your CTT item analysis. For example, if you administered the college entrance exam beta test to a sample of gifted students who are the best and brightest, the Item Analysis Report is going to come back showing that all your questions are easy (p-values close to 1) and you probably won’t get very high discrimination statistics. However, we know that the population of people taking college entrance exams is not all composed of the best and brightest, so this sample is not an accurate representation of the population (we say the sample is not representative). It would not be wise to try to build the actual college entrance exam form from the beta test results from only this one sample of bright students because the item statistics would not reflect the population of students that will be tested.

Using strong sampling methods will help ensure that the statistics you get are appropriate. Typing in a search word like “Sampling” in your favorite online book store will yield numerous suggestions for some fun reading on this subject. If you don’t have the time or inclination to do some light reading on sampling methods in your spare time, start with the obvious: Think about the target population of test takers that are going to take a test and if you are beta testing questions try to obtain samples that reflect that population of test takers. In a previous blog post I talked more about beta testing.

As an aside, Item Response Theory (IRT) advocates will be quick to point out that IRT doesn’t have the same sample dependency challenges as CTT. I’ll discuss that at another time!

Embedding Questionmark Assessments in Wetpaint

Embed a Questionmark Perception survey or quiz inside your Wetpaint page.

  • To see how this would look, see a snapshot of an assessment embedded within a Wetpaint IFrame.
  • Check out this How-to on our developer Web site.
  • Wetpaint provides social network service and wiki hosting service. Wetpaint utilizes the features of wikis, blogs, forums and social networks to help you create your own social Web site. Embedding an assessment into a Wetpaint page is simple. Wetpaint has the ability to add HTML to a Web page as a custom widget. You can therefore embed a Questionmark assessment by adding an IFrame widget to your Wetpaint page.

Questionmark: a new SAP Partner


john_smallPosted by John Kleeman

I’m pleased to let you know that Questionmark is now an SAP software solution partner.sap_partner_R_tm_p

As you’ll perhaps have seen from SAP’s advertising, many of the world’s best-run businesses run SAP software. Best-run organizations need their employees to perform, learn and be certified, and it makes a lot of sense for us to integrate Questionmark software with SAP so that people can seamlessly move from SAP user interfaces to Questionmark ones and back again.

I’ve had the privilege of working closely with SAP over the last two years. We’ve developed our Connector for use with SAP Learning Solution and deployed it with customers. And I’m now working within Questionmark to encourage deeper links with SAP software in the future so that organizations using SAP will be able to integrate Questionmark assessments even more easily in future.

Questionmark partners with many other great companies, but I’m very proud that we are a formal partner with SAP.

SAP and the SAP partner logo are trademarks or registered trademarks of SAP AG in Germany and in several other
countries all over the world.

When and where should I use randomly delivered assessments?


Posted by Greg Pope

I am often asked my psychometric opinion regarding when and where random administration of assessments is most appropriate.

To refresh memories, this is a feature in Questionmark Perception Authoring Manager that allows you to select questions at random from one or more topics when creating an assessment. Rather than administering the same 10 questions to all participants, you can give each participant a different set of questions that are pulled at random from the bank of questions in the repository.

So when is it appropriate to use random administration? I think that depends on the answer this question: What are the assessment’s  stakes and purpose? If the stakes are low and the assessment scores are used to help reinforce information learned, or to give participants a rough guess as to how they are doing in an area, I would say that using random administration is defensible. However, if the stakes are medium/high and the assessment scores are used for advancing or certifying participants I usually caution against random administration.  Here are a few reasons why:

  • Expert review of the assessment form(s) cannot be conducted in advance (each participant gets a unique form)
  • Generally SMEs, psychometricians, and other experts will thoroughly review a test form before it is put into live production. This is to ensure that the form meets difficulty, content and other criteria before being administered to participants in a medium/high stakes context. In the case of randomly administered assessments, this review in advance is not possible as every participant obtains a different set of questions.
  • Issues with the calculation of question statistics using Classical Test Theory (CTT)
  • Smaller numbers of participants will be answering each individual question. (Rather than all 200 participants answering all 50 questions in a fixed form test, randomly administered tests generated from a bank of 100 questions may only have a few participants answering each question.)
  • As we saw in a previous blog post, sample size has an effect on the robustness of item statistics. With fewer participants taking each question it becomes difficult to have confidence in the stability of the statistics generated.
  • Equivalency of assessment scores is difficult to achieve and prove
  • An important assumption of CTT is equivalence of forms or parallel forms. In assessment contexts where more than one form of an exam is administered to participants, a great deal of time is spent ensuring that the forms of the assessment are parallel in every way possible (e.g.., difficulty of questions, blueprint coverage, question types, etc.) so that the scores participants obtain are equivalent.
  • With random administration it is not possible to control and verify in advance of an assessment session that the forms are parallel because the questions are pulled at random. This leads to the following problem in terms of the equivalence of participant scores:
  • If one participant got 2/10 on a randomly administered assessment and another participant got 8/10 on the same randomly administered assessment it would be difficult to know whether the participant who got 2/10 scored low because they (by chance) got harder questions than the participant who got 8/10 or whether the low-scoring participant actually did not know the material and therefore scored low.
  • Using meta tags one can mitigate this issue to some degree (e.g.,  by randomly administering questions within topics by difficulty ranges and other meta tag data) but this would not completely guarantee randomly equivalent forms.
  • Issues with calculation of test reliability statistics using CTT
  • Statistics such as Cronbach’s Alpha have trouble with randomly administered assessment administration. Random administration produces a lot of missing data for questions (e.g., not all participants answer all questions), which psychometric statistics rarely handle well.

There are other alternatives to random administration depending on what the needs are. For example, if random administration is being looked at to curb cheating, options such as shuffling answer choices and randomizing presentation order could serve this need, making it very difficult for participants to copy answers off of one another.

It is important for an organization to look at their context to determine what is best for them. Questionmark provides many options for our customers when it comes to assessment solutions and invites them to work with us in adopting workable solutions.

Embedding Questionmark Assessments in Netvibes

Embed a Questionmark Perception survey or quiz inside your Netvibes page.

  • To see how this would look, see a snapshot of an assessment embedded within a Netvibes IFrame.
  • Check out this How-to on our developer Web site.
  • Netvibes is a customizable homepage or personal web portal. A key feature of Netvibes is the capability to add widgets, which allows you to easily add your IFrame code and embedded assessment in your Netvibes page.

So much to learn at the European Users Conference in October

Mel Lynch headshotPosted by Mel Lynch

In just over a month, Questionmark users will gather in Amsterdam for the 2010 Questionmark European Users Conference on October 3-5. We have had a very positive response so far and are really looking forward to seeing everyone in the Netherlands this year!

The program is pretty close to being finalised, and we are happy to have some excellent case study presenters from across Europe. They will be covering the following topics:

  • Capturing Learning Progress Using Questionmark
  • Self Invigilation of Assessments
  • Online Delivery of High-volume Assessments with Web Forms and Questionmark Printing & Scanning
  • Using JQuery to Extend the Functionality of Questionmark Perception
  • Adoption of Questionmark Perception at the Open University Nederland
  • Developing Advanced Item Types
  • Implementing Questionmark Perception the Fast Way at Rotterdam University

The agenda is stocked full with best practice and technical training presentations, discussions and drop-in demos, too. Not to mention the scenic canal cruise scheduled for the Monday Evening Event!

Registration is open until Wednesday the 29th of September  – but be sure not to wait too long as places are filling up fast.

Register now or visit the conference Web site for further information.

Hope to see you in Amsterdam!