How much do you know about defensible assessments?

Julie Delazyn HeadshotPosted by Julie Delazyn

This quiz is a re-post from a very popular blog entry published by John Kleeman.

Readers told us that it was instructive and engaging to take quizzes on using assessments, and we like to listen to you! So here is the second quiz in a pre-published series of quizzes on assessment topics. This one was authored in conjunction with Neil Bachelor of Pure Questions. You can see the first quiz on Cut Scores here.

As always, we regard resources like this quiz as a way of contributing to the ongoing process of learning about assessment. In that spirit, please enjoy the quiz below and feel free to comment if you have any suggestions to improve the questions or the feedback.

Is a longer test likely to be more defensible than a shorter one? Take the quiz and find out. Be sure to look for your feedback after you have completed it!

NOTE: Some people commented on the first quiz that they were surprised to lose marks for getting questions wrong. This quiz uses True/False questions and it is easy to guess at answers, so we’ve set it to subtract a point for each question you get wrong, to illustrate that this is possible. Negative scoring like this encourages you to answer “Don’t Know” rather than guess; this is particularly helpful in diagnostic tests where you want participants to be as honest as possible about what they do or don’t think they know.

5 Responses to “How much do you know about defensible assessments?”

  1. basdenleco says:

    Should have read the fine print and realised “Don’t Know” was acceptable….

  2. Jane says:

    Interesting quiz – made me think!

  3. Jone says:

    Two of the items (2 and 5) do not appear to have the correct answers —
    On Item 2: Cronbach’s alpha is a measure of internal consistency, not internal SCORE consistency. Cronbach’s alpha measures the consistency (relationship) of items on a scale (when there are multiple scales on a test) or items on a test. Cronbach’s alpha is theoretically the average of the reliability coefficients that would be obtained for all possible combinations of items when split into two half-tests.

    On Item 5: First – tests are not validated, the use of the test scores for a particular use is validated. Second, this item is a drastic over-simplification of what is necessary to defend the use of a test score. Reliability of a test is an important but insufficient test parameter for defending the use of a test for a particular purpose. The most reliable test available is not defensible if it does not measure constructs (KSAOs) that are related to predicting success on the job (validity).

    Further, in classical test theory simply adding items does not increase reliability; the items added have to be of sound psychometric properties to increase reliability. There are times when items are added and the reliability goes down. I think that is what is alluded to in the phrase “not always but.”

    On item 4: Part of the explanation for the response is incorrect. The correlation of scores between two similar exams is a measure of construct validity, not concurrent validity. (This correlation might also be calculated in an equating study for two exams, but that process involves more than just the correlation). Concurrent validity is a method of establishing validity in which tests are administered to current job incumbents and the test score is correlated with measures of the job incumbents’ criterion job performance.

  4. Hi Jone, Thank you for you comment and observations. -Julie

  5. I think best practice is not penalize for the wrong score. I believe there is a correlation with risk taking that confuses matters.

Leave a Reply