Understanding Assessment Validity and Reliability

julie-smallPosted by Julie Chazyn

Assessments are not all created equal…Those that are both reliable and valid are the superior ones that support learning and measure knowledge most effectively.  But how can  authors make sure they are producing valid, reliable assessments?

I picked up some tips about this in revisiting the Questionmark White Paper, Assessments through the Learning Process.

So, what is a reliable assessment? One that  works consistently. If a survey indicates that employees are satisfied with a course of instruction, it should show the same result if administered three days later. (This type of reliability is called test-retest reliability.) If a course instructor rates employees taking a performance test, their scores should be the same as if any other course instructor scored their performances. (This is called inter-rater reliability.)

And what is a valid  assessment? One that measures what it is supposed to measure. If a test or survey is administered to happy people, the results should show that they’re all happy. Similarly if a group of people who are all knowledgeable are tested, the test results should reveal that they’re all knowledgeable.

If an assessment is valid, it looks like the job, and the content aligns with the tasks of the job in the eyes of job experts. This type of validity is known as Content Validity. In order to insure this validity, the assessment author must first undertake a job task analysis, surveying subject matter experts (SMEs) or people on the job to determine what knowledge and skills are needed to perform job-related tasks. That information makes it possible to produce a valid test.

Good assessments are both reliable and valid. If we gave a vocabulary test twice to a group of nurses, and the scores came back exactly the same way both times, the test would be considered highly reliable. However, this reliability does not mean that the test is valid. To be valid, it would need  to measure nursing competence in addition to being reliable.

Imagine administering a test of nursing skills to a group of skilled and unskilled nurses and the scores for each examinee are different each time. The test is clearly unreliable. If it’s not reliable, it cannot be valid; fluctuating scores for the same test takers cannot be measuring anything in particular. So the test is both unreliable and invalid. The reliable and valid test of nursing skills is one that yields similar scores every time it is given to the same group of test takers and discriminates every time between good and incompetent nurses. It is consistent and it measures what it is supposed to measure.

Assessments that are both reliable and valid hit the bullseye!

newchartwp

For more detail on validity and reliability, check out another of our white papers, Defensible Assessments: What You Need to Know.

Comments

  1. Samantha
    June 5th, 2009 | 4:11 pm

    Interesting post about assessment testing. Is this true to all type of assessment tests (education, career, etc)? Many companies in search for new employees will use a PEO to administer an assessment test which is suppose to help employers narrow down their searches. It would be interesting to see if those tests were valid and reliable to all candidates. If not, that creates an unfair evaluation of all candidates, and one person could be denied when they are really right for the position.

  2. October 4th, 2013 | 7:44 pm

    [...] here to learn more about how to make your assessments more [...]

  3. October 4th, 2013 | 8:07 pm

    [...] Click here to learn more about how to make your assessments more effective. [...]

Leave a reply

SAP Microsoft Oracle HR-XML AAIC