Face Validity: Participants Connecting Assessments to Constructs

Austin FosseyPosted by Austin Fossey

I’d like to follow up my April 10 post about argument-based validity with details about face validity and a note about how these two concepts relate to each other.

This concept of face validity has been around for a while, but in his 1947 article, A Critical Examination of the Concepts of Face Validity, Charles Mosier defined what had previously been a nebulous buzz word.

Nowadays, we generally think of face validity as the degree to which an instrument measures a construct in a way that is meaningful to the layperson. To put it another way, is it clear to your participants how the test relates to the construct? Do they understand how the assessment design relates to what it claims to measure?

For an example of assessments that may have face validity issues, let’s consider college entrance exams. Many students find fault with these assessments, correctly noting that vocabulary and math multiple choice items are not the only indicators of intelligence. But here is the catch: these are not tests of intelligence!

Many such assessments are designed to correlate with academic performance during the first year of college. So while the assessment is very useful for college entrance committees, the connection between the instrument and its consequences is not immediately apparent to many of the participants. In this case, we have high criterion validity and lower face validity.

There are cases when we may not want face validity. For example, a researcher may be delivering a survey where he or she does not want participants to know specifically what is being measured. In such a scenario, the researcher may be concerned that knowledge of the construct might lead participants to engage in hypothesis guessing, which is a threat to the external validity of the study. In such cases, the researcher may design the survey instrument to deliberately obfuscate the construct, or the researcher may use items that correlate with the construct but don’t reference the construct directly.

validityFace validity is an issue that many of us put on the back burner because we need to focus on criterion, construct, and content validity. Face validity is difficult to measure, and it should have little bearing on the inferences or consequences of the assessment. However, for those of us who are accountable to our participants (e.g., organizations selling certification assessments), face validity can play a big part in customer satisfaction and public perception.

Here is where I believe argument-based validity can be very helpful. Many people can understand the structure of argument-based validity, even if they may not understand the warrants and rebuttals. By using argument-based validity to frame our validity documentation, we map out how performance on the assessment relates to the construct inferences and to the consequences that matter to the participant.

Understanding Assessment Validity: Content Validity


Posted by Greg Pope

In my last post I discussed criterion validity and showed how an organization can go about doing a simple criterion-related validity study with little more than Excel and a smile. In this post I will talk about content validity, what it is and how one can undertake a content-related validity study.

Content validity deals with whether the assessment content and composition are appropriate, given what is being measured. For example, does the test content reflect the knowledge/skills required to do a job or demonstrate that one grasps the course content sufficiently? In the example I discussed in the last post regarding the sales course exam, one would want to ensure that the questions on the exam cover the course content area of focus appropriately, in appropriate ratios. For example, if 40% of the four-day sales course deals with product demo techniques then we would want about 40% of the questions on the exam to measure knowledge/skills in the area of demo skills.

I like to think of content validity in two slices. The first slice of the content validity pie is addressed when an assessment is first being developed: content validity should be one of the primary considerations in assembling the assessment. Developing a “test blueprint” that outlines the relative weightings of content covered in a course and how that maps onto the number of questions in an assessment is a great way to help ensure content validity from the start. Questions are of course classified when they are being authored as fitting into the specific topics and subtopics. Before an assessment is put into production to be administered to actual participants, an independent group of subject matter experts should review the assessment and compare the questions included on the assessment against a blueprint. An example of a test blueprint is provided below for the sales course exam, which has 20 questions in total.

validity 4

The second slice of content validity is addressed after an assessment has been created. There are a number of methods available in the academic literature outlining how to conduct a content validity study. One way, developed by Lawshe in the mid 1970s, is to get a panel of subject matter experts to rate each question on an assessment in terms of whether the knowledge or skills measured by each question is “essential,” “useful, but not essential,” or “not necessary” to the performance of what is being measured (i.e., the construct). The more SMEs who agree that items are essential, the higher the content validity. Lawshe also developed a funky formula called the “content validity ratio” (CVR) that can be calculated for each question. The average of the CVR across all questions on the assessment can be taken as a measure of the overall content validity of the assessment.

validity 5

You can use Questionmark Perception to easily conduct a CVR study by taking an image of each question on an assessment (e.g., sales course exam) and creating a survey question for each assessment question to be reviewed by the SME panel, similar to the example below.

validity 6You can then use the Questionmark Survey Report or other Questionmark reports to review and present the content validity results.

So how does “face validity” relate to content validity? Well, face validity is more about the subjective perception of what the assessment is trying to measure than about conducting validity studies. For example, if our sales people sat down after the four-day sales course to take the sales course exam and all the questions on the exam were asking about things that didn’t seem related to the information they just learned on the course (e.g., what kind of car they would like to drive or how far they can hit a golf ball), the sales people would not feel that the exam was very “face valid” as it doesn’t appear to measure what it is supposed to measure. Face validity, therefore, has to do with whether an assessment looks valid or feels valid to the participant. However, face validity is somewhat important:  if participants or instructors don’t buy in to the assessment being administered, they may not take it seriously,  they may complain about and appeal their results more often, and so on.

In my next post I will turn the dial up to 11 and discuss the ins and outs of construct validity.

4 Tips to Help Ensure the Security of Intellectual Property

julie-smallPosted by Julie Chazyn

Protecting the intellectual property contained in a test or exam is essential, not only because of the time, effort and cost of creating assessments but also because IP theft undermines the accurate measurement of knowledge and skills.

Protecting intellectual property protects the credibility of tests. Here are four tips for helping to ensure the security of intellectual property:

Create and administer multiple test forms

Rather than having only one form of the assessment being administered, delivering multiple forms of the same exam can help limit item exposure. This method also allows for the possibility of interspersing large-scale integrated beta test questions within the forms to collect psychometric information on newly developed questions.

Restrict and control administration of beta test items

Beta testing questions is an important part of high-stakes assessment, ensuring the psychometric quality of questions before they appear on actual assessments. However, it is vital that a well conceptualized beta test model is in effect to limit the exposure of newly developed questions to participants.

Update exam forms periodically

Letting exam forms become stale can over-expose questions to participants, increasing the likelihood of IP theft. An organization could consider retiring old exam forms and turning them into exam prep materials that can be sold to participants. In this way, participants could periodically expect new practice questions.

Produce exam prep materials

Organizations should consider making exam prep materials available to participants before an assessment. This will help reduce the demand for participants to try to obtain exam questions via illegal means as they will have access to the type of questions that will be asked on the actual assessment.

For more details on this subject, plust information about various means for deploying a wide range of assessment types with assurance, download our White Paper: Delivering Assessments Safely and Securely.