# Understanding Validity: More on Construct Validity

Posted by Greg Pope

In my last post I introduced construct validity. Here I will talk more about some specific aspects of construct validity.

I previously mentioned “unidimensionality,” and there may be some people out there who would like to know what this terms means. In the assessment realm, unidimensionality refers to the measurement of one psychological dimension/trait/construct/attribute/skill/ability/etc. For example, if we have an assessment that is designed to measure the construct “math ability” we would want to ensure that all the questions in the assessment measure this construct and only this construct. One can investigate the degree to which an assessment is unidimensional by using some well known statistical analysis methods such as principal component analysis (PCA) or factor analysis (FA) to confirm/explore what dimensions each question in the assessment loads onto. I have done a few of these in my day using software like SPSS and I can tell you from firsthand experience, they are a ton of fun.

Unidimensionality is an important assumption in a number of areas of psychometrics and has implications for statistics like internal consistency reliability (e.g., Cronbach’s Alpha will be maximized when all items are measuring the same construct) and the interpretation of participant scores (e.g., if an assessment is measuring a random smattering of dimensions then what does a participant score really mean?). Here’s how this concept of unidimensionality fits in with the concept of construct validity: If an assessment is designed to measure only one dimension/construct, this should actually be the case, and this design assumption can be investigated/confirmed. If an assessment is composed of four topics, each of which is designed to measure only one dimension, then this can be investigated/confirmed. In other words, the dimensionality assumptions about the measurement of the construct(s) composing the assessment should be validated, for example using methods like PCA or FA.

Moving on to the details of convergent and discriminant validity, these two things are in some ways flip sides of the same coin. With convergent validity, we are seeing whether our math ability test correlates with other well known math ability tests that are being used. With discriminant validity, we are seeing whether our math ability test does not correlate with well known tests out there that measure something different than math ability, like verbal ability.

Using our trusty statistical analysis program, Excel, we can conduct a mini convergent and discriminant validity study. We had the same ten participants take four math ability tests for the convergent validity study: Our math ability test and three other well known tests out there. For the discriminant validity piece we also had the same ten participants take three well known verbal ability tests. To investigate convergent validity we correlated the assessment scores from Our math ability test with scores obtained from the three other math ability tests and we found high correlations with each, for example the correlation between Our math ability test and the EU math ability test is 0.966, the correlation between Our math ability test and the Australian math ability test was 0.960, and the correlation between Our math ability test and the US math ability test was 0.962:

To investigate discriminant validity we correlated the assessment scores from Our math ability test with scores obtained from the three verbal ability tests and we found very low correlations with each, for example the correlation between Our math ability test and the Canadian verbal ability test was 0.008, the correlation between Our math ability test and the US verbal ability test was -0.053, and the correlation between Our math ability test and the British verbal ability test was -0.078:

In terms of nomological validity, this is something that needs to be addressed in situations where new research is being conducted into an existing construct or new construct. We would want to ensure that current research is supported by previous research. If a completely new construct is being proposed we would need to justify how this construct is similar and different to other construct research that has been done.

In my next post I will discuss modern perspectives on validity.

Hi! I was tasked to develop a questionnaire just recently. This is the first time that I will do such type of research. I know for a fact that there are reliability and validity tests that I need to do. The validity test worries me big time. Ive read from your post that you have done such test in SPSS. Do you know of any site or maybe materials you could share with me to know the basics about it? Im not really a statistics-person, so if there’s any material with a simple yet clear explanation that you know of, please let me know of the site. Thank you for taking time to read this. Good day.

Ivee

Dear Ivee,

Thank you for your note, and I am glad to hear that you are addressing reliability and validity in the design of your questionnaire! There are many different kinds of validity, and each one has its own method for demonstrating validity, but I think the most important thing to remember is that validity refers to the inferences you make about the results and how you use those results. Another way of thinking about this is to ask ourselves if the instrument is measuring what we think it’s measuring. If you are interested in a less statistical approach to demonstrating validity, you may enjoy our post about argument-based validity.

As Greg noted, one of the issues that may affect our inference about the assessment, is whether or not the test is unidimensional; i.e., are we measuring one construct or are we inadvertently measuring multiple constructs? If this is a concern for your instrument, you can use statistical software packages like SPSS to conduct a principal components analysis or an exploratory factor analysis.

Both of these analyses can be done in the standard SPSS package, but they are not straightforward parametric tests per se. Both require some reformatting of the data (e.g., creating standard scores) and some exploration to determine how to interpret the results (e.g., examining a skree plot to determine how many components to retain in PCA).

If you are concerned about the unidimensionality of your instrument, and you want to learn about EFA or PCA, I would recommend the book Analyzing Multivariate Data by Lattin, Carroll, and Green. It is a statistics textbook, but it is written for people who work in marketing, and I find that it is not overly technical or dense.

I hope this is helpful! If you (or any of our other readers) are interested in us doing some blog posts about multivariate analyses like EFA or PCA, please let us know!

Sincerely,

Austin Fossey

Reporting and Analytics Manager

Questionmark

I want to establish face validity , content validity , construct validity in our survey what will be sample size and study deessign

Hi Sunil,

That’s great that you are planning these validity studies! There are many different ways to demonstrate content and construct validity, and a lot of it will depend on your instrument’s design. There’s far too more research and guidance on validity than I can provide in this response, but I would recommend Michael Kane’s chapter on validation in Educational Measurement (4th Edition), edited by Robert Brennan as a start. You may also want to check out Introduction to Classical and Modern Test Theory by Linda Crocker and James Algina, as well the chapter on validity in the 2014 edition of the Standards for Educational and Psychological Testing from AERA, APA, and NCME.

In general though, I often see content validity studies use indices of correct classification of items to demonstrate content validity. These typically are not statistical tests, so your sample is going to be the items on the assessment and the ratings from subject matter experts. Construct validity on the other hand is sometimes done with a bifactor confirmatory factor analysis (CFA) with a multi-trait, multi-method model. With a CFA, a lot of the sample size will depend not so much on the number of participants, but the number of parameters that need to be estimated. Face validity is not as easy to measure, and I am not sure I can provide guidance on how to go about measuring it (though I am certain some have tried). In general, I think people are now using an argument-based approach to validity which in turn can support face validity. Michael Kane’s aforementioned chapter discusses argument-based approaches to validity.

Good luck with your studies! I hope you will keep us posted on your findings!

Sincerely,

Austin Fossey

Reporting and Analytics Manager

Questionmark