# Understanding Internal Validity in Statistics

Reliability and validity describe desirable psychometric characteristics of research instruments. The concept of validity is also applied to research studies and their findings. Internal validity examines whether the study design, conduct, and analysis answer the research questions without bias. External validity examines whether the study findings can be generalized to other contexts. Ecological validity examines, specifically, whether the study findings can be generalized to real-life settings; thus ecological validity is a subtype of external validity. These concepts are explained using examples so that readers may understand why the consideration of internal, external, and ecological validity is important for designing and conducting studies, and for understanding the merits of published research.

Statistics can be a tricky subject to grasp. While numbers and calculations seem straightforward, understanding what those numbers actually mean requires diving deeper. One important concept in statistics is internal validity. But what does it actually mean and why does it matter? In this article, we’ll break down internal validity, look at examples, and discuss how it impacts statistical analysis.

## What Is Internal Validity?

Internal validity refers to how well an experiment establishes cause and effect between the independent and dependent variables It ensures that no other variables interfered with the experiment, causing the observed effect on the dependent variable.

High internal validity means the study design accurately demonstrates that the independent variable caused changes in the dependent variable, Low internal validity means other possible explanations for the results cannot be ruled out

For example, imagine you conduct an experiment to see if drinking coffee improves memory performance. The coffee drinking is the independent variable while memory score is the dependent variable. High internal validity means you can confidently state that coffee consumption directly led to an increase in memory performance. Low internal validity means other factors besides the coffee could have caused the change in memory.

When designing studies, researchers strive for high internal validity. This strengthens the conclusions that can be drawn since it rules out alternative explanations for the results. However, internal validity often trades off against external validity, which is the ability to generalize results to other settings.

## 3 Requirements for Internal Validity

Three key requirements must be met for an experiment to have strong internal validity:

1. The independent and dependent variables covary – as one changes, so does the other.

2. The cause (independent variable) preceded the effect (dependent variable) in time.

3. No other variables provide plausible alternative explanations for the results.

Let’s look at each of these requirements in more detail:

### Covariation of Variables

The independent and dependent variables must covary, meaning as the independent variable changes, the dependent variable changes too. For example, in a drug trial, patients receiving the drug treatment should show clinical improvements compared to untreated patients.

Covariation on its own does not guarantee causality though – the variables may be related through some other factor. Covariation simply shows an association exists between the variables.

### Temporal Precedence

For causality, the cause must occur before the effect. The independent variable must precede the dependent variable in time.

In our drug trial example, patients received the treatment first, and then clinical improvements followed. If clinical improvements occurred before the treatment, we clearly could not claim the treatment caused the improvements.

### No Plausible Alternatives

High internal validity means no other variables can plausibly explain the results. You must rule out confounding, extraneous, and control variables as alternative explanations through careful experimental design.

For example, imagine half the patients took the drug in the morning and half at night. If morning patients improved more, time of day provides an alternative explanation, threatening internal validity. Proper randomization would control for this.

## Internal vs. External Validity

Internal validity is not the same as external validity, which concerns the ability to generalize results to other situations and populations. Internal validity is about the accuracy of conclusions within the study itself.

High internal validity is required for any causal conclusions drawn. However, overly controlling variables can reduce external validity. This means being able to generalize may require some tradeoff with internal validity.

For example, studying student volunteers may yield high internal validity, but the results apply only to students, not the general public. Using a more diverse public sample would improve external validity despite potentially lowering internal validity.

## Threats to Internal Validity

Several issues can threaten internal validity and invalidate an experiment’s conclusions. Researchers must identify and mitigate these issues through proper study design and controls. Common threats include:

• History – Events occurring during the experiment outside the variables that affect the results.

• Maturation – Natural changes over time that alter outcomes, like growing older.

• Regression – Statistical tendency for extreme scores to become less extreme on retesting.

• Selection bias – Systematic differences between comparison groups.

• Experimental mortality – Loss of participants altering group characteristics.

• Testing effects – Influence of taking a pre-test on post-test scores.

• Instrumentation – Changes in measurement tools over the study.

Researchers use techniques like control groups, randomization, blinding, and standardized procedures to control for these threats and strengthen internal validity. No study can be 100% free of validity threats though. The goal is simply to minimize the potential issues as much as realistically possible.

## Why Internal Validity Matters

Strong internal validity is essential for determining causality between variables. Without high internal validity, experiments cannot provide credible evidence that altering the independent variable caused observed changes in the dependent variable.

Any statements about causal effects would be purely speculative since alternative explanations cannot be ruled out. The very purpose of most experiments is identifying causal relationships. So without internal validity, the research fails to achieve its goal.

Enhancing internal validity should be a key consideration when designing studies. Researchers must ask themselves – could other variables be influencing the results? If so, how can we control for them? Without answering these types of questions, the conclusions drawn from a study may rest on shaky ground.

Paying close attention to internal validity leads to higher quality research and more accurate conclusions. It helps move results from mere observation to credible evidence that can shape understanding of real world phenomena and systems. The scientific method depends on drawing valid causal inferences from experiments, making internal validity vital to the process.

## Examples of Internal Validity

To better understand the concept of internal validity, let’s look at some examples of studies with high and low validity:

High internal validity

• Drug trial with placebo control, randomization, blinding, and standardized protocols.

• Lab experiment manipulating single variable with all else tightly controlled.

• Study of different teaching methods using standardized curriculum and testing.

Low internal validity

• Survey asking people to self-report effects of a drug treatment they chose themselves.

• Study correlating income level with test scores without controlling demographic factors.

• Experiment with no control group and many confounding variables.

The high validity examples isolate the effects of the independent variable using tight controls. The low validity examples have issues like self-selection, confounding variables, lack of standardization, etc. that undermine any causal conclusions.

Good research design enhances internal validity. But no study can achieve perfect validity. The goal is simply to maximize validity wherever realistically feasible.

## Tips for Higher Internal Validity

Here are some tips to help boost internal validity when designing studies:

• Use control groups, blinded trials, and randomization to isolate effects of independent variables.

• Eliminate confounding factors through exclusion criteria or statistical controls.

• Standardize procedures and equipment or account for any necessary changes.

• Test for selection biases and similarities between comparison groups.

• Take steps to reduce experimental mortality and missing data.

• Pilot testing to uncover potential problems with protocols or measurements.

• Replicate findings under different conditions to assess consistency.

• Consider how maturation over time could influence results.

• Power studies appropriately to avoid erroneous conclusions from low sample sizes.

No single tactic can protect validity on its own. Strong internal validity stems from utilizing multiple research best practices in unison throughout the entire experimental process.

Internal validity is a cornerstone of experimental research and statistical analysis. Without confidence in the causal relationship between variables, researchers cannot draw credible conclusions from studies.

While achieving high internal validity often requires substantial effort, it forms the bedrock supporting meaningful findings that advance knowledge and provide solutions. Through careful research design and execution, scientists can demonstrate clear causal links opening new doors to progress.

## DID CATIE HAVE EXTERNAL VALIDITY?

The answer is both yes and no. CATIE[1] was designed as an effectiveness study; that is, a study with relevance to real-world settings. The CATIE findings are relevant to clinical practice in the USA but are of questionable relevance in India. One reason is that, in the USA, where CATIE was conducted, the primary outcome, time to all-cause treatment discontinuation, is substantially patient-influenced, whereas in India, where families supervise treatment, it is largely caregiver-determined. Another and more important reason is that the healthcare delivery system in clinical practice is strikingly different in the two countries. Thus CATIE has good external validity for clinical practice in the USA but not in India.

Reliability and validity are concepts that are applied to instruments such as rating scales and screening tools. Validity describes how well an instrument does what it is supposed to do. For example, does an instrument that screens for depression do so with high sensitivity and specificity? Reliability describes the consistency with which results are obtained. For example, if an instrument that rates the severity of depression is administered to the same patient twice within the span of an hour, are the scores obtained closely similar? Different types of reliability and validity describe desirable psychometric properties of research and clinical instruments.[2,3] Validity can also be applied to laboratory and clinical studies, and to their findings, as well, as the sections below show.

Internal validity examines whether the manner in which a study was designed, conducted, and analyzed allows trustworthy answers to the research questions in the study. For example, improper randomization, inadvertent unblinding of patients or raters, excessive use of rescue medication, and missing data can all undermine the fidelity of the results and conclusions of a randomized controlled trial (RCT). That is, the internal validity of the RCT is compromised. Internal validity is based on judgment and is not a computed statistic.

Internal validity examines the extent to which systematic error (bias) is present. Such systematic error can arise through selection bias, performance bias, detection bias, and attrition bias.[4] If internal validity is compromised, it can occasionally be improved, for example, by a modified plan of analysis. However, biases can be often fatal as, for example, if double-blind ratings were not obtained in an RCT.

External validity examines whether the findings of a study can be generalized to other contexts.[4] Studies are conducted on samples, and if sampling was random, the sample is representative of the population, and so the results of a study can validly be generalized to the population from which the sample was drawn. But results may not be generalizable to other populations. Thus external validity is poor for studies with sociodemographic restrictions; studies that exclude severely ill and suicidal patients, or patients with personality disorders, substance use disorders, and medical comorbidities; studies that disallow concurrent treatments; and so on. External validity is also limited in short-term studies of patients who need to be treated for months to years. External validity, like internal validity, is based on judgment and is not a computed statistic.

Ecological validity examines whether the results of a study can be generalized to real-life settings.[5] How is this different from external validity? External validity asks whether the findings of a study can be generalized to patients with characteristics that are different from those in the study, or patients who are treated in a different way, or patients who are followed up for longer durations. In contrast, ecological validity specifically examines whether the findings of a study can be generalized to naturalistic situations, such as clinical practice in everyday life. Ecological validity is, therefore, a subtype of external validity. The ecological validity of an instrument can be computed as a correlation between ratings obtained with that instrument and an appropriate measure in naturalistic practice or in everyday life. The ecological validity of a study is a judgment and is not a computed statistic.

Ecological validity was originally invoked in the context of laboratory studies that required to be generalized to real-life situations.[5] Thus, laboratory studies of the neuropsychological and psychomotor impairments produced by psychotropic drugs have poor ecological validity because what is studied in relaxed, rested, and healthy subjects tested in a controlled environment is very different from demands that stressed patients face in everyday life. In fact, these cognitive and psychomotor tests, especially when based on computerized tasks, have no parallel in everyday life. How much less ecological validity, then, would research in animal models of different neuropsychiatric states have for patients in clinical practice? This explains why drugs that work in animal models often fail in humans.[6]

On a parting note, a good understanding of the concepts of internal, external, and ecological validity is necessary to properly design and conduct studies and to evaluate the merits and applications of published research.

## Internal Validity

What is internal validity in research?

Internal validity is a measure of the reliability and soundness of research. Essentially, internal validity can express whether the research was conducted appropriately and whether there’s a real cause-and-effect relationship between research variables. The higher the internal validity, the more accurate the research is.

What is internal and external validity?

Revised on June 22, 2023. Internal and external validity are two ways of testing cause-and-effect relationships. Internal validity refers to the degree of confidence that the causal relationship being tested is trustworthy and not influenced by other factors or variables.

What is a high internal validity study?

When a study has high internal validity, it establishes a cause-and-effect relationship between the independent variable (treatment or intervention) and the dependent variable (outcome). This provides confidence that changes in the dependent variable are genuinely due to the manipulation of the independent variable.

How do you test the internal validity of a study?

There are factors you can check to test the internal validity of studies: No confounding variables. An important condition of validity is that there are no extraneous factors or counting variables. If they are present, they can affect the outcomes of your research and skew conclusions. Changes in your variables correlate.