How to Avoid Researcher Bias: 10 Tips for More Objective Research

1Department of Clinical, Educational and Health Psychology, Division of Psychology and Language Sciences, University College London, London, WC1H 0AP UK

2Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, UK Find articles by

1Department of Clinical, Educational and Health Psychology, Division of Psychology and Language Sciences, University College London, London, WC1H 0AP UK

2Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, UK Find articles by

1Department of Clinical, Educational and Health Psychology, Division of Psychology and Language Sciences, University College London, London, WC1H 0AP UK Find articles by

3MRC Integrative Epidemiology Unit at the University of Bristol, Bristol Medical School, University of Bristol, Bristol, UK

5Centre for Academic Mental Health, Population Health Sciences, University of Bristol, Bristol, UK Find articles by

3MRC Integrative Epidemiology Unit at the University of Bristol, Bristol Medical School, University of Bristol, Bristol, UK

6NIHR Biomedical Research Centre, University Hospitals Bristol NHS Foundation Trust and University of Bristol, Bristol, UK Find articles by

Analysis of secondary data sources (such as cohort studies, survey data, and administrative records) has the potential to provide answers to science and society’s most pressing questions. However, researcher biases can lead to questionable research practices in secondary data analysis, which can distort the evidence base. While pre-registration can help to protect against researcher biases, it presents challenges for secondary data analysis. In this article, we describe these challenges and propose novel solutions and alternative approaches. Proposed solutions include approaches to (1) address bias linked to prior knowledge of the data, (2) enable pre-registration of non-hypothesis-driven research, (3) help ensure that pre-registered analyses will be appropriate for the data, and (4) address difficulties arising from reduced analytic flexibility in pre-registration. For each solution, we provide guidance on implementation for researchers and data guardians. The adoption of these practices can help to protect against researcher bias in secondary data analysis, to improve the robustness of research based on existing data.

Secondary data analysis has the potential to provide answers to science and society’s most pressing questions. An abundance of secondary data exists—cohort studies, surveys, administrative data (e.g., health records, crime records, census data), financial data, and environmental data—that can be analysed by researchers in academia, industry, third-sector organisations, and the government. However, secondary data analysis is vulnerable to questionable research practices (QRPs) which can distort the evidence base. These QRPs include p-hacking (i.e., exploiting analytic flexibility to obtain statistically significant results), selective reporting of statistically significant, novel, or “clean” results, and hypothesising after the results are known (HARK-ing [i.e., presenting unexpected results as if they were predicted]; [1]. Indeed, findings obtained from secondary data analysis are not always replicable [2, 3], reproducible [4], or robust to analytic choices [5, 6]. Preventing QRPs in research based on secondary data is therefore critical for scientific and societal progress.

A primary cause of QRPs is common cognitive biases that affect the analysis, reporting, and interpretation of data [7–10]. For example, apophenia (the tendency to see patterns in random data) and confirmation bias (the tendency to focus on evidence that is consistent with one’s beliefs) can lead to particular analytical choices and selective reporting of “publishable” results [11–13]. In addition, hindsight bias (the tendency to view past events as predictable) can lead to HARK-ing, so that observed results appear more compelling.

The scope for these biases to distort research outputs from secondary data analysis is perhaps particularly acute, for two reasons. First, researchers now have increasing access to high-dimensional datasets that offer a multitude of ways to analyse the same data [6]. Such analytic flexibility can lead to different conclusions depending on the analytical choices made [5, 14, 15]. Second, current incentive structures in science reward researchers for publishing statistically significant, novel, and/or surprising findings [16]. This combination of opportunity and incentive may lead researchers—consciously or unconsciously—to run multiple analyses and only report the most “publishable” findings.

One way to help protect against the effects of researcher bias is to pre-register research plans [17, 18]. This can be achieved by pre-specifying the rationale, hypotheses, methods, and analysis plans, and submitting these to either a third-party registry (e.g., the Open Science Framework [OSF]; https://osf.io/), or a journal in the form of a Registered Report [19]. Because research plans and hypotheses are specified before the results are known, pre-registration reduces the potential for cognitive biases to lead to p-hacking, selective reporting, and HARK-ing [20]. While pre-registration is not necessarily a panacea for preventing QRPs (Table ), meta-scientific evidence has found that pre-registered studies and Registered Reports are more likely to report null results [21–23], smaller effect sizes [24], and be replicated [25]. Pre-registration is increasingly being adopted in epidemiological research [26, 27], and is even required for access to data from certain cohorts (e.g., the Twins Early Development Study [28]). However, pre-registration (and other open science practices; Table ) can pose particular challenges to researchers conducting secondary data analysis [29], motivating the need for alternative approaches and solutions. Here we describe such challenges, before proposing potential solutions to protect against researcher bias in secondary data analysis (summarised in Fig. ).

Researcher bias can negatively impact the validity and reliability of research findings. As researchers, we often have preconceived notions and preferences that can influence how we design studies, collect and analyze data, and interpret results. Minimizing bias is essential for producing high-quality, credible research that has a greater chance of being replicated. In this guide, we provide 10 tips to help researchers avoid bias and conduct more objective research.

What is Researcher Bias?

Researcher bias refers to the influence of a researcher’s personal values, perceptions, and interests on the research process. It often arises from

Selective observation and recording of data
Improper sampling methods
Leading questions that steer participants towards certain responses
Misinterpretation or skewed analysis of data to align with researcher’s existing views
Omission or disregard of unexpected findings that contradict researcher’s hypotheses

Researcher bias decreases the validity of the research and its acceptance by the larger scientific community That’s why it is critical for researchers to acknowledge their biases and act to minimize them through proper study design, data collection, analysis, and reporting

10 Tips to Avoid Researcher Bias

Here are 10 best practices researchers should implement to reduce bias in their work:

1. Create a Thorough Research Plan

Develop a detailed protocol addressing sampling, data collection tools, statistical analysis methods before starting the research.
Stick closely to the plan and document any post-hoc changes.

2. Use Proper Sampling Techniques

Simple random sampling or systematic sampling decreases likelihood of sampling bias.
Ensure sample is representative of target population.

3. Evaluate Your Hypothesis

Scrutinize your hypothesis for any pre-existing biases or assumptions.
Be open to rejecting your original hypothesis based on the data.

4. Ask General Questions First

Ask open-ended questions before more specific ones during interviews/surveys.
Avoid leading questions that nudge participants towards a response.

5. Categorize Topics Beforehand

Develop a coding scheme to categorize qualitative data into themes beforehand.
Consistently apply coding scheme to all data.

6. Analyze Context When Summarizing

Consider the original contextual meaning when summarizing qualitative data.
Use participant’s own words rather than paraphrasing.

7. Peer Review Analysis

Have colleagues review coded data and themes to check for selective interpretation of data.

8. Report All Findings

Completely report all results, including unexpected findings that run counter to hypotheses.
Do not omit non-significant results.

9. Use Blinding Techniques

Blind participants and researchers from knowing which intervention was received to avoid performance bias.

10. Share Limitations Openly

Discuss limitations of sampling, methodology, analysis in published reports.
Outline steps taken to minimize bias.

Signs of Potential Researcher Bias

Here are some common red flags that may indicate researcher bias is present:

Using a small, non-random sample that lacks representation
Asking leading, closed-ended questions
Focusing observations/interviews to align with expected outcomes
Selectively reporting or omitting data that contradicts hypotheses
Ignoring outliers or negative cases
Overstating positive findings and understating negative findings
Making unsupported conclusions that extend beyond data

Importance of Minimizing Bias

While complete elimination of bias is very difficult, researchers have an ethical responsibility to minimize bias. Here’s why it matters:

Helps produce more rigorous, replicable research
Increases credibility and acceptance of findings
Reduces errors and improves accuracy of conclusions
Ensures reporting of complete results instead of selective outcomes
Provides greater objectivity unaffected by researcher preferences
Upholds research integrity and prevents questionable research practices

By being aware of their own biases, following protocol, scrutinizing results, and engaging in peer-review, researchers can reduce bias substantially and uphold research excellence.

Summary

Researcher bias is a significant threat to valid and reliable research. Using proper sampling techniques, asking non-leading questions, blinding, developing categorical schemes, peer-reviewing data, and reporting all results can help minimize bias. We must acknowledge our own biases, remain open to unexpected findings, and take concrete steps to avoid bias through the entire research process if we want to produce credible, replicable research untainted by subjective researcher preferences.

how to avoid researcher bias

Challenge: Research may not be hypothesis-driven

Observational research arguably does not need to have a hypothesis to benefit from pre-registration. For studies that are descriptive or focused on estimation, we recommend pre-registering research questions, analysis plans, and criteria for interpretation. Analytic flexibility will be limited by pre-registering specific research questions and detailed analysis plans, while post hoc interpretation will be limited by pre-specifying criteria for interpretation [50]. The potential for HARK-ing will also be minimised because readers can compare the published study to the original pre-registration, where a-priori hypotheses were not specified.

Detailed guidance on how to pre-register research questions and analysis plans for secondary data is provided in Van den Akker’s [29] tutorial. To pre-specify conditions for interpretation, it is important to anticipate – as much as possible – all potential findings, and state how each would be interpreted. For example, suppose that a researcher aims to test a causal relationship between X and Y using a multivariate regression model with longitudinal data. Assuming that all potential confounders have been fully measured and controlled for (albeit a strong assumption) and statistical power is high, three broad sets of results and interpretations could be pre-specified. First, an association between X and Y that is similar in magnitude to the unadjusted association would be consistent with a causal relationship. Second, an association between X and Y that is attenuated after controlling for confounders would suggest that the relationship is partly causal and partly confounded. Third, a minimal, non-statistically significant adjusted association would suggest a lack of evidence for a causal effect of X on Y. Depending on the context of the study, criteria could also be provided on the threshold (or range of thresholds) at which the effect size would justify different interpretations [51], be considered practically meaningful, or the smallest effect size of interest for equivalence tests [52]. While researcher biases might still affect the pre-registered criteria for interpreting findings (e.g., toward over-interpreting a small effect size as meaningful), this bias will at least be transparent in the pre-registration.

Pre-registered analyses are not appropriate for the data

With pre-registration, there is always a risk that the data will violate the assumptions of the pre-registered analyses [17]. For example, a researcher might pre-register a parametric test, only for the data to be non-normally distributed. However, in secondary data analysis, the extent to which the data shape the appropriate analysis can be considerable. First, longitudinal cohort studies are often subject to missing data and attrition. Approaches to deal with missing data (e.g., listwise deletion; multiple imputation) depend on the characteristics of missing data (e.g., the extent and patterns of missingness [34]), and so pre-specifying approaches to dealing with missingness may be difficult, or extremely complex. Second, certain analytical decisions depend on the nature of the observed data (e.g., the choice of covariates to include in a multiple regression might depend on the collinearity between the measures, or the degree of missingness of different measures that capture the same construct). Third, much secondary data (e.g., electronic health records and other administrative data) were never collected for research purposes, so can present several challenges that are impossible to predict in advance [35]. These issues can limit a researcher’s ability to pre-register a precise analytic plan prior to accessing secondary data.

The Hidden Biases in WEIRD Psychology Research

How can we reduce bias in qualitative research?

To reduce bias – and deliver better research – let’s explore its primary sources. When we focus on the human elements of the research process and look at the nine core types of bias – driven from the respondent, the researcher or both – we are able to minimize the potential impact that bias has on qualitative research.

How to avoid researcher bias in a research study?

By framing questions using certain strategies, you can minimize the chance of including bias in a research study. In this article, we define what researcher bias is, describe the different types and provide some key steps for how to avoid it during an examination process. What is researcher bias?

Is it hard to avoid bias in quantitative research?

“Qualitative research relies more on the experience and judgment of the researcher. Also, the type of data collected is subjective and unique to the person or situation. So it is much harder to avoid bias than in quantitative research.” “Are there ways to avoid bias ?” “A good start is to recognize that bias exists in all research.

What is bias in research?

Bias—commonly understood to be any influence that provides a distortion in the results of a study ( Polit & Beck, 2014 )—is a term drawn from the quantitative research paradigm.