Endogeneity (econometrics)

In econometrics, endogeneity broadly refers to situations in which an explanatory variable is correlated with the error term.

In simplest terms, endogeneity means that a factor or cause one uses to explain something as an outcome is also being influenced by that same thing. For example, education can affect income, but income can also affect how much education someone gets. When this happens, one's analysis might wrongly estimate cause and effect. The thing one thinks is causing change is also being influenced by the outcome, making the results unreliable.

The concept originates from simultaneous equations models, in which one distinguishes variables whose values are determined within the economic model (endogenous) from those that are predetermined (exogenous).

Ignoring simultaneity in estimation leads to biased and inconsistent estimators, as it violates the exogeneity condition of the Gauss–Markov theorem. This issue is often overlooked in non-experimental research, which limits the validity of causal inference and the ability to draw reliable policy recommendations.

Common solutions to address endogeneity include the use of instrumental variable techniques, which provide consistent estimators by introducing variables that are correlated with the endogenous explanatory variable but uncorrelated with the error term.

Besides simultaneity, correlation between explanatory variables and the error term can arise when an unobserved or omitted variable is confounding both independent and dependent variables, or when independent variables are measured with error.