Data Generating Process (DGP)
True wage equation: \(\ln W_i = \alpha + \beta\, E_i + \gamma\, A_i + \delta\, S_i + \varepsilon_i\), \(\varepsilon_i \sim \mathcal{N}(0,\sigma_\varepsilon^2)\)
Optimal education (from FOC of \(\max_{E}\; \ln W - C\), with \(C_i = T E^2 S_i^2 / A_i\)):
\(E_i^* = \dfrac{\beta\, A_i}{2\,T\, S_i^2}\)
Misspecified Mincer regression (OLS): \(\ln W_i = \hat\alpha + \hat\beta_{\text{OLS}}\, E_i + u_i\)
OVB decomposition:
\(\hat\beta_{\text{OLS}} \;\xrightarrow{p}\; \beta \;+\; \underbrace{\gamma \cdot \dfrac{\text{Cov}(E,A)}{\text{Var}(E)}}_{\text{ability bias}\;(+)} \;+\; \underbrace{\delta \cdot \dfrac{\text{Cov}(E,S)}{\text{Var}(E)}}_{\text{skills bias}\;(-)}\)
Key insight: The empirical bias \((\hat\beta_{\text{OLS}} - \beta)\) computed from the simulation should exactly match the OVB formula decomposition — confirming that the bias is entirely due to omitted variables.
Panel 1: Education vs Log Wages — OLS Lines
Panel 2: Bias Decomposition (Waterfall)
Panel 3a: Ability (A) Distribution
Panel 3b: Skills (S) Distribution
Panel 3c: Optimal Education (E*) Distribution
Panel 4: Summary Statistics & Bias Comparison
| Statistic | Value |
|---|