Income and Family Background: Are we Using the Right Models?

2019 
Social scientists have long been interested in the relationship between parental factors and later child income. Finding the best characterization of this relationship for the question at hand is however fraught with choices. In this paper we use machine learning methods to assess the ‘completeness’ of one popular modelling approach. Here, completeness refers to how well the model summarizes the total predictive relationship between multiple parental factors and a single child outcome. Machine learning methods enable us to depart from functional form assumptions, allowing flexible interactions between a large set of possible parental factors. Using our most flexible complete model as a benchmark, we assess the popular ‘rank-rank’ model relating parent and child incomes. Applying our approach to high-quality Norwegian administrative data, we demonstrate that the rank-rank model explains 68% of the total explainable variation in child income rank, based on a broad set of potential parental factors entering a neural network. Parental wealth and parental education explain the majority of the remaining explainable variation. For an extremely tractable model, we consider this to be a relatively high level of completeness. In light of our country-wide estimates, we explore how this measure of completeness varies across regions of Norway, finding broadly similar patterns to those found at the national level. Our results imply that comparisons of regions based on rank-rank mobility measures may indeed reflect differences in broader notions of equality of opportunity.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    1
    Citations
    NaN
    KQI
    []