A Bayesian cohort model for estimating out-of-school rates and populations

Abstract: The out-of-school rate is a critical indicator for monitoring global progress towards universal education. It quantifies the population of children and youth excluded from each level of the education system. As with many education indicators, historical out-of-school reporting has relied exclusively on imperfect administrative data. Recently, the education community has turned to survey data as a supplement to administrative data to overcome its gaps and weaknesses. Producing such consolidated estimates globally in the out-of-school rate context, however, is a challenging task due to the diversity in enrolment patterns, systematic differences in the nature and reliability of administrative and survey-based data, and the heavy presence of invalid administrative observations resulting from enrollment counts that exceed corresponding population estimates. In this paper we introduce a cohort-based Bayesian hierarchical model to address these challenges and produce complete time series of out-of-school rates for 192 countries. The model uses a flexible spline-based process for underlying cohort out-of-school rate curves that are smoothed through cohort progression and over time. Observations are related to these values using a dual likelihood setup where each data source has distinct bias and variance components. The administrative side includes a structure that propagates uncertainty information contained in invalid data to avoid understating uncertainty. Validation exercises and sensitivity analysis suggest that the model is reasonably well calibrated and offers a material improvement over simpler approaches. The model is currently used by UNESCO to monitor out-of-school rates for all countries with available data.

Additional details on the proposed model can be found here.