A Bayesian Cohort Model for Estimating SDG Indicator 4.1.4: Out-of-School Rates

Abstract: The out-of-school rate is an essential component of the Sustainable Development Goal (SDG) 4 monitoring framework used to track progress towards universal access to education. Producing estimates of out-of-school rates globally is a challenging task due to the diversity in enrolment patterns, systematic differences in the nature and reliability of administrative and survey-based data, and the heavy presence of invalid administrative observations resulting from enrollment counts that exceed corresponding population estimates. In this paper we introduce a cohort-based Bayesian hierarchical model to address these challenges and produce complete time series of out-of-school rates for 192 countries. The model uses a flexible spline-based process for underlying cohort out-of-school rate curves that are smoothed through cohort progression and over time. Observations are related to these values using a dual likelihood setup where each data source has distinct bias and variance components. The administrative side includes a structure that propagates uncertainty information contained in invalid data to avoid understating uncertainty. Validation exercises and sensitivity calibrated and offers a material improvement over simpler approaches. The model is currently used by the United Nations to monitor out-of-school rates for all countries with available data.

Additional details on the proposed model can be found here.