Show Summary Details

Page of

PRINTED FROM the OXFORD RESEARCH ENCYCLOPEDIA, ECONOMICS AND FINANCE ( (c) Oxford University Press USA, 2016. All Rights Reserved. Personal use only; commercial use is strictly prohibited (for details see Privacy Policy and Legal Notice).

date: 19 June 2018

The Effect of Education on Health and Mortality: A Review of Experimental and Quasi-Experimental Evidence

Summary and Keywords

Education is strongly associated with better health and longer lives. However, the extent to which education causes health and longevity is widely debated. We develop a human capital framework to structure the interpretation of the empirical evidence and review evidence on the causal effects of education on mortality and its two most common preventable causes: smoking and obesity. We focus attention on evidence from randomized controlled trials, twin studies, and quasi-experiments. There is no convincing evidence of an effect of education on obesity, and the effects on smoking are only apparent when schooling reforms affect individuals’ track or their peer group, but not when they simply increase the duration of schooling. An effect of education on mortality exists in some contexts but not in others and seems to depend on (i) gender, (ii) the labor market returns to education, (iii) the quality of education, and (iv) whether education affects the quality of individuals’ peers.

Keywords: education, health, mortality, health behaviors, health disparities, causality


More educated individuals live longer, healthier lives. A large literature has documented substantial associations between education and mortality, health (self-reported health, obesity, etc.), and health behaviors (smoking, excessive drinking, exercise, preventive care use, etc.). Although the strength of these relationships varies, such associations have been observed in many countries and time periods. Moreover, the gap between the educated and less educated appears to be growing (Pappas, Queen, Hadden, & Fisher, 1993; Meara, Richards, & Cutler, 2008). These disparities in health are large. For example, in the United States, at age 25, those with more than a college degree can expect to live up to seven years longer than those without a college degree (Meara et al., 2008; Hummer & Hernandez, 2013). Substantial theoretical and empirical research has been devoted to understanding the causes of these disparities.

The earliest theoretical contribution was made by Michael Grossman, who developed a theory in which health is a form of capital that individuals can produce (Grossman, 1972). Grossman’s seminal theory laid the foundation for a substantial body of empirical and theoretical research on the determinants of health. It was also the first work to discuss the various ways in which education might affect health and longevity. Here, a theoretical framework based on this early model but inspired by recent advances in theory is developed to investigate the relationship between human capital, schooling, mortality, and health behaviors. This theory updates the early Grossman model in some important aspects. First, health, skills, health behavior, schooling, and longevity are all treated as endogenous, and they are allowed to vary as a function of initial endowments, such as genes and parental characteristics. Second, inspired by the seminal work of Heckman and others (e.g., Heckman, Stixrud, & Urzua, 2006; Cunha & Heckman, 2007), a distinction is made between skills developed in school and time spent in school. Finally, the role of laws and institutions that affect schooling is explicitly modeled, enabling predictions to be made about the effects of compulsory schooling and other education policies on health outcomes. Changes in such policies have been widely exploited in the empirical literature to estimate the causal effects of education on health.

A very large and recent literature has investigated whether the association between education and health is causal. Does going to school increase longevity? Does it improve health behaviors? The growing availability of large data sets in combination with a revolution in empirical methods addressing causality has produced much new evidence on this old question. Papers written since 2005 that investigate how education affects (i) mortality, and (ii) smoking and obesity, the two most important determinants of preventable death and disease in the United States and most developed countries (Mokdad, Marks, Stroup, & Gerberding, 2004), are discussed. Attention is given only to papers assessing causality using randomized controlled trials, twin differences, or quasi-experiments. The vast majority of empirical studies use data from the developed world, which should be kept in mind when interpreting results.

The findings are not uniform. While discrepancies in findings across studies can result from differences or flaws in the empirical methodologies applied, there is substantial evidence of genuine heterogeneity in the estimated effects of education. The focus here is on understanding this heterogeneity, in light of the theoretical model and of findings in this large literature. A formal meta-analysis was not conducted, but observations nevertheless enable broad conclusions to be drawn from the existing empirical evidence. No convincing evidence of an effect of education on obesity is found. There appear to be effects of education on smoking and mortality, but these are observed only in some contexts and not in others. Among others, the effects seem to differ by (i) gender, (ii) the labor market returns to education, (iii) the quality of education, and (iv) whether education affects the quality of individuals’ peers. These findings are discussed extensively and suggestions for future research are presented.

A Theory of Schooling, Skills, and Health

Theoretical Formulation

The theory discussed draws substantially on Galama and Van Kippersluis (2015, 2018). Consider a model in which we decompose human capital into two components: health H(t) and skill θ(t), where skill includes both cognitive and non-cognitive skills (Cunha & Heckman, 2007; Conti, Heckman, & Urzua, 2010).1 Although health cannot directly be bought, it can be produced by purchasing inputs such as food and medical care. Similarly, skills can be enhanced by investments, for example by spending time in school. As in Grossman’s 1972 model, individuals make optimal lifetime decisions concerning skill and health inputs. These choices determine skills and health, and thus affect labor-market outcomes and consumption choices, which in turn determine health and mortality.

Individuals derive utility from consumption, health, and not being in school, and make choices in order to maximize their lifetime utility


where Z(t) is the set of choice variables, t=0 corresponds to the age at which individuals start school,2 S denotes years of schooling, T denotes the age of death, and ρ is a subjective discount factor. Individual preferences are represented by a concave utility function U[.] that increases with consumption goods and services XC(t) and health H(t), and decreases with being in school cS(t).

Individuals have access to assets and labor income, and must use them over their lifetime to pay for the goods and services they consume. In addition, they can also spend time producing skills and improving their health. The intertemporal budget constraint for assets A(t) is given by



Assets A(t) (equations 2 and 3) provide a return r (the rate of return on capital), increase with income Y(t), and decrease with purchases of skill inputs Xθ(t), health inputs XH(t), and consumption goods XC(t), bought at prices pθ(t), pH(t), and pC(t), respectively. During the schooling period (up to S) individuals pay a tuition fee pS(t) but receive a state (or parental) transfer that is a fraction λS(t) of tuition, where both are exogenously determined. Thus, [1λS(t)]pS(t) is the effective net-of-subsidy tuition rate. Individuals incur a fine f*(t)I[tS_] if they enter the labor market before the minimum school-leaving age S_, where I[tx] is the indicator function, which is 1 for 0tx and zero otherwise.3

Income Y(t)Y[t,S,θ(t),H(t);S_] consists of the product of the wage rate w[t,S,θ(t);S_] and the time spent working τw[H(t)]



We further implicitly assume that skills θ(t) and the schooling duration S (largely) determine the wage rate, while health capital H(t) (largely) determines the time spent working. Wages depend explicitly on the exogenous minimum school-leaving age S_. For example, labor-laws might impose a fine on those who employ individuals younger than S_ so that wages are lower below that age if the law is enforced.

When an individual is still in school (equation 4), years of schooling is measured by t (i.e., the wage rate is not a function of optimal years of schooling S). After schooling has been completed (equation 5), the wage rate w(t) is an (increasing) function of years of completed schooling S and the individual’s level of skill θ(t).4 While schooling S is often used as a proxy for skills θ(t), there are reasons to believe that schooling S may capture benefits beyond skill formation. For example, the effect of schooling or college graduation S on wages may reflect a potential signaling effect as skills, perseverance, and work ethic are difficult to observe by the employer (Arrow, 1973; Spence, 1973; Lang & Kropp, 1986; Bedard, 2001). Conversely, while time in school may increase skill, it does not always do so. But employers observe S, whereas skills are at best partially observed.

The time constraint during the schooling and working phases of life is given by





The total available time Ω (in let’s say a day) is divided between work τw(t), time spent in school τS(t) or devoted to skill investment τθ(t), time investments in health τH(t), and time lost due to illness s[H(t)] (assumed to be a decreasing function of health). A fixed amount of time τS(t) is spent in school; and it can be used productively for skill investments τθ(t) but also unproductively τL(t) (see 8). The “time lost” τL(t) is the difference between compulsory hours spent in school and the desired hours of skill investment, τS(t)τθ(t). While unproductive in terms of skill formation, we allow τL(t) to decrease the disutility of schooling, cS[t;τL(t)]. Intuitively, some children experience low returns and high costs of investing in their skill. They may prefer “staring out of the window” τL(t), over paying attention τθ(t), for instance if the teacher is bad. This distinction is important because it implies that skill does not automatically increase with longer (compulsory) schooling duration. These intuitive constructs allow us to better understand heterogeneity in responses to policy changes aimed at encouraging schooling.

The levels (stocks) of skills θ(t) and health H(t) evolve over time. Individuals can increase them through investments (using production functions gθ(t) and gH(t)), but they also depreciate, according to the following equations



While in school, individuals invest in skill capital gθ(t) through outlays Xθ(t) (e.g., books) and time investments τθ(t) (e.g., paying attention in class, devoting effort to study).5 While working, individuals learn on the job, and devote goods and services and time to learning outside of work Xθ(t), τθ(t) (Becker, 1964). Individuals invest in health gH(t) (equation 10) through expenditures XH(t) and time investments τH(t) (e.g., medical care, flu shots, exercise).

The efficiency of the production functions gθ(t), gH(t) is assumed to be a function of the stocks of skill θ(t) and health H(t). This allows us to model self-productivity, where skills produced at one stage augment skills at later stages, and dynamic complementarity, where skills produced at one stage raise the productivity of investment at later stages (Cunha & Heckman, 2007). At a deeper level, the production functions also depend on ξ(t), which are time-invariant and predetermined endowments, including genes and family background (e.g., one’s parents’ education), and time-varying exogenous characteristics of the schooling and work environment (for instance, one’s peers, or the quality of one’s teachers).

The stock of skill deteriorates at the rate dθ(t), assumed to be a function of age t, the level of skill θ(t), and endowments ξ(t). The stock of health deteriorates at the rate dH(t), assumed to be a function of age t, the level of health H(t), consumption XC(t), and endowments ξ(t).6 Consumption goods and services can be healthy (e.g., consumption of fruits and vegetables) or unhealthy (e.g., smoking, consumption of fatty and sugary foods). Healthy consumption provides utility and slows down health depreciation, dH/XC0, as in Case and Deaton (2005) and Galama and Van Kippersluis (2018). Unhealthy consumption also provides utility but instead increases health depreciation, dH/XC>0.

We assume individuals start life with a given level of health H0, assets A0, and skills θ0, which may be influenced by genetic endowments and parental investments. Following the health-capital literature (Grossman, 1972) life cannot be sustained below a certain minimum health: H(T)=Hmin. Also, we assume individuals spend all of their assets during their lives, that is, A(T)=0. By contrast, following the human-capital literature, the stock of skill at the end of life can be chosen freely, that is, θ(T) is unconstrained (e.g., Ben-Porath, 1967).

Thus, individuals maximize lifetime utility (1) subject to Z(t), the set of control variables: Z(t){S,T,Xθ,τθ,XH,τH,XC}. The Lagrangian (Seierstad & Sydsaeter, 1987; Caputo, 2005) of this problem is:


where qθ(t), qH(t), and qA(t) are the co-state variables associated with, respectively, the dynamic equation (9) for skill capital θ(t), (10) for health H(t), and (2, 3) for assets A(t). These co-state variables can be interpreted as the marginal values (or shadow prices) of the relevant capital stock (see Galama & Van Kippersluis, 2015). For example, qθ(t) is the marginal value in terms of the remaining lifetime utility (from t onward) of an additional increment of skill capital θ(t); λL(t) is the Lagrange multiplier associated with restriction (8), and λHmin(t) is the multiplier associated with the restriction H(t)Hmin(t).

Comparison With the Literature

Our theoretical formulation builds upon and extends workhorse human-capital models. In Ben-Porath’s (1967) model, individuals invest in human capital throughout life to increase their productivity, but human capital is one-dimensional (skill), a schooling period does not feature in the model, and longevity is exogenous. Card (2001) models a schooling period similar to ours, but does not distinguish between schooling (spending time in school) and skill formation, and does not include investments in health and longevity.

The seminal health-capital model of Grossman (1972) treats health as a capital stock that individuals invest in, but assumes longevity T and education are exogenously given. Ehrlich and Chuma (1990), Galama (2015), and Galama and Van Kippersluis (2018) model both health and longevity as endogenous but treat education as exogenously given. Becker (2007) presents a two-period model where individuals can invest in skill, and health, but has no schooling period and health investments solely affect mortality risk. A closely related model is presented in Strulik (2018), where both health and human capital are endogenously determined, and in which individuals accrue so-called health deficits as opposed to facing health depreciation.

As in Galama and Van Kippersluis (2015) and Strulik (2018), we jointly model health H(t), skill θ(t), optimal schooling S, and optimal longevity T. Hence, we distinguish explicitly between schooling duration S and skill θ(t). Schooling S is the choice to spend a certain number of years in school, while skill is a capital stock subject to investment and depreciation. In contrast to Strulik (2018), in which the number of years spent in school is the only endogenous input into skill capital, we allow for (i) other inputs into skill capital besides schooling duration and (ii) the possibility that the stock of skill does not increase with schooling duration if those years are spent unproductively.

While skill and schooling are separate concepts, the two are clearly connected given that schooling is a period of life that is characterized by a low opportunity cost of time, encouraging time investments in skill. This is because skills are low early in life and hence so are wages, and schooling imposes a fixed amount of time τS(t) that cannot be devoted to work. We argue that both schooling and skills determine wages, as skills (at least initially) are hard to observe for employers and so schooling may serve as a signal for skills. However, only skills influence the production function of skill formation and health, not the duration of schooling per se.

Importantly, the model of “Theoretical Formulation” explicitly adds to Galama and Van Kippersluis (2015) the role of institutions and laws that regulate prices, wages, whether individuals are legally required to be in school and until what age, and the penalties and enforcement associated with such laws. This allows us to derive theoretical predictions about the institutional reforms that are often exploited as quasi-experiments in empirical research, an important focus here. However, as is true for most models that investigate this question, ours does not incorporate general equilibrium effects.

Optimal Schooling, Consumption, and Longevity

Schooling: Assuming that the (dis)utility of schooling is additively separable from consumption and health, U[]=U[XC(t),H(t)]cS(t,τL(t))I[tS], and making some further simplifying assumptions detailed in the Appendix, we obtain a condition for the optimal years of schooling S,


where qθ/a(t)qθ(t)/qA(t), S indicates the limit in which S is approached from below, and S+ when approached from above, and we have replaced S and S+ with S for functions that are continuous in S. This condition states that individuals will join the labor market at the age S, the age at which the (net) benefits of work exceed the (net) benefits of staying in school. The left-hand side (LHS) represents the benefits of entering the labor market, while the right-hand side (RHS) represents the benefits of staying in school. The benefits of entering the labor market consist of (i) additional labor income Y(S+)Y(S): individuals have more time to work, because they no longer have to spend a fixed amount of time in school τS(t);7 (ii) they no longer suffer disutility from schooling cS[S,τL(S)];8 (iii) they do not have to pay (subsidized) tuition [1λS(S)]pS(S); and (iv) they incur fewer monetary costs related to skill formation pθ(S)[Xθ(S)Xθ(S+)].9

The benefits of staying in school are (i) increased future earnings, (ii) not incurring a fine f*(S) before the minimum school-leaving age S_, and (iii) the value of additional skill investment while in school qθ/a(S)[gθ(S)gθ(S+)].

Investment: The first order conditions for investment in skill and health are given by




where qθ/a(t) is the relative marginal value of skill, qθ/a(t)qθ(t)/qA(t), qh/a(t) is the relative marginal value of health, qh/a(t)qH(t)/qA(t), πθ(t) is the marginal cost of skill investment, and πH(t) is the marginal cost of health investment (see equations 23, 24, 26, and 27 for explicit expressions). The marginal cost of skill investment πθ(t) and the marginal cost of health investment πH(t) increase in the level of investment in skill and health (due to diminishing returns to scale in investment), increase in prices pθ(t), pH(t), and increase in the wage rate w(t) (opportunity cost of time inputs, see Galama & Van Kippersluis [2015, 2018] for details). Leaving prices and wages aside, a higher relative marginal value of health qh/a(t) or skill qθ/a(t) implies higher investment in health or skill (see 13 and 14).

Consumption: The first-order condition for consumption is given by


Consider an unhealthy consumption good (e.g., cigarettes) that increases the health deterioration rate dH/XC>0. The marginal benefits (LHS of 15) consist of the discounted marginal utility of the consumption good, while the marginal costs (RHS) consist of the monetary cost pC(t) and the health cost qh/a(t)(dH/XC). The health cost is the product of the relative marginal value of health, qh/a(t), and the amount of health lost due to unhealthy consumption dH/XC.

Longevity: Optimal longevity is determined by the point where there are no longer net benefits of staying alive (this is the point where the Lagrangian is zero; see 28 in the Appendix),


The relative marginal value of skill qθ/a(t) approaches zero near the end of life as individuals can choose the terminal level of skill optimally. They then choose its level such that small increments in skill no longer have value at the end of life: qθ/a(T)=0.10 Thus, the second term in (11) vanishes. Health declines near the end of life as it approaches the minimum health level Hmin, and assets eventually decline as declining health reduces earnings and increases medical expenditure, and because terminal assets are constrained to zero. With declining health and declining assets, the second and third terms in (16) compete with the additional utility provided by adding an increment of longevity (first term in 16). The optimal point of death T occurs when the utility of consumption XC(T) and health H(T) no longer outweighs the increasing costs of maintaining health.

Implications for Empirical Analyses

Supply-Side Reforms and Compliers

Equation (12) provides a useful reference to understand the effect of model parameters on the optimal schooling decision, and to understand heterogeneity in schooling choices. In particular, the expression suggests there are various ways to encourage additional schooling. First, the government or parents can fund schooling λS(t)pS(t) (e.g., financial aid, conditional on being in school). Second, labor laws might stipulate that it is illegal to employ individuals younger than age S_ so that earnings Y(S) are effectively zero below that age if the law is enforced. Third, in addition to the schooling subsidy λS(t)pS(t), the government may subsidize skill investment inputs pθ(t)Xθ(t) (e.g., books, access to libraries, additional classes, uniforms, computers). Fourth, the government may set a minimum to hours of instruction or term length, operating through τS(t). Last, governments often set a minimum school-leaving age S_, and individuals may incur a fine f*(S) if they drop out of school before this age.

All supply-side reforms work to lower the costs or to increase the benefits of schooling (equation 12), and can often be considered as plausibly exogenous for a given individual. The most popular supply-side reform is undoubtedly changes in minimum school-leaving ages. Here its effects are discussed.

Equation (12) highlights that increasing the minimum school-leaving age S_ has two effects: it increases the period over which individuals are exposed to a possible fine f*(S) (second term on the RHS of 12), and potentially raises the legal age at which one can earn wages, and hence income Y(S+)Y(S) (first term on the LHS of 12). These effects alter the net benefits for each individual of staying in school longer. Taking the derivative of equation (12) with respect to S_, and assuming that individuals value skill qθ/a(t) more than health qh/a(t) early in life, then (see 32 in the Appendix) the comparative dynamic effect of the minimum school-leaving age on schooling can be summarized as:


where + indicates a positive effect, indicates a negative effect, +/ denotes an ambiguous effect, and βS_ is defined in (32). The terms on the RHS represent various effects of raising the minimum school-leaving age on the optimal schooling decision: wealth effects (first term),11 increased relative marginal value of skill (second term), a higher stock of skill (third term), better health (fourth term), a longer life (fifth term), and effects operating through labor laws (sixth term).12

Expression (17) captures a characteristic of the comparative dynamic analyses that many effects operate through the marginal values of wealth, skill, and health. In what follows, it is useful to adopt the usual assumption of diminishing returns to wealth, skill, and health. This implies that the marginal value of wealth qA(t) is smaller and the relative marginal values of skill qθ/a(t) and of health qh/a(t) are larger for those with greater wealth. Thus, if raising the minimum schooling age affects lifetime wealth, then these effects are operational.

For some individuals, raising the minimum school-leaving age S_ does not affect their schooling decision S, because they would have chosen to complete far more or far less schooling anyway. Consider an individual who starts life endowed with skill and ability to learn, excels in school, and is in an environment where her skill formation is enabled and valued. She is efficient in absorbing and internalizing the information provided in class, has the necessary materials at her disposal, the quality of the teaching and the school and home environment are conducive to skill formation, and she potentially enjoys being in school (utility rather than disutility cS(t)). As a result of these many factors, her skills θ(t) grow: the process of skill-formation is “productive” (raises skill) and “efficient” (uses the inputs time and goods/services effectively). Further, she is healthy and expects to live a long life T, the institutional environment (few barriers, such as discrimination, corruption, crime, etc.) and the labor market (high wages, high employment) enable and value the use of her skills (high returns to skill, ST(Y/S)dt). She would not be influenced by the minimum school-leaving age S_ because she would optimally choose a far greater number of years of schooling S>S_. Raising the minimum schooling age S_ would not have an effect on her choice S. Effectively, because there is no change in schooling, and thereby in wealth, all terms in equation (17) are zero (no effects on the marginal values of wealth, skill, and health).13

Another way of understanding this result is by noting that condition (12) is not the usual equilibrium equation, but a switching equation. The optimal level of schooling S occurs when the net marginal benefits of staying in school equal, for the first time, the net marginal costs. In this case, the net marginal benefits of staying in school outweigh the net marginal costs for all ages before and much past the minimum school leaving age S_. Increasing the benefits of schooling for ages well before the optimal schooling age S for this individual does not affect her choice because marginal benefits already exceeded marginal costs for those ages.

The second type of individual who is unlikely to alter her schooling decision is one who is endowed with few skills and less ability to learn to begin with, which renders investments in skill rather unproductive. Further, she strongly dislikes going to school (maybe her teachers are bad), and in order to decrease the disutility of schooling, she chooses to spend time unproductively τL(t) rather than making the effort to acquire skills τθ(t). Before the change in the minimum school-leaving age, she already dropped out well before the legal age (despite the fine). Increasing the net marginal benefits of staying in school at ages well after the age she dropped optimally out of school S does not affect her choice (net marginal costs already exceeded net marginal benefits well before the old minimum age).14

Now consider the marginal individual. She would prefer to enter the labor market to make more money, but at the same time does not want to incur a fine for dropping out early and recognizes that investing in skill increases her future earnings. She optimally decides to drop out of school exactly at S_, the minimum school leaving age. Now, if the government raises the minimum school leaving age, S_, this individual has to re-optimize. For this individual, the net benefits of staying in school longer were negative under the old regime. Hence, the fine has to be set such that the individual will comply with the reform, and chooses to stay in school until the new school-leaving age. This group of individuals at the margin forms the group of “compliers.”

Equation (17) illustrates the various factors entering the decision whether or not to comply. First, the costs of dropping out at the previous optimal S (i.e., before the new S_) increase because she would incur a fine and forgo a potential subsidy. This provides incentives to comply and increase schooling to S_, and is reinforced by the potential wealth effects associated with an extra year in school if additional schooling increases skills and earnings (terms 1 to 3 on the RHS). Second, she might invest more in skill because of the lower opportunity cost of time as she needs to spend a given amount of time τS(t) in school (term 2). If, for the marginal individual, the institutions and labor-market conditions are such that the additional schooling increases lifetime skills, and skills lead to higher earnings, then the resulting greater investment in health, healthier behavior, and longer lives may provide additional benefits of staying in school longer (terms 4 and 5). Finally, labor laws may stipulate that employment is not gainful before the new minimum school-leaving age (term 6). Thus, the complier increases (by definition) her level of schooling, but only if additional schooling increases skills and earnings do we expect to observe improvements in health, health behaviors, and longevity. The magnitude of these health effects among compliers depends on institutions and on economic and social conditions that may vary by cohort, gender, level of a country’s development, and more. It is also important that individuals correctly perceive these benefits, and are not credit-constrained, in order for them to change their decisions. The theoretical model assumes perfect information and perfect credit markets, but in practice both assumptions may not hold for certain subgroups of the population.

Effects of Compulsory Schooling on Consumption and Mortality

Consumption: The effect of a change in the minimum-school leaving age S_ on (unhealthy) consumption can be summarized by:


where β1,C to β3,C are defined in (33).

The effect of the minimum school-leaving age S_ on unhealthy consumption is ambiguous. The derivative can be decomposed into two main terms.15 The first term represents a wealth effect, and is positive among compliers: increased schooling leads to an increase in wealth enabling more unhealthy consumption (qA(S)/S_<0 and β1,C<0).

Yet, an increase in schooling, and the associated wealth effect, also leads to a higher marginal value of health relative to wealth qh/a(t) (second term on the RHS). This is quite intuitive (see also Hall & Jones, 2007): due to diminishing marginal utility of consumption, richer people eventually start caring less about consumption and more about other goods, such as health, given that health extends life (adding additional periods of utility rather than marginal improvements in utility from higher consumption). A higher relative marginal value of health increases the health cost of unhealthy consumption [qh/a(t)d/XC], and therefore reduces the demand for unhealthy consumption. This health cost effect competes with the wealth effect, and the net effect of the minimum-school leaving age S_ on unhealthy consumption XC(t) is ambiguous.

A bit more can be said on the competition between the wealth effect and the health cost effect. The health cost increases in the severity of its impact on health, dH/XC (the degree of “unhealthiness” of the consumption good). This suggests that for moderately unhealthy goods the direct wealth effect dominates, while for severely unhealthy goods the indirect wealth effect dominates (Van Kippersluis & Galama, 2014; Galama & Van Kippersluis, 2018).

In sum, a rise in the minimum school-leaving age may impact unhealthy behaviors such as smoking and poor diet through wealth effects (schooling leads to higher lifetime earnings enabling more consumption) and health-cost effects (schooling increases the relative marginal value of health through better job prospects and higher lifetime wealth, thereby reducing demand for unhealthy consumption). Because the health cost increases in the degree of unhealthiness of the good, we expect the wealth effect to dominate among moderately unhealthy goods, and the health cost effect to dominate among severely unhealthy goods. Note that the results depend on the assumption that individuals are perfectly informed about the precise health consequences of consumption goods d/XC—if people are unaware of the health effects of unhealthy behavior, the term capturing its effect in the optimization would vanish and the wealth effect would dominate, in which case more schooling would lead to more unhealthy behavior. This could, for example, explain why higher educated individuals were more likely to smoke before the 1964 Surgeon General’s report warned about its effects.

Longevity: The condition for the effect of the minimum school-leaving age S_ on optimal length of life T can be written as (see 37 and 40 in the Appendix)


where β1,T to β3,T are defined in (40).

Conditions (19) and (40) illustrate the various ways through which a rise in the minimum school-leaving age S_ may affect longevity. The first term on the RHS is the wealth effect, among compliers, if extra schooling boosts lifetime wealth, that is, if schooling enhances skills that are valued in the labor market and because of signaling. Individuals with more resources can afford to devote resources to increasing life expectancy (first term on the RHS, β1,T<0 and qA(T)/S_<0).16 However, there also exist scenarios in which wealthier individuals would choose to spend more resources on unhealthy consumption, and live shorter lives as a result. Hence the ambiguous sign. The second term on the RHS of (19 and 40) shows that, to the extent that an increase in the minimum school-leaving age boosts skills among compliers, these additional skills improve skill and health production, and increase earnings, all of which enable life extension through greater lifetime resources and better health (2nd term on the RHS, β2,T>0 and θ/S_>0). Finally, through diminishing returns to wealth and skill, or because of improved health knowledge, a higher minimum school-leaving age may lead individuals to value health relatively more compared with wealth, invest more in health (see 14), and live longer as a result (third term on the RHS, β3,Tqh/a(T)/S_|T,S>0).

In sum, a rise in the minimum school-leaving age may impact longevity through wealth effects, improved skill formation, and a higher relative marginal value of health, improving health behavior. The magnitude of these effects, and in the case of the wealth effect also its sign, depends on the relative importance of terms in equation (19). Studying these terms more thoroughly, using equation (40) in the Appendix, reveals that the effect of a raise of the minimum school-leaving age on longevity among compliers will be stronger when (i) the labor market returns to schooling are high, (ii) the quality of education and the motivation of students is such that additional schooling boosts skill formation, and (iii) the non-monetary returns to this additional skill are large (e.g., skills improving skill formation and skills improving health production). Obviously, the magnitude and importance of these terms varies across time periods and settings, and our theory suggests there will be cases where a rise in the minimum school-leaving age does not improve skills, increase wealth, or improve health outcomes. Finally, there may be other channels through which minimum school-leaving laws affect health outcomes that are not captured in our model—for example they may affect an individual’s peers—we discuss this more thoroughly below.

Empirical Evidence

In the following review of the literature the effects of measures of skill and schooling on mortality, smoking and obesity are considered. Mortality is an unambiguous and precise measure of health that captures circumstances throughout the lifetime. For the living, a large number of health measures are available but there is no unique health indicator, aside from self-reported health status. Because self-reported health status is a subjective measure, we instead focus on the first and second leading causes of preventable disease and death in the United States: smoking and obesity (Mokdad, Marks, Stroup, & Gerberding, 2004). To compare across studies, we focus on whether or not an individual smokes currently (a binary indicator), and whether the person’s body mass index (BMI) exceeds the threshold for obesity (BMI 30). Both obesity and smoking are unambiguous measures of bad health.17

Randomized controlled trials (RCTs) are arguably the gold standard for establishing causal effects (Imbens, 2010). Unfortunately, there are only a few RCTs for childhood education, and the samples are small and not representative. For this reason, we rely mostly on quasi-experimental methods, which seek to replicate experimental conditions (sometimes also referred to as natural experiments). We separately discuss (i) twin studies, where within-twin-pair differences in education are related to within-twin-pair differences in health outcomes, implicitly accounting for all genetic and family characteristics shared by twin pairs; and (ii) quasi-experiments, where instrumental variables (IV) or regression discontinuity designs (RDD) are used to estimate a treatment effect of education on health outcomes.

There are alternative approaches to assess causality. In the absence of random assignment, one can use observational data to infer causal effects based on a number of econometric methods. The control-function approach to solving the omitted variable bias problem allows researchers to control for essential observable and unobservable variables, often using data from a specific cohort (e.g., Conti, Heckman, & Urzua, 2010; Savelyev, 2014; Bijwaard, Van Kippersluis, & Veenman, 2015). A limitation of the control-function approach is that it relies on generalizations of the conditional independence assumption. We do not include papers that follow this approach here because they vary widely in terms of methodology and are therefore hard to classify. Finally, the literature on ordinary least squares (OLS) estimations is large and diverse in terms of included control variables, which makes it challenging to summarize. The reader is referred to Grossman (2015) for a review of methods that do not rely on experimental or quasi-experimental randomization.

We include studies that (i) were published after 2005; (ii) focus on the causal effect of education on all-cause mortality,18 current smoking or obesity; and (iii) used an RCT, twin difference design, or quasi-experiment.19


Randomized experiments are the preferred method to establish treatment effects. When a random subset of individuals is treated, there are no average differences in the characteristics of the treated and the controls. Therefore, the controls can provide a counterfactual outcome. For practical and ethical reasons, there are no RCTs that give individuals “one more year” of formal schooling. There is, however, useful indirect evidence based on interventions that provide incentives to individuals to attend school.

In the United States, the most studied randomized interventions of this type are the Perry Preschool Program (implemented in the 1960s) and the Abecederian (ABC) program (implemented in the 1970s), both of which offered early childhood education (ECE). Perry provided access to preschool education to a random subset of children between the ages of three and five. It offered intensive cognitive and language skill activities. The ABC program treated children between ages zero and five, and a subset up to age eight. In addition to cognitive stimulation, children received nutritional support and other health services. Because the Perry and ABC interventions occurred decades ago, it is now possible to investigate how they affected formal years of schooling, IQ, wages, and health in adulthood. But it is too early to use these to investigate mortality.

Table 1 summarizes the results of RCT interventions. Males randomly assigned to Perry were less likely to smoke at age 27 (Heckman, Pinto, & Savelyev, 2013) and less likely to be a daily smoker at age 40 (Conti, Heckman, & Pinto, 2016) than males in the control group. This was not true for females. ABC recipients were less likely to be obese than controls, though these effects are not statistically significant (Campbell et al., 2014; Conti et al., 2016). The likely mechanisms by which these two ECE interventions improved health are interesting. Neither Perry nor ABC had long-lasting statistically significant impacts on IQ, and Perry did not affect formal measures of education (years of school). But the interventions affected non-cognitive skills, such as externalizing behaviors, and also increased achievement scores, such as grades, which reflect motivation in addition to skill and knowledge. In adulthood, treated individuals had higher employment, higher earnings, lower participation in crime, and higher rates of marriage than untreated individuals.

Table 1. Effect of Randomized Education Interventions on Smoking and Obesity


Data & Sample

Outcomes (Overall Means)


Treatment Effect

(M=Males, F=Females, P=Pooled)

A. Smoking

Conti et al. (2016)

Perry Preschool Program

Not a daily smoker at age 40

Perry Preschool Program


Follow-up at age 40



Male children age 3–5


Birth cohorts early 1960s

[Table 4, Column 6]

N = 66

Heckman et al. (2013)

Perry Preschool Program

Tobacco use at age 27 (0.48)

Perry Preschool Program


Follow-up at age 27


Male children age 3–5


Birth cohorts early 1960s

[Table 1, Column 1]

N = 51

Jensen & Lleras-Muney (2012)

Dominican Republic

Currently smoking around

Providing information on the returns


Survey data (2001–2005)

age 18 (0.05)

to education to Dominican youths


Male students around age 14


Birth cohort 1986–1987

[Table 4, Column 4]

N = 2,011

B. Obesity

Conti et al. (2016)

Carolina Abecedarian Project

Obesity at age 35 (0.56)

Carolina Abecedarian Project


Male children age 0–5


Birth cohorts 1972–1977


N = 26

[Table 5, Column 6]

Campbell et al. (2014)

Carolina Abecedarian Project

Obesity at age 35 (0.56/0.73)

Carolina Abecedarian Project


Children age 0–5


Birth cohorts 1972–1977


N = 26 / 40c

[Table 1/2, Column 3]


(a) Female estimates are not significant and not reported in the paper.

(b) Square brackets are p-values, parentheses are standard errors.

(c) Men/Women, respectively.

Although these findings provide strong evidence of a causal effect of ECE, they have several limitations—we point out a few. First, the Perry and ABC studies are each based on small samples of about 100 observations. Second, these studies concentrate on very disadvantaged children from poor families. In addition, Perry preschool targeted only African Americans within an IQ range of 70 to 85—many would be considered cognitively impaired today—so it is unclear to what extent results apply to less disadvantaged and higher IQ individuals. ABC participants were not exclusively African Americans, and they had higher IQs than those in Perry, but were still drawn from poor populations with low baseline IQs. Lastly, the ABC intervention included a large health component, so effects on health and health behaviors do not necessarily stem from education alone.

We also report the results of one more RCT. Jensen and Lleras-Muney (2012) investigate the effects of an intervention in the Dominican Republic. The intervention informed male students at the end of middle school (around age 14) of the wage increases associated with attending high school (the randomization was done at the school level). Treated boys were less likely to smoke by age 18 (and delayed the onset of alcohol consumption). This study also investigated mechanisms. Treated boys were more likely to stay in school, worked less, and earned less pocket money while in school. They were also less likely to interact with smokers. The intervention did not affect participants’ knowledge of the harms of smoking, nor did it affect their discount rates or their attitudes toward risk. This study also has limitations. The children were also disadvantaged, drawn from poor neighborhoods with low high school attendance rates, females were not included, and all measures, except for schooling, are based on self reports.

Despite their limitations, some interesting conclusions can be drawn from these interventions. First, interventions that provide some form of schooling appear to improve some health behaviors. Because the ABC intervention included a health component that could directly explain the obesity results, the most conservative conclusion is that education, early in life or in high school, appears to lower smoking among disadvantaged males (more evidence is needed for obesity). Second, results differ for males and females, though it is not entirely clear why this is so. Third, in all cases, earnings, social skills, and connections appear to have been improved by the intervention, suggesting they could be mediators. Lastly, the Perry preschool program affected some important non-cognitive skills and motivation, despite having no effect on IQ or formal years of schooling, in line with predictions of the model (better use of school time, e.g., because of better quality teachers, would result in greater skill without increases in schooling duration). It appears that important non-cognitive skills were formed by participation in these programs. These non-cognitive skills provided important benefits such as greater earnings, greater social connections, and more stable marriages later in life. These improvements may explain better health behaviors in later life.

Twin Studies

An alternative approach to RCTs is to mimic the results of experimental variation by finding a setting where almost all determinants of health are identical, but there is some variation in education that is close to random. Because important determinants of health can be traced to family inputs early in life, a substantial literature has pursued a within-twin estimation approach. The intuition behind this approach is that twins face very similar conditions—they share food, parents, neighbors, age, and genetic traits, and have the same number of siblings. Further, identical twins have the same gender and identical genetic endowments. Under the assumption that differences in education between twins are due to random factors, differences in adult outcomes across twins can be thought of as mostly resulting from differences in their education.

Table 2 reports the findings of studies that use twins to estimate the effect of education on mortality and smoking. The data for these studies come mostly from twin registries in various developed countries and contain a large number of observations (at least a thousand). The two U.S. studies rely on the Midlife U.S. Survey and have smaller samples (of about 350 and 650). We report results for identical twins.

Table 2. Health Effects of Education From Twin Studies


Data & Samplea

Variables (mean)

Estimates (M=Males, F=Females, P=Pooled)b


Fixed Effects

A. Mortality

Lundborg et al. (2016)

Swedish Twin Registry

H = Mortality by 2007, hazard ratio



Birth cohorts 1886–1958




N = 8,602/10,084d

E = Years of education (9.74/9.59)



[Table 3, Column 3]

[Table 3, Column 4]

Van den Berg et al. (2012)e

Danish Twin Registry

H = Mortality (0.62)



Birth cohorts 1888–1897

E = Eligible for reform that



N = 2,839/2,856

expanded voluntary education




[Table 11/12, Column 1]

[Table 11/12, Column 3]

Behrman et al. (2011)

Danish Twin Registry

H = Mortality by 2003 (0.16)



Birth cohorts 1921–1950

E = Years of education (10.8)

Cohort 1921–1935

Cohort 1921–1935

N = 3,234/2,060





Cohort 1936–1950

Cohort 1936–1950





[Table 6, Column 2]

[Table 6, Column 4]


(a) Sample size is for monozygotic twins, if available.

(b) Report for monozygotic twins, if available.

(c) Means for Male/Female, respectively.

(d) Men/Women, respectively.

(e) Working paper.

(f) Correlated frailty model, where unobservables within twins are assumed correlated.

Madsen et al. (2010)

Danish Twin Registry

H = Mortality 1980–2008 (Hazard ratio)



Statistics Denmark

Cohort 1921–1935

Cohort 1921–1935

Census of Death Registry

E = Less than 7 years of education



Birth cohorts 1921–1950




N = 5,260

Cohort 1936–1950

Cohort 1936–1950





[Table 3, Column 4]

[Table 3, Column 7]

B. Smoking

Amin et al. (2013)

TwinsUK Database

H = Currently smoking (0.12)



Women, birth cohorts 1924–1974

E = Years of education (13.44)



N = 1,482



[Table 4, Column 2]

[Table 4, Column 6]

Lundborg (2013)

Midlife in the U.S. Survey

H = Currently smoking (0.213)



Birth cohorts 1921–1970

E = Years of education (13.68)



N = 664



[Table 3, Column 1]

[Table 3, Column 2]

Fujiwara & Kawachi (2009)

Midlife in the U.S. Survey

H = Currently smoking (odds ratio)



Birth cohorts 1920–1970

E = Years of education



N = 168/183



[Table 3, Column 1/3]

[Table 4, Column 1/3]


(a) Own education instrumented by twin’s report.

Twin studies usually start by reporting the effect of education using an OLS regression. These estimates compare the health of those with higher and lower levels of schooling. Thus the differences in outcomes reflect differences in education within and across twins. All OLS estimates in Table 2 find that for both men and women, education is associated with lower mortality and reduced smoking. The twin fixed-effect (FE) estimates use only within-twin variation in education for identification, controlling for shared genetic and family characteristics.

Four studies report effects of education on mortality (Table 2, panel A). A Swedish study (Lundborg, Lyttkens, & Nystedt, 2016) finds statistically significant twin FE estimates that are about the same size as OLS estimates. The results suggest that education lowers mortality very substantially for both men and women—a year of education is estimated to reduce mortality by 4 to 5%—and also that the bias in OLS estimates is low.

Three studies use Danish data. Behrman et al. (2011) find statistically insignificant twin FE effects on mortality that are also of “the wrong sign.” Van den Berg, Janys, and Christensen (2012) report within-twin FE estimates of education on mortality that are negative and statistically significant for men, and these are similar to OLS estimates. Finally, Madsen, Andersen, Christensen, Andersen, and Osler (2010) use a larger sample of Danish twins and measure duration until death (rather than whether death has occurred by 2003 as in Behrman et al. 2011). This study finds mixed results. The within-twin FE estimates are not statistically significant, but they are larger than OLS in magnitude for the oldest male cohort, and smaller and close to zero for the younger male cohort. In all three Danish studies the effects for women are insignificant and within-pair estimates are always smaller than OLS estimates (or have the wrong sign).

The three studies for smoking (Table 2, panel B) find a negative, but very small and statistically insignificant, effect of education on smoking, once twin-fixed effects are included. However these studies have very small sample sizes. Finally, there are no studies that report effects on obesity.20

There are some well-known methodological issues with twin approaches, which have been extensively discussed elsewhere (e.g., Bound & Solon, 1999). First, twin studies have important power limitations. The variation in education within twins is small, because the share of twins with identical education levels is very high (more than 40% in all studies). And, when they exist, the observed differences in education are often due to measurement error.

A second issue in twin studies is whether they can control for birthweight and for health conditions since conception. Even though identical twins share identical genetic material, they have differential access to nutrition in the womb, and one twin is typically born larger and heavier than his or her counterpart. Because nutrition is different in utero, traits that are affected by nutrition, such as IQ or heart function, could differ across identical twins. Moreover, parents might make compensating or reinforcing investments in response to these initial differences (Aizer & Cunha, 2012). Therefore, even after controlling for birthweight, there could be unobserved differences in inputs or environments within identical twins resulting in higher schooling and better health. The fundamental difficulty is that it is not clear in these studies why one twin has more schooling than the other, that is, the source of variation is unclear.

A final question regarding twin studies is their external validity. Even though twins come from selected parents, they are on average much closer to the general population in their characteristics and outcomes than the participants of RCTs previously mentioned. However, they have unique experiences relative to children from singleton births with siblings—for instance their in utero environment is shared, and they are typically born early and with low birthweights. Growing up, twins have an identical sibling, also a unique experience. Thus, twin studies cannot estimate the average treatment effect (or ATE) of one more year of schooling.

Where does this leave us? The study with the largest sample of identical twins is the Swedish study by Lundborg et al. (2016). This study also includes birthweight and height—two proxies for nutrition in the womb and in childhood—and has excellent mortality measures. It also uses register data from administrative sources, which typically contain less measurement error in education. Thus, the concerns raised are likely less severe in the Lundborg et al. (2016) analysis. This study finds large effects of education on mortality for both women and men. The results of the three Danish studies are consistent with these findings for men, but inconclusive for women and generally statistically insignificant (except in Van den Berg et al., 2012). Thus, some evidence of effects of education on mortality exists for men, subject to the important caveat of all twin studies, that within-twin differences in education are not necessarily random. There is some hint that these effects vary by birth cohort and that they might apply to females as well, though potentially effects are smaller for females. For both males and females, the evidence suggests no effects of education on smoking, but these findings are less conclusive because they come from small samples.

Quasi-Experimental Evidence

Rather than investigating populations with similar characteristics, as in twin studies, one can seek to identify circumstances where differences in education in a certain population did not result from individual and family choices but are due to external factors outside of the control of the individual and family. These are often referred to as quasi-experiments, because they attempt to mimic the ideal experiment where an outside force causes some individuals to obtain more schooling and others not.

Many studies have investigated the effect of compulsory schooling legislation (CSL hereafter) as a quasi-experiment. The intuition is that when the law is changed, typically to require individuals to stay more years in school, the affected cohort is very similar to the unaffected cohort, at least within a given state or geographic unit and close to the cutoff date specified by the legislation. Many countries around the world have implemented compulsory schooling legislation. If the legislation is shown to be binding—that is, it indeed forced individuals to attend school for longer, then one can compare individuals forced to go to school for X years with those forced to go to school for X+ΔX years, and assess whether those that are forced to attend school longer live longer lives and have better health and improved health behaviors.

Quasi-experimental studies typically start by reporting the OLS association with education and then report the two-stage least squares (2SLS) estimates of education, using minimum school-leaving ages as instrumental variables. The estimated effect of increasing the minimum school-leaving age should be interpreted as treatment effects among the group of compliers, often referred to as a local average treatment effect (LATE, see Imbens & Angrist, 1994). If different individuals have different returns to school, then this LATE estimate will not coincide with the average treatment effect that OLS seeks to estimate. Therefore, apart from selection bias, another reason why OLS estimates may differ from estimates derived from quasi-experiments (supply-side reforms) is that the affected population is different.

The type of individual that is induced to go to school longer by supply-side reforms could also differ across quasi-experiments. Therefore 2SLS estimates across settings could differ as well. These LATE estimates do inform policy regarding the effects of compulsory schooling in a given setting, but not necessarily in other settings, and are not necessarily informative about the effects of other policies that would also raise educational attainment.21

Table 3 documents the findings. It documents substantial divergence across studies, with some studies finding large causal effects while others find no effects. The mortality studies are reviewed in detail, after which the smoking and obesity studies are discussed more concisely.

Table 3. Health Effects of Education From Quasi-Experiments


Data & Sample

Variables (mean, full sample if available)

Identification, first stagea

Estimates (M=Males, F=Females, P=Pooled)



A. Mortality

Meghir et al. (2017)

Swedish Education Register

H = Mortality, hazard ratio

CLS 7 or 8 to 9b



National Cause of Death Register

in 1973–2015 (0.200)

in 1948



Birth cohorts 1940–1957

E = Years of education (11.4)

First stage: 0.313/0.177



N = 812,719/749,702


[Table 2, Column 1]

[Table 4, Column 1]

Buckles et al. (2016)

U.S. Vital Statistics Mortality

H = Mortality rate per 1,000,

Vietnam War



U.S. Census and NHISc

1981–2007 (138.58)

Draft Avoidance



White men, Cohorts 1942–1953

E = Years of college (1.99)

First stage: 11.41 (2.40)



G = 600 (birth state-cohort)

[Table 3, Column 1]

[Table 3, Column 5]

N = 1,994,459

Davies et al. (2016)d

UK Biobank

H = Mortality rate 2006–2014, risk difference (0.9%)

CSLe 15 to 16 (1972)



National Death Statistics

F-stat: 629/689



Birth cohorts 1935–1971

E = Leaving school after age 15



N = 9,699/12,439f

[Table S6/S7, Column 8]

[Table S6/S7, Column 2]

Fletcher (2015)

U.S. Diet & Health Study

H = 10-year Mortality rate (0.18)

Various CSL by state



1995–1996, Low Education

E = Self-reported education

F-stat: 16.36



Birth cohorts 1925–1945

categories (12 years)



N = 127,550

[Table 3, Column 8]

[Table 3, Column 9]


(a) The F-statistic is reported in cases of multiple instruments or unreported first-stage.

(b) Extension of compulsory schooling from 7 or 8 depending on municipality to 9 nationally.

(c) National Health Interview Survey.

(d) Working paper.

(e) Compulsory Schooling Laws.

(f) Men/Women, respectively.

Gathmann et al. (2015)

European Social Survey

H = 20-year mortality (odds ratio)

Various CSL by country




E = Years of education




Human Mortality Database

First stage: 0.50/0.54



Birth cohorts 1880–1986


[Table 2/3, Column 2]

[Table 5, Column 2/7]

N = 21,979/27,237

Clark & Royer (2013)

U.K. National Death Statistics

H = Mortality 1970–2007

CSL 14 to 15 in 1947


Health Survey for England

(log odds of death)

First stage: 0.45 (0.04)


General Household Survey

E = Years of education

CSL 15 to 16 in 1972


Census in 2001


First stage: 0.35 (0.06)

[Table 3, Column 1]

Birth cohorts 1926–1940, 1950–1965

Fischer et al. (2013)

Sweden National School Authority

H = 10-year mortality rate

Differential CSL (1936)


Census in 1935

in region (0.007)



Swedish Death Index

E = Share of schools


Birth cohorts 1924–1931

adopting CSL in region (0.4)

[Table 5, Column 1]

G = 400 (Region-cohort-gender)

N = 731,791

Lager & Torssander (2012)

Swedish Census of 2007

H = Mortality after 40

CSL 8 to 9, differential


Birth cohorts 1943–1955

(hazard ratio)

roll-out (1949–1962)


N = 639,473/608,394

E = Having been through


the reform 1949–1962

[Table 2, Column 1/2]

Van Kippersluis et al. (2011)

Statistics Netherlands

H = Mortality rate 81–87

CSL 6 to 7 in 1928



Survey and administrative data

E = Years of education

First stage: 1.039 (0.05)



Men, Birth cohorts 1912–1922



N = 66,891

[Table 1, Column 3]

[Table 4, Column 2]


(a) Survey of Health, Ageing and Retirement in Europe.

(b) 1947/1972 reform, respectively.

(c) Calculated by dividing reduced-form estimate by the first stage estimate.

(d) Reduced form estimate.

Albouy & Lequien (2009)

French EDPa

H = Survival at 50

CSL 14 to 16, Berthoin



Cohorts 1950–1955 for Berthoin

for the Berthoin reform

First stage: 0.28 (0.06)



N = 47,337

H = Survival at 80

CSL 13 to 14, Zay



Cohorts 1920–1925 for Zay

for the Zay reform

First stage: 0.11 (0.06)

[Table 2, Column 2/1]

[Table 6, Column 3/2]

N = 35,828

E = Min. school leaving age

Glied & Lleras-Muney (2008)

U.S. Census, 1980 and 1990

H = 4-year mortality rate

Various CSL by state


Mortality Detail Files




White U.S.–born cohorts 1901–1925

E = Years of education

First stage: 0.054/0.041


N = 119,975


[Table 4, Column 3]

Lleras-Muney (2005)

U.S. Census 1960, 1970, 1980

H = 10-year mortality rate

Various CSL by state




1960–1970 & 1970–1980 (0.11)




Birth cohorts 1901–1925

E = Years of education

First stage: 0.05 (0.008)



G = 4,792 (cohort-gender-state)


[Table 3, Column 1]

[Table 4, Column 4]

N = 814,806

B. Smoking

Davies et al. (2016)d

U.K. Biobank

H = Currently smoking

CSL 15 to 16 (1972)



Office of National Statistics


F-stat: 625/702



Birth cohorts 1935–1971

E = Leaving school after 15



N = 9,681/12,405

[Table S6/S7, Column 8]

[Table S6/S7, Column 2]

Heckman et al. (2016)d

National Longitudinal Survey

H = Currently smoking

Unemployment rate,



of Youth 1979

E = college graduate

college presence,



Men, Birth cohort 1979

(vs. high school dropout)

college tuition



N = 2,242

[Table 3, Column 1]

[Table 3, Column 7]


(a) Echantillon Démographique Permanent.

(b) Estimates for the Berthoin/Zay reform, respectively.

(c) National Health and Nutrition Examination Survey Epidemiologic Follow-up Study.

(d) Working paper.

Huang (2016)a

Chinese Family Panel Studies

H = Currently smoking (0.26)

CSL to 9 in 1986,



Health and Nutrition Survey

E = Years of education (8.86)

differential adoption by



Birth cohorts 1955–1993




N = 104,634

F-stat: 25.78

[Table 5, Column 3]

[Table 5, Column 3]

James (2015)

Health Survey of England

H = Currently smoking

Expansion of post-



Birth cohorts 1962–1980

E = Age left school

compulsory education



N = 27,927

during 1980s



F-stat: 19.02

[Table 5, Column 2]

[Table 5, Column 2]

Li & Powdthavee (2015)

Household, Income and Labour

H = Currently smoking (0.204)

CSL age 14 to 15



Dynamics in Australia (HILDA)

E = Years of education (12.171)

differential adoption by



Birth cohorts 1939–1972

state in 1970s



N = 9,099

F-stat: 71,601

[Table 3, Column 1]

[Table 3, Column 1]

Silles (2015)

General Household Survey

H = Currently smoking

CSL 1940s and 1970s



for Great Britain, 1978–2004


schooling age from 14

Great Britain

Continuous Household Survey

E = Years of education

to 16



for Northern Ireland, 1983–2004





Birth cohorts 1923–1981

F-stat: 61.22/109.95

[Table 5, Column 1]

[Table 5, Column 1]

N = GB: 79,271/90,666


Northern Ireland

NI: 15,298/19,629

F-stat: 21.91/23.39





[Table 6, Column 1]

[Table 6, Column 1]

Clark & Royer (2013)

U.K. Office of National Statistics

H = Currently smoking

CSL 14 to 15 in 1947



Health Survey for England


First stage: 0.45 (0.04)



General Household Survey

E = Years of education

CSL 15 to 16 in 1972



Cohorts 1926–1940 & 1950–1965


First stage: 0.35 (0.06)

[Table 5A, Column 2]

[Table 5A, Column 4]

N = 49,421/47,177b

(a) Working paper.

(b) 1947/1972 reform, respectively.

Braakmann (2011)

British Labour Force Survey

H = Currently smoking (0.3)

CSL and birth-month



Health Survey in England

E = Having any degree

First stage: 0.021 (0.006)



Birth cohorts 1957–1970

(CSE/O-level or above) (0.84)



N = 15,822

[Table 8, Column 1]

[Table 8, Column 1]

Etilé & Jones (2011)

EPCV and ES in Francea

H = Currently smoking (0.31)

Postwar expansion



Birth cohorts 1945–1965

E = Years of education (9.68)

of education in France



N = 18,785/20,335

F-stat: 502/283



[Table 1, Column 7/3]

[Table 1, Column 8/4]

Jürges et al. (2011)

German Micro-census

H = Currently smoking

Differential expansion of



for 1999 and 2003


education in Germany



Birth cohorts 1940–1980

E = Years of education




N = 71,388/71,353


First stage: 41.2/51.8

[Table 2, Column 1]

[Table 3, Column 1]


Kemptner et al. (2011)

German Micro-census

H = Currently smoking (0.36/0.24)

Differential roll-out



in 1989–2003

E = Years of education (9.42/9.17)

of CSL 8 to 9 from



Birth cohorts 1930–1960




N = 121,318/124,314

First stage: 0.616/0.654

[Table 4, Column 1]

[Table 5, Column 2]


Reinhold & Jürges (2010)

German Microcensus

H = Currently smoking (0.37/0.28)

Abolition of secondary



for 1999 and 2003

E = Years of education (10.16/10.00)

school fees in 1950s



Birth cohorts 1934–1982

First stage: −0.11/−0.10



N = 85,698/86,405

F-stat: 13.91/14.69

[Table 2, Column 1]

[Table 3, Column 1]

Park & Kang (2008)

Labor & Income Panel

H = Refrains from smoking (0.304)

Expansion of secondary



Korean men

E = Years of education (12.479)

education and birth order



High school cohorts 1965–1985




N = 1,611

F-stat: 6.65

[Table 2, Column 2]

[Table 4, Column 2]


(a) EPCV: Permanent Survey on the Conditions of Living of French Households 1996–2003; ES: Health Surveys 1992 and 2003.

Grimard & Parent (2007)

U.S. CPSa Tobacco Supplements

H = Smoking regularly

Vietnam war draft



Birth cohorts 1935–1974

E = Years of education

avoidance (dummy for



N = 227,027

male cohorts, 1945–1950)



First stage: 0.161 (0.045)

[Table 2, Column 2]

[Table 2, Column 2]

De Walque (2007)

U.S. National Health Interview Survey

H = Currently smoking

Vietnam War draft



Birth cohorts 1937–1956

E = Years of education

avoidance (induction risk)



N = 73,952

past high school

First stage: 1.779 (0.23)



[Table 1, Column 5]

[Table 2, Column 5]

Kenkel et al. (2006)

NLSY 1979

H = Currently smoking

State education policies



Birth cohort 1979


F-stat: 35.84/19.22



N = 3,205/3,286

E = High school graduate




[Table 3, Column 2/6]

[Table 3, Column 3/7]

C. Obesity

James (2015)

Health Survey of England (HSE)

H = Obese (0.147)

Expansion of post-



Birth cohorts 1962–1980

E = Age left school

compulsory education



N = 25,888

during 1980s



F-stat 16.50

[Table 3, Column 3]

[Table 3, Column 3]

Li & Powdthavee (2015)

Household, Income and Labour

H = Obese (0.279)

CSL age 14 to 15 in 1970s



Dynamics in Australia (HILDA)

E = Years of education

differential adoption by



Birth cohorts 1939–1972





N = 8,873

F-stat: 1.5x107

[Table 5, Column 4]

[Table 5, Column 4]

Brunello et al. (2013)


H = Obese (0.14/0.13)

Various CSL by country




E = Years of education






First stage: 0.42/0.57



Birth cohorts 1927–1970


[Table 5, Column 7/3]

[Table 5, Column 8/4]

N = 9,013/11,872


(a) Current Population Survey.

(b) Separate dummies for high school graduates and GED. We focus on high school graduates.

(c) European Community Household Panel.

(d) Survey for Health, Ageing and Retirement in Europe.

(e) English Longitudinal Study of Ageing.

Jürges et al. (2011)

German Micro-census

H = Obese (0.12/0.09)

Differential expansion of



for 1991 and 2003

E = Years of education (10.15/10.02)

education in Germany



Birth cohorts 1940–1980

First stage: 35.2/52.73



N = 61,892/60,139


[Table 2, Column 5]

[Table 3, Column 5]

Kemptner et al. (2011)

German Micro-census

H = Obese (0.16/0.13)

CSL 8 to 9 1949–1969



in 1989–2003

E = Years of education (9.42/9.20)

First stage: 0.595/0.663



Birth cohorts 1934–1960




N = 48,640/49,225

[Table 4, Column 1]

[Table 5, Column 2]

Reinhold & Jürges (2010)

German Micro-census

H = Obese (0.10/0.12)

Abolition of secondary



years 1999, 2003

E = Years of education (10.16/10.00)

school fees in 1950s



Birth cohorts 1934–1982

First stage: −0.11/−0.010



N = 74,638/73,233

F-stat: 11.67/12.61

[Table 2, Column 5]

[Table 3, Column 5]

Kenkel et al. (2006)

NLSY 1979

H = Obese (0.25/0.27)

State education policies



Birth cohort 1979

E = High school graduate (0.72)a

F-stat: 36.87/21.33



N = 3,248/3,274



[Table 6, Column 1/5]

[Table 6, Column 3/7]


(a) Separate dummies for high school graduates and GED. We focus on high school graduates.


The first paper to investigate the causal effect of education on mortality, using compulsory schooling laws as instruments for education, was the study by Lleras-Muney (2005) of white cohorts born in the United States between 1900 and 1925. It found large effects of education on adult mortality, which was measured by tracking cohort sizes across successive censuses. The instrumental variables (IV hereafter) estimates were negative and large, though not statistically different from their OLS counterparts. No substantive differences across gender were found.

Several studies have investigated these results in the United States using the same approach. These studies have pointed out many limitations of the original data, methodology, and findings. Mazumder (2008) documented that the results are not robust to including state-specific trends. Black, Hsu, and Taylor (2015) use more precise aggregate mortality data from vital statistics and document that almost all of the variation in mortality is explained by cohort and state fixed effects. Finally, a recent study by Fletcher (2015) shows that despite large samples and precise individual measures of age at death, the effects of education are not precisely estimated, though they appear to be large.22

The fundamental limitation of the U.S.–based studies is that CSLs in the United States had small effects on the average education of the population (e.g., in Lleras-Muney’s study one more year of compulsory schooling resulted in 0.05 years of additional schooling on average, or equivalently, only 1 in 20 individuals obtained one more year of schooling). Small effects are difficult to separate from the large secular improvements in education and mortality that occurred over the 20th century. This results in weak instruments, particularly when a large number of fixed effects (cohort, state) and trends (state-specific) are included. Additionally, only a small, and selected, number of people were treated (effectively forced to go to school longer). In the United States, these are typically individuals at the low end of the education distribution that likely come from particular backgrounds. Even with sufficient power, the CSL approach cannot estimate the average treatment effect in the population, but rather the LATE.

Other papers have therefore investigated settings where the reforms had larger impacts on education—a larger first stage improves the statistical properties of the estimators and makes them more likely to be representative of average population effects. The country with the largest effects of compulsory schooling reforms on education is Great Britain, where years of education increased by 0.45 (0.35) years when the school-leaving ages were increased from 14 to 15 (15 to 16) in 1947 (1972). Yet, in an excellent study of the effects of these reforms, Clark and Royer (2013) find no decrease in mortality.

Clark and Royer’s results are based on large representative aggregate mortality rates from vital statistics, and are estimated using a regression discontinuity approach, which compares individuals born right before and right after the cutoff birth dates specified by the law. This approach is therefore less susceptible to the issue of cohort trends that IV studies face, because the RDD compares individuals very close to a well-specified month of birth cutoff. A limitation of this study is that, due to data limitations, it cannot estimate the OLS effect of education on mortality, so one cannot fully assess the extent of the bias for this population.

A recent re-examination of the British 1972 CSL reform by Davies, Dickson, Smith, Van den Berg, and Windmeijer (2016) finds that it resulted in statistically significant declines in mortality. Methodological issues may explain the discrepancy in the estimates between Clark and Royer (2013) and Davies et al. (2016). Davies et al. (2016) use U.K. Biobank data that includes only people who volunteered to participate. Importantly, they observe only a small number of deaths between 2006 and 2014 (only 191 out of 22,138 die), whereas Clark and Royer use vital statistics. Not surprisingly, given the very low prevalence of mortality, results in Davies et al. (2016) appear to be sensitive to the choice of bandwidth and the degree of the polynomials fitted to account for trends. The main issue is that the cohorts affected by the 1972 reform are still relatively young and have low mortality rates. So, ultimately, Clark and Royer’s results based on the 1947 experiment are most informative, and these are precisely estimated zeroes.

Many other papers use CSLs in other contexts. Some suffer from weak instruments, for example, Albouy and Lequien (2009), who investigate compulsory schooling reforms in France. But in most other studies, the first stage is large and the instruments do not appear to be weak.23 These studies find a range of effects.

Three studies look at Sweden. Lager and Torssander (2012) and Meghir et al. (2017) investigated Swedish reforms that affected cohorts born between 1940 and 1960 and increased their education by 0.25–0.30 years. They both find very small and statistically insignificant declines in mortality. Fischer, Karlsson, and Nilsson (2013) investigated reforms that affected earlier cohorts in Sweden (born in 1924–1931) and had a substantially larger effect on educational attainment (about twice as large as the the more recent study by Meghir et al. 2017). They do find larger and statistically significant (at the 10% level) declines in mortality.

Van Kippersluis, O’Donnell, and Van Doorslaer (2011) use a Dutch reform in the early 20th century (1928) that raised the compulsory years of schooling from six to seven, which for many implied having to attend at least one year of secondary school. Mortality is observed between ages 81 and 87, which is a limitation because the sample is heavily selected by mortality before age 81. The upside is that the mortality rate in this cohort is rather high (around 50% for the pivotal cohort born in 1917), providing power to detect an effect, should one exist, in this population. Using administrative data in a two-sample two-stage least squares (TS2SLS) setting, the study finds a 2.5 percentage point reduction in the mortality rate for an additional year of schooling, an effect that is statistically significant among men at the 1% level. The Swedish and Dutch findings are consistent with the twin findings: effects seem larger in magnitude for older cohorts.

It is not clear whether the differences in findings across studies are due to methodology or reflect true heterogeneity in the effects of schooling. Gathmann, Jürges, and Reinhold (2015) conduct a useful systematic investigation of heterogeneity by pooling data from 19 European countries and exploiting CSL reforms at various times during the 20th century. Because the same methods are applied to all countries, the difference in the findings across countries are not methodological. The downside of the exercise is that it is difficult to account correctly for all the institutional details across countries. And the samples in this study are not as large as those used in some of the individual-country studies. They find significant effects of education on mortality for men, but not for women. Among men, effects appear sizable in some countries, such as Belgium and the Netherlands, but there are no effects in other countries, such as Spain or Italy. This paper also finds that older cohorts tend to have larger effects, and the effects are also larger for poorer countries or those with initially higher mortality baselines. But surprinsingly they are not larger for countries that start with lower levels of compulsory schooling.

One study uses a different quasi-experiment. Buckles, Hagemann, Malamud, Morrill, and Wozniak (2016) use the Vietnam war to generate experimental variation in college attendance. The main difficulty with this approach is that those who were drafted ended up serving, so that one must separate the effect of going to college from the effect of serving in Vietnam, both of which were affected by drafting procedures. The authors use variation in the risk of induction (the risk of being called to serve) during the War to generate two instruments that independently predict college attendance and veteran status—both of which are endogeneous. This is the only study that investigates the causal effect of college, and it finds that college education reduces the mortality rate in middle age by 2.6 percentage points. Interestingly, it also does not find much bias in the OLS estimates.

In sum, there appears to be evidence that education leads to lower mortality for men, but this is true only for certain times and places. These effects appear to be larger for men born at the turn of the century—when GDP per capita was substantially lower and mortality was higher—and smaller more recently. Findings for women are substantially less robust, and statistically insignificant or small when significant.

Smoking and Obesity

The results for smoking and obesity also vary across studies, as shown in Table 3, panels B and C. A few issues are worth mentioning from the outset. We review only studies that focused on whether an individual is a current smoker, and whether an individual is currently obese. There are undoubtedly important differences between such studies and those that focus on the initiation and cessation of smoking, or BMI and overweight, respectively. Smoking and obesity information come from survey data, rather than from vital statistics or censuses. This results in several problems. First, these outcomes are typically (though not always) self reported, and have known reporting biases,24 though not much is known about whether education levels predict misreporting. Secondly, some survey data are not representative of the population. Finally, some of these studies have much smaller sample sizes than the vital statistics and censuses used in the mortality studies—this poses a problem for studies that rely on aggregate reforms such as CSLs, particularly if one wants to flexibly control for trends. The studies will be underpowered.

Clark and Royer (2013) and Davies et al. (2016) both find that laws raising the minimum school-leaving age did not significantly affect smoking prevalence in Britain. Their point estimates are significantly smaller than the corresponding OLS estimates and are relatively precisely estimated zeroes. Braakmann (2011) exploits a slightly different feature of schooling laws in England, where children born in February are 3% more likely to obtain a qualification than those born in January due to the timing of exams. The study also finds no statistically significant effect on smoking prevalence. Here, the 2SLS estimates are of similar magnitude as the OLS results, but imprecisely estimated. Finally, although James (2015) uses a slightly less convincing design—education effects are identified by checking whether deviations from long-term trends of education for specific cohorts are associated with similar deviations in smoking—he reaches the same conclusion: educational expansions in Great Britain have not led to reduced smoking prevalence. Interestingly, Silles (2015) reports that education lowers smoking for males in Northern Ireland, a much poorer country.

In Germany, a group of authors (Reinhold & Jürges, 2010; Jürges, Reinhold, & Salm, 2011; Kemptner, Reinhold, & Jürges, 2011) used various sources of exogenous variation that increased years of education (the abolition of fees in academic track schools, academic track school construction that differed across German provinces, and compulsory-schooling laws, respectively) in conjunction with the sizable German Micro-census (between 70,000 and 120,000 observations). Interestingly, their papers reach different conclusions. Reinhold and Jürges (2010) and Kemptner et al. (2011) show, in line with the British evidence, no statistically significant effect of an extra year of secondary schooling on smoking behavior. Jürges et al. (2011), however, investigate the construction of new academic track schools, which, apart from increasing years of schooling, also changed the composition of students within the different school tracks. The induced changes, in both the quantity and quality of schooling, led to a 40 and 21% decrease in smoking prevalence for men and women, respectively, although the authors acknowledge that the results for men are not robust.

Etilé and Jones (2011) use a French expansion of secondary schooling in a Difference-in-Difference (DiD) design, using those with the highest level of education as a control group. They find sizable, 6–8%, effects of an extra year of education on current smoking. Heckman, Humpries, and Veramendi (2016) compare college graduates with high school graduates and use a structural equation framework with largely plausible exclusion restrictions. They also find sizable effects on smoking prevalence.25

In sum, very convincing natural experiments in Great Britain and Germany suggest that another year in secondary school does not affect smoking prevalence. Li and Powdthavee (2015) draw a similar conclusion for Australia, albeit using a smaller sample (N<10,000.26 However, it seems that exposure to a completely different type of schooling (e.g., completing high school as in Kenkel, Lillard, & Mathios, 2006; college versus high school as in Grimard & Parent, 2007; De Walque, 2007; Heckman et al., 2016; or academic track versus regular track as in Jürges et al., 2011) can lead to sizable reductions in smoking prevalence. This suggests that the schooling track and the associated peer group may be more important in smoking decisions than spending a certain number of years in secondary school. This interpretation would be consistent with the RCT results previously reported: schooling affects smoking if it changes one’s peers but not otherwise. However, the IV strategies used for these comparisons typically involve stronger assumptions compared with the RDD approaches that study a raise in the compulsory years of schooling, and the IV assumptions cannot be directly assessed for their validity.

Only a few studies on the effects of schooling on obesity met our criteria, in part because we limited attention to obesity, rather than including BMI or overweight as outcomes. The OLS results, without exception, indicate a negative correlation between education and obesity, of the order of a two-percentage-point reduction in the probability to be obese for every year of schooling. These results hold even though the studies cover substantially different birth cohorts and time periods.

Evidence for a causal protective effect of education on obesity is, however, at best very weak. In part because of the relatively small sample sizes, the standard errors of the 2SLS estimates are often large, rendering the effects insignificant. Only Kemptner et al. (2011) and James (2015) find a protective effect of an extra year of schooling in Germany and the United Kingdom, respectively. However, James’s effect sizes seem implausibly large (68% reduction in the probability to be obese for an extra year of schooling), and the effects in Kemptner et al. 2011 are present only for men and not robust to including a higher-order polynomial in state-specific trends. Further, Jürges et al. (2011) obtain a statistically significant result (at the 1% level) that an additional year of education increases, not decreases, obesity. In all other cases, the point estimates are insignificant, close to zero (e.g., Kenkel et al., 2006; Brunello, Fabbri, & Fort, 2013), and in some cases positive though not significantly different from zero, (e.g., Reinhold & Jürges, 2010; Li & Powdthavee, 2015). Hence, there seems to be no convincing evidence to date that education causally reduces the likelihood of being obese.

Understanding Heterogeneity and Mechanisms

Why do the quasi-experimental estimates differ so much from the ordinary least squares (OLS) estimates and from each other? Some differences can be due to methodology. OLS estimates indicate a strong association between education and health outcomes, which declines somewhat but remains strong even with an extensive set of control variables. In some cases, the point estimates obtained through 2SLS using educational reforms as instrumental variables (IV) are larger than the corresponding OLS estimate, which is counterintuitive if we believe the selection bias of OLS estimates is positive. This apparent contradiction could be explained by larger returns for the compliers of reforms than for the average individual (see the discussion about local average treatment effect [LATE] in the section “Quasi-Experimental Evidence”). Alternatively, it could reflect a publication bias or weak instruments. In two cases, the 2SLS estimates of education on mortality estimate precise zeroes (Clark & Royer, 2013; Meghir, Palme, & Simeonova, 2017). In most cases, in line with the OLS results, the 2SLS estimates suggest positive effects of education on health outcomes, but have wide confidence intervals that cannot rule out the OLS estimates.

At a more substantial level, what the findings suggest is that there is substantial true heterogeneity in these effects across time, space, and population. Our reading of the evidence is that there is an effect of education on mortality, but not for all populations. Evidence for women is somewhat more limited but suggests there is no, or a much smaller, effect on mortality for females than for males. Effects appear to be larger for earlier schooling reforms and smaller for later ones. The effects of education on smoking are frequently statistically insignificant, but not in randomized controlled trials (RCTs) and not in a few IV studies. There seems to be no convincing evidence that education causally reduces the likelihood of obesity.

How, in light of the theory, might we interpret the finding that education reduces mortality or smoking sometimes? As the section “Effects of Compulsory Schooling on Consumption and Mortality” illustrated, in our theory education can affect unhealthy consumption and mortality through two prominent channels. First, additional schooling may raise life-time earnings and wealth (qA(t)/A(t)0). This wealth (or permanent income) effect raises the relative marginal value of health (qh/a(t)/A(t)0) and thereby the health benefit of health investment and the health cost of unhealthy consumption, both improving healthy behavior. A second important channel through which schooling may affect health behavior and longevity is through skill formation improving the efficiency of health production gH(.) (see 10, e.g., health knowledge, access to high-quality medical care, connections, etc.).

The Role of Income and Access to Resources

One first reason why education may lead to better health is that it increases lifetime earnings (in our model, higher earnings Y(t), smaller marginal value of wealth qA(t)), which in turn can be used to purchase health inputs. The monetary returns to education in the labor market vary. In the United States they were very high in the beginning of the 20th century, plummeted to their lowest level in the early 1970s, and have been rising rapidly since then (Goldin & Katz, 2009). Our model suggests the effect of education on health should follow a similar pattern, if the permanent income pathway to health is of relative importance. Indeed, our review of the literature found larger effects on mortality for earlier schooling reforms. In line with this reasoning, Cutler, Huang, and Lleras-Muney (2015) find that the health returns to education are larger for cohorts graduating in bad economic conditions, partly because those without education suffer large employment and wage declines if they enter the labor market during bad times (Oreopoulos et al., 2012). In other words the returns to education are larger for those graduating in bad times.

Some papers investigate directly whether increases in education due to compulsory schooling legislation (CSL) raised incomes. For instance, Devereux and Hart (2010) find that the British CSL reforms of 1947 (studied also by Clark & Royer, 2013) did not result in an increase in wages for women but did increase wages by 3–4% for men. Other papers document small or no returns to education on earnings using CSLs. Meghir and Palme (2005) and Fischer, Karlsson, Nilsson, and Schwarz (2016) find rather small income increases for men in Sweden (2%), and no effects for women, as a result of the 1948 CSL reforms. Perhaps not surprisingly, the corresponding effects on mortality in Sweden are small or not significantly different from zero. The effects of schooling on income are substantially lower in Great Britain and Sweden than those found in the United States for older cohorts. Clay, Lingwall, and Stephens (2016) find wage returns for white males born between 1885 and 1912 to be between 6.5 and 8%, perhaps explaining why Lleras-Muney (2005) and Fletcher (2015) cannot rule out relatively large effects on mortality.

These results suggest one possible explanation for the observed heterogeneity in effects: perhaps the returns to education in the labor market are small for some cohorts and some populations. Low income returns to education in Great Britain, for example, are consistent with a high fraction of individuals dropping out of school exactly at the minimum school-leaving age—this is indeed what the model predicts individuals would do if perceived returns were low but enforcement penalties for non-compliance are high. This suggests that, while settings where a lot of people are affected by CSL are likely to estimate average treatment effects, they could also precisely be the settings where returns are expected to be small.27 It also suggests that the effects of education on health are small for cohorts for whom the monetary returns to an additional year of education are small because of institutional and labor-market conditions (e.g., high barriers, such as discrimination, corruption, crime; low labor-market returns such as low wages, high unemployment, low quality of schools, etc.). In contrast, it could be that in the United States returns were high and enforcement penalties low. Thus, the small group of compliers were individuals who were credit constrained and had higher returns.

Another reason education might not have large health returns in Great Britain and other European countries is that it is not associated with other (non-wage) benefits in the labor market. In most European countries, health and disability insurance is publicly provided, making income differences potentially less important. In the United States, for example, the college educated have better health insurance, but in most European countries healthcare access is universal.

Related to the discussion about the LATE, the discrepancy across studies could also stem from the fact that different populations are affected by the reforms. Many studies find that the monetary returns to education, estimated using CSL reforms, is much larger for low socioeconomic status (SES) men. This is true in Sweden (Meghir & Palme, 2005) and in the United States (Clay et al., 2016). It also appears to be the case in the United Kingdom. A recent paper by Barcellos, Carvalho, and Turley (2017) finds that while the average effect of the U.K. 1972 reform on body mass index (BMI) is zero—consistent with Clark and Royer (2013), the effects of the reform are larger at the bottom of the BMI distribution, where it resulted in reductions in obesity. Davies, Dickson, Smith, Van den Berg, and Windmeijer (2016) also report that the same 1972 U.K. reform increased incomes at the bottom of the income distribution but not at the top. In the United States, Lleras-Muney and Shretzer (2015) find that reforms that limited work in adolescence had greater wage returns for immigrant children than for natives, and immigrants tend to be of lower SES.

The discussion so far assumes that resources, such as income and health insurance, are the main mechanisms whereby education causes health. The theory and the empirical literature suggest that this is, however, not the only, or even most important, channel. As already discussed, RCTs find effects on incomes but also document that education interventions are associated with improvements in non-cognitive skills and in peers. These could easily affect health. Further, there are many health behaviors that are not immediately determined by income, such as wearing a seat belt or exercising. This suggests that factors other than resources could be at play.

The Role of Quality of Education, Skills, and the Difference Between Time in School and Skills

Effects of education on health could also differ because of variation in the quality of schools across time, place, and population. When CSL reforms are implemented, and when they affect a lot of individuals, it is possible that they lower the quality of instruction and therefore the skills associated with time in school. Evidence from large expansions of education in Italy suggests this can occur. Bianchi (2016) investigates an Italian reform in 1961 that dramatically increased university enrollment in science fields. But wages of affected individuals did not increase because congestion (higher pupil/teacher ratios and overall lower per pupil resources) and peer effects lowered overall learning rates (measured by grades). Many scholars have pointed out that the returns to education (on any outcome) must depend on the quality of education and the skills it imparts (e.g., Bold et al., 2017). But measuring school quality (separate from the characteristics of students who attend school) is extremely difficult.28

A related question is whether more or different education leads to different occupations, which depends on the type of skills that are acquired in school—some occupations are much more beneficial to health than others. Secondary schooling increased access to white-collar jobs at the beginning of the 20th century, but not so much at the end. These jobs were substantially safer than were jobs in agriculture and manufacturing. This could also explain why older cohorts saw larger returns from staying in school.

A related issue is whether we are measuring the effect of what happens in school or the effects of what happens outside of school. When children left school in the past, they worked in agriculture and in manufacturing—occupations that were detrimental to children’s health and growth. Or they roamed the streets. As compulsory schooling expanded, so did health-related nutrition, vaccination, and health programs provided in school. As in the Abecederian (ABC) program, schools and school policies evolved to guarantee and improve the health of their students in an attempt to better their learning capacities. It is possible that school leads to better health in some periods because it kept kids in safe and healthy environments relative to the alternative.

It is also possible that in some contexts compulsory schooling extended time in school without increasing the skills individuals benefit from. For example, Pischke and Von Wachter (2008) study the effect of compulsory schooling changes in Germany in the post-WWI period and find they had no impact on wages. They hypothesize this is because “the basic skills most relevant for the labor market are learned earlier in Germany than in other countries.” This is consistent with the distinction in our model between skills and schooling. Schooling is in essence time and if time is not used productively then additional schooling may not increase skills.

This provides a segue into a related, but more disturbing, possibility that compulsory schooling is in fact a bad experiment to assess the causal effect of education because these laws force individuals who are not interested in staying in school longer to do exactly that. In the absence of market failures, such as financial constraints, and if individuals are rational, informed and forward looking, as in our theory, the variation in schooling in the population is driven by voluntary attendance decisions—that is, individuals want to attend school because its benefits exceed its costs. In terms of equation (8), when there is no disutility of schooling and/or skill investment is productive, individuals spend all available time in school on skill production τS(t)=τθ(t). Hence, when empirical researchers compare individuals with different levels of schooling in an OLS setting, implicitly they are comparing individuals with different “effective” schooling durations and skills. In contrast, most IV studies use compulsory schooling reforms, which is an increase in S_. The complier population consists of individuals who would have liked to drop out of school in the absence of the reform. It is not clear whether the production of skill is equally productive for this group, compared to those who voluntarily remain in school. They may invest less (or not at all) in their skill during this additional year than do others: τθ(t)<τS(t). Hence, in compulsory schooling settings, schooling duration and skill capital are potentially weakly connected, and the skill capital gained may be lower in school than in other settings (e.g., individuals may be better off learning on the job).

The broader question is whether compulsory schooling solves a market failure: why are individuals not attending school to begin with? If they are not attending because they are credit constrained, because they are poorly informed about its returns, or because they have different objectives than their parents or legal guardians (e.g., children want to go to school but their parents don’t want them to), then CSL could be beneficial. There could also be externalities that rationalize increasing education beyond what appears to be optimal for the individual (societal benefits). But in the absence of such failures, economic theory predicts that CSL could have negative returns.

The Role of Period Effects: Disease, Technology, and Information

Only two studies investigate sources of heterogeneity beyond gender. Gathmann, Jürges, and Reinhold (2015) report that earlier reforms appear to have had larger effects than did later reforms, but not necessarily at lower levels of schooling. Earlier cohorts had larger mortality rates—this suggests that for more recent cohorts, effects might not materialize or become statistically detectable until older ages. This might explain why studies that investigate pre-WWII cohorts are more likely to find effects. Also, effects appear larger in poorer countries. Glied and Lleras-Muney (2008) suggest another reason why the effects of education may differ across time periods: they find that CSLs lowered mortality in the United States, but these effects were much larger for diseases for which there was more medical innovation.

Scholars have documented that the health returns to education measured by OLS are also quite heterogeneous, particularly if one investigates these effects over long spans of time and across different populations. For example, the association between education and BMI varies with the level of development. Among poor countries, education is associated with higher weight, but the relationship becomes negative as countries get richer and fatter (Cutler & Lleras-Muney, 2014). Thus the effect of education depends on the stage of the “nutrition transition”: in countries where nutrition levels are low and infectious disease is prevalent, being overweight is an advantage, so the more educated and those with more resources will be overweight. But as nutrition levels improve and infectious diseases disappear, being overweight becomes costly as it leads to chronic diseases, such as cardiovascular disease. Those with knowledge and resources avoid being overweight. Indeed, a recent meta analysis by Smith, Anderson, Salinas, Horvatek, and Baker (2015) finds that the OLS effect of education on chronic disease is, on average, negative, but it varies systematically from 0 to a negative number depending on the stage of the epidemiological transition. Our theory suggests an alternative (but not necessarily competing) hypothesis, operating solely through wealth, rather than knowledge or chronic disease. As countries are poor, due to diminishing returns to wealth, improvements in wealth raise the demand for unhealthy consumption more than the health costs decrease it. At higher levels of wealth (in more developed countries) the opposite happens: gains in wealth raise the demand for unhealthy consumption by less than the health costs decrease it.

Similarly, the effect of education on smoking has evolved over time. In the United States in the 1950s, more educated individuals were more likely to smoke, but this reversed as knowledge of the harms of smoking disseminated. What these examples suggest is that wealth, and the availability of information, play an important role in determining whether individuals undertake healthy behaviors.

Multiple Inputs Into Health

Clearly, smoking is determined by several factors, including knowledge but also income, wealth, cigarette taxes and prices, prices of complementary (or substitute) goods, peers, and possibly many other factors, including genetic risk. A different set of factors might affect (over)eating and exercise and therefore obesity. Thus, at a particular point in time, one could observe educated individuals undertaking one but not the other behavior because different factors are at play.

Indeed, the bias in OLS estimates of the effect of schooling seems to differ across outcomes, even within studies. Almost all studies considered here find statistically insignificant effects of education on smoking, and the IV estimates are generally smaller than their OLS counterparts. By contrast, many studies investigating mortality find IV effects that are larger than (or statistically identical to) OLS effects.

Several papers report IV estimates for more than one outcome, using the same sample and identification strategy. The pattern that emerges from these studies is somewhat puzzling but consistent with the idea that different outcomes have different inputs. Consider the results by Kemptner, Jürges, and Reinhold (2011). They find that for men OLS estimates of the effect of education on smoking are upward biased, whereas OLS estimates for obesity are downward biased. Jürges, Reinhold, and Salm (2011) find that for women the OLS estimate of the effect of education on smoking is downward biased but obesity estimates are upward biased.

This suggests that the relevant omitted variables are different for different outcomes (peers may matter for smoking but maybe less for exercise), or that the same omitted variable has different effects on different outcomes (e.g., income may increase smoking but also increase exercise and thus lower obesity). Thus, simple stories that suggest that a single factor, such as IQ, generates upward bias in the effect of education are too simplistic.29 Advances in the genetics and biology of health behaviors may improve understanding of why certain, otherwise identical, individuals engage more or less in specific unhealthy behaviors.

Gender, Culture, and Peers

Many of the studies considered find substantial differences across gender, with the evidence typically weaker for women. Why this is the case is not clear. One reason could be that more educated women are more likely to delay births, have fewer children, and are more likely to use modern hormone-based contraceptives. These behaviors are thought to increase the chance of cancers of the reproductive system, because these cancers are a function of lifetime exposure to hormones.30

Pregnancy, which only women experience, is related to important changes in weight, health-seeking behaviors, and labor force participation. In many developed countries pregnant women and women with small children make more use of the healthcare system, regardless of their level of education—this may somewhat attenuate differences between more and less educated women. For instance, all women today are strongly discouraged from smoking during pregnancy. Moreover, pregnancy causes persistent decreases in hourly earnings and labor supply (e.g., Lundborg, Plug, & Rasmussen, 2017), potentially rendering the relationship between education and lifetime earnings, and, in turn, health, weaker among women versus men.

Another possibility is that the returns to education for women operated through the marriage market, rather than through their own labor income, given that labor force participation of women was low until recently. Finally, it is worth noting that some behaviors, such as smoking or drinking, were considered taboo for females at the turn of the 20th century—but as women entered the labor force they became acceptable, and in some cases, where adopted by the most successful women, symbols of independence (Amos & Haglund, 2000). Thus, for more recent cohorts of women, it is possible that more successful women undertake more unhealthy behaviors, despite their health cost. This discussion suggests that for women it is key to incorporate the marriage and fertility effects of education, in addition to other behaviors that are partially culturally determined, to understand the net effect of education on women’s health.

More broadly, economic theories largely ignore social ties and norms as determinants of health. Many empirical papers suggest that peers are a key determinant of some behaviors, such as eating, smoking, and drinking (e.g., Cawley & Ruhm, 2012). And, as already discussed, the evidence from RCTs and IV studies suggests that one reason why schooling may affect smoking is that, in some instances, it alters the peer group. But research rarely considers the effects of isolation and networks on health directly, independently of these behaviors. Many empirical studies find strong associations between social and community ties and mortality (starting with Berkman & Syme, 1979; see Kawachi, Subramanian, & Kim, 2008; Holt-Lunstad, Smith, & Layton, 2010 for reviews). Education, and the type of education individuals receive, affect the size and quality of their social connections and their integration into the communities in which they live. There are important differences across gender in how ties are formed and in the type of ties they have. Women also appear to be differently affected by social connections. For example, the effect of marriage on mortality is very different by gender: marriage lowers mortality more among men than among women, and widowhood increases mortality more for men than for women (Smith & Christakis, 2008).

Some Concluding Thoughts and Directions for Future Research

An attempt has been made to provide some insight into the findings of a large literature seeking to understand the effect of education on health outcomes. While there are some discrepancies across studies that appear to be driven by methodological differences, there appears to be substantial true heterogeneity in the effect of education on health outcomes. This heterogeneity varies depending on the outcome studied, the country, and the time period. The theoretical determinants of health have been investigated to try to understand this heterogeneity. Additionally, an effort has been made to gain insight into this heterogeneity by hypothesizing ex-post about the factors that might explain observed differences across empirical studies.

Our conclusion, based on reviewing twin studies and instrumental variables (IV) studies, is that education lowers mortality among men, but only for some populations and time periods. However, the results based on IV are often very imprecise so it remains possible that differences in the findings across studies are spurious. A second conclusion is that education does not appear to affect smoking in a causal sense, except for disadvantaged populations, or for those whose peer group changes with the amount or type of schooling they obtain. Third, the evidence on the effect of education on obesity is weak. Finally, across all studies, methods, and outcomes, there are substantial differences between men and women, with men’s outcomes depending more strongly on education than women’s.

Next, a few broader conclusions are summarized. There is an important difference between time spent in school, which is an input, and the actual skills that are developed during that time. It may very well be that additional schooling does not lead to additional skills (that is, the complier affected by a particular reform does not learn much from schooling), or that the skills learned are not useful to the complier given her labor-market prospects. Research should seek to move toward a better understanding of observed heterogeneity in the effects of education by evaluating the benefits of reforms to those affected.

Related, the quasi-experiments used to estimate the effects of education on health outcomes almost exclusively derive from changes in the minimum school-leaving age. Whereas these reforms often provide convincing, random-like variation in attended years of schooling for cohorts born (in some cases) only a few months apart, they affect only a specific group of compliers at a particular margin—those for whom the minimum school-leaving age happened to coincide with the end of their educational career. The theory suggests there are various other policy instruments encouraging skill formation that are plausibly exogenous to the individual, for example prices pS and pθ and subsidies λS, or the term length τS. It would be informative to see how these other type of reforms—with different types of compliers, and at different margins—affect later-life health outcomes.31

Future work should also link short-, medium-, and long-term outcomes. Most studies of mortality do not observe important determinants of mortality, such as health behaviors or income across the life cycle, and few studies include measures of what occurred in school in terms of nutrition, and cognitive and non-cognitive skill development. Studies must move beyond asking whether or not “there is an effect” to tracing the pathways by which particular reforms impact individuals in a comprehensive manner. This deeper investigation, linking short- and long-term outcomes, can shed light on why there is, what we believe to be, true heterogeneity across studies in the effects of education on health.

Studies could also benefit from investigating the returns to the skills that are learned in school—these returns depend greatly on economic and social circumstances and institutions. Greater attention should be paid to the fact that different health behaviors or health outcomes have different determinants: the simple notion that education has the same (beneficial or detrimental) effect across outcomes does not seem to be supported by the data.

The current availability of measures in, and sample sizes of, existing data sources are probably too limited to undertake these kinds of detailed studies, particularly if one wants to use quasi-experiments for identification. Thus, a possible future direction for research would be to combine results from studies using large-scale administrative databases with wide population coverage but a small number of outcomes, with analyses using data with smaller sample sizes but that are richer in their survey measures.

In general, heterogeneity is investigated without much theoretical guidance. Even the simple theory presented here makes rich predictions as to how and why effects of education may differ. A greater integration of theoretical and experimental work would be useful.32 In particular, a promising avenue seems to be the exploration, guided by theory, of “interactions” between determinants of education and health. For instance, education might make more of a difference when there is information about the dangers of smoking, but less so when there are smoking bans.

Both our theory and our review of the empirical evidence fall short in some specific areas. One important area is the modeling and documenting of the role of peer effects and social factors in the determination of health gaps by education. Our model operates at the individual level, and important factors, such as the influence of spouses and friends are not incorporated. We also do not model, and did not discuss, genetic factors. A new, important, and exciting area of research investigates the role of genetic risk, and more specifically the interaction between genes and the environment. Genetic make-up is treated typically as a potential omitted variable, and is implicitly taken into account when using within-twins designs or carefully conducted IV studies. But this approach assumes genetic factors are additive and do not interact with the environment. This view of the effect of genetic endowments has proved to be overly simplistic: it is increasingly appreciated that complex interactions exist between genes and environment (Turkheimer, 2000; Heckman, 2007). Lastly, many common health behaviors, such as smoking and eating, have important addictive characteristics—our theory and analysis of the empirical evidence is based on a set of strong assumptions that include perfect information and rational decision-making. Greater incorporation of bounded rationality into health decision-making and a greater understanding of how education affects imperfect decision-making would be useful.

A key difficulty with quasi-experiments is that they rely on variation over time or across cohorts. Health and mortality vary substantially with age. As Davies et al. (2016, p. 14) explain: “people affected by the reform, are one year younger than our control group, those unaffected by the reform. (. . .) Many of the outcomes we investigated increase linearly or log-linearly over time. This means it is difficult to determine if any differences we observed are due to an additional year of aging or the reform.” Similarly, there have been large secular improvements in health resulting in much lower mortality and much better health at any age among recent cohorts. It is very difficult to identify the effects of reforms separately from age effects and from secular trends across cohorts, particularly if one wants to account for these trends flexibly, in a non-parametric fashion. This suggests that exploiting cross-sectional variation, that is, comparing individuals born around the same time but who obtain different levels of schooling through a randomized or natural experiment, could be very informative because it alleviates the need to control for secular trends over cohorts and time.

Finally, in compiling our tables it was difficult to compare across studies—the information empirical researchers report varies greatly across studies. Not all studies report the number of observations, or basic summary statistics, such as the the mean and standard deviation of the dependent variable or of education. Some quasi-experimental studies do not report OLS results, and some papers do not report results separately by gender. The choice of the dependent variable and the functional form used across studies also varies widely. Studies do not systematically report whether results are sensitive to these choices or compare their estimates to existing estimates in the literature. Neither do studies report how results vary when one includes covariates that were included in other existing studies. This lack of uniformity makes it very challenging to summarize and compare findings, particularly the magnitudes of their effects. Future work should try to systematically report basic information, and include results of models that are identical to those previously estimated, for greater transparency and so that comparisons can be made more easily.


Research reported in this publication was supported by the National Institute on Aging of the National Institutes of Health under Award Numbers K02AG042452 (Galama), RF1AG055654 (Galama), R01AG037398 (Galama and Van Kippersluis). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Van Kippersluis thanks the Netherlands Organization of Scientific Research for financial support (NWO Veni grant 016.145.082). Titus Galama is grateful to the School of Economics of Erasmus University Rotterdam for a Visiting Professorship in the Economics of Human Capital. We would like to thank Carolina Arteaga, Jingyi Fang, and Keyoung Lee for excellent research assistance, and Martin Karlsson, Martin Salm, Peter Savelyev, and two anonymous reviewers for excellent comments.

Further Reading

Almond, D., Currie, J., & Duque, V. (2017). Childhood Circumstances and Adult Outcomes: Act II (No. w23017). National Bureau of Economic Research.Find this resource:

Cawley, J., & Ruhm, C. J. (2012). The Economics of Risky Health Behaviors. In Mark V. Pauly, Thomas G. McGuire, & Pedro Pita Barros (Eds.), Handbook of Health Economics (chapter 3, pp. 95–199). Amsterdam: North Holland.Find this resource:

Cutler, D. M., & Glaeser, E. (2005). What Explains Differences in Smoking, Drinking, and Other Health-Related Behaviors? The American Economic Review, 95(2), 238–242.Find this resource:

Cutler, D. M., & Lleras-Muney, A. (2014). Education and health: Insights from international comparisons. In A. J. Culyer (Ed.), Encyclopedia of health economics (vol. 1, pp. 232–245). San Diego, CA: Elsevier.Find this resource:

Galama, T. J., & Van Kippersluis, H. (2015). A Theory of Education and Health. CESR Working Paper #2015-001.Find this resource:

Grossman, M. (2015). The Relationship between Health and Schooling: What’s New? Nordic Journal of Health Economics, 3(1), 7–17.Find this resource:

Heckman, J. J., Humpries, J. E., & Veramendi, G. (2016). Returns to education: The causal effects of education on earnings, health, and smoking. National Bureau of Economic Research Working Paper No. 22291.Find this resource:

Lochner, L. (2011). Nonproduction Benefits of Education: Crime, Health, and Good Citizenship. In E. Hanushek, S. Machin, & L. Woessmann (Eds.), Handbook of the Economics of Education (vol. 4, chapter 2). Amsterdam: Elsevier Science.Find this resource:


Aizer, A., & Cunha, F. (2012). The production of human capital: Endowments, investments and fertility. National Bureau of Economic Research Working Paper No. w18429.Find this resource:

Albouy, V., & Lequien, L. (2009). Does compulsory education lower mortality? Journal of Health Economics, 28, 155–168.Find this resource:

Almond, D., Currie, J., & Duque, V. (2017). Childhood circumstances and adult outcomes: Act II. National Bureau of Economic Research Working Paper No. w23017.Find this resource:

Amin, V., Behrman, J. R., & Spector, T. D. (2013). Does more schooling improve health outcomes and health related behavior? Evidence from U.K. twins. Economics of Education Review, 35(2013), 134–148.Find this resource:

Amos, A., & Haglund, M. (2000). From social taboo to “torch of freedom”: The marketing of cigarettes to women. Tobacco Control, 9(1), 3–8.Find this resource:

Arrow, K. J. (1973). Higher education as a filter. Journal of Public Economics, 2(3), 193–216.Find this resource:

Barcellos, S. H., Carvalho, L. S., & Turley, P. (2017). Distributional effects of education on health. Mimeo.Find this resource:

Becker, G. S. (1964). Human capital: A theoretical and empirical analysis, with special reference to education. New York: National Bureau of Economic Research.Find this resource:

Becker, G. S. (2007). Health as human capital: Synthesis and extensions. Oxford Economic Papers, 59(3), 379–410.Find this resource:

Bedard, K. (2001). Human capital versus signaling models: University access and high school dropouts. Journal of Political Economy, 109(4), 749–775.Find this resource:

Behrman, J. R., Kohler, H. P., Jensen, V. M., Pedersen, D., Petersen, I., Bingley, P., & Christensen, K. (2011). Does more schooling reduce hospitalization and delay mortality? New evidence based on Danish twins. Demography, 48, 1347–1375.Find this resource:

Ben-Porath, Y. (1967). The production of human capital and the life cycle of earnings. Journal of Political Economy, 75(4, Part 1), 352–365.Find this resource:

Berkman, L. F., & Syme, S. L. (1979). Social networks, host resistance, and mortality: A nine-year follow-up study of Alameda County residents. American Journal of Epidemiology, 109(2), 186–204.Find this resource:

Bianchi, N. (2016). The indirect effects of educational expansions: Evidence from a large enrollment increase in STEM majors. Working paper.Find this resource:

Bijwaard, G. E., Van Kippersluis, H., & Veenman, J. (2015). Education and health: The role of cognitive ability. Journal of Health Economics, 42, 29–43.Find this resource:

Black, D. A., Hsu, Y. C., & Taylor, L. J. (2015). The effect of early-life education on later-life mortality. Journal of Health Economics, 44, 1–9.Find this resource:

Black, S. E., Devereux, P. J., & Salvanes, K. G. (2008). Staying in the classroom and out of the maternity ward? The effect of compulsory schooling laws on teenage births. Economic Journal, 118(530), 1025–1054.Find this resource:

Bold, T., Filmer, D., Martin, G., Molina, E., Stacy, B., Rockmore, C., et al. (2017). Enrollment without learning: Teacher effort, knowledge, and skill in primary schools in Africa. Journal of Economic Perspectives, 31(4), 185–204.Find this resource:

Bound, J., & Solon, G. (1999). Double trouble: On the value of twins-based estimation of the return to schooling. Economics of Education Review, 18(2), 169–182.Find this resource:

Braakmann, N. (2011). The causal relationship between education, health and health related behavior: Evidence from a natural experiment in England. Journal of Health Economics, 30, 753–763.Find this resource:

Brunello, G., Fabbri, D., & Fort, M. (2013). The causal effect of education on body mass: Evidence from Europe. Journal of Labor Economics, 31(1), 195–223.Find this resource:

Buckles, K., Hagemann, A., Malamud, O., Morrill, M., & Wozniak, A. (2016). The effect of college education or mortality. Journal of Health Economics, 50, 99–144.Find this resource:

Campbell, F., Conti, G., Heckman, J. J., Moon, S. H., Pinto, R., Pungello, E., & Pan, Y. (2014). Early childhood investments substantially boost adult health. Science, 343, 1478–1485.Find this resource:

Card, D. (2001). Estimating the return to schooling: Progress on some persistent econometric problems. Econometrica, 69(5), 1127–1160.Find this resource:

Case, A., & Deaton, A. S. (2005). Broken down by work and sex: How our health declines. In D. A. Wise (Ed.), Analyses in the economics of aging (pp. 185–212). Chicago: University of Chicago Press.Find this resource:

Cawley, J., & Ruhm, C. J. (2012). The economics of risky health behaviors. In M. V. Pauly, T. G. McGuire, & P. P. Barros (Eds.), Handbook of health economics (pp. 95–199). Amsterdam: North Holland.Find this resource:

Caputo, M. R. (2005). Foundations of Dynamic Economic Analysis. Cambridge: Cambridge University Press.Find this resource:

Chetty, R. (2009). Sufficient statistics for welfare analysis: A bridge between structural and reduced-form methods. Annual Review of Economics, 1, 451–488.Find this resource:

Conti, G., Heckman, J. J., & Pinto, R. (2016). The effects of two influential early childhood interventions on health and healthy behaviour. Economic Journal, 126, F28–F65.Find this resource:

Conti, G., Heckman, J. J., & Urzua, S. (2010). The education-health gradient. American Economic Review Papers and Proceedings, 100, 234–238.Find this resource:

Contoyannis, P., Jones, A. M., & Rice, N. (2004). The dynamics of health in the British Household Panel Survey. Journal of Applied Econometrics, 19(4), 473–503.Find this resource:

Clark, D., & Royer, H. (2013). The effect of education on adult mortality and health: Evidence from Britain. American Economic Review, 103(6), 2087–2120.Find this resource:

Clay, K., Lingwall, J., & Stephens, M., Jr. (2016). Laws, educational outcomes, and returns to schooling: Evidence from the Full Count 1940 Census. National Bureau of Economic Research Working Paper No. w22855.Find this resource:

Connor Gorber, S., Tremblay, M., Moher, D., & Gorber, B. (2007). A comparison of direct vs. self-report measures for assessing height, weight and body mass index: A systematic review. Obesity Reviews, 8, 307–326.Find this resource:

Cunha, F., & Heckman, J. (2007). The technology of skill formation. American Economic Review, 97(2), 31.Find this resource:

Cutler, D. M., & Glaeser, E. (2005). What explains differences in smoking, drinking, and other health-related behaviors? American Economic Review, 95(2), 238–242.Find this resource:

Cutler, D. M., Huang, W., & Lleras-Muney, A. (2015). When does education matter? The protective effect of education for cohorts graduating in bad times. Social Science & Medicine, 127, 63–73.Find this resource:

Cutler, D. M., & Lleras-Muney, A. (2010). Understanding differences in health behaviors by education. Journal of Health Economics, 29(1), 1–28.Find this resource:

Cutler, D. M., Lleras-Muney, A., & Vogl, T. (2011). Socioeconomic status and health: Dimensions and mechanisms. In S. Glied & P. C. Smith (Eds.), The Oxford Handbook of Health Economics (pp. 124–163). Oxford: Oxford University Press.Find this resource:

Dalgaard, C. J., & Strulik, H. (2014). Optimal aging and death: Understanding the Preston curve. Journal of the European Economic Association, 12(3), 672–701.Find this resource:

Davies, N. M., Dickson, M., Smith, G. D., Van den Berg, G., & Windmeijer, F. (2016). The causal effects of education on health, mortality, cognition, well-being, and income in the UK Biobank. Preprint.Find this resource:

Deaton, A. (2010). Instruments, randomization, and learning about development. Journal of Economic Literature, 48(2), 424–455.Find this resource:

Devereux, P. J., & Hart, R. A. (2010). Forced to be rich? Returns to compulsory schooling in Britain. Economic Journal, 120(549), 1345–1364.Find this resource:

de Walque, D. (2007). Does education affect smoking behaviors? Evidence using the Vietnam draft as an instrument for college education. Journal of Health Economics, 26, 877–895.Find this resource:

Ehrlich, I., & Chuma, H. (1990). A model of the demand for longevity and the value of life extension. Journal of Political Economy, 98(4), 761–782.Find this resource:

Etilé, F., & Jones, A. M. (2011). Schooling and smoking among the baby boomers—An evaluation of the impact of educational expansion in France. Journal of Health Economics 30, 811–831.Find this resource:

Finkelstein, A., Hendren, N., & Luttmer, E. F. P. (2015). The value of Medicaid: Interpreting results from the Oregon Health Insurance Experiment. NBER Working Paper No. 21308.Find this resource:

Fischer, M., Karlsson, M., & Nilsson, T. (2013). Effects of compulsory schooling on mortality: Evidence from Sweden. International Journal of Environmental Research and Public Health, 10, 3596–3618.Find this resource:

Fischer, M., Karlsson, M., Nilsson, T., & Schwarz, N. (2016). The sooner the better? Compulsory schooling reforms in Sweden. IZA Working Paper No. 10430.Find this resource:

Flegal, K. M., Kit, B. K., Orpana, H., & Graubard, B. I. (2013). Association of all-cause mortality with overweight and obesity using standard body mass index categories: A systematic review and meta-analysis. Journal of the American Medical Association, 309, 71–82.Find this resource:

Fletcher, J. M. (2015). New evidence of the effects of education on health in the US: Compulsory schooling laws revisited. Social Science & Medicine, 127, 101–107.Find this resource:

Fujiwara, T., & Kawachi, I. (2009). Is education causally related to better health? A twin fixed-effects study in the USA. International Journal of Epidemiology, 38, 1310–1322.Find this resource:

Galama, T. J. (2015). A contribution to health-capital theory. CESR Working Paper No. 2015-004.Find this resource:

Galama, T. J., & Van Kippersluis, H. (2015). A theory of education and health. CESR Working Paper No. 2015-001.Find this resource:

Galama, T. J., & Van Kippersluis, H. (2018). A theory of socioeconomic disparities in health over the life cycle. Economic Journal.Find this resource:

Gathmann, C., Jürges, H., & Reinhold, S. (2015). Compulsory schooling reforms, education and mortality in twentieth century Europe. Social Science & Medicine, 127, 74–82.Find this resource:

Glied, S., & Lleras-Muney, A. (2008). Technological innovation and inequality in health. Demography, 45(3), 741–761.Find this resource:

Goldin, C. D., & Katz, L. F. (2009). The race between education and technology. Cambridge, MA: Harvard University Press.Find this resource:

Grimard, F., & Parent, D. (2007). Education and smoking: Were Vietnam War draft avoiders also more likely to avoid smoking? Journal of Health Economics, 26, 896–926.Find this resource:

Grossman, M. (1972). On the concept of health capital and the demand for health. Journal of Political Economy, 80(2), 223–255.Find this resource:

Grossman, M. (2015). The relationship between health and schooling: What’s new? Nordic Journal of Health Economics, 3(1), 7–17.Find this resource:

Hall, R. E., & Jones, C. I. (2007). The value of life and the rise in health spending. Quarterly Journal of Economics, 122(1), 39–72.Find this resource:

Holt-Lunstad, J., Smith, T. B., & Layton, J. B. (2010). Social relationships and mortality risk: A meta-analytic review. PLoS Medicine, 7(7), e1000316.Find this resource:

Hanushek, E. A., & Zhang, L. (2009). Quality-consistent estimates of international schooling and skill gradients. Journal of Human Capital, 3(2), 107–143.Find this resource:

Heckman, J. J. (2007). The economics, technology, and neuroscience of human capability formation. Proceedings of the National Academy of Sciences, 104(33), 13250–13255.Find this resource:

Heckman J. J., Humpries, J. E., & Veramendi, G. (2016). Returns to education: The causal effects of education on earnings, health, and smoking. National Bureau of Economic Research Working Paper No. 22291.Find this resource:

Heckman J., Pinto, R., & Savelyev, P. (2013). Understanding the mechanisms through which an influential early childhood program boosted adult outcomes. American Economic Review, 103(6), 2052–2086.Find this resource:

Heckman, J. J., Stixrud, J., & Urzua, S. (2006). The effects of cognitive and noncognitive abilities on labor market outcomes and social behavior. Journal of Labor Economics, 24(3), 411–482.Find this resource:

Heckman, J. J., & Urzua, S. (2010). Comparing IV with structural models: What simple IV can and cannot identify. Journal of Econometrics, 156(1), 27–37.Find this resource:

Holt-Lunstad, J., Smith, T. B., & Layton, J. B. (2010). Social relationships and mortality risk: A meta-analytic review. PLoS Medicine, 7(7), e1000316. Find this resource:

Huang, W. (2016). Understanding the effects of education on health: Evidence from China. Mimeo.Find this resource:

Hummer, R. A., & Hernandez, E. M. (2013). The effect of educational attainment on adult mortality in the United States. Population Bulletin, 68(1), 1–16.Find this resource:

Imbens, G. W. (2010). Better LATE than nothing: Some comments on Deaton (2009) and Heckman and Urzua (2009). Journal of Economic Literature, 48(2), 399–423.Find this resource:

Imbens, G. W., & Angrist, J. D. (1994). Identification and estimation of local average treatment effects. Econometrica, 62(2), 467–475.Find this resource:

James, J. (2015). Health and education expansion. Economics of Education Review, 49, 193–215.Find this resource:

Jensen, R., & Lleras-Muney, A. (2012). Does staying in school (and not working) prevent teen smoking and drinking? Journal of Health Economics, 31, 644–657.Find this resource:

Jürges, H., Reinhold, S., & Salm, M. (2011). Does schooling affect health behavior? Evidence from educational expansion in Western Germany. Economics of Education Review, 30, 862–872.Find this resource:

Kawachi, I., Subramanian, S. V., & Kim, D. (2008). Social capital and health. In I. Kawachi, S. V. Subramanian, & D. Kim, (Eds.), Social capital and health (pp. 1–26). New York: Springer.Find this resource:

Kemptner, D., Jürges, H., & Reinhold, S. (2011). Changes in compulsory schooling and the causal effect of education on health: Evidence from Germany. Journal of Health Economics, 30, 340–354.Find this resource:

Kenkel, D., Lillard, D., & Mathios, A. (2006). The roles of high school completion and GED receipt in smoking and obesity. Journal of Labor Economics, 24(3), 635–660.Find this resource:

Khang, Y. H., Lynch, J. W., Yang, S., Harper, S., Yun, S. C., Jung-Choi, K., et al. (2009). The contribution of material, psychosocial, and behavioral factors in explaining educational and occupational mortality inequalities in a nationally representative sample of South Koreans: Relative and absolute perspectives. Social Science & Medicine, 68(5), 858–866.Find this resource:

Lager, A. C. J., & Torssander, J. (2012). Causal effect of education on mortality in a quasi-experiment on 1.2 million Swedes. Proceedings of the National Academy of Sciences, 109(22), 8461–8466.Find this resource:

Lang, K., & Kropp, D. (1986). Human capital versus sorting: The effects of compulsory attendance laws. Quarterly Journal of Economics, 101(3), 609–624.Find this resource:

Lemieux, T. (2006). The “Mincer equation” thirty years after schooling, experience, and earnings. In S. Grossbard (Ed.), Jacob Mincer: A Pioneer of Modern Labor Economics (pp. 127–145). New York: Springer.Find this resource:

Leuven, E., Plug, E., & Rønning, M. (2016). Education and cancer risk. Labour Economics, 43, 106–121.Find this resource:

Lleras-Muney, A. (2005). The relationship between education and adult mortality in the United States. Review of Economic Studies, 72(1), 189–221.Find this resource:

Lleras-Muney, A., & Shertzer, A. (2015). Did the Americanization movement succeed? An evaluation of the effect of English-only and compulsory schooling laws on immigrants. American Economic Journal: Economic Policy, 7(3), 258–290.Find this resource:

Li, J., & Powdthavee, N. (2015). Does more education lead to better health habits? Evidence from the school reforms in Australia. Social Science & Medicine, 127, 83–91.Find this resource:

Lochner, L. (2011). Nonproduction benefits of education: Crime, health, and good citizenship. Handbook of the Economics of Education, 4, 183–262.Find this resource:

Lochner, L., & Moretti, E. (2004). The effect of education on crime: Evidence from prison inmates, arrests, and self-reports. American Economic Review, 94(1), 155–189.Find this resource:

Lundborg, P. (2013). The health returns to schooling—What can we learn from twins? Journal of Population Economics, 26, 673–701.Find this resource:

Lundborg, P., Lyttkens, C. H., & Nystedt, P. (2016). The effect of schooling on mortality: New evidence from 50,000 Swedish twins. Demography, 53, 1135–1168.Find this resource:

Lundborg, P., Plug, E., & Rasmussen, A. W. (2017). Can women have children and a career? IV evidence from IVF treatments. American Economic Review, 107(6), 1611–1637.Find this resource:

Mackenbach, J. P., Huisman, M., Andersen, O., Bopp, M., Borgan, J. K., Borrell, C., et al. (2004). Inequalities in lung cancer mortality by the educational level in 10 European populations. European Journal of Cancer, 40(1), 126–135.Find this resource:

Madsen, M., Andersen, A. M. N., Christensen, K., Andersen, P. K., & Osler, M. (2010). Does educational status impact adult mortality in Denmark? A twin approach. American Journal of Epidemiology, 172(2), 225–234.Find this resource:

Mazumder, B. (2008). Does education improve health? A reexamination of the evidence from compulsory schooling laws. Economic Perspectives, 32(2), 2.Find this resource:

Meara, E. R., Richards, S., & Cutler, D. M. (2008). The gap gets bigger: Changes in mortality and life expectancy, by education, 1981–2000. Health Affairs, 27(2), 350–360.Find this resource:

Meghir, C., & Palme, M. (2005). Educational reform, ability, and family background. American Economic Review, 95(1), 414–424.Find this resource:

Meghir, C., Palme, M., & Simeonova, E. (2017). Education and mortality: Evidence from a social experiment. American Economic Journal: Applied Economics.Find this resource:

Mincer, J. (1974). Schooling, experience and earnings. New York: National Bureau of Economic Research.Find this resource:

Mokdad, A. H., Marks, J. S., Stroup, D. F., & Gerberding, J. L. (2004). Actual causes of death in the United States. Journal of the American Medical Association, 291(10), 1238–1245.Find this resource:

Oreopoulos, P., Von Wachter, T., & Heisz, A. (2012). The short- and long-term career effects of graduating in a recession. American Economic Journal: Applied Economics, 4(1), 1–29.Find this resource:

Palme, M., & Simeonova, E. (2015). Does women’s education affect breast cancer risk and survival? Evidence from a population based social experiment in education. Journal of Health Economics, 42, 115–124.Find this resource:

Pappas, G., Queen, S., Hadden, W., & Fisher, G. (1993). The increasing disparity in mortality between socioeconomic groups in the United States, 1960 and 1986. New England Journal of Medicine, 329(2), 103–109.Find this resource:

Park, C., & Kang, C. (2008). Does education induce health lifestyle? Journal of Health Economics, 27, 1516–1531.Find this resource:

Pischke, J. S., & Von Wachter, T. (2008). Zero returns to compulsory schooling in Germany: Evidence and interpretation. Review of Economics and Statistics, 90(3), 592–598.Find this resource:

Reinhold, S., & Jürges, H. (2010). Secondary school fees and the causal effect of schooling on health behavior. Health Economics, 19, 994–1001.Find this resource:

Savelyev, P. A. (2014). Conscientiousness, education, and longevity of high-ability individuals (Unpublished manuscript). Vanderbilt University, Department of Economics.Find this resource:

Seierstad, A., & Sydsaeter, K. (1987). Optimal Control Theory With Economic Applications. In C. J. Bliss & M. D. Intriligator (Eds.), Advanced Textbooks in Economics (Vol. 24), Series Editors. North Holland: Elsevier.Find this resource:

Silles, M. (2015). The causal effect of schooling on smoking behavior. Economics of Education Review, 48, 102–116.Find this resource:

Smith, K. P., & Christakis, N. A. (2008). Social networks and health. Annual Review of Sociology, 34(1), 405–429.Find this resource:

Smith, W. C., Anderson, E., Salinas, D., Horvatek, R., & Baker, D. P. (2015). A meta-analysis of education effects on chronic disease: The causal dynamics of the Population Education Transition Curve. Social Science & Medicine, 127, 29–40.Find this resource:

Spence, M. (1973). Job market signaling. Quarterly Journal of Economics, 87(3), 355–374.Find this resource:

Strulik, H. (2018). The return to education in terms of wealth and health. Journal of the Economics of Ageing, 12(2018), 1–14.Find this resource:

Todd, P. E., & Wolpin, K. I. (2006). Assessing the impact of a school subsidy program in Mexico: Using a social experiment to validate a dynamic behavioral model of child schooling and fertility. American Economic Review, 96(5), 1384–1417.Find this resource:

Turkheimer, E. (2000). Three laws of behavior genetics and what they mean. Current Directions in Psychological Science, 9(5), 160–164.Find this resource:

Van den Berg, G., Janys, L., & Christensen, K. (2012). The causal effect of education on mortality. Mimeo.Find this resource:

Van Kippersluis, H., & Galama, T. J. (2014). Wealth and health behavior: Testing the concept of a health cost. European Economic Review, 72, 197–220.Find this resource:

Van Kippersluis, H., O’Donnell, O., & Van Doorslaer, E. (2011). Long-run returns to education: Does schooling lead to an extended old age? Journal of Human Resources, 46(4), 695–721.Find this resource:

Webbink, D., Martin, N. G., & Visscher, P. M. (2010). Does education reduce the probability of being overweight? Journal of Health Economics, 29(1), 29–38.Find this resource:

Weiss, A. (1995). Human capital vs. signalling explanations of wages. Journal of Economic Perspectives, 9(4), 133–154.Find this resource:

Appendix: Derivations

Optimality Conditions

Optimal schooling age S: The condition for the optimal length of schooling S follows from the dynamic envelope theorem (e.g., Caputo, 2005, p. 293):


where S indicates the limit in which S is approached from below, and S+ when approached from above. Noting that state and co-state functions are continuous in S, and λL(S)[τS(S)τθ(S)τL(S)]=0, we obtain


where we have replaced the limits S and S+ with S for functions that are continuous in S. The left-hand side (LHS) of (21) represents the benefits of entering the labor market consisting of gains in labor income Y(S+)Y(S) (e.g., no longer does time need to be devoted to schooling τS(t) and / or wages may be higher after graduation than while in school), the monetary value of no longer suffering disutility from being in school cS(t)0, and not having to pay tuition pS(t). The right-hand side (RHS) represents the benefit of staying in school, consisting of the schooling subsidy λS(S)pS(S) (first term), not having to pay a fine if younger than the minimum school-leaving age S_ (second term), higher lifetime earnings from additional schooling (third term), and the value of higher levels of skill investment and of health investment while in school due to the possibly lower opportunity cost of time resulting from lower wages (e.g., if laws constrain labor before the school-leaving age) and from less time that can be devoted to work during schooling years (fourth and fifth term). The sixth term reflects the possibility that the cost of skill investment pθ(t)Xθ(t) is subsidized when in school (t<S), providing another benefit of schooling. Finally, if time substitutes for goods and services XH(t) in the production of health investment, then the final term on the RHS represents the benefit of schooling in terms of reduced expenditures on health investment if the opportunity cost of time is lower during schooling.

Condition (21) weighs multiple costs and benefits of schooling that factor into the decision of the optimal age at which to leave school and enter the labor market. Some of these costs and benefits are arguably not as important as others for most individuals. For example, school-age individuals are generally in good health, resulting in a relatively low marginal value of health qh/a(t) and few medical expenditures pH(t)XH(t) (e.g., Galama & Van Kippersluis, 2018). Therefore, it seems safe to assume that terms involving health production are small.

Optimal skill investment: The first order condition for skill investment is given by


where πθ(t) is the marginal cost of skill investment


and qθ/a(t) is the relative marginal value of skill, which evolves according to


Optimal health investment: The first-order condition for health investment is given by


where πH(t) is the marginal cost of health investment


and qh/a(t) is the relative marginal value of health, which evolves according to


Optimal consumption: The condition for optimal consumption is obtained by taking the derivate of the Lagrange function (11) with respect to XC. This leads to equation (15).

Optimal length of life T: The condition for the optimal length of life T follows from the dynamic envelope theorem (e.g., Caputo, 2005, p. 293):


This leads to equation (16).

Comparative Dynamic Analyses

Optimal length of schooling S: Using condition (20), we can explore variations in model parameters Z0 on the optimal schooling decision. Note that L(S)=0 implies that


This can be rewritten as


We can further develop (30) into


where the last three rows represent the direct effect of specific variations with respect to prices pθ(S), pH(S), pS(S) and institutional characteristics S_, λS(S), and f*(S). By contrast, the other terms in the expression apply more generally to all types of variations δZ0. These reflect indirect effects on the marginal value of wealth qA(S), the relative marginal value of skill qθ/a and of health qh/a(S), and the stocks of skill θ(S) and of health H(S).

Further, by substituting Z0=S_, and assuming that individuals value skill qθ/a(t) more than health qh/a(t) early in life, we obtain the comparative dynamic effect of the minimum school-leaving age on schooling:


where the first term on the LHS L(S)/S|S_,T is positive under the assumption of diminishing returns to schooling. Note that T/S_ is also a function of qA(S)/S_, qθ/a(S)/S_, etc. However, when fully substituting T/S_ using the information in (37) and (38), the expression becomes too cumbersome to work with. We therefore choose to informally discuss the full effect using (17) in the main text.

Consumption: The comparative dynamic effect of the minimum school-leaving age S on the optimal consumption path XC(t) is given by


where the term on the LHS of equation (33) is positive under the plausible assumptions of diminishing marginal utility of consumption and constant or increasing returns to scale in the health cost of unhealthy consumption (see Van Kippersluis & Galama, 2014). The term β1,C in (18) is given by the coefficient in front of qA(t)/S_ and the term β2,C is given by the coefficient in front of qh/a(t)/S_, respectively. The sign of both 2U/XCH and 2dH/HXC are unknown, and hence β3,C is hard to sign.

Optimal length of life T: Varying an initial condition, end condition, or model parameter Z0 in equation (28), we have


From (34) we have


Consistent with diminishing returns to life extension (Ehrlich & Chuma, 1990), we assume


(see Galama & Van Kippersluis, 2015, 2018), in which case we can identify the sign of the variation in life expectancy from


Taking the first derivative of the optimality condition (T)=0 (see 11 and 28) with respect to an initial condition, end condition, or model parameter Z0, and holding length of life T and schooling S fixed, we obtain


where (T)/θ=qθ(t)/t|t=T, A(T)/Z0=H(T)/Z0=0 because A(T) and H(T) are fixed, and qθ(T)/Z0|T=0 because qθ(T)=0, regardless of Z0. Further, (T)/ξ(t)|S,T=0 for any control function ξ(t), because these are the necessary first-order conditions.33

The condition for the effect of the minimum school-leaving age S_ on optimal length of life T can be written as (see 37)


Hence we can infer the sign of T/S_ by studying


where β1,T in (19) is given by the coefficient front of qA(T)/S_, β2,T is given by the coefficient in front of θ/S_, etc. Because health H(t) declines by definition near the end of life (approaching Hmin from above) and assets A(t) do too (given that individuals tend to build wealth early in life and spend it later in life), the first term is positive if increasing the minimum schooling age S_ increases schooling S and schooling in turn generates wealth, extending life (wealth effect). The second term is positive because the marginal value of skills declines near the end of life (skill effect). The third term is positive (marginal value of health effect) because H(t)/t|t=T<0 and qh/a(T)/S_|T,S<0 (see Galama & Van Kippersluis, 2018). The last term is negligible, because individuals tend to reduce working hours in old-age (a retirement phase), and earnings at the point of death are zero Y(T)=0.


(1.) We assume that health and skills are orthogonal traits. There are of course exceptions to this separation, for example, dementia and other health conditions that impair skills, which we ignore here for simplicity.

(2.) We assume here that parents make the decision about when to start school but that individuals decide when to end it. We ignore issues regarding possible conflicts between parents and children, and model children as fully rational decision makers.

(3.) A useful interpretation of the fine f*(t) is as a probability of getting caught and having to pay a cost when the individual is not in school but should have been (S<S_). For this reason it is modeled not as a one-off cost but as a cost that operates for as long as the individual has not reached the minimum schooling age. Historically, fines were the way in which compulsory schooling laws were enforced.

(4.) In our formulation, we ignore possible sheepskin (or diploma) effects of completing a certain level of education.

(5.) Further, while in school, individuals may be provided with healthy nutritious foods, or simply be kept off the street, which may keep them out of trouble (incarceration effect; Weiss, 1995; Lochner & Moretti, 2004; Black, Devereux, & Salvanes, 2008; Lochner, 2011). This would suggest that skills and health could also be a function of τS(t) as well as τθ(t).

(6.) An important criticism of the Grossman model, made by Dalgaard and Strulik (2014), is that health decline is arguably smaller for people in better health. Our health production function is a flexible function of health and health investment, and encompasses various specifications, including the original Grossman model as well as Dalgaard and Strulik’s health deficit model.

(7.) Potentially the wage rate changes too right after dropping out of school, if the minimum school-leaving age coincides with labor laws.

(8.) Of course, many people enjoy being in school, and for them this would represent a cost rather than a benefit of working.

(9.) Theory predicts skill investment drops after graduation as the opportunity cost of time investment is lower during schooling because individuals have to spend a fixed amount of time τS(t) in school. If there are complementarities between goods/services Xθ(t) and time inputs τθ(t) then investment goods will also be lower.

(10.) This is not true for wealth A(T) and health H(T) as individuals cannot choose their terminal levels optimally. Hence, the transversality conditions for the co-state equations are: qθ(T)=0, while qH(T)0 and qA(T)0.

(11.) The sign of this term depends on whether time in school provides utility or disutility, cS(t).

(12.) The very last term represents the possibility that wages depend explicitly on the exogenous minimum school-leaving age S_: labor-laws might impose a fine on those who employ individuals younger than S_ so that wages are lower below that age if the law is enforced. This would represent a discontinuity in the wage rate at S_. The effect would operate only on those for whom the optimal schooling age S and the minimum school-leaving age S_ coincide, that is, it affects those individuals who are potential compliers.

(13.) If the completion of schooling by the compliers (those at the margin) increases competition in the labor force then “economy” wide (general equilibrium) effects may reduce the returns to her schooling, resulting in a negative wealth effect.

(14.) But she is not unaffected. When the minimum school-leaving age is increased, and the individual does not increase her schooling duration, she now faces an even longer duration over which she pays a fine and/or cannot earn wages. This negatively affects her lifetime wealth, and hence her skill and health production (terms 1–4 on the RHS of 17). The model also allows for situations in which this group of non-compliers minimally increases or decreases their schooling duration, with ambiguous effects on lifetime wealth, skill, and health.

(15.) The third term on the RHS of (18) shows that an increase in the minimum school-leaving age may impact health at later ages, which, depending on whether unhealthy consumption and health are complements or substitutes in utility and in the rate of depreciation, may reinforce either the wealth or health-cost effect (see 18). This term is therefore difficult to sign.

(16.) For the never-takers who drop out of school before the minimum school-leaving age, the wealth effect is negative. For them this may lead to poorer health and lower life expectancy.

(17.) Although many studies report effects of education on BMI or overweight (BMI between 25 and 30) these results are difficult to interpret. Changes in BMI within normal ranges (18.5–25) are not strongly associated with poor health and mortality. Similarly, overweight individuals appear to have lower mortality than individuals in the normal BMI range (e.g., Flegal, Kit, Orpana, & Graubard, 2013), also making this an ambiguous measure of health. Obesity, on the other hand, is clearly associated with increased morbidity and mortality (Flegal et al., 2013).

(18.) Palme and Simeonova (2015) and Leuven, Plug, and Rønning (2016) focus on the effect of education on cancer risk and mortality.

(19.) The exact search command in Google Scholar we used was “instrumental variable” OR “regression discontinuity” OR “natural experiment” OR “exogenous variation,” AND “causal effect of education on mortality” for the quasi-experiment studies, where mortality was replaced by obesity or smoking for the other outcomes. This delivered 64 hits for mortality, 30 for obesity, and 24 for smoking. For the twin studies we used “twin difference” OR “twin fixed effects” OR “co-twin,” AND “causal effect of education on mortality,” where again mortality (5 hits) was replaced by obesity (1 hit) and smoking (1 hit) for the other outcomes. We manually went through all of these hits, and the references therein, and applied our selection criteria to arrive at the selection of papers reviewed here. We have done our utmost to identify all relevant papers but recognize that we may have missed a few in the process.

(20.) Within twin pairs, some studies find that education substantially reduces overweight for men (Webbink, Martin, & Visscher, 2010) but not for women (Webbink et al., 2010, Amin, Behrman, & Spector, 2013). Lundborg (2013) finds large but statistically insignificant effects of education on BMI.

(21.) This has led to a substantial amount of criticism of quasi-experiments because LATE estimates are often uninformative about other settings and other policies (Heckman & Urzua, 2010; Deaton, 2010).

(22.) Mazumder (2008) uses similar data to Lleras-Muney (2005), and Black et al. (2015) argue that once fixed effects are added there is no variation left to estimate the additional impact of CSL. Therefore we do not report these findings in the table.

(23.) Though of course, as discussed above, the strength of the instrument is a function of the covariates that are included and in particular the number of aggregate controls for cohort, age, location, and trends.

(24.) Smoking is typically under-reported; reporting biases for height and weight are more complex and differ by gender. See for example Connor Gorber, Tremblay, Moher, and Gorber (2007).

(25.) Park and Kang (2008) also use a structural modeling approach, combined with exclusion restrictions. They obtain sizable point estimates, but their small sample size results in very imprecisely estimated coefficients.

(26.) In a recent working paper, Huang (2016) presents evidence that an extra year of education in China led to a 5% reduction in smoking prevalence, although this estimate is only marginally significant at 10%.

(27.) Of course, a high fraction of individuals at the margin could also be consistent with other explanations, such as high opportunity costs of schooling, or a large fraction of parents who do not appreciate the returns to schooling for their kids.

(28.) Nevertheless, wage returns to schooling are much larger once quality is accounted for (Hanushek & Zhang, 2009).

(29.) Cutler and Glaeser (2005) also point out that the correlation between healthy behaviors within individuals is small, so a single factor is unlikely to explain differences across individuals.

(30.) Indeed Glied and Lleras-Muney (2008) and Meghir et al. (2017) document that CSLs increase mortality rates from reproductive cancers.

(31.) Fischer, Karlsson, Nilsson, and Schwarz (2016), for example, show that increases in the total time spent in school, as a result of increases in the term length, affect later-life income much more than do comparable increases in time spent in school due to raising the minimum school-leaving age.

(32.) Many recent authors have moved in this direction. For instance, Todd and Wolpin (2006) estimate a structural model using experimental variation from Progresa, an experiment in Mexico. Work in other fields is also increasingly using this combined approach (e.g., Heckman, Pinto, & Savelyev, 2013; Finkelstein, Hendren, & Luttmer, 2015). Another approach that tries to combine both is the work by Chetty (2009) on the use of sufficient statistics.

(33.) Note that we distinguish in notation between f(t)/t|t=T, which represents the derivative with respect to time t at time t=T, and f(t)/T|t=T, which represents variation with respect to the parameter T at time t=T.