A Theory of Human Capital Investment

Panhaboth Kun,economic theory

Preface (written 17 Dec 2023)

I conceived of this model way back during my earlier years of study at the LSE. In its infancy, this essay was in no greater form than a list of questions waiting to be answered. How should we distribute funds for education? How do we measure differences in student productivity? Over the years, with marginally greater mathematical know-how, I would return to answering these questions, making many refinements and correcting many mistakes along the way. Still, the results were a little bit disappointing for me, personally, one that left more questions than provided answers, and even then, the answers it provided rested on assumptions too hastily made. The dream was to explore those questions in consideration of a Mincerian earnings function, and the motivation behind this was simple. A framework for distributing free education that insists on maximising GDP is sure to attract a nod from the school of thought that has much prevailed in this industry. This dream did not work out once the effect of schooling on future earnings proved itself too complicated to measure, and so have I settled with a more modest proposal. What I did was take a smaller leap from cause to effect.

The most profound thing I've learned from writing this essay is the duality of theory and its power. A great theory, symbols tied together by the soundest of logic, is defeated the moment one cannot measure the quantities that these symbols represent. The remedy to this defeat, as all economic theory employs, is to pretend. In writing this essay, I was hardly prepared to pretend, but life goes on and pretenses have to be made, in spite of our determination to do otherwise. Diminishing marginal returns to investment in human capital I could measure only at students' earliest years in formal education. I pretend that in students this does not change over time, so that a scholarship policy could be devised off of this measure, even for a student that is about to enter into tertiary education. In this essay there is a whole list of other such pretenses, and I leave the reader to pick them out. Owing to my instructor's comments, new insights I have developed, and new ways of doing things I have discovered, I intend to return to this essay some day, tackling the disappointments I originally had. Many refinements are in the plan.

I write this essay in dedication to my country, Cambodia, to the hidden talents in its distant villages, to all the untapped human capital.

Abstract

Given a fixed education investment fund, its optimal distribution is an issue of concern to a multitude of agents, from the government, to scholarship-endowing organisations, and to the household. In this essay I model the firm as a producer of human capital (one may think of this firm as any of the aforementioned agents) and show that efficiency conditions systematically require investment to be smoothed out among individuals or groups of individuals in the economy. Then, I develop a richer model by showing how differing individual or group characteristics skew this result. Namely, more fund is required to be put into students that are more efficient in transforming it into educational outcomes. This richer model paves the way towards a general solution for distributing education investment funds. However, measuring the characteristics that truly cause improvements in student efficiency is rife with selection biases, an issue for which this essay proposes a candidate solution.

Introduction

This essay aims to devise an effective way of distributing scholarship funds so that the sum total scholastic output among a pool of students is maximised. To achieve this goal, it is crucial to understand the effects of school input on academic achievement. However, it is important to note that scholastic output is not solely determined by school input; as will be seen, it is potentially also influenced by peers and the individual students’ circumstances at home. Therefore, to determine an unbiased estimate of the effects of school on achievement, all of these factors must be taken into consideration.

The first part of this essay deals with the state of current studies. In particular, we look into how a class of functions called ’education production functions’ play a role in determining how we are to design scholarship distribution mechanisms.

The second part of this essay develops a theoretical relationship between the effect of school quality on students’ academic performance. We use school fees as a proxy for its quality, and in so doing presume that there is a one-to-one relationship between fees and quality. The relationship between school quality and the academic performance it induces from students is captured in what I call the ’investment transformation function’. By picking a particular functional form, we also account for diminishing marginal returns to investment. Then, by looking at a simple two-student case, I demonstrate the derivation of the solution to the optimisation problem.

In the third part of this essay, I propose a method to isolate the marginal effects of school input, considered in the scheme of the education production function. Since scholastic output or academic achievement is potentially determined also by peer and familial influence, if students from advantageous family backgrounds tend to be sent to schools that command higher fees, then a selection bias issue arises if advantageous family backgrounds also affects scholastic output. I remedy this issue by using pre-school enrolment achievement profiles to control for familial influence.

Finally, this essay touches on some policy implications. While the optimisation problem gives way to policy design, this extension allows us to analyse existing policies. Because, as it will be seen, optimal policy depends on relative productivity in the pool of students, we analyse policies by looking at the relative productivity that the policy presupposes, under the assumption that it is efficient. The policy is therefore inefficient to the extent that the relative productivity that it presupposes deviates from the true relative productivity.

Literature Review

Current studies focus on educational output as determined by an ’education production function’, agreeing that it takes on the following form: yi=hi(S,P,F)y_i = h_i(S, P, F ).1 Hanushek points out that peer influence and familial influence are deemed to affect educational output.2 While family background and peer inputs are typically characterised by socio-demographic characteristics, school inputs, a point of interest in this essay, include the quality of its facilities and teacher background. In a cross section of schools, these inputs vary too little to generate variations meaningful enough to be used in regression analyses.3 In this sense educational production functions differ from its traditional counterpart. While capital and labour counts are a permissible instrument in accounting for differences in economic output, one must rely on measuring school quality to account for differences in scholastic output. This essay will thus rely on the assumption that expenditure on education infrastructure is positively correlated with its inherent quality.

While the study pioneered by Mincer delved into the relation between the quantity of education received and life outcome,4 the choice to use scholastic output as the outcome variable is not entirely frivolous. Indeed, it would be if the aim of the study is to devise an education infrastructure around test scores in a way as to maximise lifetime earnings. However, to the extent that scholastic output at any point in time determines that at some later date, an education infrastructure a student is under today could affect their academic performance in the future. This is especially useful in devising scholarship programmes based on past academic performance.

Models

We begin by defining the problem faced by a producer of human capital. In doing so, we enrich our model with objectives and constraints defined over two distinct classes of objects that are the given parameters and policy parameters. The policy parameters are our solution of interest, and the optimal values they take on depend on the given parameters. Before we begin the task, it is imperative I emphasise that our problem exists within the fabric of many interconnected problems. As such, while given parameters are taken to be constant in the definition of this problem, they may have been policy parameters in and of themselves in some problem else.5

The idea of the problem is as follows: given an investment fund for education EE, how much should be invested among a pool of students in order to maximise the total returns in terms of educational outputs? For both geometrical and algebraic simplicity, I opt to restrict the pool to contain only two students, but the insights that will be developed come without loss of generality. Let us formalise the idea of the problem. Consider the idea that students transform levels of investment on their education into scholastic achievement in the future. Thus, each student is equipped with what I will proceed to call an ’investment transformation function’ whose value is future scholastic output and parameter is the level of funds that is invested in their education today. I will denote this transformation function as fi(ei)f_i(e_i) for student ii. I now give shape to this important function.

First, it is safe to assume that ff increases in eie_i for all ii.

Assumption 1 (Strict Positive Monotonicity): The returns to investment increases when more funding is available for any student. That is, fi(ei)>0f_i'(e_i) > 0 for all ii.

Second, we assume diminishing marginal returns to investment at the individual level. For all individuals, the next dollar invested yields less of a return than the previous dollar.

Assumption 2 (Diminishing Marginal Returns to Investment): f(ei)<0f''(e_i) < 0 for all ii.

Finally, we give ff a position in the Cartesian space. For now, provisioning an assumption, which later we relax, suffices:

Assumption 3 (Equal Footing): The educational outcome of zero investment is nil for all students (in short, fi(0)=0f_i(0) = 0 for all ii).

One functional form that satisfies the above assumptions, which in this essay we work with, is:

fi(ei)=eiα for any 0<α<1. f_i(e_i) = e_i^α \text{ for any } 0 < α < 1.

I emphasise here that the αα is a given parameter in this problem and is not a solution to any other problem that the producer of human capital faces. This parameter defines the sharpness of ff’s curvature, which in turn reflects the degree of diminishing marginal returns to investment in education. Our position shall be that such thing is no object of choice to any being at any point in time. Below is an example of an investment transformation function, plotted with α=0.45α = 0.45.

Plot of an investment transformation function

We will assume that αα is the same for everyone. Diminishing marginal returns to educational investment is modelled like gravity: set in stone.

The government’s objective is to maximise the sum total of future scholastic output by choosing the right levels of ee isubject to the constraint that the sum total of investments do not exceed the available fund. With the objective function denoted by FF, the government solves the following constrained optimisation problem:

maxe1,e2F(e1,e2)=e1α+e2αs.t.e1+e2E \begin{aligned} \max_{e_1, e_2} \quad & F(e_1, e_2) = e_1^\alpha + e_2^\alpha\\ \text{s.t.} \quad & e_1 + e_2 \leq E \end{aligned}

This type of production function is additively separable. It is expressible as a sum of single-variable functions of each of its inputs. This kind of function is not commonly used in production theory because it implies that not all of its inputs are required in the production process. In the context of a typical coffee shop, this would make for a poor model of production if its inputs are capital and labour. Without the necessary tools, there can be no coffee. But neither too can there be any if there are no baristas.6 This functional form is completely fine in our model, as splurging the investment fund on one person will not necessarily lead to zero educational output in the future.

For the constraint, prices do not come into it because e1e_1 and e2e_2 are already expressed in nominal terms: the cost of £100 of investment is simply £100. Furthermore, it is timely to note that the EE parameter is a given in this problem, but of course is not set in stone. What proportion of profit or revenue should be allocated to educational investment was once itself a policy parameter, or a solution to some scholarship-endowing firm’s optimisation problem. We will deduce three properties of FF and two properties of the constraint set that will be of great help to us in visualising the problem.

Proposition 1. FF is increasing in eie_i for all ii.

Proof. FF increases in eie_i for all ii iff its partial derivatives are all greater than zero.

F(e1,e2)e1=αe1α1=αe11α>0,F(e1,e2)e2=αe2α1=αe21α>0 \begin{aligned} \frac{\partial F(e_1, e_2)}{\partial e_1} & = \alpha e_1^{\alpha - 1}\\ & = \frac{\alpha}{e_1^{1 - \alpha}} > 0,\\ \frac{\partial F(e_1, e_2)}{\partial e_2} & = \alpha e_2^{\alpha - 1}\\ & = \frac{\alpha}{e_2^{1 - \alpha}} > 0 \end{aligned}

since 0<α<10 < \alpha < 1.

Proposition 2. FF-level sets are convex towards the origin.

Proof. The proposition holds iff the slope of every FF-level set is negative and increasing.

We know that the gradient vector in the e1e2e_1e_2-space points towards the direction of FF’s maximal differential. Therefore, the direction vector orthogonal to the gradient must be parallel to the slope of the level set. The gradient of FF, F∇F, is

F(F(e1,e2)e1F(e1,e2)e2). \nabla F \coloneqq \begin{pmatrix} \frac{\partial F(e_1, e_2)}{\partial e_1} \\ \frac{\partial F(e_1, e_2)}{\partial e_2} \end{pmatrix}.

Then, if vF=0\vec{v} \cdot \nabla F = 0 and v=(1λ)\vec{v} = \begin{pmatrix} 1 \\ \lambda \end{pmatrix}, then λ\lambda is the instantaneous rate of change of e2e_2 with respect to e1e_1. From the definition of the gradient, the orthogonality condition, and the scaling of the direction vector orthogonal to the gradient, we compute λ\lambda:

F(e1,e2)e1+λF(e1,e2)e2=0λ(e1,e2)=F(e1,e2)e1F(e1,e2)e2 \begin{aligned} \frac{\partial F(e_1, e_2)}{\partial e_1} + \lambda \frac{\partial F(e_1, e_2)}{\partial e_2} & = 0\\ \lambda(e_1, e_2) & = - \frac{\frac{\partial F(e_1, e_2)}{\partial e_1}}{\frac{\partial F(e_1, e_2)}{\partial e_2}}\\ \end{aligned}

So, λ(e1,e2)<0λ(e_1, e_2) < 0 for all (e1,e2)(e_1, e_2) since both the numerator and denominator are greater than zero as proven in Proposition 1.

Now we need to differentiate λ with respect to e1e_1 and see to it that this is indeed greater than zero, so that convexity towards the origin follows.

λ(e1,e2)e1=2F(e1,e2)e12F(e1,e2)e22F(e1,e2)e2e1F(e1,e2)e1F(e1,e2)e22. \begin{aligned} \frac{\partial \lambda(e_1, e_2)}{\partial e_1} = - \frac{\frac{\partial^2 F(e_1, e_2)}{\partial e_1^2} \frac{\partial F(e_1, e_2)}{\partial e_2} - \frac{\partial^2 F(e_1, e_2)}{\partial e_2e_1} \frac{\partial F(e_1, e_2)}{\partial e_1}}{\frac{\partial F(e_1, e_2)}{\partial e_2}^2}. \end{aligned}

Because FF is additively separable, its cross derivatives must be zero. Moreover, because 0<α<10 < α < 1, the second partial derivative of FF with respect to e1e_1 must be less than zero. Finally, by Proposition 1, it follows that:

λ(e1,e2)e1>0 \frac{\partial \lambda(e_1, e_2)}{\partial e_1} > 0

as is required for all FF-level sets to be convex towards the origin.

Proposition 3. FF-level sets and the constraint set at the boundary are symmetric about e1=e2e_1 = e_2.

Proof. For this it suffices to show that for any c=F(a,b)c = F(a, b), c=F(b,a)c = F(b, a) particularly where aba \neq b.

F(a,b)=aα+bαF(b,a)=bα+aα. \begin{aligned} F(a, b) & = a^\alpha + b^\alpha \\ F(b, a) & = b^\alpha + a^\alpha. \end{aligned}

Clearly, c=F(a,b)=F(b,a)c = F(a, b) = F(b, a) since F(b,a)F(b, a) is just a re-arrangement of F(a,b)F(a, b). Therefore, symmetry about e1=e2e_1 = e_2 holds for all FF-level sets.7

Proposition 4. The constraint set is a convex set.

Proof. A set is convex iff all convex combinations of any two elements in the set are also elements of the set.

Let (a,b)(a, b) and (c,d)(c, d) satisfy Ee1+e2E \geq e_1 + e_2 for any scalar EE. Then, we construct a convex combination of (a,b)(a, b) and (c,d)(c, d):

(βaβb)+((1β)c(1β)d)(β(ac)+cβ(bd)+d). \begin{aligned} \begin{pmatrix} \beta a\\ \beta b \end{pmatrix} + \begin{pmatrix} (1 - \beta)c\\ (1 - \beta)d \end{pmatrix} \equiv \begin{pmatrix} \beta(a - c) + c\\ \beta(b - d) + d \end{pmatrix}. \end{aligned}

We need to show that the sum of the components still satisfy the constraint (i.e., the sum does not exceed EE).

The sum of the components is:

β(a+b(c+d))+c+d. \beta(a + b - (c + d)) + c + d.

By definition, β\beta is a real number in the interval [0,1][0, 1]. If β=1\beta = 1 or β=0\beta = 0, then the convex combination reduces to (a,b)(a, b) or (c,d)(c, d), respectively, which both satisfy the constraint by assumption. If instead 0<β<10 < \beta < 1, then both β\beta and 1β1 - \beta are less than 11, so that four and only four cases could ensue:

For all collectively exhaustive and mutually exclusive cases regarding the position of the two vectors of interest, any convex combination of them satisfies the constraint. Therefore, the constraint set is a convex set.

Implications of the Properties of FF and the Constraint Set

Propositions 1, 2 and 4 guarantee a unique solution at the boundary of the constraint set. In other words, if those three conditions hold, then the constraint binds at the maximum value of FF. This, as we will soon see when we relax some of the assumptions, greatly simplifies the problem by allowing us to equivalently express it as an unconstrained optimisation problem. Furthermore, Proposition 3 implies that our solution is characterised by the following system of equations:

{e1=e2e1+e2=E. \begin{cases} e_1 = e_2 \\ e_1 + e_2 = E. \end{cases}

Since EE is an exogenous parameter, the system is one of two equations in two unknowns, yielding us a unique solution if one exists. Substituting the first equation into the second, we get 2e2=E2e_2 = E so that e2=E/2e_2 = E/2. Substituting this into the first equation, we get e1=E/2e_1 = E/2. From this it follows: the optimal distribution of educational investment is equal investment for all.

The reason for why the solution is characterised as such is due entirely to assumptions we imposed on each of the relevant classes of objects in our model. Some of these assumptions I stated explicitly. For example, the smoothing of our investment is accountable to our assumption of diminishing marginal returns to investment at the individual level. If e1>e2e_1 > e_2, then the marginal returns to investment for individual 1 is always less than the marginal returns to investment for individual 2, such that forgoing an infinitesimally small amount of investment on individual 1 in favour of individual 2 guarantees a net increase in total future educational output. Vice versa, the same line of reasoning holds for e2>e1e_2 > e_1. Therefore, allocative efficiency requires the investment levels on these individuals to tend towards each other. The exhaustion of the entire investment fund is accountable to our assumption of increasing investment transformation functions: if every dollar increase in investment now leads to some gain in future educational output, then it would not make sense to leave some of the dollars lying around uninvested.

The implicit assumption in this model is that all students in the economy are the same, since we have characterised each by exactly the same investment transformation function. Let us rectify this problem by explicitly provisioning further assumptions to our model. Namely, we scale each individual's investment transformation function by a factor productivity τi\tau_i for student ii.

A Richer Model

In order to admit potential differences between students in how each can successfully transform investment into output in the future, we develop a more general model that captures the previous in its entirety. This is a needed model, as is often the case, one could be sent to a private school and perform poorly in the future for a variety of different reasons. We model this by including scale parameters in the students' investment transformation function, taking these as given. In particular:

f(ei)=τieiαfor any 0<α<1τi>0. f(e_i) = \tau_ie_i^\alpha \quad \text{for any $0 < \alpha < 1$, $\tau_i > 0$}.

We assume here that there is no possibility of 'investment debt', so that positive investment cannot worsen any student's educational output. From this it follows that the total contribution to future output for a given investment vector (e1,e2)(e_1, e_2) is:

F(e1,e2)=τ1e1α+τ2e2α. F(e_1, e_2) = \tau_1e_1^\alpha + \tau_2e_2^\alpha.

The constraint set has not changed, so it is, as before, a convex set and the boundary has a constant slope of 1-1. Given (5), FF is still increasing in eie_i and its level sets are still convex towards the origin. However, its level sets are no longer symmetric about e1=e2e_1 = e_2 in general. Let us, as an aside, determine the condition for such symmetry.

F(e1,e2)=τ1e1α+τ2e2αF(e_1, e_2) = \tau_1e_1^\alpha + \tau_2e_2^\alpha is symmetric about e1=e2e_1 = e_2 iff F(a,b)=F(b,a)F(a, b) = F(b, a) for all investment vectors (a,b)(a, b) where

F(a,b)=τ1aα+τ2bα,F(b,a)=τ2aα+τ1bα. \begin{aligned} F(a, b) = \tau_1a^\alpha + \tau_2b^\alpha, \\ F(b, a) = \tau_2a^\alpha + \tau_1b^\alpha. \end{aligned}

Clearly, F(a,b)=F(b,a)F(a, b) = F(b, a) iff τ1=τ2\tau_1 = \tau_2. Furthermore, we know that, given the constraint, the optimisation problem if the level sets are symmetric about e1=e2e_1 = e_2 will yield policy parameters restricted to e1=e2e_1 = e_2. Therefore, in this model, the optimal policy parameters are e1=e2=E/2e_1 = e_2 = E/2 if and only if τ1=τ2\tau_1 = \tau_2. This insight serves to show that in the grand scheme of things, only relative productivity between the students will matter. We capture this in a theorem.

Theorem 1. Optimal human capital investment is invariant to identical scalar transforms of the individuals’ investment transformation function.

Having equipped ourselves with economic intuition and geometric insight, I turn our attention to the their algebraic counterpart. First, recall that Propositions 1, 2 and 4 guarantee that a unique solution exists at the boundary. The problem therefore reduces to an unconstrained optimisation problem and we set up and solve the Lagrangian as follows:

argmaxe1,e2τ1e1α+τ2e2αs.t.e1+e2E \begin{aligned} \arg\max_{e_1, e_2} & \hspace{0.75em} \tau_1 e_1^\alpha + \tau_2 e_2^\alpha \\ \text{s.t.} & \hspace{0.75em} e_1 + e_2 \leq E \end{aligned} L(e1,e2,λ)=τ1e1α+τ2e2α+λ[Ee1e2] \mathcal{L}(e_1, e_2, \lambda) = \tau_1 e_1^\alpha + \tau_2 e_2^\alpha + \lambda[E - e_1 - e_2] L(e1,e2,λ)e1=ατ1e1α1λ=0L(e1,e2,λ)e2=ατ2e2α1λ=0L(e1,e2,λ)λ=Ee1e2=0 \begin{aligned} \frac{\partial \mathcal{L}(e_1, e_2, \lambda)}{\partial e_1} & = \alpha\tau_1 e_1^{\alpha - 1} - \lambda = 0 \\ \frac{\partial \mathcal{L}(e_1, e_2, \lambda)}{\partial e_2} & = \alpha\tau_2 e_2^{\alpha - 1} - \lambda = 0 \\ \frac{\partial \mathcal{L}(e_1, e_2, \lambda)}{\partial \lambda} & = E - e_1 - e_2 = 0 \end{aligned} \vdots e1(E,τ,α)=E(τ1τ2)1α1+1e2(E,τ,α)=E(τ1τ2)1α1+1 \begin{aligned} e_1^*(E, \vec{\tau}, \alpha) & = \frac{E}{(\frac{\tau_1}{\tau_2})^\frac{1}{\alpha - 1} + 1} \\ e_2^*(E, \vec{\tau}, \alpha) & = \frac{E}{(\frac{\tau_1}{\tau_2})^{-\frac{1}{\alpha - 1}} + 1} \end{aligned}

Notice that the optimal investment policy reduces to e1=e2=E/2e_1 = e_2 = E/2 if τ1=τ2\tau_1 = \tau_2 as is consistent with Theorem 1. To see this, simply work with τ1=τ2\tau_1 = \tau_2, which implies τ1/τ2=1\tau_1/\tau_2 = 1. Substituting τ1/τ2\tau_1/\tau_2 into our solution functions we see that e1=e2=E/(1+1)=E/2e_1 = e_2 = E/(1 + 1) = E/2 because 11 raised to the power of any number greater than 11 is 11, and indeed 1/(α1)>1|1/(\alpha - 1)| > 1 since 0<α<10 < \alpha < 1.

Furthermore, how the optimal policy responds to changes in the τ\vec{\tau}-parameters is captured entirely in how it responds to changes in the ratio between τ1\tau_1 and τ2\tau_2. Indeed, our solution functions can be expressed as taking in the scalar-valued ratio τ1/τ2\tau_1/\tau_2 instead of the vector τ\vec{\tau} as one of its parameters. Every instance of τ1\tau_1 and τ2\tau_2 on the right-hand side of both solution functions appears in its ratio form. In that sense, it is not the absolute values of the τ\vec{\tau}-parameters that matter in the determination of optimal policy, but rather how τ1\tau_1 and τ2\tau_2 geometrically fare with each other. In other words, how investment funds for education shall be distributed depends on the relative productivity of the pool of choice variables we have at hand (i.e., the students).

Yet, part of this essay's literature review serves to show that educational output might not be entirely determined by school inputs. One may ask, in the absence of a perfectly controlled trial where an individual student is tested against many different investment levels in order to derive unbiased estimates of their investment transformation function, how does one go about designing experiments for this purpose? Before we answer this question, it is imperative I remark on the absence of time-related concepts in the relation yi=hi(S,P,F)y_i = h_i(S, P, F). How does this relation hold for each student at a particular point in time? This the original model does not have an answer to. Perhaps, peer influence weighs heavier as time increases. Perhaps, if a student, prior to their enrolment in the first year of school, is challenged with a scholastic test, the scoring they receive would not be a function of SS or PP at all.

Empirical Methods

Let us fine-grain the model by adding onto it a concept of time in as most a general manner as possible. Thus, for any student ii,

yit=hit(S,P,F) y_{it} = h_{it}(S, P, F)

where SS is the quality of school inputs, PP is peer influence, FF is familial influence from home, and tt is a time index with t=0t = 0 being the time prior to enrolment in formal education. tt shall increase by 11 in every subsequent year.

Since at time t=0t = 0 students are yet to enrol in any school and be subjected to peer influence, yi0y_{i0} is reducible to

yi0=hi0(F). y_{i0} = h_{i0}(F).

Therefore, at least at the very beginning of formal education, we could isolate school and familial effects on scholastic outcome if we have a scholastic outcome profile for each student prior to enrolment. If school fees are a perfect reflection of school input, then the average treatment effect in regressions for students grouped by their yi0y_{i0} gives us an unbiased estimate of the effects of school input on scholastic output. In effect, assuming yi0{0,1,2,,100}y_{i0} \in \{0, 1, 2, \ldots , 100\}, the unbiased estimator α^\hat{\alpha} of α\alpha in

f(ei)=τieiα, f(e_i) = \tau_ie_i^\alpha,

at least pertaining to the first year in formal education, is the sample-size weighted average of the following regressions:

logyi1yi0=0=A0+α0logei+ε0logyi1yi0=1=A1+α1logei+ε1 \begin{aligned} \log{y_{i1 | y_{i0} = 0}} = A_0 + \alpha_0 \log{e_i} + \varepsilon_0 \\ \log{y_{i1 | y_{i0} = 1}} = A_1 + \alpha_1 \log{e_i} + \varepsilon_1 \end{aligned} \vdots logyi1yi0=100=A100+α100logei+ε100. \log{y_{i1 | y_{i0} = 100}} = A_{100} + \alpha_{100} \log{e_i} + \varepsilon_{100}.

Numerical Example

Suppose we have data for students that have just completed their first year of education as in Table 1. For students that scored 4545 on their pre-enrolment exam, we perform OLS on the log-transformed (ei,y1i)(e_i, y_{1i}) pairs of the first three columns. In effect, the relevant dataset amounts to {(7,4.01),(7.09,4.17),(7.13,4.19)}\{(7, 4.01), (7.09, 4.17), (7.13, 4.19)\}. Therefore:

α^45=0.00148. \hat{\alpha}_{45} = 0.00148.

Transforming the rest of the rows and performing OLS to compute α^60\hat{\alpha}_{60} and α^68\hat{\alpha}_{68} we get:

α^60=0.342α^68=0.499 \begin{aligned} \hat{\alpha}_{60} = 0.342 \\ \hat{\alpha}_{68} = 0.499 \end{aligned}

so that the sample-size weighted average, α^\hat{\alpha}, is:

α^=0.281. \hat{\alpha} = 0.281. Table of example data

Implications on Public Policy

Returning to the two-student case where the optimal investment distribution is shown to be given by (6), the empirical methods in the previous Section gives us a complete education investment model, since optimal investment is wholly determined by exogenous parameters EE and τ1/τ2\tau_1/\tau_2, and estimated parameter α\alpha. Thus I turn our attention to the analysis of policy.

Often in practice, we see divergences between investment funds allocated to individuals or, on a larger scale, groups of individuals. These differences are accountable to a number of phenomenons, whether that stems from traditional expectations of society, or poorly contrived economic policy. So, a question naturally arises: are these differences, never mind now the position of equity, even efficient? We can sketch an answer to this question with the results we have just now examined. In particular, take the difference between e1e_1^* and e2e_2^* as seen in (6). This difference, by construction, is the efficient difference in investment levels between two individuals, given their productivity ratio and the degree of diminishing marginal returns to investment:

e1(E,τ,α)e2(E,τ,α)=E(τ1τ2)1α1+1E(τ1τ2)1α1+1. e_1^*(E, \vec{\tau}, \alpha) - e_2^*(E, \vec{\tau}, \alpha) = \frac{E}{(\frac{\tau_1}{\tau_2})^\frac{1}{\alpha - 1} + 1} - \frac{E}{(\frac{\tau_1}{\tau_2})^{-\frac{1}{\alpha - 1}} + 1}.

Take note that if we divide both sides by EE and take the absolute value of the result, then we have compacted our object further by removing EE from the right-hand side. This formula is key: it is the efficient absolute difference in investment levels between the two individuals, expressed as a percentage of the total investment fund, given their productivity ratio. Rather than it being a mouthful, we will for conciseness refer to it as the 'summary of an efficient policy'. Let us perform the necessary algebraic manipulations so that we arrive at this key formula:

e1()e2()E=(τ1τ2)1α1(τ1τ2)1α12+(τ1τ2)1α1+(τ1τ2)1α1=Γ(τ1τ2,α). \mid \frac{e_1^*(\cdot) - e_2^*(\cdot)}{E} \mid = \mid \frac{(\frac{\tau_1}{\tau_2})^{-\frac{1}{\alpha - 1}} - (\frac{\tau_1}{\tau_2})^\frac{1}{\alpha - 1}}{2 + (\frac{\tau_1}{\tau_2})^{-\frac{1}{\alpha - 1}} + (\frac{\tau_1}{\tau_2})^\frac{1}{\alpha - 1}} \mid = \Gamma(\frac{\tau_1}{\tau_2}, \alpha).

This formula, as seen above, we view as a function denoted by Γ\Gamma.

Analysing Education Policies

The main point of constructing the Γ\Gamma function lies beyond the question of design. Indeed, if we had only wanted to derive the optimal policy parameters, we could have simply plugged in the exogenous and estimated parameters into our solution functions (6) where we solved for an interior maximum. Still, expressing our solution function in some other form (in this case the optimal percentage difference in investment levels) that one-to-one corresponds with the actual policy parameters provides us with greater depth of insight while making conversations about it more compact. What I want to finally turn our attention to is analysis as opposed to design.

Imagine we encounter a policy that discriminates education investment based on gender, so that the observed deviation of investment between the genders expressed as a percentage of the total fund is dd. Here, our model as it stands does not need further generalisation: we can read our two-individual interpretation as referring to two groups of individuals. How does dd fare with the optimum? While dd is an observed value, the optimum we have only just now derived, so that dd is the summary of an efficient policy if and only if

d=Γ(τ1τ2,α). d = \Gamma(\frac{\tau_1}{\tau_2}, \alpha).

With the observed value dd and the estimated parameter α\alpha, we can solve for the productivity ratio that must be the case if the policy in place is indeed efficient. This productivity ratio we denote by τ1τ2\frac{\tau_1}{\tau_2}^*. We relate equation (7) to the productivity ratio more explicitly: an education investment fund distribution is efficient only if the productivity ratio between the groups by virtue of only the selection criteria into the groups, τ1τ2\frac{\tau_1}{\tau_2}, satisfies (7). With this in mind, we can construct estimates of the true value of τ1τ2\frac{\tau_1}{\tau_2}. These estimates are denoted as τ1τ2^\hat{\frac{\tau_1}{\tau_2}}. From this it follows:

Theorem 2. dd is the summary of an efficient policy if and only if τ1τ2^=τ1τ2\hat{\frac{\tau_1}{\tau_2}} = \frac{\tau_1}{\tau_2}^*.

Conclusion

In this concluding remark, I first emphasise a particular weakness in the empirical methods proposal. The dynamics of how school inputs affect scholastic output may be wholly different in the lower years compared to the higher years. If it is so different, then it is not valid to use the effect so computed by isolating it from the students' pre-school circumstances. This weakness is especially palpable if the model is used to determine scholarship endowments post-secondary school.

In the theory, however, we dabbled a little bit with education policies that discriminate purely based on gender. This need not be interpreted as a public policy discrimination. Traditions and cultures, while they do influence government policies, nevertheless come up with policies rightfully of their own. Our model is applicable to the analysis of policy-makers as an abstract entity. Thus, they are applicable to all institutions, whether it be the free-market or the state, the tradition or the household, that engage in the activity of human capital investment.

With regards to this brief thought experiment on discrimination, the implications of our theory is clear. The efficient variance in investment distribution between the genders increases in nature’s tendency to endow each gender with relatively different levels of productivity. Remember that the exercise requires us to come up with a ceteris paribus estimate of the true value of relative productivity. Group characteristics other than gender must be held constant, so that this relieves all confounded reasons as to why there should be investment differentials. If girls tend to be made to stay at home more than boys by virtue of societal traditions, and so may not have had much chance in honing their productivity from early on, then it does not necessarily justify differentials in government-financed investment. So, even without engaging in this exercise with absolute scrutiny, are we to believe that nature systematically endows our students with productivity levels conditioning on their gender, or at least, conditioning on it enough as to warrant our focus so much on enforcing gender roles? If not, then the window within which investment should diverge is rather small, and our efficiency conditions systematically tends to smooth out investment.

Footnotes and References

Footnotes

  1. Douglas N. Harris. “Diminishing Marginal Returns and the Production of Education: An International Analysis”. In: Education Economics 15 (Mar. 2007), pp. 31–53. doi: 10 . 1080 /09645290601133894.

  2. Eric A. Hanushek. The Economics of Education: A Comprehensive Overview. Chapter 13 - Education production functions. Ed. by Steve Bradley and Colin Green. 2nd ed. Academic Press, An Imprint Of Elsevier, 2020, pp. 161–170.

  3. Eric A. Hanushek. “Conceptual and Empirical Issues in the Estimation of Educational Production Functions”. In: The Journal of Human Resources 14 (1979), pp. 351–388. doi: 10.2307/145575. url: https://www.jstor.org/stable/145575?seq=1 (opens in a new tab).

  4. Jacob Mincer. Schooling, experience, and earnings. National Bureau Of Economic Research, 1974.

  5. How much should we invest in teachers’ payroll? What about classroom resources and school facilities? These are all questions to be answered in the maximisation of educational outputs subject to an educational production function.

  6. Our production function differs from its Neoclassical counterpart in that the Inada conditions do not hold.

  7. This symmetry follows more generally from how FF is additively separable, and its constituent single-variable functions are of the same form. E=e1+e2E = e_1 + e_2 is additively separable and its constituents are of the same functional form. Therefore, the constraint at the boundary is symmetric about e1=e2e_1 = e_2.

© Panhaboth Kun