So a friend of mine, Franklin, recently posted about an econ/math optimization problem regarding foods and their nutritional values (see Stigler diet). The problem roughly goes like this: given food items \(F_1, F_2, F_3, \dots, F_n\) that cost \(c_1, c_2, c_3, \dots, c_n\) per unit, each with nutritional value \(N_1, N_2, \dots, N_m\) per unit, how do we minimize costs to reach one's daily nutritional goals \(G_1, G_2, \dots, G_m\) respectively? (A seemingly simple, naïve problem. Although I wouldn't want to consume something like Soylent every day.) An example would be the following array of nutritional information:

$$\begin{array}{c|lcr} \space & F_1 & F_2 \\ \hline N_1 & 0.2 & 0.3 \\ N_2 & 0.3 & 0.1 \\ \end{array}$$

That’s all normalized so that the units given are in proportion of daily value per unit cost. The blog post linked above details a case-by-case approach to this problem, so I won't delve into that. The post, however, ends off with a call for a do-it-all algorithm. I decided to look into this as I might have some ideas regarding a generalized form of this problem…

First of all, an algorithm for the simple form is pretty trivial given the overpoweredness of Mathematica:

Minimize[{F1 + F2, {{0.2, 0.3}, {0.3, 0.1}}.{F1, F2} == {1, 1}, F1 >= 0, F2 >= 0}, {F1, F2}]

which returns

{4.28571, {F1 -> 2.85714, F2 -> 1.42857}}

Just what the question called for. So that’s aside.

What if, though, food prices are not the same depending on how much of a certain food you buy? Say a nation, or even a community, needs to source food to feed its large population of constituents. Perhaps buying in bulk saves more money, or (on a larger scale) buying too much of one food costs more money as there is a limited supply of that food item. (Hey, it’s an econ problem so these are valid concerns.) Maybe, even, eating too much of one food diminishes its nutritional value. (I don’t know, digestion or something?) So let’s assume that the units of a food item you can buy is a \(C^1\) function of cost, and the units of nutrition is also a \(C^1\) function of units of food. We combine this \(N(F(c))\) into a single \(C^1\) function \(f(c)\) of cost. Consider this generalized diet problem now: Given food items \(F_1, F_2, \dots, F_n\) that respectively yield units of nutrition \(N_1, N_2, \dots, N_m\) given by the functions \(f_{i,j}(c_i)\) of cost \(c_1, c_2, \dots, c_n\) (for the \(i^{\mathrm{th}}\) food item and the \(j^{\mathrm{th}}\) nutritional value), how do we minimize cost \(C=\sum c_i\) to reach daily nutritional goals of \(G_1, G_2, \dots, G_m\)? In other words, say the nutritional table now looked like this:

$$\begin{array}{c|cccc} & F_1& F_2 & \cdots & F_n\\ \hline N_1 & f_{1,1}(c_1) & f_{1,2}(c_2) & \cdots & f_{1,n}(c_n) \\N_2 & f_{2,1}(c_1) & f_{2,2}(c_2) & \cdots & f_{2,n}(c_n) \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ N_m & f_{m,1}(c_1) & f_{m,2}(c_2) & \cdots & f_{m,n}(c_n) \end{array}$$

This, now, translates into a typical optimization problem that we can solve using something relatively overpowered: Lagrange Multipliers. The setup is as such, we have have a function:

$$C(c_1,c_2,\dots,c_n)=c_1+c_2+\cdots +c_n$$

that we are trying to find critical points on (find a minima). We have \(n\) constraint functions (we assume here that we hit the dietary quotas perfectly):

$$N_1(c_1,\dots,c_n)=f_{1,1}(c_1)+f_{1,2}(c_2)+\cdots + f_{1,n}(c_n)-G_1=0$$

$$N_2(c_1,\dots,c_n)=f_{2,1}(c_1)+f_{2,2}(c_2)+\cdots + f_{2,n}(c_n)-G_2=0$$


$$N_m(c_1,\dots,c_n)=f_{m,1}(c_1)+f_{m,2}(c_2)+\cdots + f_{m,n}(c_n)-G_m=0$$

To combine all these equations, we use an auxiliary Lagrangian function:

$$\mathcal{L}(c_1,\dots,c_n,\lambda_1,\dots,\lambda_m)=C(c_1,\dots,c_n)-\sum_{k=1}^m \lambda_k N_k(c_1,\dots,c_n)$$

and solve

$$\nabla \mathcal{L}(c_1,\dots,c_n,\lambda_1,\dots,\lambda_m)=\mathbf{0}$$

which has \(n+m\) inequalities in \(n+m\) equations.

This is obviously overkill for linear \(f\) with constant derivative, but we can try using this to “nuke” the 3x3 dietary matrix example that Franklin mentioned just to prove that it works.

$$\begin{array}{c|ccc}\space & F_1 & F_2 & F_3 \\ \hline N_1 & 0.1 & 0.2 & 0.5 \\ N_2 & 0.5 & 0.4 & 0.1 \\ N_3 & 0.7 & 0.1 & 0.6 \\ \end{array}$$

Using the functions we defined above, we say that \(f_{1,1}(c_1)=0.1c_1, f_{1,2}(c_2)=0.2c_2\), and so on. The function that we are trying to minimize has derivative of a matrix of ones:

$$C(c_1,c_2,c_3)=c_1+c_2+c_3\qquad [\mathbf{D}C(c_1,c_2,c_3)]=\begin{bmatrix}1&1&1\end{bmatrix}$$

And our 3 constraints are as follows:




along with the 3 other equations from our auxiliary function:

$$\begin{align*}\nabla_{c_1} \mathcal{L}=&1-\lambda_1(\mathbf{D_1}N_1)-\lambda_2(\mathbf{D_1}N_2)-\lambda_3(\mathbf{D_1}N_3)\\=&1-\lambda_1(0.1)-\lambda_2(0.2)-\lambda_3(0.5)=0\end{align*}$$

$$\begin{align*}\nabla_{c_2} \mathcal{L}=&1-\lambda_1(\mathbf{D_2}N_1)-\lambda_2(\mathbf{D_2}N_2)-\lambda_3(\mathbf{D_2}N_3)\\=&1-\lambda_1(0.5)-\lambda_2(0.4)-\lambda_3(0.1)=0\end{align*}$$

$$\begin{align*}\nabla_{c_3} \mathcal{L}=&1-\lambda_1(\mathbf{D_3}N_1)-\lambda_2(\mathbf{D_3}N_2)-\lambda_3(\mathbf{D_3}N_3)\\=&1-\lambda_1(0.7)-\lambda_2(0.1)-\lambda_3(0.6)=0\end{align*}$$

(You'll realize that there are 3 equations solely with \(c_i\)s and 3 equations solely with \(\lambda_i\)s. This is because we're using the simple linear case, and is also why the simple case only really works with square matrices – otherwise you would be solving for too many variables with not enough equations, or too many equations for variables so there aren't any solutions.) You'll also realize that solving this is trivial as it's just the same 3 equations, and is the same solution detailed in Franklin's post; I won’t repeat it here. \(\blacksquare\)