The stochastic programming farming problem involves deciding how to allocate 500 acres of land to three different crops: wheat, corn, and sugar beets. While the sales prices at harvest time are fixed, the yields per acre are not. The objective is to decide how to allocate the land such that the expected profit is maximised. Equivalently in standard form, we want to minimize costs less revenue. To incorporate the uncertainty component, we introduce randomly generated scenarios representing possible harvest yield outcomes for our expectation.
To solve this problem, we were tasked with constructing a Simplex Method solver and a Dantzig-Wolfe Decomposition Solver. Intuitively, we expect that the Dantzig-Wolfe method should perform better than the Simplex method as the decomposition reduces the size of the matrices involved at each step by solving smaller subproblems at a time.
Before constructing any solver, we must formulate the problem. For a given instance of the farming problem, we require n harvest yield cases corresponding to n possible scenarios in the instance. As we are assuming that each scenario is equally likely, a uniform distribution seems appropriate. In order to set bounds on the support of the uniform, we consider a couple of subject-matter considerations. First, any yield must be bounded below by 0. We cannot plant crops and harvest a negative amount per acre. Then, from a practicality perspective, it is unlikely that harvesting almost nothing or double the average would be common phenomena. To reflect this, the bounds on our uniform will be \((0.5,1.5)\) where 0.5 represents the case where a crop yields half the average and 1.5 represents a 50% increase in the yield.
To actually generate a scenarios, we draw three random values from the uniform distribution: one representing the yield multiplier for each crop. Note that 3 draws are made instead of a single one per scenario to allow the crops to produce different yields. However, I decided that each scenario should either reflect a positive harvest or a negative harvest, meaning that all the yields in one scenario should be above average or all yields should be below average. This translates to an additional restriction on our uniform. In order to select the overarching conditions of the case, we essentially perform a Bernoulli trial with probability 0.5 which selects the positive or negative harvest. If a positive harvest is selected, three numbers are drawn from a Uniform(1,1.5) distribution. If a negative harvest is selected, three numbers are drawn from a Uniform(0.5,1) distribution. Equipped with the three yield multipliers, the average harvest of \([2.5,3,20]\) is element-wise multiplied by the multipliers to arrive at the yields for the specific scenario. These yields are then incorporated into the standard formulation of the farming problem.
The function CaseGenerator(n) is my implementation of this process. The process first builds the common 500 acres constraint and appends each scenario to the cost and constraints matrices as the for loop generates them. To produce a consistent form, the constraints are all formulated as less than or equal to constraints. The code for this function is included in the appendix.
To solve the farming problem, we’ve implemented two solvers: a Simplex Method and a Dantzig-Wolfe Decomposition. The code for both of these methods is included in the appendix. Here we will discuss a couple key implementation details.
For the Simplex method, instead of explicitly finding the inverse of the basis matrix every iteration, we use the linsolve function to solve the equivalent system of linear equation problems as specified in the Revised Simplex method. Second, upon finding that the problem is unbounded, the extreme direction is returned along with a flag value in the cost to indicate that the LP was unbounded. Finally, the algorithm is initialized using the Big-M method. As some of the coefficients in the b-vector are negative, we cannot just introduce slack variables and use an identity matrix corresponding to the slack variables as our initial basis. Instead, upon converting the problem to standard form, the constraints with negative right-hand sides are mutliplied by -1 (thus setting the coefficients of some slack variables to -1) and then the Big-M variables are introduced. We now have an identity matrix corresponding to the Big-M variables to initialize with.
In the Dantzig-Wolfe decomposition, a similar initialization strategy is employed. However, we first solve the subproblems to find initial corner points using the linprog function and then initialize the Big-M variables. These are the first columns pushed out by the algorithm due to their high cost. The second noteworthy implementation detail is that our custom Simplex method implementation is called at each iteration to solve the subproblems. As we designed the Simplex implementation to output extreme directions upon finding unboundedness, it flows nicely into the Dantzig-Wolfe decomposition. Finally, we impose a tolerance level on the reduced costs. Due to computational accuracy issues, the reduced costs were being computed as negative values in the order of \(10^{-12}\) when they should have been zero. Thus, any absolute values of the reduced costs less than \(10^{-8}\) are converted to zero when checking for termination. As the implementation is set to terminate only upon finding optimality as opposed to a certain error bound, we cannot allow computational restrictions to prevent termination.
One final note about computational accuracy. In both methods, we use linsolve to search for the direction vectors. Quite frequently, linsolve would return incredibly small elements in the solution vectors close, but not equal to 0, even though they should have been 0. These values would then trigger an incorrect entry into the basis. To correct for this problem, we set absolute values below a tolerance to 0.
Both methods implemented excelled on the accuracy front. In the Timing and Testing section of the Appendix, there is a script that loops over a set of values representing the number of subproblems. While recording the amount of time each method took to solve the problem, I also compared the solution to the solution of the same problem using Matlab’s linprog function. The Simplex Method and the Dantzig-Wolfe decomposition were tested on values between 1 and 500. There were no absolute differences greater than \(10^{-8}\) observed between either method’s solution and the linprog solution. For example, on a given run of a 100 subproblem case, all three algorithms produce the results -100 114.67. Given the consistency in our results, we can feel very comfortable about the solution quality of the solvers.
For the Dantzig-Wolfe decomposition, the key reason for this high quality is that accuracy was prioritized over speed in the implementations. The Dantzig-Wolfe decomposition is allowed to run until termination - not until the bounds on the solution reach some value. It is only when the reduced costs are all positive that the algorithm terminates.
Having established the high solution quality, we now turn to the question of efficiency. While testing the accuracy of each method over the problem size, the time to solve the LP was recorded for each subproblem count. The following plot shows the runtimes of the three methods on the problem sizes up to 500.
We can see that the Simplex Method and Dantzig-Wolfe decomposition begin to diverge at a problem size of around 225. The runtime of the Simplex Method begins to grow much faster as a function of problem size than the other method. At the 500 subproblem mark, it is close to double the time. However, notice that linprog is still terminating at under a second for all problem sizes. While the efficiencies within the decomposition outperform the Simplex method, they come nowhere close to the industrial solver.
The following table includes a subset of the subproblems tested.
Size | Linprog | Simplex Method | Dantzig-Wolfe |
---|---|---|---|
1 | 0.0085603 | 0.0015545 | 0.0299023 |
2 | 0.0086845 | 0.0016559 | 0.0351054 |
3 | 0.0088273 | 0.0019626 | 0.0454192 |
4 | 0.0090289 | 0.0034692 | 0.0589206 |
5 | 0.0091014 | 0.0046650 | 0.0666460 |
10 | 0.0093767 | 0.0108294 | 0.1139530 |
15 | 0.0097107 | 0.0317854 | 0.1910992 |
25 | 0.0103996 | 0.1330424 | 0.4060585 |
50 | 0.0119999 | 0.6250298 | 1.0920966 |
100 | 0.0177685 | 5.8116758 | 6.5515200 |
100 | 0.0178726 | 5.8116758 | 6.5515200 |
150 | 0.0281627 | 18.4340736 | 16.2262651 |
250 | 0.1552917 | 125.2588135 | 85.8248788 |
500 | 0.1795418 | 1304.6220852 | 733.8879828 |
It is interesting to note that until 150 subproblems, the Dantzig-Wolfe decomposition was actually slightly slower than the Simplex method. Even more interesting is that our custom Simplex method consistently outperforms linprog for very small cases. Knowing this, using the custom Simplex method in our Dantzig-Wolfe decomposition is a smarter choice than linprog given the small size of the subproblems. However, beyond 5, linprog is clearly the faster algorithm. To test the limits of the methods, we did run the algorithm on a problem size of 1000. The Dantzig-Wolfe decomposition converged after 1.5 hours. The Simplex method did not converge after 6 hours and was manually terminated. These values are not included on the plots because they would drive the scale far too high. However, it does illustrate that the Simplex method does not scale at all to these very large problem sizes.
The natural next question is how would we go about improving the speed of our algorithm. There are a couple of immediately obvious effiencies that could be incorporated into the Dantzig-Wolfe decomposition. When solving an unbounded subproblem, the Simplex method often returns a standard basis vector as the extreme direction. When stepping through the solutions, I noticed that it would often be the case that there were two obvious extreme directions as there would be two negative coefficients in the cost vector corresponding to variables only constrained to be positive. However, as standard basis vectors were returned as the output, it would take two iterations to process this structure. If we developed a customized Simplex method specifically for the Dantzig-Wolfe that recognized such structure and then returned a better extreme direction, we could drastically reduce the number of iterations necessary. Secondly, we could recognize that most of the subproblems have the same structure and thus updates could be made in groups in certain instances.
In summary, the Dantzig-Wolfe decomposition outperforms the Simplex method for large problem sizes. We are able to observe this in the subproblem sizes up to 500. Beyond that, the Simplex method is too slow to even attempt to run. While it could beat the Simplex method, the Dantzig-Wolfe decomposition wass not able to beat the built-in linprog function.
% fuction [A b c] = CaseGenerator(n)
% Generates n scenarios of the farming problem and converts them into
% matrices characterizing the linear program. The scenarios are drawn from
% a uniform distribution in the regions where all coordinates are above 1
% or all coordinates are below 1. This is to restrict the subproblems to
% cases where farming yields increased for the year or decreased for the
% year. Finally, each coordinate must have a value between [0.5, 1.5].
% The draw is then multiplied by the average yields [2.5, 3, 20] to
% construct the given subproblem's yields.
function [A, b, c] = CaseGenerator(n)
% First set the coupling constraints for the common x variables
A = [1, 1, 1]; %Acreage constraint
b = [500]; %500 total acres
c = [150, 230, 260]; %Planting costs
% Extend the constraint matrix A for each scenario
% First note that there are 6 new variables per scenario
% Thus, the number of columns of A is 3 + 6n
% We can construct a new matrix A2 to store the coefficients of the
% new variables. This will be appended to A later.
A2 = zeros(1,6*n);
% Next, notice that the coefficients of the new variables introduced
% in each scenario share the same coefficients. The only coefficients
% that vary between scenarios are those tied to the acerage variables.
% Thus, we can constructed a diagonal block matrix to represent the
% introduced variables. These will be added to A2. For consistency, we
% will only be introducing less than or equal to constraints. The
% constraints are thus sales - production - purchases <= quota
% Constraints
temp = [-1, 0, 1, 0, 0, 0; %Wheat Production Constraint
0, -1, 0, 1, 0, 0; %Corn Production Constraint
0, 0, 0, 0, 1, 1; %Beet Production Constraint
0, 0, 0, 0, 1, 0]; %High Price Beet Sales Constraint
A2 = [A2; kron(eye(n),temp)];
% Generate the yields for each of the n cases
for i = 1:n %linspace(0.8,1.2,n)
% Randomly determine if this will be a positive or negative case
year = round(rand(1))/2;
% If year == 0, yields will generate Unif(0.5,1) values.
% If year == 0.5, yields will generate Unif(1,1.5) values.
yields = 0.5 + year + rand(1,3)/2;
yields = -diag(yields .* [2.5, 3, 20]); %Negative to enforce less than or equal to
A = [A; yields; zeros(1,3)];
end
%Finally, concatenate A and A2
A = [A, A2];
% Extend the constraint vector b for each scenario
btemp = repmat([-200; -240; 0; 6000],n,1);
b = [b; btemp];
% Extend the cost vector for each scenario
ctemp = repmat([238, 210, -170, -150, -36, -10],1,n)/n;
c = [c, ctemp];
end
% function [cost, xopt] = CustomSimplex(A, b, c)
% CustomSimplex implements the simplex algorithm for problems with all less
% than or equal to constraints. The inputs A, b, and c correspond to the
% constraint matrix, contraint right-hand sides and cost vectors
% respectively. This implementation solves systems of linear equations
% instead of inverting matrices as specified in the Revised Simplex Method.
function [cost, xopt] = CustomSimplex(A, b, c)
% Note the size of the matrix A
[m, ~] = size(A);
%The first step is to convert the problem to standard form. As all of
%the constraints are less than or equal to, concatenate an mxm identity
%matrix to the A matrix and add m zeroes to the cost vector
A2 = [A, eye(m)];
c2 = [c, zeros(1,m)];
%In order to use the Big M Method to initialize an x vector, we will
%need the constants on the right-hand side of the constraints to all be
%positive. We will want to multiply the appropriate rows of A and b by
%-1 to accomplish this.
negatives = (b>0) - (b <= 0);
A2 = A2.*negatives;
b2 = b.*negatives;
%Introduce the Big M Variables to the constraint matrix and the cost
%vector. We will use M=1,000,000 as our penalty.
A2 = [A2, eye(m)];
c2 = [c2, 10000*ones(1,m)];
%Update the dimensions of A to those of A2
[m, n] = size(A2);
%Step 0: Initialization
%Solve for the initial basis vector xB
xB = linsolve(A2(:,(n-m+1):end),b2);
%Set the remaining x values to 0 under xN
xN = zeros(n-length(xB),1);
%Set the index lists of the x variables
idxN = 1:length(xN);
idxB = 1:length(xB);
idxB = length(xN) + idxB;
%Partition the A matrix and c vector
N = A2(:,1:length(xN));
B = A2(:,(n-m+1):end);
cN = c2(1:length(xN));
cB = c2((n-m+1):end);
cost = cN*xN + cB*xB;
%Counter for the number of steps
step = 0;
while 1
%Step 1: Optimality Check
%Find the reduced costs
pi = linsolve(B.',cB.');
r = cN - pi.'*N;
if all(r >= 0) %Optimal solution found
break
end
%Select a variable from the non-basis set to enter the basis
entering = idxN(r < 0);
xq = entering(1);
%Step 2: Descent Direction Generation
Nq = N(:,idxN == xq);
d = linsolve(B, -Nq);
d(abs(d) < 0.00001) = 0; %Correct for computation inaccuracies
temp = eye(length(xN));
eq = temp(:,idxN == xq);
%Check if the LP is unbounded
if all(d >= 0)
%disp("LP is unbounded");
cost = 1234567890;
[~, n] = size(A);
xopt = zeros(n,1);
%Reconstruct the extreme direction
for j = 1:n
if any(idxB == j)
xopt(j) = d(idxB == j);
else
xopt(j) = eq(idxN == j);
end
end
return
end
%Step 3: Step Length Generation
idxj = idxB(d < 0);
xj = xB(d < 0);
dj = d(d < 0);
minratio = -xj ./ dj;
alpha = min(minratio);
%Step 4: Improved Adjacent Basic Feasible Solution
xB = xB + alpha * d;
xN = xN + alpha * eq;
%Step 5: Basis update
exit = idxj(minratio == alpha);
exit = min(exit);
exitcol = B(:,idxB == exit);
exitc = cB(idxB == exit);
%Update the basis matrix, basis cost coefficients and index set
B(:,idxB == exit) = Nq;
cB(idxB == exit) = cN(idxN == xq);
xB(idxB == exit) = xN(idxN == xq);
idxB(idxB == exit) = xq;
%Update the non-basis matrix, cost coefficients and index set
N(:,idxN == xq) = exitcol;
cN(idxN == xq) = exitc;
xN(idxN == xq) = 0;
idxN(idxN == xq) = exit;
%Compute the new cost value
cost = cN*xN + cB*xB;
step = step + 1;
end
disp(step);
%Reconstruct xopt in the correct ordering
[~, n] = size(A);
xopt = zeros(n,1);
for j = 1:n
if any(idxB == j)
xopt(j) = xB(idxB == j);
else
xopt(j) = xN(idxN == j);
end
end
end
% function [cost, xopt] = DantzigWolfe(A, b, c, N, tol)
% DantzigWolfe implements the Dantzig-Wolfe decomposition algorithm for
% problems with the structure of the stochastic farming problem. The inputs
% A, b, and c correspond to the constraint matrix, contraint right-hand
% sides and cost vectors respectively. N is the number of scenarios in the
% problem and tol is the parameter controlling computation limitations.
function [cost, xopt] = DantzigWolfe(A, b, c, N, tol)
% First we must subdivide this problem into the master problem and
% the subproblems. The first constraint x1 + x2 + x3 <= 500 corresponds
% to the first subproblem. After that, the structure of the A matrix is
% such that the first 3 constraints are linking constraints and the
% fourth constraint on w3 <= 600 is specific to the subproblem. Thus,
% we can extract the master problem and subproblem constraints as follows:
[~, col] = size(A);
MA = zeros(3*N,col);
Mb = zeros(3*N,1);
%Extract the linking constraints
for i = 1:N
MA((3*i-2):(3*i),:) = A((4*i-2):(4*i),:);
Mb((3*i-2):(3*i)) = b((4*i-2):(4*i));
end
%Extract the subproblem constraints. We will use cell arrays so we can
%access the subproblem constraints by subproblem number
SA = cell(1,N+1);
Sb = cell(1,N+1);
Sc = cell(1,N+1);
SA{1} = A(1,1:3);
Sb{1} = b(1);
Sc{1} = c(1:3);
for i = 1:N
SA{i+1} = A((4*i+1),((6*i)-2):((6*i)+3));
Sb{i+1} = b(4*i+1);
Sc{i+1} = c(((6*i)-2):((6*i)+3));
end
%Step 0: Initialization
%v holds the current extreme points in the solution and vidx indicates
%which subproblem they were generated from
vidx = zeros(1,N+1);
v = cell(1,N+1);
%Suppress linprog output for generating initial vectors
options = optimoptions('linprog','Display', 'off');
for i = 1:(N+1)
vidx(i) = i;
%Find the initial solution using the phase 1 method. Introduce the
%slack variables and artificial variables before using linprog.
p1c = [zeros(1,length(Sc{i})),0,1];
p1A = [SA{i},[1,1]];
p1b = Sb{i};
p1lb = zeros(1,length(p1c));
init = linprog(p1c,[],[],p1A,p1b,p1lb,[],options);
if init(length(init)) > 0
disp('Subproblem is infeasible');
return
end
v{i} = init(1:length(init)-2);
end
% To initialize the master problem constraint matrix, we first need to
% introduce slack variables to set the inequalities to equalties. Then,
% we multiply each by -1 since the right hand side of the three linking
% constraints per subproblem are each non-positive. Finally, we will
% introduce Big-M artificial variables to initialize at the
% coefficients of b
%Note the size of the matrix MA
[m, n] = size(MA);
%Add the slack variables and multiply each row by -1
MA = [MA, eye(m)];
c = [c, zeros(1,m)];
MA = MA * -1;
Mb = Mb * -1;
%Add the Big-M variables
MA = [MA, eye(m)];
c = [c, 1000*ones(1,m)];
%Extract the L matrices into a cell array for easier access
L = cell(1,N+1);
L{1} = MA(:,1:3);
for i = 1:N
L{i+1} = MA(:,((6*i)-2):((6*i)+3));
end
%Construct the initial Basis matrix and basis index set. To distinguish
%between lambda/mu variables and Big-M variables, any index in the
%basis index set less than or equal to N+1 will indicate a subproblem
%number whereas any value greater than N+1 will correspond to a Big-M
%column.
Bidx = [vidx, (n+m+1):(n+m+m)];
B = zeros(m+N+1, length(Bidx));
fb = zeros(length(Bidx),1);
for val = 1:length(Bidx)
if Bidx(val) <= (N+1)
temp1 = L{val}*v{val};
temp2 = zeros(N+1,1);
temp2(val) = 1;
B(:,val) = [temp1; temp2];
fb(val) = Sc{val}*v{val};
else
temp = [MA(:,Bidx(val));zeros(N+1,1)];
B(:,val) = temp;
fb(val) = 1000;
end
end
%Set the initial x value
xB = [ones(N+1,1);Mb];
cost = fb' * xB;
step = 0;
while 1
%Step 1: Simplex Multiplier Generation
dual = linsolve(B', fb);
pi1 = dual(1:m);
%Step 2: Optimality Check
rstar = zeros(1,N+1);
a = zeros(m+N+1,1);
extreme = 0;
entering = 0;
ftemp = 0;
SPsols = cell(1,N+1);
for j = 1:(N+1)
%In certain cases, obj is producing numbers such as 2e-14
%instead of 0. To allow the algorithm to terminate, we will set
%these to 0 based on the tolerance input.
obj = Sc{j} - pi1' * L{j};
obj(abs(obj) < tol) = 0;
[r, dir] = CustomSimplex(SA{j}, Sb{j}, obj);
%if the problem is unbounded, stop iterating and return the
%extreme direction
if r == 1234567890
a(1:m) = L{j}*dir;
extreme = 1;
entering = j;
SPsols{j} = dir;
ftemp = Sc{j}*dir;
rstar(j) = r;
break
else
%If not, just return the reduced cost and the subproblem
%solution
rstar(j) = r - dual(m+j);
SPsols{j} = dir;
end
end
%Step 3: Column Generation
%If we did not find an extreme direction, choose the minimal r
%value from the subproblems
if extreme == 0
rmin = min(rstar);
if (rmin >= 0 || abs(rmin) <= tol)
%disp("Optimal Solution Found!");
break
end
ridx = find(rstar == rmin, 1);
entering = ridx;
a(1:m) = L{ridx}*SPsols{ridx};
a(m+ridx) = 1;
ftemp = Sc{ridx}*SPsols{ridx};
end
%Step 4: Descent Direction Generation
d = linsolve(B,-a);
d(abs(d) < tol) = 0; %Correct for computation inaccuracies
if all(d >= 0)
disp("LP is unbounded");
break
end
%Step 5: Step Length Generation
dl = d(d < 0);
xl = xB(d < 0);
ratio = -xl./dl;
alpha = min(ratio);
lidx = find(d < 0);
exit = find(ratio == alpha, 1);
lstar = lidx(exit);
%Step 6: Update Basic Variables and Basis
xB = xB + alpha*d;
%Step 7: Basis Update
B(:,lstar) = a;
Bidx(lstar) = entering;
xB(lstar) = alpha;
fb(lstar) = ftemp;
v{lstar} = SPsols{entering};
cost = fb' * xB;
step = step + 1;
end
%Reconstruct the optimal x solution
xopt = [];
for k = 1:(N+1)
kidx = find(Bidx == k);
if k == 1
kvec = zeros(3,1);
else
kvec = zeros(6,1);
end
for l = kidx
contribution = xB(l) * v{l};
kvec = kvec + contribution;
end
xopt = [xopt; kvec];
end
testcases = [1:99, 100:25:500];
testcases2 = [750, 1000, 1500, 2000];
time1 = zeros(length(testcases),1);
time2 = zeros(length(testcases),1);
time3 = zeros(length(testcases),1);
for i = testcases
[A,b,c] = CaseGenerator(i);
tic;
[~, fval1] = linprog(c,A,b,[],[],zeros(1,3+6*i),[],options);
time1(i) = toc;
tic;
[fval2, ~] = CustomSimplex(A,b,c);
time2(i) = toc;
tic;
[fval3, ~] = DantzigWolfe(A, b, c, i, 1e-8);
time3(i) = toc;
diff1 = fval1 - fval3;
diff2 = fval1 - fval2;
diff = max(abs(diff1),abs(diff2));
if diff > 1e-8
disp(i);
end
end