The Journal of Online Mathematics and Its Applications, Volume 8 (2008)

Integer Programming Model for the Sudoku Problem, Bartlett, Chartier, Langville, Rankin

Can we solve Sudoku puzzles mathematically? That is, can a mathematical process find the solution for us? There are two approaches that we could take to this end:

- Solution 1
- Write a computer program to execute the logic that a person uses to solve a Sudoku puzzle.
- Issue
- What if the programmer has yet to learn some level of logic that solves more difficult problems? This type of solution can necessitate a certain level of proficiency at solving such puzzles.
- Solution 2
- Formulate the problem in a way such that a mathematical algorithm can be applied to find the exact solution, if it exists.

Let us mathematically model the Sudoku puzzle found in Figure 1 as a linear program. More specifically, we will formulate a binary integer linear program (BILP) for general n \times n puzzles. Once the program is developed to solve the BILP for Figure 1, it can be easily adapted to solve *any* Sudoku puzzle.

To begin, we define our decision variables:

x_{ijk}= \left\{
\begin{array}{cl}
1, & \text{if element } (i,j) \text{ of the } n \times n \text{ Sudoku matrix contains integer } k {\rm ~~~~~}\\
0, & \text{otherwise}.
\end{array}
\right.

When the values of the decision variables are determined, we will know whether each integer k (1 \le k \le 9) appears in each element (i,j) of the n \times n Sudoku matrix. That is, the solution to the corresponding Sudoku puzzle will be defined.

We now turn to the ever important objective function and set of constraints. Notice how the constraints require only a knowledge of the rules of Sudoku puzzles (with the addition of one implicit constraint which is constraint (4) below) and do not require a proficiency with the logic necessary to solve such puzzles by hand. A BILP formulation suitable for Sudoku puzzles is as follows:

\begin{array}{llr}
\min {\rm ~~~~~}& \quad \mathbf{0}^T \mathbf{x} \qquad \qquad& \\
s.t. \quad & \displaystyle\sum_{i=1}^n x_{ijk}=1, \quad \mbox{$j$=1:$n$, $k$=1:$n$} \quad \text{(only one } k \text{ in each column)} \quad & (1) \\
\quad & \displaystyle\sum_{j=1}^n x_{ijk}=1, \quad \mbox{$i$=1:$n$, $k$=1:$n$} \quad \mbox{(only one $k$ in each row)} \quad & (2) \\
\quad & \displaystyle\sum_{j=mq-m+1}^{mq} \sum_{i=mp-m+1}^{mp} x_{ijk}=1, \quad \mbox{$k$=1:$n$, $p$=1:$m$, $q$=1:$m$} \quad \mbox{(only one $k$ in each submatrix)} \qquad & (3) \\
\quad & \displaystyle\sum_{k=1}^n x_{ijk}=1 \quad \mbox{$i$=1:$n$, $j$=1:$n$} \quad \mbox{(every position in matrix must be filled)} \qquad & (4) \\
\quad & x_{ijk}=1 \quad \forall \, (i,j,k) \in G \quad \mbox{(given elements $G$ in matrix are set "on")} \qquad & (5) \\
\quad &x_{ijk} \in \{0,1\} \quad & (6)
\end{array}

In this article, Matlab's `bintprog`

command will solve this BILP. (Note that the `bintprog`

command is available with Matlab's Optimization Toolbox.) Other software could also solve this BILP; examples include Excel (Weiss et.al., 2007), Xpress-Mosel (Chlond, 2005), and SAS (Devenezia et.al., 2007).

The Sudoku problem is actually a *satisfiability* problem or feasibility problem (also known as a constraint programming problem). The goal is to find at least one feasible solution satisfying constraints (1)-(6). In theory, because this is a satisfiability problem, no objective function is needed. However, in order to use Matlab's `bintprog`

command, an objective function is required. We chose \mathbf{0}^T as the vector of objective function coefficients.

Prove or disprove the following statement: Any uniform vector would work as the objective function of coefficients in the above BILP. (Solution)

Executing the above BILP model, a Matlab program named sudoku.m can solve this puzzle. We will see that the code can be easily adapted to solve other puzzles. The program requires that the user input the elements, called givens, provided at the start of the puzzle. The matrix of givens for our example is

\begin{eqnarray}
{\rm Givens} = \pmatrix{ 1 & 8 & 2 \cr
2 & 2 & 2 \cr
2 & 7 & 5 \cr
3 & 3 & 7 \cr
3 & 6 & 3 \cr
3 & 7 & 4 \cr
& \vdots & \cr
8 & 3 & 9 \cr
8 & 8 & 1 \cr
9 & 2 & 1},
\end{eqnarray}

where each line in this matrix gives the row index, column index, and integer value of each element appearing in the initial Sudoku matrix. The m-file sudoku.m calls the built-in Matlab function, `bintprog`

which is available through the Optimization Toolbox. On a Mac G5 with dual 2.7 GHz processors and 8 GB of memory, our program found the feasible solution, presented in Figure 2, in 16.08 seconds. Admittedly, this approach will generally be slower than a technique that executes the logical steps a person uses to solve a Sudoku puzzle. One such implementation is available at Sudoku Game - Wolfram Demonstrations Project and is based on the Mathematica implementation by Simons (Simons, 2005).

Clicking the graphic in Figure 2 will download a file sudokuMatlab.jar which is a Java application that allows the user to enter Sudoku puzzles and solve them using sudoku.m. Note a user must have both Matlab and the Optimization Toolbox to successfully utilize the Java application.

Matlab's `bintprog`

uses the classic branch and bound method to determine the optimal solution. The branch and bound method identifies a solution to a BILP by solving a series of LP-relaxation problems, by replacing (or relaxing) the binary constraint x_{ijk} = 1 or 0 to the weaker constraint 0 \le x_{ijk} \le 1. Initially, all the variables adhere to the relaxed constraint 0 \le x_{ijk} \le 1. At each branching step, the branch and bound method chooses a variable x_j whose current value for the LP-relaxation problem is not integral and adds a constraint x_j = 0 for one branch and x_j = 1 for another branch. For our problem, a branch of the search tree is discontinued if the LP-relaxation problem at the current node is infeasible. In our BILP formulation, the bounding of a branch and bound algorithm will not play a role given that our objective function equals 0 for any feasible solution. For additional information on the branch and bound algorithm, a reader is encouraged to reference the MATLAB documentation for the `bintprog`

function available at http://www.mathworks.com/access/helpdesk/help/toolbox/optim/ug/bintprog.html.

Given the structure of a branch and bound algorithm, each additional constraint enforced by a given element of the opening Sudoku matrix greatly reduces the search space of the branch and bound procedure and makes the problem tractable. It's easy to understand why each given element of the matrix helps the branch and bound procedure. For example, because the (1,3) element in our 9 \times 9 Sudoku example from Figure 1 is 7, this means that x_{137}=1, which implies that

- x_{13k}=0, \quad \forall k \ne 7 (there can be no other integer in the (1,3) position)
- x_{i37}=0, \quad \forall i \ne 1 (there can be no other 7 in the third column)
- x_{1j7}=0, \quad \forall j \ne 3 (there can be no other 7 in the first row)
- x_{217}=0, x_{227}=0, x_{317}=0, x_{327}=0 (there can be no other 7 in the (1,1) submatrix)

Thus, for each given element, at most 29 (in general, 3(n-1)+(m-1)^2+1)) of the n^3=729 decision variables are determined.

Why can't we say 29 exactly? (Solution)

Without considering the constraints, the total search space for our model is 2^{n^3} because each decision variable can take on two values, and there are n^3 decision variables. For n=9, 2^{729} \approx 2.82 \times 10^{219}. Fortunately, constraint (5) supplies many givens (at least 17, see Section 4.1), each of which drastically reduces the search space. Suppose only 9 givens are provided, where each given appears in its own row, column, and submatrix (so that there is no double counting), then 9*29=261 of the 729 decision variables are determined, leaving a search space of 2^{468} \approx 7.62 \times 10^{140}. Of course, considering the remaining givens further reduces the search space, and makes the problem tractable for the IP technique.

Our BILP finds at least one feasible solution, if it exists. While Sudoku puzzles are defined as having a unique solution, the authors occasionally discovered puzzles in which this constraint was not met. Such faulty puzzles have multiple solutions. In such cases, we discovered that our solution while correct was distinct from the solution provided by the puzzle creators.

The n=4 Sudoku puzzle below has two solutions. (Note, the underscore is included in the matrix for entries that are not included as givens for the initial tableau.)

\text{Puzzle} =
\left(
\begin{array}{cccc}
\underline{~~} & 2 & \underline{~~} & \underline{~~} \\
3 & \underline{~~} & \underline{~~} & \underline{~~} \\
\underline{~~} & \underline{~~} & 4 & 3 \\
\underline{~~} & 3 & \underline{~~} & \underline{~~} \\
\end{array}
\right),
\quad \text{Solution }
=
\left(
\begin{array}{cccc}
1 & 2 & 3 & 4 \\
3 & 4 & 2 & 1 \\
2 & 1 & 4 & 3 \\
4 & 3 & 1 & 2 \\
\end{array}
\right),
\quad \text{or} \quad
\left(
\begin{array}{cccc}
1 & 2 & 3 & 4 \\
3 & 4 & 1 & 2 \\
2 & 1 & 4 & 3 \\
4 & 3 & 2 & 1 \\
\end{array}
\right).

Find all solutions to the Sudoku puzzle below. (Hint: there are 3.)

\text{Puzzle } =
\left(
\begin{array}{cccc}
\underline{~~} & 2 & \underline{~~} & 4 \\
3 & \underline{~~} & \underline{~~} & \underline{~~} \\
\underline{~~} & \underline{~~} & 4 & \underline{~~} \\
\underline{~~} & \underline{~~} & \underline{~~} & \underline{~~} \\
\end{array}
\right).

(Solution)

Create a program to find all solutions to a Sudoku puzzle. One way to do this is to modify the sudoku.m file to find all optimal solutions to our BILP. (Yato and Seta have shown that this problem is something they define as ASP-complete (Yato et al., 2002), which implies NP-completeness and demonstrates the challenge and complexity of the all solutions problem.)

Generally, the values of a group of cells in the puzzle can be determined from the givens in the initial matrix using only very simple logic. Identifying such values essentially increases the number of givens for the puzzle and generally aids in the efficiency of the BILP. Write an algorithm that determines the values of cells from the givens (again using only simply logic on the placement of the givens) and insert this as a preprocessing step in the sudoku.m file so that the matrix of `Givens`

is appended. Compare the computation time required with and without this preprocessing step.

Although it is a fun application for classes that study optimization, it is not necessary to use a BILP or even optimization techniques to solve Sudoku puzzles. Can you use other mathematical or computational techniques to solve these puzzles? Write your own algorithm.

Can you create a 4 \times 4 puzzle with a unique solution using only 5 givens? 4 givens? 3 givens? Which is more important, the position or the value of the givens? Formulate a proposition that summarizes your findings. For example, "any 4 \times 4 puzzle whose 5 givens satisfy...has a unique solution." (Solution)

Prove the proposition(s) you formulated in Exercise 4.