## # Predicting Penalty Coefficients for QUBO formulations

Lucas 2014, Gloever 2018, these are the legendary papers on QUBO formulations. There is also a major list of QUBO formulations to be found here. Often in these QUBO formulations, there are a set of coefficients (A, B, C, ...) that weigh different constraints in a certain way. Most of the time, there is also a certain set of valid coefficient values that are allowed - else, the QUBO does not map correctly to the original problem and may result in invalid solutions when solved.

An examplary QUBO formulation (for Number Partitioning) might look as follows: $A(\sum_{j=1}^{m}n_{j}x_{j})^2$, where $n_j$ is the j-th number in the set that needs to be partitioned into two sets. This is a trivial example, since the only constraint for A is very simple: A needs to be greater than 0. A more interesting example is Set Packing: $A(\sum_{i,j:V_i\cap{}V_{j}\neq{}\emptyset{}}x_{i}x_{j}) - B\sum_{i}x_i$. Here, the constraint is also rather simple, but still, something to consider: $B < A$. Now, for both examples, it is still interesting to know - which value should A or B take? Should A be set to 1, and B to 0.5? In the literature, this is rarely mentioned.

As a matter of fact, it turns out - this is quite an important factor to consider. It is in fact possible to greatly improve the approximation ratio returned by a quantum computer when using optimized penalty values. Before delving into it, it should be noted that approximation ratio is the ratio of optimal solutions returned by the D-Wave machine. We used the D-Wave Advantage 4.1 system.

The way it works is as follows. We predict the penalty value that is associated with the maximum minimum spectral gap. By maximizing the minimum spectral gap, the chance of the quantum annealer to jump into an excited state while annealing and staying there is minimized. Jumping into such state would result in a suboptimal solution after finishing the annealing. Thus, by maximizing the minimum spectral gap, we expect to improve the overall approximation ratio. We find empirically that this expectation is valid. Note that we investigated problems that have exactly two penalty coefficients (A and B).

The Figure above shows for Set Packing the approximation ratio of a set of 100 problem instances for which a neural network regression model (red line) predicted the best penalty values including the 95% confidence interval, versus a random process that samples random penalty values for 50 iterations and keeps the rolling best approximation ratio achieved. It is clear that only after 30 iterations does the random process reach the model, thus, it can be said that the model saves us 50 costly D-Wave calls.

The next Figure shows for six problems the $R^2$ coefficient of a neural network model achieved while training. The problems are:

- Knapsack with Integer Weights (KP)
- Minimum Exact Cover (MECP)
- Set Packing (SPP)
- Maximum Cut (MCP)
- Minimum Vertex Cover (MVCP)
- Binary Integer Linear Programming (BILP)

We generated for each problem 3000 problem instances.

For SPP, MCP, MVCP and BILP the model is easily able to predict the penalty value associated with the maximum minimal spectral gap. KP and MECP are tougher, as seen in the next Figure, which shows the clustering of the datasets. Especially for KP it is clear that predicting the best penalty values (in the Figure shown as the penalty ratio $B / A$) is much more complex than for the other problems.

To summarize, it is clear that optimizing the penalty coefficients of QUBO formulations leads to a better minimum spectral gap, which induces a better approximation ratio on the D-Wave Advantage machine.

This work is based on previous work found here.