## # Resolution and Indexing

This is a seminar paper I did in December 2019.

The post surveys the origins of resolution and a selected set of variants: Binary resolution with theory unification, hyper-resolution and parallel unit resulting resolution. Performance improvements enabled by indexing data structures such as substitution trees or abstraction trees are explored. Finally, related work and further topics are listed.

## Introduction

As early as in the 18th century Leibniz formulated the idea of automatic resolution with his calculus ratiocinator', in which he described a vision of machines being able to automatically resolve any logical problem51, i.e. for example the satisfiability of a propositional formula or a theorem of logic. Now, centuries later, resolution, an essential part of solving these kind of problems, enjoys a broad range of applications, though Leibniz' vision has not come into reality so far: Theorem provers do not have the ability to solve any such given problem. An excellent case suggesting that this may never become reality is Hilbert's 10th problem, in which an algorithm is to be found that can decide whether a diophantine equation is solvable: Matiyasevich showed in 1970 that it is indeed impossible to devise such an algorithm31 30.

Still, since its first formal definition by Robinson in 1965, many areas have been found to profit off of the resolution principle41. Examples include automatic theorem provers36 such as Vampi.e.cite{RV99}, SAT solvers such as BerkMin17 or logic programming. For the latter, PROLOG is a prominent example: A PROLOG program is defined as a logic formula and refutation using SLD resolution15 21 is the computation' of the program25. As for SAT solvers, resolution helps in averting combinatorial explosion and is used as a preprocessing step35. By creating new clauses that may show the unsatisfiability earlier the total state space that is visited is lessened. Specifically hyper-resolution has been shown to improve SAT solvers successfully in terms of efficiency5 20. Notable examples include HyPre5, HyperBinFast16, PrecoSAT7 or CryptoMiniSAT46.

The remainder of this post is structured as follows. First, binary resolution is presented. On top of that, three variants are explored: Equational-theory resolution, hyper-resolution and unit resulting resolution. In each chapter a note on possible performance improvements through indexing is made. Lastly, concluding remarks and references to related topics are listed.

## Binary Resolution

To derive a proof for a hypothesis given certain axioms, resolution can be of particular use. The hypothesis can be encoded as a proof of contradiction and resolution repeatedly applied. If the result is an empty clause, the theorem cannot be fulfilled if the hypothesis is negated. This contradiction is the proof that the hypothesis is actually correct. The opposite is true if the result is not an empty clause. Another classic application is the satisfiability of a certain problem that consists of multiple premises (consisting of multiple boolean variables) that have to be fulfilled. The resolution principle defines an elementary rule to confirm the (un-) satisfiability of that problem, i.e. whether there is a certain configuration of variables that can fulfill the problem. In the following, the rule is presented in detail.

Binary resolution41 is especially of interest for its support of predicate logic. With predicate logic it is possible to define relations between variables, which makes it more expressive than (finite) propositional logic6. While propositional resolution works analogously to binary resolution, using a similar propositional resolution principle11, there is a major difference to binary resolution — it makes use of unification. Now, in this context propositional resolution is not covered. Thus, in what follows it is reasoned in predicate logic exclusively. To dissect binary resolution, a few definitions are needed.

• A term is a constant or a variable or an n-ary function consisting of n terms.
• An n-ary predicate symbol on n terms is a literal. The negation of a literal is also a literal.
• A finite set of literals is a clause. The set can also be seen as a disjunction of the literals.

Binary resolution introduces the following rule (see Example 1): If, given two clauses, in this case $\{A(x), C(x)\}$ and $\{\neg A(x), D(x)\}$ , one contains a positive and the other a corresponding negative literal, all else being equal, the two clauses resolve into a new clause, also called the resolvent. In this case the resolvent is $\{C(x), D(x)\}$ . So far this is similar to propositional resolution.

A positive literal and its negative version is also called a complement. Now, binary resolution additionally relaxes the notion of a complement by allowing a negative and its positive literal to be resolved if they can merely be unified. See Example 2 below.

It can be seen that $A(x)$ and $\neg A(y)$ are unifiable by the substitution $\{y \mapsto x\}$ . This substitution is then applied to the resolvent, yielding $\{C(y), D(y)\}$ . It is important to note that this is always the substitution of a most general unifier19.

There is still a limitation. For example, the two clauses $\{A(x, a)\}$ and $\{\neg A(b, x)\}$ cannot be satisfied, since this is akin to $\forall x A(x, a)$ and $\forall x \neg A(b, x)$ (where the two $x$ variables are different) and assuming $x$ in the first clause becomes $b$ and $x$ in the second clause becomes $a$ , $\{A(b, a)\}$ and $\{\neg A(b, a)\}$ become unsatisfiable. Yet, binary resolution in its current form is not able to resolve to an empty clause. To fix this, a renaming step in the first clause is added32. Then, $x$ may become $y$ (resulting in $\{A(y, a)\}$ ) and thus unification and resolution is successful.

There is still a case that cannot be resolved correctly: The two clauses $\{A(x), A(y)\}$ and $\{\neg A(x), \neg A(y)\}$ are unsatisfiable, yet an empty clause cannot be derived, as all literals of the first clause are positive and all literals of the second clause are negative. For this purpose, factoring is introduced. The main point of factoring is deduplication. If some of the literals of a clause are unifiable, the clause after substitution of the most general unifier is a factor. A resolution can then commence using factors. In the above case, the substitution $\{x \mapsto y\}$ is a most general unifier. Thus, after applying it, the result is the factors $\{A(y)\}$ and $\{\neg A(y)\}$ , since a unification removes duplicates. Now, in a simple resolution step an empty clause is derived.

It is possible to use indexing to speed up resolution by a small factor. With indexing, redundant terms are shared, decreasing the total amount of memory used, and lookup of unifiable terms is faster. To realize indexing, structures such as discrimination trees19 or substitution trees18 are used. As Graf mentions19, an index could be built in advance, for instance, by adding all literals into a discrimination tree and thus enabling fast retrieval of complementary literals for unification for a given literal in a resolution step. An exemplary tree of a set of clauses can be seen in Figure 1. Traversing the tree, to for example look for a complementary literal for $B(x)$ , is faster than testing all possible literals each time a complementary candidate is required.

## Binary Resolution with Theory Unification

Standard binary resolution does not have the power to resolve clauses for which certain axioms or theorems are available, such as associativity or commutativity in an equation. Consider the two clauses $\{A(a, b)\}$ and $\{\neg A(b, a)\}$ . If commutativity holds, i.e. $A(a, b) = A(b, a)$ , binary resolution cannot resolve the clauses, since syntactic unification3, the unification without theories that binary resolution is based on, cannot change constants and does not know about the rules of commutativity. Now, a naive approach would be to generate all possible clauses that arise from the theory and then try them all until one succeeds. However, for the clause $\{A(x, y, z)\}$ that would already result in six total clauses, considering that all possible combinations that could arise from commutativity need to be covered. It is immediately clear that this does not scale. To this end, binary resolution can be coupled with equational theory unification37, also called E-unification, i.e. if a clause including an equation is given together with a certain theory that holds, a resolution can be found efficiently.

Next, E-unification is explored in more detail. The notion of most general unifier is similar to normal unification, though to a limited extent. This means that there is an ordering of substitutions — some are more general than others: For two given substitutions $\sigma$ and $\tau$ , $\sigma$ is more general if there is a $\upsilon$ where $x\tau$ = $x\sigma\upsilon$ and the given theory still holds. The substitution is the most general E-unifier, if it is also the most general substitution. Syntactic unification can have multiple most general unifiers, however this is taking into account variants or renaming variables. Ignoring this, a syntactically unifiable set of clauses has just one most general unifier13. This is comparable to unitary problems that also only have one most general E-unifier. However, and this is the main difference between syntactic unification and E-unification, there are cases in which E-unifiable clauses do not have a single most general E-unifier3. For example, the equational clause $A(x, y) = A(a, b)$ (with commutativity) has two most general E-unifiers $\{x \mapsto a, y \mapsto b\}$ and $\{x \mapsto b, y \mapsto a\}$ . Strictly speaking, none of them is then the most general E-unifier, since both are not related (e.g. if one was the instance of the other), both have the same number of substitutions and for both a more general substitution does not exist. Thus, E-unification reasons with sets of E-unifiers. A set of ordered E-unifiers is complete, if it contains all E-unifiers that are applicable to the problem or at least all most general ones, i.e. for each applicable E-unifier there is at least one more general unifier in the set. Now, a minimal set of E-unifiers only contains unifiers that cannot compared, i.e. for a given E-unifier there is no more or less general one in the set. For a given minimum set there are three types of problems3:

1. Unitary, if the theory is empty (so no additional theory holds, just equality) and there is just one most general E-unifier (which is a syntactic one, as no theory holds).
2. Finitary, if the set is finite and has more than one E-unifier. For example, commutativity is finite: There are only so many combinations for a given problem43.
3. Infinitary, if the set is infinite.
4. Zero, if the set is empty.

The type of problem is interesting since an algorithm might have to deal with having infinitely possible E-unifiers. Now, a list of theories is given in Table 1. It is interesting to note that there are some theories that are not decidable. Also, not every theory has published algorithms for E-unification. These are potentially open problems.

Theory Type Decidability Algorithm
Associativity (A) Infinitary Decidable 37
Commutativity (C) Finitary Decidable 43
Distributivity (D) Infinitary$*$ (Un) Decidable 42
(A) and (D) Unitary/Finitary Decidable 47
(A), (D) and idempotency (I) Unitary/Finitary Decidable 2
Abelian groups Unitary/Finitary Decidable 22
Commutative rings Zero Undecidable None
Boolean rings Unitary/Finitary Decidable 8
Endomorphisms Unitary/Finitary (Un) Decidable None

Table 1: List of theories with respective facts3

Types denoted by $*$ only apply to some problems such as ones containing constants or arbitrary function symbols.

Now, equational theory is restricted in that clauses need to contain some kind of equation. However, it has been found to be possible to create specific unification algorithms, similar to the ones referenced in Table 1: Z-resolution12 compiles an axiom by generating a Lisp program that can then automatically deduce that axiom. An axiom that can be Z-resolved has to be in the form of a two literal clause (also called Z-clause), which together with another clause resolves to some other clause. It can be seen immediately that this gives considerable freedom in what the clauses contain and thus the axiom they represent. In fact, they can represent theories from E-unification, such as commutativity. Compiling an axiom can be compared to a human remembering that if $A=B$ , then $B=A$ , and thus only one rule needs to be saved in memory. The other one is deduced when needed.

The idea of such specific unification algorithms is not applicable to recursive clauses. For instance, one such clause is $\{A(x), \neg A(f(x))\}$ . It resolves with itself to the new clause $\{A(x), \neg A(f(f(x)))\}$ , also called a self-resolvent. Now, it is immediately clear that there are infinite such resolvents. Generating a unification algorithm for them all is infeasable, however Ohlbach showed that abstraction trees can be used to encode those resolvents efficiently34. Furthermore, standard operations such as merge do not have to be adapted for infinite clauses. Self-resolvents can be compared to instances of the original clause. With this, a deterministic procedure for generating new ones is possible. The clauses can be added into an abstraction tree with continuations, i.e. when a node with a continuation is reached while traversing, the next level can be generated with the procedure and by appending a new tree of same structure. Thus, it is possible to represent infinite sets of unifiers.

Refer to Figure 2 for an example. The clauses $\{A(x, a)\}$ and $\{A(a, x)\}$ with associativity over A can be resolved with an infinite set of E-unifiers, e.g. $\{\{x \mapsto a\}$ , $\{x \mapsto A(a, a)\}$ , $\{x \mapsto A(a, A(a, a))\}$ , ...$\}$ . Now, when traversing the tree the continuation is generated whenever a continuation identifier, in this case denoted by a $C$ , is encountered.

## Hyper-resolution

In the context of theorem proving, especially for huge problems, the number of clauses generated by resolution may become substantially huge. In fact, the number grows exponentially in terms of when unsatisfiability\footnotemark{} can be confirmed4. \footnotetext{\ The attentive reader might question why it only grows in terms of unsatisfiability. Well, predicate logic is undecidable10 49 and thus it is not clear which complexity satisfiability to denote with4. } Additionally, part of the clauses will mostly be tangential to the proof9. Storing them wastes memory and considering them as proof candidates wastes CPU cycles, causing the theorem prover to slow down. In the following, variants of binary resolution that deal with this are explored.

A prominent one is hyper-resolution. It offers improvements in terms of speed: There are less resolvents generated and thus less steps needed to get to a result23. Published in 1965 by Robinson40, it has since then found wide use as part of SAT solvers, going as far as to make problems solvable that were previously unsolvable5. Next, a short overview is given. First, a few definitions are needed. A positive clause is a clause containing only positive literals (i.e. no literal with a negation sign). A negative clause then only contains negative ones. Since the resolvent is positive, this is also called positive hyper-resolution. The corresponding negative hyper-resolution would result in a negative resolvent. The hyper-resolution rule is now as follows. Given a list of strictly positive clauses, also called electrons, and a clause that is not positive (i.e. either completely or partly negative), also called the nucleus, the clauses (all of them) can be resolved in one step: The nucleus and the electrons are unified simultaneously. Figure 3 illustrates this. Given are the four clauses $\{\neg A(x), B(x)\}$ , $\{C(y)\}$ , $\{A(x), B(x)\}$ and $\{\neg B(z), \neg C(z)\}$ . Bold clauses are the respective nuclei and there are two steps until the empty clause is reached. Multiple electrons (or clauses) are resolved at each step. After each step the new clause is added to all clauses. It can immediately be seen that binary resolution would require three steps in this case. After unification, the negative literals of the nucleus are required to have counterparts in all of the electrons, causing all these complements to be removed. Since all negative literals are gone, a positive clause (or an empty clause) remains.

At this point, however, finding the unifier is the main computational issue, as the possible combinations grow exponentially with the amount of electrons for each negative literal in the nucleus19. For this, indexing comes into play, specifically substitution trees. In fact, many theorem provers such as LEO-III50 24 rely on substitution trees among others for speed-up. A substitution tree allows for efficient retrieval of terms unifiable with a given term. Given two substitution trees, the merge operation then returns the substitutions that are compatible, i.e. where the codomain of the same variables can be unified. This can be extended to multiple substitution trees and is called a multi-merge, which is especially relevant for hyper-resolution, since a set of electrons and a nucleus need a simultaneous unifier. Now, Graf19 proposes to keep a substitution tree for each literal of the nucleus, i.e. all required substitutions for that literal can be looked up. When the simultaneous unifier is needed, a multi-merge on the substitutions trees of all negative literals of the nucleus together with the respective electrons is executed.

Before diving deep into Unit Resulting Resolution, a tangential improvement should be noted. Often, there are multiple clauses that can be used as nucleus at the current step. However, some result in a faster resolution than others. An efficient selection can be achieved using an ordering of the literals (after unification). The idea of an ordering can be traced back to Slagle and Reynolds in 196744: For instance, clauses containing the conclusion (if looking for a certain theorem to be proven) can be prioritized that way, increasing the speed of resolution27.

## Unit Resulting Resolution

Unit resulting resolution (UR resolution) works according to the following rule. Given is a set of $n$ unit clauses (i.e. clauses containing only one literal) and another clause with $n+1$ literals, also called the nucleus, in which each unit clause has a complementary literal (with unification). After resolution with a simultaneous unifier, only one unit clause remains. For example, given the clauses $\{\neg A(z), B(x), C(y)\}$ , $\{A(x)\}$ and $\{\neg C(z)\}$ , UR resolution resolves to $\{B(x)\}$ . In contrast to hyper-resolution, the resulting unit clause may be positive or negative, since the multi-literal clause may have positive or negative literals. Now, UR resolution is a good candidate for parallelization, as Aßmann described in his PhD thesis in 19931. As a matter of fact, using multiple processes is very relevant to current developments: Latest CPUs are not improving in terms of performance due to their clock speed, but due to their core count48. Thus, focusing on this area yields favorable runtimes. Additionally, it is clear that redundancy of calculations is reduced, if for example there are multiple big clauses overlapping in terms of their literals.

Aßmann's idea consists of 3 steps. First, a clause graph is generated. It connects the literals of the nucleus to the complementary unit clauses, while also keeping track of the unifier between both. More specifically, each edge in the graph is enriched by the unifier split into two components: The positive component contains the substitutions that are applied to the positive literal of a complementary pair, the negative component the ones that are applied to the negative literal. Splitting the unifier up is useful for the actual search algorithm. A simple example can be seen in Figure 4. The dashed box represents the nucleus.

Now, for each clause (or node in the graph) a process is started that executes Algorithm 1. It should be noted that the function `smgu' returns the simultaneous most general unifier. Additionally, the part of the unifier between the nucleus and a unit clause that belongs to the nucleus is called test substitution, while the one belonging to the unit clause is the send substitution. Finally, the core of the algorithm is a two-step update from the nucleus towards the unit clauses. After the nucleus receives the unifier from a given unit clause, all other unit clauses are sent their send substitution modified with the smgu of the currently received substitutions and the test substitution (see line 11). The intuition of this is as follows. The clauses need to be kept updated with currently known substitutions. To do so, the substitution that operates on their respective literal is updated. Lastly, this loop is repeated until a simultaneous most general unifier between all substitutions is found (see line 5).

Parallel UR Resolution (PURR) by Graf and Meyer33 improves upon Aßmann's clause process by increasing the degree of parallelization even further. Now, each edge between a nucleus literal and a unit clause (instead of just a clause) in the clause graph is assigned a process — the resolution process. Its task can be compared to the inner most loop in Aßmann's clause process. Additionally, substitution trees instead of single substitutions are shared across processes. This enables the use of efficient operations such as the multi-merge operation. Lastly, the terminator process, which runs on a single clause, confirms whether a simultaneous most general unifier has been found. In detail, the resolution process now keeps substitution trees for each literal of the nucleus (except for the one its edge is connected to) cached. Whenever a substitution tree is received, it subsumes19 the cached version of it, i.e. all instances of the substitutions in the cached substitution tree are removed in the new version. The same is done the other way around: The cached version is updated in that instances of a given new substitution are removed. The union of both (so no substitutions are lost) is then merged with all other cached substitution trees, resulting in a substitution tree that contains most general unifiers between the cached substitutions and the new substitution. The unit clause literal is finally updated by applying the lightest substitutions (i.e. substitutions with the smallest amount of symbols) from the new substitution tree to its part of the substitution (recall that substitutions are kept split in two on each edge of the graph). Finally, terminator processes check for the desired simultaneous most general unifier of all clauses (similar to line 5 in Algorithm 1): If all collected substitutions for a clause can be merged, such a unifier is found. Subsumption is also applied here.

It is clear at this point that PURR heavily depends on substitutions trees. However, this also enables efficient operations such as subsumption that aid the merge operation in that unnecessary instances of substitutions are filtered beforehand and improves the performance of PURR.

## Conclusion

The introduction of the resolution principle has had a major impact — it is now a widely used technique in many areas of computer science and related fields. Performance is of particular concern, and work towards improving it has been plentiful in the past: Much attention to it was devoted by Graf in his book Term Indexing. This includes data structure such as discrimination trees, abstraction trees or substitution trees. Additionally, many algorithmic adaptions or extensions based on binary resolution have since then been proposed, such as hyper-resolution, resolution with equational theory, or unit resulting resolution.

This post covered much of that. First, the resolution principle was presented. An extension to it, binary resolution with equational theory, was explored: Now, knowledge about axioms in equational clauses can be used to create specific fast algorithms for fast resolution. Hyper-resolution, which reduces the amount of clauses generated, was demonstrated. And finally, PURR was reviewed: Unit resulting resolution combined with parallelization leads to good resource utilization and thus efficient resolution of unit clauses.

Still, more areas can be surveyed. General improvements to resolution are possible, such as faster unification algorithms29 14. As for further adaptions to classic binary resolution, there are multiple: Examples include paramodulation39, in which a clause containing an equation can be used as a substitution in another clause, linear resolution26 28 or demodulation52. Lastly, the reader interested in more related work on theorem provers could explore alternative proof procedures for (un)satisfiability, like analytic tableaux45 and its variants.

## References

1. Ulrich Assmann. A model for parallel deduction.

2. Franz Baader and Wolfram Buettner. Unification in commutative idempotent monoids.

3. Franz Baader and Wayne Snyder. Unification theory.

4. Matthias Baaz and Alexander Leitsch. Complexity of resolution proofs and function introduction.

5. Fahiem Bacchus and Jonathan Winter. Effective preprocessing with hyper-resolution and equality reduction.

6. Jon Barwise. Handbook of mathematical logic, volume~90.

7. Armin Biere. P $re, i$ cosat@ sc‚Äô09.

8. Wolfram Buttner and Helmut Simonis. Embedding boolean expressions into logic programming.

9. Chin-Liang Chang and Richard Char-Tung Lee. em Symbolic logic and mechanical theorem proving.

10. Alonzo Church. A note on the entscheidungsproblem.

11. Martin Davis and Hilary Putnam. A computing procedure for quantification theory.

12. John~K Dixon. Z-resolution: theorem-proving with compiled axioms.

13. Elmar Eder. Properties of substitutions and unifications.

14. Gonzalo Escalada-Imaz and Malik Ghallab. A practically efficient and almost linear unification algorithm.

15. Jean~H. Gallier. Logic for Computer Science: Foundations of Automatic Theorem Proving.

16. Roman Gershman and Ofer Strichman. Cost-effective hyper-resolution for preprocessing cnf formulas.

17. Eugene Goldberg and Yakov Novikov. Berkmin: A fast and robust sat-solver.

18. Peter Graf. Substitution tree indexing.

19. Peter Graf and D~Fehrer. Term indexing.

20. Marijn~JH Heule, Matti Jaervisalo, and Armin Biere. Revisiting hyper binary resolution.

21. Robert Kowalski. Predicate logic as programming language.

22. D.S. Lankford, G.~Butler, and B.~Brady. Abelian group unification algorithms for elementary terms.

23. Theodor Lettmann. Aussagenlogik: Deduktion und Algorithmen: Deduktion und Algorithmen.

24. Tomer Libal and Alexander Steen. Towards a substitution tree based index for higher-order resolution theorem provers.

25. John~W Lloyd. Foundations of logic programming.

26. Donald~W Loveland. A linear format for resolution.

27. Donald~W Loveland. Automated Theorem Proving: a logical basis.

28. David Luckham. Refinement theorems in resolution theory.

29. Alberto Martelli and Ugo Montanari. An efficient unification algorithm.

30. Yuri~V Matiyasevich and Jens~Erik Fenstad. Hilbert's tenth problem, volume 105.

31. Yuri~Vladimirovich Matiyasevich. The diophantineness of enumerable sets.

32. Bernard Meltzer. Theorem-proving for computers: some results on resolution and renaming.

33. Christoph Meyer. Parallel unit resulting resolution.

34. Hans~Juergen Ohlbach. Compilation of recursive two-literal clauses into unification algorithms.

35. Duc~Nghia Pham. Modelling and Exploiting Structures in Solving Propositional Satisfiability Problems.

36. Alain Pirotte. Automatic theorem proving based on resolution.

37. Gordon Plotkin. Building-in equational theories.

38. Alexandre Riazanov and Andrei Voronkov. Vampire.

39. G~Robinson and L~Wos. Paramodulation and theorem-proving in first-order theories with equality.

40. John~Alan Robinson. Automatic deduction with hyper-resolution.

41. John~Alan Robinson et~al. A machine-oriented logic based on the resolution principle.

42. Manfred Schmidt-Schau$\beta$. A decision algorithm for distributive unification.

43. Joerg~H. Siekmann. Unification of commutative terms.

44. James~R Slagle. Automatic theorem proving with renamable and semantic resolution.

45. Raymond~M Smullyan. First-order logic.

46. Mate Soos. The cryptominisat 5 set of solvers at sat competition 2016.

47. Mark~E. Stickel. A complete unification algorithm for associative-commutative functions.

48. Thomas~N Theis and H-S~Philip Wong. The end of moore's law: A new beginning for information technology.

49. Alan~M. Turing. Turing. on computable numbers, with an application to the entscheidungsproblem.

50. Max Wisniewski, Alexander Steen, and Christoph Benzmueller. The leo-iii project.

51. Bruno Woltzenlogel~Paleo. Leibniz's Characteristica Universalis and Calculus Ratiocinator Today.

52. LT~Wos, George~A Robinson, Daniel~F Carson, and Leon Shalla. The concept of demodulation in theorem proving.

Published on