# Improving the Generation of Random Modal Formulae for Testing Decision Procedures

by
Peter F. Patel-Schneider & Roberto Sebastiani

# Abstract

The recent emergence of heavily-optimised modal decision procedures has lead to a number of generation methods for modal formulae. However, the generation methods developed so far are not satisfactory. To cope with this fact, we propose a much improved version of one of the best-known methods, the random CNF[]m test. The new method drastically reduces the influence of a major flaw of the previous method, and allows us to generate a wider variety of problems covering much more of the input space. This produces more interesting test sets for the current generation of modal decision procedures.

Talk presented at IJCAR-2001 Workshop on Issues in the Design and Experimental Evaluation of Systems for Modal and Temporal Logics, Sienna, Italy, June 2001.

# Empirical Testing of Modal Decision Procedures

Why:
to see how a decision procedure performs
How:
try the procedure on lots of formulae
Problem:
generating a reasonable test set
• reasonable difficulty
• good coverage
• no problem formulae

# Desirable Features for Good Test Sets

Significance
• significant portion of entire input space
• include different sources of difficulty
Difficulty
• problems should not be too easy to solve
• problems should not be too hard to solve
• degree of difficulty should be easy to adjust
Valid vs. non-valid balance
• should include both formulae that are valid and formulae that are not valid
• valid and non-valid formulae should have similar difficulty
Reproducibility
• tests should be easily reproducible
• test results should be easily to discern

# Achieving Desirable Features

Parameterisation
• to allow capture of significant portion of input space
• to allow for easy change of difficulty
• to allow for easy change of valid vs. non-valid balance
Control
• have parameters that monotonically adjust various features
Modal vs. Propositional Balance
• a way of capturing different sources of difficulty
Data organisation
• qualitative behaviour of the tests should be easily visible

# Problems with Test Sets

Redundancy
• formulae should not contain redundant pieces
• formulae should not contain small pieces that dictate the result on that formula
Triviality
• formulae should require the full power of the logic to solve
Artificiality
• test sets should correspond closely to application inputs
Over-size
• formulae should not be too big

# Using Randomly-generated Modal Formulae

Why:
easy to generate lots of formulae
How:
generate formulae according to some recipe, typically
• a conjunction of some number of clauses
• each clause has (usually) three disjuncts
• each disjunct is a (possibly negated) propositional variable or modal formula
• contents of each modal formula is the same as a clause, until a maximum depth is reached
Parameters:
• maximum modal depth (d)
• number of propositional variables (N)
• probability of propositional variables (p)
• number of modal boxes (M; usually 1)
• number of disjuncts in a clause (C; usually 3)
• number of clauses (L; usually varied within each test)

# Using Randomly-generated Modal Formulae

Problem:
repeated propositional variables in a clause
• reduce effective size and complexity
Solution:
don't repeat

Problem:
complementary literals in a clause
• in negated modal formulae can make entire formula unsatisfiable
Solution:
don't repeat

# Using Randomly-generated Modal Formulae

Problem:
small collection of clauses may be propositionally inconsistent
Solution:
eliminate propositional variables (except at maximum depth)
Problem:
makes the formulae unnatural and very hard
Solution:
new generation methodology

# First Change

• restrict variability in number of propositional literals per clause
• less than one different from pC
• average number of propositional variables is pC
• eliminates strictly-propositional clauses
• sharply reduces the number of problem formulae

# Old Method, Depth 2, Prop. Prob. 0.5

Median difficulty for DLP (version 4.1).

# Old Method, Depth 2, Prop. Prob. 0.5

Median difficulty for *SAT (version 1.3).

# Old Method, Depth 2, Prop. Prob. 0.5

Fraction of formulae that can be determined unsatifiable (or satisfiable) by purely propositional means.

# New Method, Depth 2, Prop. Prob. 0.5

Median difficulty for DLP (version 4.1).

# New Method, Depth 2, Prop. Prob. 0.5

Median difficulty for *SAT (version 1.3).

# New Method, Depth 2, Prop. Prob. 0.5

Fraction of formulae that can be determined unsatifiable (or satisfiable) by purely propositional means.

# Second Change

• allow the number of disjuncts in a clause to vary
• manner similar to previous method
• then determine the number of propositional literals in each clause based on the number of disjuncts in that particular clause
• allows for easier test sets
• allows for a new source of variability

# New Method, Depth 2, Prop. Prob. 0.5, 2.5 Clauses

Median difficulty for DLP (version 4.1).

# New Method, Depth 2, Prop. Prob. 0.5, 2.5 Clauses

Fraction of formulae that can be determined unsatifiable (or satisfiable) by purely propositional means.

# Third Change

• direct specification of probability distributions
• for the number of disjuncts in a clause
• for the number of propositional literals in a clause
• can depend on modal depth and on number of disjuncts in the clause
• example with
• m=1
• C=[[1,8,1]]
• p=[[[1,0],[0,3,0],[0,3,3,0]]]
• N=3,4; d=3,4

# C=[[1,8,1]]; p=[[[1,0],[0,3,0],[0,3,3,0]]]; N=3,4; d=3,4

Median difficulty for DLP (version 4.1).

# C=[[1,8,1]]; p=[[[1,0],[0,3,0],[0,3,3,0]]]; N=3,4; d=3,4

Fraction of formulae that can be determined unsatifiable (or satisfiable) by purely propositional means.

# Using the New Generator

Basic Usage
Settings
• single or adjacent clause sizes, same at all modal depths
• single or adjacent propositional disjuncts
• same propositional probability at all places
Effect
• similar to previous generators
• no propositional flaws
• also allows fractional clause sizes
Characteristics
• more interesting tests
• allows greater modal depths
• small number of parameters
Settings
• varying or unusual distributions for clause sizes or propositional probabilities
Effect
• unusual distributions, unusual results
Characteristics
• very large parameter space
• allows even large modal depths

# Improvements

• new parameters (e.g., clause size)
• many different possibilities for test sets
• more control over difficulty
• better control over modal vs. propositional balance
• all strictly-propositional problems eliminated
• trivial unsatisfiability reduced
• new hard but not big formulae

# Summary

• New generator for random modal formulae
• Benefits
• no ``errors'' in formulae
• can generate test sets with fewer problem formulae
• can generate reasonable test sets with deeper modal depth
• Drawback
• more parameters, so harder to exhaustively explore parameter space
• Available at http://ect.bell-labs.com/who/pfps/dlp