Bell Labs logo

Improving the Generation of Random Modal Formulae
for Testing Decision Procedures


by
Peter F. Patel-Schneider & Roberto Sebastiani

Abstract

The recent emergence of heavily-optimised modal decision procedures has lead to a number of generation methods for modal formulae. However, the generation methods developed so far are not satisfactory. To cope with this fact, we propose a much improved version of one of the best-known methods, the random CNF[]m test. The new method drastically reduces the influence of a major flaw of the previous method, and allows us to generate a wider variety of problems covering much more of the input space. This produces more interesting test sets for the current generation of modal decision procedures.

Talk presented at IJCAR-2001 Workshop on Issues in the Design and Experimental Evaluation of Systems for Modal and Temporal Logics, Sienna, Italy, June 2001.

Empirical Testing of Modal Decision Procedures

Why:
to see how a decision procedure performs
How:
try the procedure on lots of formulae
Problem:
generating a reasonable test set
  • reasonable difficulty
  • good coverage
  • no problem formulae

Desirable Features for Good Test Sets

Significance
  • significant portion of entire input space
  • include different sources of difficulty
Difficulty
  • problems should not be too easy to solve
  • problems should not be too hard to solve
  • degree of difficulty should be easy to adjust
Valid vs. non-valid balance
  • should include both formulae that are valid and formulae that are not valid
  • valid and non-valid formulae should have similar difficulty
Reproducibility
  • tests should be easily reproducible
  • test results should be easily to discern

Achieving Desirable Features

Parameterisation
  • to allow capture of significant portion of input space
  • to allow for easy change of difficulty
  • to allow for easy change of valid vs. non-valid balance
Control
  • have parameters that monotonically adjust various features
Modal vs. Propositional Balance
  • a way of capturing different sources of difficulty
Data organisation
  • qualitative behaviour of the tests should be easily visible

Problems with Test Sets

Redundancy
  • formulae should not contain redundant pieces
  • formulae should not contain small pieces that dictate the result on that formula
Triviality
  • formulae should require the full power of the logic to solve
Artificiality
  • test sets should correspond closely to application inputs
Over-size
  • formulae should not be too big

Using Randomly-generated Modal Formulae

Why:
easy to generate lots of formulae
How:
generate formulae according to some recipe, typically
  • a conjunction of some number of clauses
  • each clause has (usually) three disjuncts
  • each disjunct is a (possibly negated) propositional variable or modal formula
  • contents of each modal formula is the same as a clause, until a maximum depth is reached
Parameters:
  • maximum modal depth (d)
  • number of propositional variables (N)
  • probability of propositional variables (p)
  • number of modal boxes (M; usually 1)
  • number of disjuncts in a clause (C; usually 3)
  • number of clauses (L; usually varied within each test)

Using Randomly-generated Modal Formulae

Problem:
repeated propositional variables in a clause
  • reduce effective size and complexity
Solution:
don't repeat

Problem:
complementary literals in a clause
  • in negated modal formulae can make entire formula unsatisfiable
Solution:
don't repeat

Using Randomly-generated Modal Formulae

Problem:
small collection of clauses may be propositionally inconsistent
Solution:
eliminate propositional variables (except at maximum depth)
Problem:
makes the formulae unnatural and very hard
Solution:
new generation methodology

First Change

Old Method, Depth 2, Prop. Prob. 0.5

??
Median difficulty for DLP (version 4.1).

Old Method, Depth 2, Prop. Prob. 0.5

??
Median difficulty for *SAT (version 1.3).

Old Method, Depth 2, Prop. Prob. 0.5

??
Fraction of formulae that can be determined unsatifiable (or satisfiable) by purely propositional means.

New Method, Depth 2, Prop. Prob. 0.5

??
Median difficulty for DLP (version 4.1).

New Method, Depth 2, Prop. Prob. 0.5

??
Median difficulty for *SAT (version 1.3).

New Method, Depth 2, Prop. Prob. 0.5

??
Fraction of formulae that can be determined unsatifiable (or satisfiable) by purely propositional means.

Second Change

New Method, Depth 2, Prop. Prob. 0.5, 2.5 Clauses

??
Median difficulty for DLP (version 4.1).

New Method, Depth 2, Prop. Prob. 0.5, 2.5 Clauses

??
Fraction of formulae that can be determined unsatifiable (or satisfiable) by purely propositional means.

Third Change

C=[[1,8,1]]; p=[[[1,0],[0,3,0],[0,3,3,0]]]; N=3,4; d=3,4

??
Median difficulty for DLP (version 4.1).

C=[[1,8,1]]; p=[[[1,0],[0,3,0],[0,3,3,0]]]; N=3,4; d=3,4

??
Fraction of formulae that can be determined unsatifiable (or satisfiable) by purely propositional means.

Using the New Generator

Basic Usage
Settings
  • single or adjacent clause sizes, same at all modal depths
  • single or adjacent propositional disjuncts
  • same propositional probability at all places
Effect
  • similar to previous generators
  • no propositional flaws
  • also allows fractional clause sizes
Characteristics
  • more interesting tests
  • allows greater modal depths
  • small number of parameters
Advanced Usage
Settings
  • varying or unusual distributions for clause sizes or propositional probabilities
Effect
  • unusual distributions, unusual results
Characteristics
  • very large parameter space
  • allows even large modal depths

Improvements

Summary