[Termtools] A proposal for the next competition

Wed Feb 15 17:43:25 CET 2006

Hi all,

Claude and I discussed some of these matters before, and we decided that
Claude will formulate his proposal about time limits, and I will
formulate my proposal about the output format. So here is my proposal.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+

It turned out that several proofs generated in the competition were
wrong. Of course this should be avoided, and a natural way to do so is
to do some verification on the generated proofs. For sure time is not
yet ready to require full formal verification, but a first step should
be appropriate. Until now any kind of proof sketch was allowed, and
completely left to the tool authors. For instance a tool may refer to
unpublished techniques with only a vague reference, while the details of
the technique are left unclear. To avoid this my proposal for the 2006
competition is as follows:

* In the generated proofs only established techniques are allowed. Any
established technique either has been published (in a journal or
conference paper, or technical report, or in some text on some ones home
page), or is found on a list of techniques archived by the competition
organization. Just like secret problems and the latest tool versions,
participants may submit techniques / theorems to this list just before
the start of the competition, by which new techniques can be kept secret
until the start of the competition. For all established techniques the
criteria should be clear and mathematically precise. In this way for
future errors it should be possible to detect whether the technique (or
a particular case of it) itself is wrong (and hence every proof
referring to it), or the criteria were not fulfilled.

The reference to these techniques should be clear. This can be done by
adjusting the tool output, but alternatively this can be done by
separately formulating the underlying theorems as they are used.

To my opinion this would be a valuable first step towards verifying
generated proofs. For the competition participants I have the following
questions: 
* Do you agree? 
* Are you willing to accept these rules and adjust your tool and/or
submit a list of techniques used? 

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+

About Claude's proposal: I am not in favour of it. It will complicate
the rules with hardly any result on the outcome. Moreover, it would
stimulate the tool developers to tune their tool not only for one time
limit but for several time limits. The main motivation for the whole
competition is to stimulate people to work in this area and develop  and
extend techniques. I have the feeling that this goal is not served by
complicated rules and tuning towards several time limits.

Results on systems in TPDB that were not solved by any tool within 1
minute may be interesting, but I expect there will be only few of such
results. May be if one has results of this type (like Dieter and
Johannes solved z086.srs, see RTA open problem 104) it would be nice to
add a reference to such a result in the competition result page.

            Best regards, Hans Zantema.

_________________________________________________________
Dr Hans Zantema            
Technische Universiteit Eindhoven, Department of Computer Science 
P.O. Box 513, 5600 MB Eindhoven, The Netherlands 
e-mail: H.Zantema at tue.nl, homepage: www.win.tue.nl/~hzantema
office: Hoofdgebouw room 6.73, tel: (040)2472749