[Termtools] A proposal for the next competition

Wed Feb 15 16:42:03 CET 2006

Hi all,

I would like to propose, for the next competition, a new system for
limiting the time of search.

The general idea comes from two facts which happened in the last
competition:

1) on a majority of examples, some tools are able to solve the problem
   in a very short time, less than 1 second, whereas others tools are
   not able to solve the problem and thus, in general, spend the full
   time allowed which was 60 seconds, to finally find nothing. 
	=> *This is boring*

2) there were examples which were not solved by any tool (91 over 773
   in the standard TRS category), for which it would be interesting to
   know whether increasing the time limit would help.

Consequently, I propose a new way to run the competition, which would
give more time to try to solve the "difficult" problems. The idea is
to run several global rounds :

1) first round would be as the last competition, running every tool
on every examples, but with a shorter time limit, say 5 seconds or
even less. We could expect this first round to run quite quickly, not
more than one day probably.

Then from this first round, some of the problems would be definitively
considered as "solved". In a first approximation, each problem solved
by at least one tool could be considered as solved. But this could
relaxed a bit, I will discussed that below.

If you look at last year's results, the first examples which would be
unsolved would be :

TRS/AProVE - AAECC-ring : solved by none
TRS/AProVE - AAECC : solved only by aprove but in 14 sec.
TRS/AProVE - JFP_Ex31: solved only by TEPARLA but in 40 sec.

2) second round would be then to rerun the tools but only on
   non-solved problems, for a larger time limit, say 1 minute.

   some new problems would be considered solved, like 

   TRS/AProVE - AAECC 
   TRS/AProVE - JFP_Ex31 

3) a third round could be made, for say a 5 minutes timeout

We may run rounds over and over like this, by multiplying the time
limit by some coefficient each round (so we would have a
non-terminating termination competition !)

Now, let me relax a bit the condition on which a problem is considered
solved. Look at for example 

TRS/AProVE - JFP_Ex51

it is solved in 4.82 sec by CiME, 6.45 by TTT and 9.05 by aprove. That
would mean that it would considered "solved" in the first round, and
TTT and aprove would be considered unable to solve it. To avoid this
kind of unfair phenomenon, We could consider a problem solved if some
stricter conditions occur :

. the time for solving it should be clearly less than the time limit,
  say less than half of it.
. maybe there should be at least 2 tools able to solve it, or 3.

Please remember that the idea behind this is to allow more time to
spend on "difficult" examples, so I am only trying to find a criterion
for defining "difficult".

I'd like to hear your opinions on this proposal, in particular from
last year participants. Do you see a major drawback ? Do you have any
better idea ?

- Claude

-- 
| Claude Marché           | mailto:Claude.Marche at lri.fr |
| LRI - Bât. 490          | http://www.lri.fr/~marche/  |
| Université de Paris-Sud | phoneto: +33 1 69 15 64 85  |
| F-91405 ORSAY Cedex     | faxto: +33 1 69 15 65 86    |
_______________________________________________
Termtools mailing list
Termtools at serveur-listes.lri.fr
http://serveur-listes.lri.fr/mailman/listinfo/termtools