[Termtools] A proposal for the next competition

Wed Feb 15 17:18:14 CET 2006

Dear Claude,

I like the idea of successive increasing the time for the problems. I
think the fast tools should profit from this. That's why I would propose
a slightly different approach with two rounds.

1) first round: with 5 seconds time-out
2) second round: run each tool on the problems that this tool has not
yet solved with time-out timeLeft/problemsLeft

Thereby fast tools would profit. For example if Aprove solves 600 of 750
problems within 1 second average, it would have 700min/150 = 4.6min for
each problem in the second round.

Kind regards,
Joerg

Claude Marche wrote:
> Hi all,
> 
> I would like to propose, for the next competition, a new system for
> limiting the time of search.
> 
> The general idea comes from two facts which happened in the last
> competition:
> 
> 1) on a majority of examples, some tools are able to solve the problem
>    in a very short time, less than 1 second, whereas others tools are
>    not able to solve the problem and thus, in general, spend the full
>    time allowed which was 60 seconds, to finally find nothing. 
> 	=> *This is boring*
> 
> 2) there were examples which were not solved by any tool (91 over 773
>    in the standard TRS category), for which it would be interesting to
>    know whether increasing the time limit would help.
> 
> Consequently, I propose a new way to run the competition, which would
> give more time to try to solve the "difficult" problems. The idea is
> to run several global rounds :
> 
> 1) first round would be as the last competition, running every tool
> on every examples, but with a shorter time limit, say 5 seconds or
> even less. We could expect this first round to run quite quickly, not
> more than one day probably.
> 
> Then from this first round, some of the problems would be definitively
> considered as "solved". In a first approximation, each problem solved
> by at least one tool could be considered as solved. But this could
> relaxed a bit, I will discussed that below.
> 
> If you look at last year's results, the first examples which would be
> unsolved would be :
> 
> TRS/AProVE - AAECC-ring : solved by none
> TRS/AProVE - AAECC : solved only by aprove but in 14 sec.
> TRS/AProVE - JFP_Ex31: solved only by TEPARLA but in 40 sec.
> 
> 2) second round would be then to rerun the tools but only on
>    non-solved problems, for a larger time limit, say 1 minute.
> 
>    some new problems would be considered solved, like 
> 
>    TRS/AProVE - AAECC 
>    TRS/AProVE - JFP_Ex31 
> 
> 3) a third round could be made, for say a 5 minutes timeout
> 
> We may run rounds over and over like this, by multiplying the time
> limit by some coefficient each round (so we would have a
> non-terminating termination competition !)
> 
> Now, let me relax a bit the condition on which a problem is considered
> solved. Look at for example 
> 
> TRS/AProVE - JFP_Ex51
> 
> it is solved in 4.82 sec by CiME, 6.45 by TTT and 9.05 by aprove. That
> would mean that it would considered "solved" in the first round, and
> TTT and aprove would be considered unable to solve it. To avoid this
> kind of unfair phenomenon, We could consider a problem solved if some
> stricter conditions occur :
> 
> . the time for solving it should be clearly less than the time limit,
>   say less than half of it.
> . maybe there should be at least 2 tools able to solve it, or 3.
> 
> Please remember that the idea behind this is to allow more time to
> spend on "difficult" examples, so I am only trying to find a criterion
> for defining "difficult".
> 
> I'd like to hear your opinions on this proposal, in particular from
> last year participants. Do you see a major drawback ? Do you have any
> better idea ?
> 
> - Claude
> 
> 

_______________________________________________
Termtools mailing list
Termtools at serveur-listes.lri.fr
http://serveur-listes.lri.fr/mailman/listinfo/termtools