[Termtools] A proposal for the next competition

Johannes Waldmann waldmann at imn.htwk-leipzig.de
Thu Feb 16 12:24:18 CET 2006


Claude, thanks for your suggestions on the competition.
Here's my opinion. In short:

I agree with successively increasing the time limit for unsolved
problems. In any case the tool should know the time limit
(given on the command line, as it already is)
because it might want to adapt some parameters.

I don't think I agree with cutting the time limit.


(More detail follows, and finally, a suggestion.)

Why exactly do you say "it is boring". You don't have to watch
it all day .. do you think your CPU is bored? Or overworked?
Is it a question  of computing resources? If we have,  say,
1000 problems  and 10 solvers and each one takes 1 minute,
then this gives 10000 minutes = 1 week.
That still means 51 weeks per year idle time for your computer :-)

If you don't have a dedicated server available,
then perhaps some other research group can provide one.
Possibly, we might find some industrial sponsor
(that buys us some CPU time at some server farm).


Compare to the SAT solver competition:
http://www.lri.fr/%7Esimon/satex/satex.php3
They list a total computing time of 84 days.

They also have a well thought-out system of ranking the solvers,
see http://www.satcompetition.org/2005/whatsnew05.html
(Computing Scores), perhaps we want to consider this.


There are a few more interesting suggestions there, which I think
should all be applied to the termination competition as well
since their rationale is no way problem-specific.

begin quote: ----------------------------------------------------------

* Competition Division:

... The authors must allow the SAT competition to release the source
code of their solver for research purpose on its web site.
Rationale: Every material submitted to the SAT competition will be
available for research purposes after the competition.

* Demonstration division

The demonstration division is intended to provide greater flexibility
for exposure of solvers when the authors cannot or wish not to comply
with the requirements for the competition division.

Solvers in the demonstration division might not be tested in identical
execution environments; they will not be rated or compared with other
solvers. There will not be winners. Their test results simply will be
recorded and publicized.

* Randomness and pseudorandomness

Randomness is not permitted. Each solver should produce repeatable
results when run on the same input formula with the same conditions of
time and space availability. Pseudorandomness is perfectly acceptable if
the program controls it by specifying a random seed whose calculation is
repeatable under the same conditions as just mentioned.

end quote ----------------------------------------------------------


Perhaps we can also look at other competitions
( e. g. http://www.cs.miami.edu/~tptp/CASC/20/Design.html )
from what I read, SAT is modeled after CASC
so this seems to be well-established practice.

Perhaps a representative of the SAT/CASC community
can be invited to the Workshop on Termination
to give a short (informal) talk and discuss some questions.
(very convenient since the SAT and IJCAR are also a part of FLoc).


Best regards,
-- 
-- Johannes Waldmann -- Tel/Fax (0341) 3076 6479/80 --
---- http://www.imn.htwk-leipzig.de/~waldmann/ -------



More information about the Termtools mailing list