Some of the key points that came out of the discussions that sparked up are related to questions about organisation details (e.g. how will it be decided which competitor skips which puzzle set) and how the scoring will exactly work (e.g. to compensate for any possible imbalance between different sets). These questions are in the process of being followed-up, right now the proposed solutions are being reviewed by quite a few people before getting published. So bear with us, this proposal should now be communicated publicly within a day or two.
In today's post I wanted to clarify the process of creating the puzzles for this "80 puzzles" concept, since a couple of comments/questions/concerns came up in the feedback of the initial post that the puzzle sets provided by author teams will be very different in style, or very different in difficulty, or that they will all contain the same types and be boring, or that they will all only contain abstract innovations that will not be very entertaining. These comments seemed to assume that the four authors are working in a completely independent and uncontrolled way and that there is no coordination that would ensure their product meets Championship standards (or if the authors are speaking to each other, then they know each other’s plan in advance, which is unfair). These assumptions are largely untrue and hopefully the description of our internal process helps everybody understand why.
Selection of authors
I had (me = Zoltan N here) a long list of potential contributors based on available history as a competitor and puzzle author. Since I was coordinating last year’s 24HPC as well where the majority of the puzzle authors were international, it was easy to ensure that all authors of 80P have already proven themselves as being able to contribute a single set of puzzles of high quality in the context of working with a core team remotely against well-defined requirements (details). While 24HPC is very different to and much smaller in scope than a WPC, a track record of smooth cooperation and a positive feedback on the actual puzzles from competitors is certainly a good indicator of further success in working with these authors again.
Once the authors confirmed that they are happy to participate in this programme, I provided them with a thorough guideline documentation about the expectations against the puzzle set that they had volunteered to provide. The guidelines were basically similar in nature to the 24H guidelines, although probably a little more restrictive. Requirements include
- Specifying the framework (60 minutes, 20 puzzles).
- Puzzle sets should offer a balanced set of puzzles in terms of difficulty, ranging from easy puzzles that should be accessible for any WPC competitor or even for the public, to difficult puzzles that challenge even serious title contenders.
- Puzzle sets should offer a balanced set of puzzles on the scale of novelty, ranging from very traditional and well-known types to new variations or innovations.
- Puzzle sets should include puzzles from all available genres, i.e. magic squares, pathfinders, arithmetics, crosswords, paintings, logic just to name a few. That is, a range of solving skills should be required to do well in these sets.
- There being a separate World Sudoku Championship held just a few days before WPC, Sudoku and its closest variations should be avoided.
Coordinating puzzle types
The first deliverable for the authors was a list of their planned puzzle type and/or the structure of their set, without creating any actual puzzles at this stage. Once all four plans were received, we collated that into a single sheet. We were trying to ensure that
- The genre variety and novelty range is right
- There is no puzzle type overlap between any two sets within 80P
- There is no puzzle type overlap between 80P sets and the types of puzzles that would play a crucial role in the rounds designed by us, if any
Any duplicates were communicated to the authors as such, and asked them to find a replacement, without revealing who else got priority on that particular puzzle type, e.g. we only sent “don’t include Battleships” types of messages. In fact, this has seen quite a few iterations in some cases. This process ensures that authors do not know any substantial information about the puzzle sets of the other authors ("someone else seems to be considering Battleships" is probably not an overwhelming amount of information).
Although we ended up being lenient and allowed for some exceptions to the overlap constraints above for reasons that include puzzle set theme, sheer puzzle beauty, etc, there is no significant overlap in the four sets as a result.
Testing the sets
Our testing had at least three people from the core team solve each of the sets under competition circumstances. Test solvers were Pal and the three Zoltans, i.e. team HUN-A that finished 4th at last year's WPC. We recorded all our solving times for each of the puzzles and provided these results to the authors in a collated feedback (each author only received feedback about their own set, obviously).
There were instances where some or all of us found certain puzzles to be too difficult to the point of not being too much fun to solve. There were other instances where we felt the puzzle is not suitable for the circumstances. In these cases, we asked authors to adjust and/or replace those puzzles. (I need to point out that all the authors did a very mature and quality job, so it is the minority of the puzzles in any of the set that we had to ask to be tweaked or replaced.)
As a result of these iterations, we feel that the puzzle sets are now balanced on difficulty and all other measures as much as it is reasonable to expect. In particular, the solving times for each of the test solvers fall within a 10% range for each of the sets – this is surely not a lot of data points to go by, but then this is also how we had timed WPC rounds in 2005 and 2011 and 24HPC rounds in many other years and these events were (mostly) reasonably predictable, in how rounds were scored and timed.
In case the difficulty of the sets still turns out to show some differences, that's where the scoring recommendation, details of which are to be posted in a day or two as discussed above, will come into play.
To ensure a consistent graphic appearance of puzzles during the entire Championship, the core team has agreed to re-draw all the puzzles that the 80 puzzles’ authors create. We also review the instructions of the puzzles, for a similar reason.