# Psychic Modeling (Fast Version)

In Psychic Modeling, I described a reasonably understandable implementation of a ticket generator for the Psychic Modeling Problem. While this version is not overly slow, it is not amazingly fast either.

As I’m refreshing my C skills, I thought it would be interesting to try and implement a version as fast as possible.

### Design

I represent a subset as bit patterns in a 32-bits integer. This means I am limited to 32 different values (in other words, $n$ must be no larger than 32). The upside is that I have extremely fast intersection (&) and union (|) operations, among others.

### Memory Management

I use a work memory allocated at the beginning of the search; additional memory is allocated on the stack (using C99 features), and the selected tickets are just printed to avoid having to remember them.

The work memory is large enough to store $1 + \beta$ times a block large that can hold the complete set of $j$-subsets. The first block keeps the remaining $j$-subsets, and there’s an extra block for each random ticket: each time (for a total of $\beta$) a random ticket is generated, the $j$-subsets that are not covered yet is computed for this ticket; after I have generated $\beta$ tickets, I copy the work block of the best one over the first one.

I could have use just 3 blocks, a reference, the best so far, and one for the current random ticket, and copy from the current to the best each time the current ticket is better. There would be more copy operations, but perhaps less movement between the cache and the memory. The current design requires less than 2M, and only one copy operation per random ticket.

### Non portable features

I am using a few GCC built-in bit-level operations (number of bits, index of least significant 1 bit, and count of trailing zeroes); Bit Twiddling Hacks and Hacker’s Delight have portable alternatives.

I also use /dev/random as a source of random numbers; replacing dev_random by random would restore portability (but the output would always be the same, and the random state is reset when the program starts).

### Performance

So, is it fast?

The program found 71 tickets covering all 7-subsets with at least 6 numbers in less than a second. Even when the conditions are not that good, it remains fast:

Here it generated 1077 tickets using the smaller ticket size from Younas and Skiena paper; the paper had a 1080 tickets solution, so my version is effective.

Of course, it would be useless and unfair to compare the speed of this version against the numbers from the paper; more relevant is the difference with the Haskell version: while the latter was not meant to be fast, it is hundreds of times slower. I suppose it would be interesting to try and make it faster, but I suspect it would be just as ugly or uglier than the C version. And I like to keep using Haskell as a design and exploratory tool.

### Overview of the code

#### solve

The main function, solve, is more complex than in the Haskell version. It allocates the work memory, and fills it with init. A first ticket is used in init to filter out $j$-subsets.

Then the loop for the other tickets starts. It of course stops when there are no remaining $j$-subsets.

The subset of remaining numbers is computed with funion (fold union), and the digits array prepared to be used in sample. It consists of the individual bits of the number representing the remaining numbers subset. It is computed by repeatedly isolating the rightmost 1 bit (with d & -d), then clearing this bit (with d &= d -1).

A first ticket is randomly generated and its uncovered set computed. It is also set as the best new ticket (and indeed is the best so far). Then for the remaining $\beta-1$ new tickets, the uncovered set is computed as well, and if the new set is smaller than the best’s, the new ticket becomes the best as well.

The best ticket is printed, the main work memory is updated with the best uncovered set, and if there are any remaining $j$-subsets to find, we loop.

#### init

init’s purpose it to avoid wasting a loop over the $j$-subsets by merging the generation of $j$-subsets with the coverage of a first permutation (defined as [1..k] in solve). The returned value is not size of the not yet covered set of $j$-subsets.

If all tickets had to be generated randomly, 0 could be passed instead of a ticket to keep all $j$-subsets.

#### check_cover

check_cover has a similar design as init, but reads the $j$-subsets from the work memory from instead of generating them.

#### sample

sample is very similar to the Hashell version (indeed they are both based on the same algorithm); here the digits array plays the role that ds played in the Haskell version.

#### next_perm

The next_perm is from Bit Twiddling Hacks, and explained here.

#### Compiling and running

Using gcc, the necessary option is -std=c99 to activate C99 support; -O3 gives much (really) better performance, while -Wall is in general a good idea:

To run it, just pass the $n$, $k$, $j$ and $l$ parameters on the command line. There is no checks, so avoid mistakes. The program outputs the generated tickets:

### Wrapping up

After I completed the Haskell version, I found it not overly difficult to implement the C one. I was lucky to have discovered Bit Twiddling Hacks the week before; the code fragments there were very helpful in writing efficient set oriented functions over words.

Surprisingly, I had just one bug to track (I was using a variable both as parameter and temporary storage in one of the function); that was lucky as I’m not sure I could have debugged such code.