In Psychic Modeling, I described a reasonably understandable implementation of a ticket generator for the Psychic Modeling Problem. While this version is not overly slow, it is not amazingly fast either.
As I’m refreshing my C skills, I thought it would be interesting to try and implement a version as fast as possible.
Design
I represent a subset as bit patterns in a 32-bits integer. This means
I am limited to 32 different values (in other words, $n$ must be no
larger than 32). The upside is that I have extremely fast intersection
(&
) and union (|
) operations, among others.
Memory Management
I use a work memory allocated at the beginning of the search; additional memory is allocated on the stack (using C99 features), and the selected tickets are just printed to avoid having to remember them.
The work memory is large enough to store $1 + \beta$ times a block large that can hold the complete set of $j$-subsets. The first block keeps the remaining $j$-subsets, and there’s an extra block for each random ticket: each time (for a total of $\beta$) a random ticket is generated, the $j$-subsets that are not covered yet is computed for this ticket; after I have generated $\beta$ tickets, I copy the work block of the best one over the first one.
I could have use just 3 blocks, a reference, the best so far, and one for the current random ticket, and copy from the current to the best each time the current ticket is better. There would be more copy operations, but perhaps less movement between the cache and the memory. The current design requires less than 2M, and only one copy operation per random ticket.
Non portable features
I am using a few GCC built-in bit-level operations (number of bits, index of least significant 1 bit, and count of trailing zeroes); Bit Twiddling Hacks and Hacker’s Delight have portable alternatives.
I also use /dev/random
as a source of random numbers; replacing
dev_random
by random
would restore portability (but the output
would always be the same, and the random state is reset when the
program starts).
Performance
So, is it fast?
1 2 3 4 5 |
|
The program found 71 tickets covering all 7-subsets with at least 6 numbers in less than a second. Even when the conditions are not that good, it remains fast:
1 2 3 4 5 |
|
Here it generated 1077 tickets using the smaller ticket size from Younas and Skiena paper; the paper had a 1080 tickets solution, so my version is effective.
Of course, it would be useless and unfair to compare the speed of this version against the numbers from the paper; more relevant is the difference with the Haskell version: while the latter was not meant to be fast, it is hundreds of times slower. I suppose it would be interesting to try and make it faster, but I suspect it would be just as ugly or uglier than the C version. And I like to keep using Haskell as a design and exploratory tool.
Overview of the code
solve
The main function, solve
, is more complex than in the Haskell
version. It allocates the work memory, and fills it with init
. A
first ticket is used in init
to filter out $j$-subsets.
Then the loop for the other tickets starts. It of course stops when there are no remaining $j$-subsets.
The subset of remaining numbers is computed with funion
(fold
union), and the digits
array prepared to be used in sample
. It
consists of the individual bits of the number representing the
remaining numbers subset. It is computed by repeatedly isolating the
rightmost 1 bit (with d & -d
), then clearing this bit (with d &= d -1
).
A first ticket is randomly generated and its uncovered set computed. It is also set as the best new ticket (and indeed is the best so far). Then for the remaining $\beta-1$ new tickets, the uncovered set is computed as well, and if the new set is smaller than the best’s, the new ticket becomes the best as well.
The best ticket is printed, the main work memory is updated with the best uncovered set, and if there are any remaining $j$-subsets to find, we loop.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
|
init
init
’s purpose it to avoid wasting a loop over the $j$-subsets by
merging the generation of $j$-subsets with the coverage of a first
permutation (defined as [1..k]
in solve
). The returned value is
not size of the not yet covered set of $j$-subsets.
If all tickets had to be generated randomly, 0 could be passed instead of a ticket to keep all $j$-subsets.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
check_cover
check_cover
has a similar design as init
, but reads the
$j$-subsets from the work memory from
instead of generating them.
1 2 3 4 5 6 7 8 9 10 |
|
sample
sample
is very similar to the Hashell version (indeed they are both
based on the same algorithm); here the digits
array plays the role
that ds
played in the Haskell version.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
next_perm
The next_perm
is from Bit Twiddling Hacks, and explained here.
1 2 3 4 5 |
|
Compiling and running
Using gcc
, the necessary option is -std=c99
to activate C99
support; -O3
gives much (really) better performance, while -Wall
is in general a good idea:
1
|
|
To run it, just pass the $n$, $k$, $j$ and $l$ parameters on the command line. There is no checks, so avoid mistakes. The program outputs the generated tickets:
1 2 3 |
|
Wrapping up
After I completed the Haskell version, I found it not overly difficult to implement the C one. I was lucky to have discovered Bit Twiddling Hacks the week before; the code fragments there were very helpful in writing efficient set oriented functions over words.
Surprisingly, I had just one bug to track (I was using a variable both as parameter and temporary storage in one of the function); that was lucky as I’m not sure I could have debugged such code.
Complete Code
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 |
|