The problem is finding those 3-5 items. Choosing every possible combination of 5 would take too long. Choosing every combination of 3 is fast enough, but leaves a lot of extras. The simple solution is to first try all the combos of 3, then try all the combos of 4 on the remainder, then try all the combos of 5 on the remainder of that. The smaller and smaller pile makes the larger and larger choices feasible.
My boss, who is a competent practical programmer, suggests we have 3 procedures: One that does a 3 nested loop, one that does a 4, and one that does a 5. Like this (in Tcl):
set items {apple orange banana grape strawberry} set count 5 for {set i 0} {$i < $count} {incr i} { for {set j [expr $i + 1]} {$j < $count} {incr j} { for {set k [expr $j + 1]} {$k < $count} {incr k} { set string [lindex $items $i] lappend string [lindex $items $j] lappend string [lindex $items $k] puts $string } } }
That's just the 3 level loop because the other two look almost the same. The 4 and 5 level loops are identical except for having additional levels. As a programmer who values elegance over readability, the phrase "identical except for" is a red flag. Why have 3 separate procedures when all you are adjusting is a single parameter? What I need is a new control structure that is basically a for loop but lets me control how many nested fors there are.
Tcl makes it easy to create new control structures. Here's the control structure definition:
# Usage: # indexcount - how many items are being chosen # itemcount - how many items are being chosen from # indexvar - variable to hold current combination of indexes # body - code to execute for each combination proc chooseloop {indexcount itemcount indexvar body} { if {$indexcount > $itemcount} { error "More indexes than items" } if {$indexcount == 0} { return } set indexes {} for {set i 0} {$i < $indexcount} {incr i} { lappend indexes $i } set maxindexval [expr $itemcount - 1] while {1} { # make new body that sets indexvar first set newbody "set $indexvar {$indexes}\n$body" # do this iteration uplevel 1 [list eval $newbody] # find incrementable index set found no for {set i 0} {$i < $indexcount} {incr i} { set index [lindex $indexes end-$i] set thismax [expr $maxindexval - $i] if {$index < $thismax} { set found yes break } } if {!$found} { break } # increment this index and set all following ones set incrindex [expr ($indexcount - 1) - $i] set precedingval [lindex $indexes $incrindex] for {set j $incrindex} {$j < $indexcount} {incr j} { lset indexes $j [expr $precedingval + 1] set precedingval [lindex $indexes $j] } } }
Now to get my 3 level loop, I can call like this:
chooseloop 3 5 indexes { set str "" foreach index $indexes { append str "[lindex $items $index] " } puts $str }
In order to do 3, then 4, then 5, I can call like this:
for {set i 3} {$i < 6} {incr i} { chooseloop $i 5 indexes { set str "" foreach index $indexes { append str "[lindex $items $index] " } puts $str } puts "--" }
The output of that last one is:
apple orange banana apple orange grape apple orange strawberry apple banana grape apple banana strawberry apple grape strawberry orange banana grape orange banana strawberry orange grape strawberry banana grape strawberry -- apple orange banana grape apple orange banana strawberry apple orange grape strawberry apple banana grape strawberry orange banana grape strawberry -- apple orange banana grape strawberry --
Tcl represent.
ReplyDeleteI thought you'd like that.
ReplyDeleteAnd the great news is that for a given nesting level it only takes 10 times longer than the nested loop version! There's nothing Tim loves more than putting elegance over speed, so I expect a big raise.
I made this program mainly as a joke, but actually now I'm a little puzzled: Why exactly is it so slow? If I put the exprs in {}, I shave off about 10-20%, I can see that.
ReplyDeleteThe main problem seems to be looping through the indexes in the caller's body. If I could somehow set some named variables, I think it would be nearly as fast as the nested version. But of course I'd have to have a constant-size list of named variables, which doesn't help me.