Lazy Cartesian Product: Iterating combinations of arrays without calculating them all first.

posted 2012-Feb-27

Summary

The nth entry in the Cartesian product of arrays can be calculated directly (without pre-computing all entries) as:

That looks imposing when formally written, but what it says is, “For each array, divide the index desired by the lengths of all later arrays multiplied together, round the result down to the nearest integer, and then divide by the length of this array and use the remainder as the index into this array.”

In practice, you can pre-compute the dividends and moduli in a single pass backwards through the array. After that, computing the nth entry is just a matter of a map:

# Pseudocode to precalculate factors; $sets is an array of arrays
set $divs  = []
set $mods  = []
set factor = 1
for i from $sets.length-1 downto 0:
  set items to $sets[i].length
  $dividends[i] = factor 
  $moduli[i]    = items
  factor        = factor * items

# Map each 
def entry(n):
  set combo = []
  for i from 0 upto $sets.length-1:
    combo[i] = $sets[i][ floor(n/$dividends[i]) % $moduli[i] ]
  return combo

JavaScript Code

Here’s an implementation in JavaScript that lets you iterate through a Cartesian product of arrays forward, backward, pick out random items, and stop at any time:

// sets: an array of arrays
function LazyProduct(sets){
  for (var dm=[],f=1,l,i=sets.length;i--;f*=l){ dm[i]=[f,l=sets[i].length] }
  this.length = f;
  this.item = function(n){
    for (var c=[],i=sets.length;i--;)c[i]=sets[i][(n/dm[i][0]<<0)%dm[i][1]];
    return c;
  };
};

// Usage, assuming count/kinds/moods are all arrays
var combos = new LazyProduct( [count,kinds,moods] );

// Print out ten random items
for (var i=0;i<10;++i){
  var combo = combos.item( Math.floor( Math.random()*combos.length ) );
  console.log( combo ); // Triplet of an item from count, kinds, or moods
}

// Iterate backwards, stopping when you find what you want
for (var i=combos.length;i--;){
  var combo = combos.item(i);
  if () break;
}

Alternatively, here is a much faster version if you only need to iterate through all combinations in ‘normal’ order:

// sets: an array of arrays
// f: your callback function
// context: [optional] the `this` to use for your callback
function lazyProduct(sets,f,context){
  if (!context) context=this;
  var p=[],max=sets.length-1,lens=[];
  for (var i=sets.length;i--;) lens[i]=sets[i].length;
  function dive(d){
    var a=sets[d], len=lens[d];
    if (d==max) for (var i=0;i<len;++i) p[d]=a[i], f.apply(context,p);
    else        for (var i=0;i<len;++i) p[d]=a[i], dive(d+1);
    p.pop();
  }
  dive(0);
}

// Usage, assuming count/kinds/moods are all arrays
lazyProduct( [count,kinds,moods], function(n,kind,mood){
  // Your function is yielded unique combinations of values from the arrays
  console.log(n,kind,mood);
});

Background and Motivation

The Cartesian product of multiple sets of values is the set of all entries that have exactly one value from each set (in the order specified). For example, here is the Cartesian product of three arrays:

  var mood = ["happy","sad","lonely"];
  var mean = ["nice","mean"];
  var kind = ["cats","dogs","hogs"];
  var combos = cartesianProduct( mood, mean, kind ); // 18 ordered combinations
  combos.forEach(function(a){ console.log(a.join(' ')) });
  // "happy nice cats"
  // "happy nice dogs"
  // "happy nice hogs"
  // "happy mean cats"
  // …
  // "lonely nice hogs"
  // "lonely mean cats"
  // "lonely mean dogs"
  // "lonely mean hogs"

A quick web search reveals a good JavaScript implementation of cartesianProduct(), returning a full array of arrays.

Sometimes, however, you want to run through the combinations without generating them all first. For example, say that you’re looking for any 7-digit number of non-zero digits where the sum of digits adds up to 13 (e.g. 1111117).

var digits=[1,2,3,4,5,6,7,8,9];
var combos = cartesianProduct(digits,digits,digits,digits,digits,digits,digits);
for (var i=0,len=combos.length;i<len;++i){
  if (sum(combos[i])==13){
    console.log(combos[i]);
    break;
  }
}
function sum(a){
  for (var s=0,i=a.length;i--;) s+=a[i];
  return s;
}

The above code must create an array of 4,782,969 arrays before you even begin looking through it. Computers are fast, and memory is cheap, but that’s sort of ridiculous. The 7th combination you check is the winner.

Instead—using the LazyProduct constructor above, you can write almost the same code…

var combos = new LazyProduct([digits,digits,digits,digits,digits,digits,digits]);
for (var i=0,len=combos.length;i<len;++i){
  var combo = combos.item(i);
  if (sum(combo)==13){
    console.log(combo);
    break;
  }
}

…and it will start almost immediately and be done before you can even blink.

Ruby supports lazy Cartesian products out of the box using Array#product:

digits.product(digits,digits,digits,digits,digits,digits) do |nums|
  if nums.inject(:+)==13
    p nums
    break
  end
end
Isak
05:27PM ET
2012-Jul-12

I like the approach. Regarding the ruby version, however, it is not lazy, because Array#product returns an array.

See http://www.ruby-doc.org/core-1.9.3/Array.html#method-i-product

Gavin Kistner
09:26PM ET
2012-Jul-12

@Isak The non-block form returns an array, but the block form does not, and is lazy. From the docs we both l linked to:

"If given a block, product will yield all combinations and return self instead."
Isak
03:18PM ET
2012-Jul-22

Ah you are right, my mistake. I was thinking of the clojure type of lazy where you return a lazy sequence. Here is what I ended up going with in ruby:

def lazy_cartesian_product colls colls.first.enum_for :product, *colls.drop(1) end

net.mind details contact résumé other
Phrogz.net