A Look at Cartesian Products
Copyright 2012 by Shawn H Corey. Some rights reserved.
Licence under CC BY-SA 3.0
Problem: You have a number of sets and you want all the combinations when choosing one element from each set.
In mathematics, these combinations are called the Cartesian product. They are also known as cross-products. In the database world, they are sometimes called cross-joins.
This article is to show how to create them using Perl's glob
function and how to create them in a subroutine.
Perl's glob
Function
Perl's glob
uses csh
's wildcard expansion.
When given a list,
it will insert each item from the list,
one at a time,
into the specification.
For example:
$ perl -e'@a=glob(q[g{a,e,i,o,u}t]);print "@a\n"' gat get git got gut
This ability can be used to create Cartesian products:
$ perl -e'@a=glob(q[{foo,bar}{1,2,3}]);print "@a\n"' foo1 foo2 foo3 bar1 bar2 bar3
A Subroutine Using Recursion
Cartesian products are easy to do with recursion:
#!/usr/bin/env perl use 5.16.0; use strict; use warnings; use Data::Dumper; use Storable qw( dclone ); # Make Data::Dumper pretty $Data::Dumper::Sortkeys = 1; $Data::Dumper::Indent = 1; # Set maximum depth for Data::Dumper, zero means unlimited local $Data::Dumper::Maxdepth = 0; # -------------------------------------- # Name: cartesian_product # Usage: $products = cartesian_product( \@sets ); # Purpose: Generate the Cartesian products of the given sets via recursion. # Parameters: \@sets -- an AoA of the sets # Returns: $products -- an AoA of the products # sub cartesian_product { my $sets = shift @_; # step 1: handle the degenerate case(s) return [[]] unless @$sets; # step 2: reduce the problem by one my $first_set = shift @$sets; # step 3: recurse to get the partial results my $partial_products = cartesian_product( $sets ); # step 4: combine the partial results to get the full results my $products = []; for my $item ( @$first_set ){ for my $product ( @$partial_products ){ push @$products, [ $item, @$product ]; } } return $products; } # -------------------------------------- # Main my @sets = ( [qw( foo bar )], [qw( 1 2 3 )], ); my $products = cartesian_product( dclone( \@sets )); print Dumper \@sets, $products;
Main
The main part of this script creates some sets for testing similar to those used in the previous example. It uses dclone()
from Storable
to copy the sets and calls the subroutine to create the Cartesian products. This is necessary since the subroutine is destructive in its use of the sets.
Cartesian Product Subroutine
The subroutine follow the four steps of recursion.
- The first step handles the degenerate case. If there are no sets, an array holding an empty array is returned.
- The second set saves the first set.
- The third step calls the subroutine recursively. It stores the partial results.
- The final step is to combine the saved set with the partial results. This is done with a loop within a loop to create new sets by adding each element of the saved set to the sets of the partial results.
A Subroutine Using Loops
#!/usr/bin/env perl use 5.16.0; use strict; use warnings; use Data::Dumper; # Make Data::Dumper pretty $Data::Dumper::Sortkeys = 1; $Data::Dumper::Indent = 1; # Set maximum depth for Data::Dumper, zero means unlimited local $Data::Dumper::Maxdepth = 0; # -------------------------------------- # Name: cartesian_product # Usage: $products = cartesian_product( \@sets ); # Purpose: Generate the Cartesian products of the given sets via looping. # Parameters: \@sets -- an AoA of the sets # Returns: $products -- an AoA of the products # sub cartesian_product { my $sets = shift @_; my $products = [[]]; for my $set ( reverse @$sets ){ my $partial_products = $products; $products = []; for my $item ( @$set ){ for my $product ( @$partial_products ){ push @$products, [ $item, @$product ]; } } } return $products; } # -------------------------------------- # Main my @sets = ( [qw( foo bar )], [qw( 1 2 3 )], ); my $products = cartesian_product( \@sets ); print Dumper \@sets, $products;
Main
Again, the main in this script creates a set of sets for demonstrate but unlike the recursion version, this one does not need to clone it. This subroutine is non-destructive.
Cartesian Product Subroutine
The first step of this subroutine is to assume there are no sets and the products are assign an array holding only an empty array. Then the sets are scanned in reverse order. This is just to ensure the result are the same as the previous script.
The partial products are saved and a new array is created to hold the new products.
The inner two loops that follow are the same as those in the previous script. They create the final products for the set.
Conclusion
Cartesian products at first may seem daunting but all that is need is two loops and a little recursion. Or three loops, if with a little more brain power.
Al alternative approach can be found in my module, https://metacpan.org/module/Set::CartesianProduct::Lazy, which can be quite useful in certain cases, especially when the cartesian product could be substantially larger than the "source" data.
ReplyDelete