Thursday, January 19, 2012

Generate Subsets

Question:
How to generate a list of subsets with restrictions?
Input
{1,2,3}

Output
{{1},{2,3}}
{{2},{1,3}}
{{3},{1,2}}

Input
{1,2,3,4}

Output
{{1},{2,3,4}}
{{2},{1,3,4}}
{{3},{1,2,4}}
{{4},{1,2,3}}
{{1,2},{3,4}}
{{1,3},{2,4}}
{{1,4},{2,3}}

Input
{1,2,2,3}

Output
{{1},{2,2,3}}
{{2},{1,2,3}}
{{3},{1,2,2}}
{{1,2},{2,3}}
{{1,3},{2,2}}

Great Answer :
---------------
If you were generating all subsets you would end up generating 2n subsets for a list of length n. A common way to do this is to iterate through all the numbers i from 0 to 2n-1 and use the bits that are set in i to determine which items are in the ith subset. This works because any item either is or is not present in any particular subset, so by iterating through all the combinations of n bits you iterate through the 2n subsets.

For example, to generate the subsets of (1, 2, 3) you would iterate through the numbers 0 to 7:

0 = 000b → ()
1 = 001b → (1)
2 = 010b → (2)
3 = 011b → (1, 2)
4 = 100b → (3)
5 = 101b → (1, 3)
6 = 110b → (2, 3)
7 = 111b → (1, 2, 3)

In your problem you can generate each subset and its complement to get your pair of mutually exclusive subsets. Each pair would be repeated when you do this so you only need to iterate up to 2n-1 - 1 and then stop.

1 = 001b → (1) + (2, 3)
2 = 010b → (2) + (1, 3)
3 = 011b → (1, 2) + (3)

To deal with duplicate items you could generate subsets of list indices instead of subsets of list items. Like with the list (1, 2, 2, 3) generate subsets of the list (0, 1, 2, 3) instead and then use those numbers as indices into the (1, 2, 2, 3) list. Add a level of indirection, basically.

Here's some Python code putting this all together.

#!/usr/bin/env python

def split_subsets(items):
subsets = set()

for n in xrange(1, 2 ** len(items) / 2):
# Use ith index if ith bit of n is set.
l_indices = [i for i in xrange(0, len(items)) if n & (1 << i) != 0]
# Use the indices NOT present in l_indices.
r_indices = [i for i in xrange(0, len(items)) if i not in l_indices]

# Get the items corresponding to the indices above.
l = tuple(items[i] for i in l_indices)
r = tuple(items[i] for i in r_indices)

# Swap l and r if they are reversed.
if (len(l), l) > (len(r), r):
l, r = r, l

subsets.add((l, r))

# Sort the subset pairs so the left items are in ascending order.
return sorted(subsets, key = lambda (l, r): (len(l), l))

for l, r in split_subsets([1, 2, 2, 3]):
print l, r

Output:

(1,) (2, 2, 3)
(2,) (1, 2, 3)
(3,) (1, 2, 2)
(1, 2) (2, 3)
(1, 3) (2, 2)

Reference : Stack Overflow

No comments:

Post a Comment