8

I am trying to solve the problem posed in this question. Here the OP asks that given a 15 element set, its 2^15 element power set be generated in a manner such that the elements of the power set are ordered and grouped together in accordance of their cardinality (number of elements in the subset).

He suggested that generating the powerset by the binary counter method is not going to produce his desired order, and an alternative counting rule therefore have to be devised. This can very easily be done in Python by using library as other users suggested -

import itertools
set = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
for n in range (len(set) + 1) :
    for powerset in itertools.combinations (set, n) :
        print(powerset)

Alternatively, it can be done in C by writing a computer programme in a line-by-line manner, the approach I wanted to take.

I wrote the following which seems to get the job done -

#include <stdio.h>

void length (char * set, int * n)
{
    *n = 0;
    for (int i = 0; set[i] != '\0'; i++) { *n = *n + 1; }
}

void expo (int * n, int * x)
{
    *x = 1;
    for (int i = 0; i < *n; i++) { *x = *x * 2; }
}

void bin_cof (int * n, int * k, int * nCk)
{
    int _k; *nCk = 1;
    if (*k > *n - *k) { _k = *n - *k; }
    else { _k = *k; }
    for (int i = 0; i < _k; i++) { *nCk = *nCk * (*n - i) / (1 + i); }
}

void edges (char * set, char ** current, char ** last, int * n, int * k)
{
    for (int i = 0; i < *k; i++) { current[i] = set + i; }
    current[*k] = set + *n;
    for (int i = 0; i < *k; i++) { last[i] = set + i + *n - *k; }
    last[*k] = set + *n;
}

void successor (char * set, char ** current, char ** last, int * k)
{
    int con;
    for (int i = *k - 1; i >= 0; i--)
    {
        if (current[i] != last[i]) { con = i; break; }
    }
    char * i = set;
    for (;i != current[con]; i++) {}
    int poi = (int) (i - set);
    for (; con < *k; con++) { current[con] = set + 1 + poi; poi++; }
}

void gen_comb (char * set, int *n, int * k, int * nCk, char ** pointer)
{
    char * current[*k + 1], * last[*k + 1];
    edges (set, current, last, n, k);
    for (int j = 0; j < *nCk; j++)
    {
        for (int i = 0; i < *k + 1; i++) { pointer[j][i] = *(current[i]); }
        successor (set, current, last, k);
    }
}

void powerset (char * set)
{
    int n; length (set, &n);
    for (int k = 0; k <= n; k++)
    {
        int nCk; bin_cof (&n, &k, &nCk);
        char combination[nCk][k+1], * pointer[nCk];
        for (int i = 0; i < nCk; i++) { pointer[i] = combination[i]; }
        gen_comb (set, &n, &k, &nCk, pointer);
        printf ("\nSet x has %d subset with cardinality %d :\n{ %s", nCk, k, pointer[0]);
        for (int i = 1; i < nCk; i++) { printf (", %s", pointer[i]); }
        printf (" }\n");
    }
    printf ("\n");
}

int main (void)
{
    for (;;)
    {
        char set [30];
        printf ("Give me a string set, x = ");
        if (scanf("%s", set) == 0) { printf ("\nError!!"); break; }
        int n, x;
        length (set, &n); expo (&n, &x);
        printf ("\n\nIn total set x has %d subsets which are sorted in the order of low to \nhigh cardinality below ->\n ", x);
        powerset(set); printf ("\n");
    }
    return 1;
}

If I give it input ABCDEF

Give me a string set, x = ABCDEF

Then the output becomes

In total set x has 64 subsets which are sorted in the order of low to high cardinality below ->

Set x has 1 subset with cardinality 0 : { }

Set x has 6 subset with cardinality 1 : { A, B, C, D, E, F }

Set x has 15 subset with cardinality 2 : { AB, AC, AD, AE, AF, BC, BD, BE, BF, CD, CE, CF, DE, DF, EF }

Set x has 20 subset with cardinality 3 : { ABC, ABD, ABE, ABF, ACD, ACE, ACF, ADE, ADF, AEF, BCD, BCE, BCF, BDE, BDF, BEF, CDE, CDF, CEF, DEF }

Set x has 15 subset with cardinality 4 : { ABCD, ABCE, ABCF, ABDE, ABDF, ABEF, ACDE, ACDF, ACEF, ADEF, BCDE, BCDF, BCEF, BDEF, CDEF }

Set x has 6 subset with cardinality 5 : { ABCDE, ABCDF, ABCEF, ABDEF, ACDEF, BCDEF }

Set x has 1 subset with cardinality 6 : { ABCDEF }

I am able to obtain the powerset of a finite set of 15 elements grouped in a way the OP demanded they should be if I give input ABCDEFGHIJKLMNO.

In this example, each of the "grouping in accordance with cardinality" is in fact an array of stings, in the code which is represented by the matrix combination[nCk][k+1] within the function powerset(). By initializing an array of pointers * pointer[nCK] within the function powerset() with the addresses of the individual strings contained in the matrix, we were able to pass the matrix combination[nCk][k+1] from function to function by using the address of the array of pointers * pointer[nCK] as reference char ** pointer.

In the example presented above for a set of cardinality 6, there are 7 matrices. Let's name them combination0[1][1], combination1[6][2], combination2[15][3], combination3[20][4], combination4[15][5], combination5[6][6], and combination6[1][7]. How can we gather all of these 7 arrays of strings into another array and use a triple pointer like char *** tripointer in order to pass that array of array of strings around from function to function, just like we did to combination[nCk][k+1] by using char ** pointer as reference?

In summary, how do I make an array of matrices of varying dimensionality and pass it around from function to function using a triple pointer?

If the matrices happened to have the same dimensionality this wouldn't be a problem at all. We could use the same method we used for combination[nCk][k+1]. But the problem arises due to the matrices not being of equal dimensionality.

7
  • 3
    If ever you think "triple pointer", your next thought should be "nope". Never say never, but triple and deeper indirection is a huge red flag. Commented Aug 17 at 13:27
  • 2
    "the OP asks that [...] it's 2^15 element power set be generated in a manner such that the elements of the power set are ordered and grouped together in accordance of their cardinality. He suggested that generating powerset by binary counter method is not going to produce his desired order." -- um, no? Maybe you linked the wrong question, but the one you did link says nothing about grouping by cardinality, and it suggests that the binary counter method would be viable. You can of course ask about variations on that, but do take care about putting words in others' mouths. Commented Aug 17 at 13:46
  • 2
    This code is unreadable. Please use consistent code formatting in a style used by other C programmers, do not invent your own coding style. And format your code consistently. Commented Aug 18 at 6:28
  • 1
    I'd probably go with a 1D array long enough to fit all strings, separated by NULLs. That would make any combination a char* (at some easy-to-compute offset from the beginning), and any set of such combinations a char**. Only double indirection. Why do you need to store combination as a 2D array without flattening it? Commented Aug 18 at 15:10
  • 1
    @user4157124 Audit items are randomly chosen by the system, unfortunately. It is a known issue with the site/system, at least among the Meta community. Commented Aug 26 at 15:53

3 Answers 3

7

The simplest way to handle this is to make all of the subarrays the same size.

Given n elements, you'll have n+1 for the outer dimension. The maximum size of the middle dimension will be nC(n/2), and the maximum size of the inner dimension is n+1.

So create your array as:

int max_nCk = nCk(n, n/2);
char combination[n+1][max_nCk][n+1];

For N=15, this will result in an array of about 1.6M elements. This might be big for the stack, so you can dynamically allocate it:

char (*combination)[max_nCk][n+1] = malloc(sizeof(char[n+1][max_nCk][n+1]);

Doing this will increase memory usage by a factor of just under 4, but simplifies how you work with it.

Sign up to request clarification or add additional context in comments.

1 Comment

Alternately, with less redundancy: char (*combination)[max_nCk][n+1] = malloc((n+1) * sizeof(*combination));
6

The best way to do the allocation you asked about, for what you're using it for, is to not do it at all. Not even a little bit.

If you've got a finite number of items and need a powerset of them then mapping 1-bits in an element number is the most efficient way to express any element. Google will tell you how to iterate through numbers with k bits set. It's from hakmem iirc.

"Google it" and links are normally not great in an answer, but as an explanation, not an apology, hakmem is a damn gold mine in about the richest vein out there for little hacks like this, and studying it will train your mind.

It might be valuable to bring the old-school exposition there a few steps towards the modern age, where it's good to just expect optimizers to find the obvious speedups that only get in the way of exposition and comprehension, so:

C unsigned operations' results are defined to be bit-identical with signed two's complement (as God intended, a fact demonstrated most cogently by one of the hakmem entries). So:

/* Find the lowest set bit in a number: */
static unsigned lsb(unsigned c) { return c&-c; }

/* Find the next more-significant zero bit above the lsb: */
static unsigned nsb(unsigned c) { return lsb(c+lsb(c)); }

and to iterate to the next larger number with the same number of 1 bits you want to set the nsb, clear the lsb and shift any 1 bits between them to the bottom. It's easier to shift the lsb out at the end than isolate and reset it first.

static unsigned lsb(unsigned c) { return c & -c; }
static unsigned nsb(unsigned c) { return lsb(c + lsb(c)); }
static unsigned nextkbit(unsigned c,unsigned lim)  // lim mb 2**n-1
{   // next larger c with the same number of 1 bits, wrapping when > lim
    if (!c) return c;
    unsigned highbits = c+nsb(c) & -nsb(c) & lim;
    unsigned lowbits = (nsb(c)-1) / lsb(c);
    return nsb(c)&lim? highbits | lowbits>>1 : lowbits;
}

Now, your C implementation is doing a lot more formatting and decoration than the python one. Duplicating your python implementation in C:

#include <stdio.h>

// these four get us what `import itertools` gets the python version:
static unsigned lsb(unsigned c) { return c & -c; }
static unsigned nsb(unsigned c) { return lsb(c + lsb(c)); }
static unsigned nextkbit(unsigned c,unsigned lim)  // lim mb 2**n-1
{   // next larger c with the same number of 1 bits, wrapping when > lim
    if (!c) return c;
    unsigned highbits = c+nsb(c) & -nsb(c) & lim;
    unsigned lowbits = (nsb(c)-1) / lsb(c);
    return nsb(c)&lim? highbits | lowbits>>1 : lowbits;
}
static void print_powerset_entry(char **items, unsigned pick)
{   // try to duplicate python itertools set element printing exactly
    const char *sfx = ""; if ( pick && pick==lsb(pick) ) sfx=",";
    putchar('(');
    for ( int i=0; pick; ++i, pick>>=1 )
        if (pick&1) printf("%s%s",items[i],pick>1?", ":sfx);
    puts(")");
}

// the main event
int main(int argc, char **argv)
{
    char **items = argv+1; unsigned n=argc-1;
    const unsigned lim = (1u<<n)-1;
    for (int k=0; k<=n; ++k) {
        unsigned least = (1u<<k)-1, element = least;
        do print_powerset_entry(items,element);
        while ( k && (element=nextkbit(element,lim)) != least );
    }
}

and you run it like ./a.out {1..15} in bash to see the same results (the k-bits-set runs are sequenced differently, as if the element bits were mapped differently, sort the outputs they match).

To get more like what your C program produces, but also handle variable-length items like the python does, try:

#include <stdio.h>
#include <stdlib.h>

// combination iteration/printing
static unsigned lsb(unsigned c) { return c & -c; }
static unsigned nsb(unsigned c) { return lsb(c + lsb(c)); }
static unsigned nextkbit(unsigned c,unsigned lim)  // lim mb 2**n-1
{   // next larger c with the same number of 1 bits, wrapping when > lim
    if (!c) return c;
    unsigned highbits = c+nsb(c) & -nsb(c) & lim;
    unsigned lowbits = (nsb(c)-1) / lsb(c);
    return nsb(c)&lim? highbits | lowbits>>1 : lowbits;
}
static void print_powerset_entry(char **items, unsigned pick)
{
    for ( unsigned i=0; pick; ++i, pick>>=1 )
        if (pick&1) printf("%s",items[i]);
}

// scaffolding+pretties implemented later
static unsigned long choose(unsigned n,unsigned k);
static void get_items_or_die(int argc, char **argv, char ***items, unsigned *nitems);

// the main event
int main(int argc, char **argv)
{
    char **items; unsigned n;
    get_items_or_die(argc,argv,&items,&n);
    printf("%u items can be chosen in %u combinations, "
        "in ascending order of cardinality they are:\n",
        n,1u<<n
    );
    const unsigned lim = (1u<<n)-1;
    for (int k=0; k<=n; ++k) {
        unsigned least = (1u<<k)-1;
        unsigned element = least;
        printf("cardinality %u has %lu subsets:\n{ ",k,choose(n,k));
        do {    printf("%s",element==least?"":",");
            print_powerset_entry(items,element);
        } while ( k && (element=nextkbit(element,lim)) != least );
        puts(" }");
    }
}

static unsigned long choose(unsigned n,unsigned k) // this is just for pretties
{
    if (k==0) return 1;
    return (n*choose(n-1,k-1))/k;
}

static void get_items_or_die(int argc, char **argv, char ***items, unsigned *nitems)
{
    // if you really want to read in the items from a file do it here
    *nitems = argc-1;
    *items = argv+1;
    if (*nitems>24) { puts("Yeah, \"no.\"."); exit(1); }
}

and it wants to be run like ./a.out do re mi fa sol la ti do.

Note that unless Python is a lot less cleverly implemented than I think it is, it never generates the full powerset itself either. Some quick checks says the C reimplementation is only about four times quicker so itertools looks to be iterating the same way.

Comments

5

There are multiple problems in the posted code:

  • passing all variables by reference is confusing and inefficient.
  • scanf("%s", set) is risky as user input longer than 19 characters will cause a buffer overflow. Use scanf("%29s", set) to avoid this problem. Also test for values different than 1 instead of just equal to 0 to catch the case of end of file for which scanf() will return EOF.
  • the pointer array defined in powerset is redundant, you can pass combination and adjust the prototype of gencomb().
  • current and last should be arrays of int to simplify the code.

Here is a simplified version of the code:

#include <stdio.h>
#include <string.h>

int bin_coef(int n, int k)
{
    int nCk = 1;
    if (k > n - k) { k = n - k; }
    for (int i = 0; i < k; i++) {
        nCk = nCk * (n - i) / (1 + i);
    }
    return nCk;
}

void edges(const char *set, int *current, int *last, int n, int k)
{
    for (int i = 0; i < k; i++) { current[i] = i; }
    for (int i = 0; i < k; i++) { last[i] = i + n - k; }
}

void successor(const char *set, int *current, int *last, int k)
{
    int con = 0;
    for (int i = k; i-- > 0; ) {
        if (current[i] != last[i]) {
            con = i;
            break;
        }
    }
    int i = 0;
    for (; i != current[con]; i++) {}
    for (; con < k; con++) {
        current[con] = 1 + i;
        i++;
    }
}

void gen_comb(const char *set, int n, int k, int nCk, char combination[nCk][k+1])
{
    int current[k];
    int last[k];
    edges(set, current, last, n, k);
    for (int j = 0; j < nCk; j++) {
        for (int i = 0; i < k; i++) {
            combination[j][i] = set[current[i]];
        }
        combination[j][k] = '\0';
        successor(set, current, last, k);
    }
}

void powerset(const char *set)
{
    int n = strlen(set);
    int x = 1 << n;
    printf ("\n\nIn total set x has %d subsets which are sorted in the order of low to\n"
            "high cardinality below ->\n ", x);
    for (int k = 0; k <= n; k++) {
        int nCk = bin_coef(n, k);
        char combination[nCk][k+1];
        gen_comb(set, n, k, nCk, combination);
        printf("\nSet x has %d subsets with cardinality %d:\n{ %s",
               nCk, k, combination[0]);
        for (int i = 1; i < nCk; i++) {
            printf(", %s", combination[i]);
        }
        printf(" }\n");
    }
    printf("\n\n");
}

int main(int argc, char *argv[])
{
    if (argc > 1) {
        // if command line arguments were passed, use those
        // instead of prompting the user for string sets.
        for (int i = 1; i < argc; i++) {
            powerset(argv[i]);
        }
        return 0;
    }
    for (;;) {
        char set[30];
        printf("Give me a string set, x = ");
        if (scanf("%29s", set) != 1) {
            printf("\nError!!\n");
            break;
        }
        powerset(set);
    }
    return 0;
}

Here is an alternative approach: you can generate all the subsets in a single pass using the binary selection and dispatch each subset to the correct position in the array so it is sorted by length:

  • you would define the array of all subsets as char combination[x][n+1];
  • you define an array of pointers to arrays of arrays of n+1 characters and initialize it to point to the appropriate positions inside the combination array: char (*pointer[n+1])[n+1];(*)
  • using an auxiliary array int count[n+1] initialized to 0 to keep track of the counts of subsets of given lengths.
  • this uses much less memory than defining a 3D array char combination[n+1][max_nCk][n+1]: 512KB for 15 byte sets instead of 1.6MB. Memory could be reduced by almost another 50% by making the entries k+2 bytes and cramming the subsets of k and n-k characters in the same entries.

Here is an implementation:

#include <stdio.h>
#include <string.h>

int bin_coef(int n, int k)
{
    int nCk = 1;
    if (k > n - k) { k = n - k; }
    for (int i = 0; i < k; i++) {
        nCk = nCk * (n - i) / (1 + i);
    }
    return nCk;
}

int set_subset(const char *set, int x, char *dest)
{
    int k = 0;
    for (int i = 0; x; i++) {
        if (x & 1) dest[k++] = set[i];
        x >>= 1;
    }
    dest[k] = '\0';
    return k;
}

void powerset(const char *set)
{
    int n = strlen(set);
    int x = 1 << n;
    char combination[x][n+1];
    char (*pointer[n+1])[n+1];
    int count[n+1];
    char subset[n+1];

    // initialize the pointers and counts
    for (int i = 0, k = 0; k <= n; k++) {
        pointer[k] = &combination[i];
        count[k] = 0;
        i += bin_coef(n, k);
    }
    // generate all subsets and dispatch them by size
    for (int i = 0; i < x; i++) {
        int k = set_subset(set, i, subset);
        strcpy(pointer[k][count[k]], subset);
        count[k] += 1;
    }
    // output the subsets by cardinality
    printf("\n\nIn total set %s has %d subset%.*s which are sorted in the order of low to\n"
           "high cardinality below ->\n", set, x, x != 1, "s");
    for (int k = 0; k <= n; k++) {
        int nCk = count[k];
        printf("\nSet %s has %d subset%.*s with cardinality %d:\n",
               set, nCk, nCk != 1, "s", k);
        printf("{ %s", pointer[k][0]);
        for (int i = 1; i < nCk; i++) {
            printf(", %s", pointer[k][i]);
        }
        printf(" }\n");
    }
    printf("\n");
}

int main(int argc, char *argv[])
{
    if (argc > 1) {
        // if command line arguments were passed, use those
        // instead of prompting the user for string sets.
        for (int i = 1; i < argc; i++) {
            powerset(argv[i]);
        }
        return 0;
    }
    for (;;) {
        char set[30];
        printf("Give me a string set, x = ");
        if (scanf("%29s", set) != 1) {
            printf("\nError!!\n");
            break;
        }
        powerset(set);
    }
    return 0;
}

(*) this is advanced usage of C99 variable length arrays, which might not be supported by all C compilers. The definition looks weird by can be pronounced using the spiral rule: char (*pointer[n+1])[n+1]; defines pointer as an array of n+1 pointers to arrays of n+1 bytes.

2 Comments

In your main() what is being done through the 1st loop involving argv, argc? Powerset(argv[i]), what is being done through this?
@uran: I modified the main function to handle command line arguments: if you run the program with one or more arguments, it will compute and display the powerset of these string arguments. It is easier to test programs to use command line arguments instead of typing them interactively.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.