I don't think you can write much faster program than you have already. The main problem is the output size: it has order of n!*2^n (number of subsets * average number of permutations for one subset), which is already > 10^9 for a string of 10 different characters.
Since STL's next_permutation adds very limited complexity for such small strings, your program's time complexity is already nearly O(output size).
But you can make your program a bit simpler. In particular, for( k =1; k<=n; k++) loop seems unnecessary: you already calculate size of subset in variable c inside. So, just have int k = c instead of if (c == k). (You'll also need to consider case of empty subset: i == 0)
edit
Actually, there's only 9864100 outputs for n == 10 (not ~ 10^9). Still, my point remains the same: your program already wastes only "O(next_permutation)" time for each output, which is very, very little.