1

Sets don't have a deterministic order in Python. Why then can you do tuple unpacking on a set in Python?

To demonstrate the problem, take the following in CPython 3.10.12:

a, b = {"foo", "bar"} # sets `a = "bar"`, `b = "foo"` a, b = {"foo", "baz"} # sets `a = "foo"`, `b = "baz"` 

I recognize that the literal answer is that Python tuple unpacking works on any iterable. For example, you can do the following:

def f(): yield 1 yield 2 a, b = f() 

But why is there not a check used by tuple unpacking that the thing being unpacked has deterministic ordering?

15
  • 1
    Why do you call it "tuple unpacking"? Commented Jan 15 at 21:10
  • 6
    It works for the same reason you can do for i in {"foo", "bar"}:. Even though the order isn't deterministic, we allow you to iterate over them. It's your responsibility to understand that a and b will get arbitrary elements of the set. Commented Jan 15 at 21:23
  • 2
    I don’t understand your concern. If there’re two things packed in a box, you can unpack them. Determinism isn’t even a remote consideration for unpacking. Commented Jan 15 at 21:26
  • 3
    @Zags 1) That's still a terrible site noone should use. 2) They show it with a tuple, so yeah, in that case it's tuple unpacking. But you don't have a tuple, so it's not and that page doesn't apply. 3) Google can't find "tuple unpacking" anywhere in the Python docs. Commented Jan 15 at 21:26
  • 3
    (BTW, please consider "that's a terrible site" echoed re: w3schools; in the community where I personally spent a long time offering support, it was one of the two worst offenders in terms of spreading outdated or simply wrong information -- the other being TLDP. While W3Schools' name may imply affiliation with the W3C, they have no such affiliation and are not an official resource for anything). Commented Jan 15 at 22:39

3 Answers 3

4

Python just doesn't have a way to perform the check you want. It'd certainly help avoid some bugs, if it existed, and you could do a, b = tuple(whatever) if you really wanted to unpack a semantically unordered iterable, but the necessary API to perform the check doesn't exist.

It'd be a lot of work to implement such a check. It'd require a lot of manually-written __isordered__ methods, and probably a hook at the level of individual generator functions to get things right for generators, and you still wouldn't be able to get it right for generator expressions. It'd cost extra execution time to perform the checks, too.

Even then, you'd still have cases like a, b, c = [x**2 for x in some_set], where the unpacking works even though the underlying iteration was over an unordered object.

Sign up to request clarification or add additional context in comments.

Comments

3

The core "why" is: Because all features start at -100 points, and nobody thought it was worth preventing sets from being used in this context.

Every new feature costs developer resources to write it, write tests for it, code review it, and then maintain it forever. There has to be a significant benefit to the feature to justify it. "Preventing people from doing something that is potentially useful in niche contexts to avoid (possibly accidental) misuse in other niche contexts" is essentially neutral on pros and cons.

You could propose a feature that would enable this. If someone came up with a significant benefit that would not only cancel out the -100 points all features start at, but also cancel out the negative points applied because this would definitely break existing code in use right now, then they might deprecate (with a warning) iterable unpacking using sets and other unordered iterables, and in a year or three some new version of Python could eventually forbid it. I don't see it happening.

Fundamentally:

  1. It is useful for sets to be iterable (even if unordered iteration is bad in your opinion, sorted(someset) relies on being able to iterate sets to produce the list that it then sorts). So that's not going away.
  2. Iterable unpacking applies to all iterables; you'd need to special-case unordered iterables to explicitly block it.
  3. You'll never prevent all forms of the misuse you seem to dislike (something as simple as a, b = list(theset) will prevent the "misuse" from being detected)
  4. There are always going to be valid use cases this needlessly blocks, e.g. checking for a single element set by unpacking, [obj] = theset (with try/except used to handle when it's more than one), or processing elements in a destructive fashion one at a time, when you just need one arbitrary element at a time, e.g. first, *collection = collection (where collection starts as set but becomes a list as a side-effect here).
  5. Even if they put in an explicit means of detecting unordered iterables, e.g. an __unordered__ attribute on the class with C level support so it can be checked efficiently, that's still slowing down a highly optimized code path for little to no benefit.

So it's a feature that will never handle all cases of "misuse", slows down uses to prevent the misuse, breaks existing code, and is only an arguable benefit in the first place. So they haven't done it, and almost certainly never will do it.

20 Comments

It's not really about preventing misuse - it's about preventing accidental bugs. Like how if (a=b) in C is occasionally reasonable, but almost always a bug. Most people trying to unpack a set are probably making a mistake.
@user2357112: I don't consider the two uses that different, but sure. I've definitely used unpacking with sets on purpose, and I've never done it by accident to my knowledge, simply because I don't use sets that often in the first place, and the uses are often pretty short-lived, so there's little opportunity to make a mistake like that. I'd be against such a change for lack of consistency; iterable unpacking works with arbitrary iterables, and putting in extra effort to stop it in a subset of cases makes the construct harder to reason about.
I'm not sure the history/direction is correct. Unpacking didn't always support all iterables. Before Python 1.5, "the object had to be a tuple" (doc). And 1.5 then said it must be a "sequence". So at some point, somehow, support for sets was added. So 'worth preventing sets from being used" might be backwards. Might've been a question of "worth supporting sets" there.
@nocomment: sets as a top-level built-in didn't exist until 2.4 (with 2.3 introducing them as a dedicated module that was discarded in short order). The iterator protocol didn't exist until 2.2; before then, iteration only worked via __getitem__, that's why sequences were the only option. But AFAICT, when the iterator protocol was introduced in 2.2, it was immediately applied to support unpacking; when sets were introduced, they were iterable, so they got the behavior for free, and it would have taken extra work to prevent it.
@dumbass I just tried it with CPython 3.13, sum of ['a'] * 300000 took about 0.16 seconds and ['a'] * 600000 took about 0.64 seconds. Very much looks quadratic. (And of course even linear time is still O(n²)). Code.
|
-2

Turning the question around, why should an order be required to do unpacking?

def foo(a, b): return (a, b) if random.random() >= 0.5 else (b, a) 

2 Comments

I have absolutely no idea why this would be useful. Unpacking implies it could be stable (obviously it isn't with sets) but this is just unworkable?
it certainly implies it could be stable (and imo it's likely any implementation which is really taking advantage of it is at least worse off for being a less-obvious solution than expressly shuffling, using the set directly, etc.), but I don't see why it should be required just to do the unpack when it clearly works fine and is implemented in a very friendly and generic way without adding additional runtime checks

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.