How are deques in Python implemented, and when are they worse than lists?

Question

I've recently gotten into investigating how various data structures are implemented in Python in order to make my code more efficient. In investigating how lists and deques work, I found that I can get benefits when I want to shift and unshift reducing the time from O(n) in lists to O(1) in deques (lists being implemented as fixed-length arrays that have to be copied completely each time something is inserted at the front, etc...). What I can't seem to find are the specifics of how a deque is implemented, and the specifics of its downsides v.s. lists. Can someone enlighten me on these two questions?

Alagappan Ramu · Accepted Answer · 2020-01-30 08:15:52Z

111

https://github.com/python/cpython/blob/v3.8.1/Modules/_collectionsmodule.c

A dequeobject is composed of a doubly-linked list of block nodes.

So yes, a deque is a (doubly-)linked list as another answer suggests.

Elaborating: What this means is that Python lists are much better for random-access and fixed-length operations, including slicing, while deques are much more useful for pushing and popping things off the ends, with indexing (but not slicing, interestingly) being possible but slower than with lists.

edited Jan 30, 2020 at 8:15

Alagappan Ramu

2,3697 gold badges27 silver badges38 bronze badges

answered Jun 6, 2011 at 19:35

JAB

21.2k6 gold badges73 silver badges80 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

user395760 Over a year ago

Note that if you just need to append and pop at one end (stack), lists should perform better as .append() and .pop() are amortized O(1) (reallocation and copying happens, but very rarely and only until you reach the max. size the stack will ever have).

JAB Over a year ago

@delnan: But if you want a queue, then something like deque is definitely the right way to go.

user395760 Over a year ago

@Eli: Lists don't deal with thread-safety (well, it's not wired into their internals) and have been tuned by many smart people for a long time.

ShadowRanger Over a year ago

@delnan: Actually, deques in CPython don't really handle thread safety either; they just benefit from the GIL making their operations atomic (and in fact, append and pop from the end of a list has the same protections). In practice, if you're just using a stack, both list and deque have effectively identical performance in CPython; the block allocations are more frequent with deque (but not plain linked list frequent; you'd only end up allocating/freeing every time you crossed a 64 member boundary in CPython implementation), but the lack of huge intermittent copies compensates.

Matt Eding Over a year ago

For a pure Python implementation in check out PyPy's code. Interestingly it is a doubly-linked list of small array blocks.

|

Zach Kelling · Accepted Answer · 2011-06-06 20:27:59Z

Check out collections.deque. From the docs:

Deques support thread-safe, memory efficient appends and pops from either side of the deque with approximately the same O(1) performance in either direction.

Though list objects support similar operations, they are optimized for fast fixed-length operations and incur O(n) memory movement costs for pop(0) and insert(0, v) operations which change both the size and position of the underlying data representation.

Just as it says, using pop(0) or insert(0, v) incur large penalties with list objects. You can't use slice/index operations on a deque, but you can use popleft/appendleft, which are operations deque is optimized for. Here is a simple benchmark to demonstrate this:

import time from collections import deque num = 100000 def append(c): for i in range(num): c.append(i) def appendleft(c): if isinstance(c, deque): for i in range(num): c.appendleft(i) else: for i in range(num): c.insert(0, i) def pop(c): for i in range(num): c.pop() def popleft(c): if isinstance(c, deque): for i in range(num): c.popleft() else: for i in range(num): c.pop(0) for container in [deque, list]: for operation in [append, appendleft, pop, popleft]: c = container(range(num)) start = time.time() operation(c) elapsed = time.time() - start print "Completed %s/%s in %.2f seconds: %.1f ops/sec" % (container.__name__, operation.__name__, elapsed, num / elapsed)

Results on my machine:

Completed deque/append in 0.02 seconds: 5582877.2 ops/sec Completed deque/appendleft in 0.02 seconds: 6406549.7 ops/sec Completed deque/pop in 0.01 seconds: 7146417.7 ops/sec Completed deque/popleft in 0.01 seconds: 7271174.0 ops/sec Completed list/append in 0.01 seconds: 6761407.6 ops/sec Completed list/appendleft in 16.55 seconds: 6042.7 ops/sec Completed list/pop in 0.02 seconds: 4394057.9 ops/sec Completed list/popleft in 3.23 seconds: 30983.3 ops/sec

Huh, just noticed that you can't do slicing with deques even though you can do indexing. Interesting.
+1 for the timings -- interesting that list appends are slightly faster than deque appends.
@zeekay: That's quite odd, considering that searching for the index of a specific item would normally require iterating over the items of the collection anyway, and that you can index into a deque just as you would a list.
@senderle: Of course, the list pops were slower than deque's (likely due to list's higher cost of intermittently resizing as it shrinks, where deque is just freeing blocks back to free list or small object pool), so when selecting a data structure for a stack (aka LIFO queue), the empty-to-full-to-empty performance looks slightly better for deque (averages 6365K ops/sec for append/pop, vs. list's 5578K ops/sec). I suspect deque would do slightly better in the real world, as deque's freelist means growing for the first time is more expensive than growing after shrinking.
To clarify my freelist reference: CPython's deque will not actually free up to 16 blocks (module-wide, not per deque), instead putting them in a cheap array of available blocks for reuse. So when growing a deque for the first time, it always has to pull new blocks from malloc (making append more expensive), but if it's constantly expanding for a bit, then shrinking for a bit, and back and forth, it will usually not involve malloc/free at all so long as the length stays roughly within a range of 1024 elements (16 blocks on the free list, 64 slots per block).

senderle · Accepted Answer · 2011-06-06 19:35:56Z

The documentation entry for deque objects spells out most of what you need to know, I suspect. Notable quotes:

Deques support thread-safe, memory efficient appends and pops from either side of the deque with approximately the same O(1) performance in either direction.

But...

Indexed access is O(1) at both ends but slows to O(n) in the middle. For fast random access, use lists instead.

I'd have to take a look at the source to tell whether the implementation is a linked list or something else, but it sounds to me as though a deque has roughly the same characteristics as a doubly-linked list.

PythonJin · Accepted Answer · 2013-03-17 18:01:05Z

In addition to all the other helpful answers, here is some more information comparing the time complexity (Big-Oh) of various operations on Python lists, deques, sets, and dictionaries. This should help in selecting the right data structure for a particular problem.

Collectives™ on Stack Overflow

How are deques in Python implemented, and when are they worse than lists?

4 Answers 4

7 Comments

10 Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

7 Comments

10 Comments

Comments

Comments

Linked

Related