trying to understand recursive run of generators

Question

I'm trying to figure out how to plot the run of this code onto a recursion tree, because im not quite sure how it operates, even when I'm debugging. what each yield is doing and why do I need them both?

Ive tried creating a somewhat tree that connects each run to its next recursively but I dont know what follows yield.data where the head is 'c'

class Node: def __init__(self, data, next=None): self.data = data self.next = next def get_reverse_iterator(head): if head.next: for datum in get_reverse_iterator(head.next): yield datum yield head.data lst = Node('a', Node('b', Node('c'))) for x in get_reverse_iterator(lst): print(x)

the result should be: c b a

The first loop with yield can be written shortly by yield from get_reverse_iterator(head.next). — Andrej Kesely
– Andrej Kesely, Commented Jun 21, 2019 at 18:14
Can you explain more about how you expect this to work? I don't quite understand where your understanding breaks down. — Magnus Hoff
– Magnus Hoff, Commented Jun 21, 2019 at 18:27

Booboo · Accepted Answer · 2019-06-21 22:36:42Z

To understand how it works you need to understand the basic idea of recursion. Let's suppose that we are not dealing with a generators; we just wish to print all the nodes of a list in reverse given the head node. We call the function print_reverse passing the node as the argument. If the node's next field is empty we just print the field's data value. But if next is not empty, it is pointing to a node that must be printed before the current node is printed. So we recursively call print_reverse again to first print that node. When print_reverse returns we can now print the current node. Of course, when we call print_reverse recursively to print the next node, it may discover that there is yet another node that it points to which must first be printed and we will be calling print_reverse recursively yet again. So we have:

class Node: def __init__(self, data, next=None): self.data = data self.next = next def print_reverse(head): if head.next: print_reverse(head.next) print(head.data) lst = Node('a', Node('b', Node('c'))) print_reverse(lst)

The above code must be understood before the generator problem can be understood. Instead of creating a function print_reverse that prints the node's data field, we wish instead to create a generator function that yields the value. So, it makes sense to rename the function and to replace the print function with a yield statement and the recursive call with a yield from statement:

class Node: def __init__(self, data, next=None): self.data = data self.next = next def get_reverse_iterator(head): if head.next: #print_reverse(head.next) yield from get_reverse_iterator(head.next) #print(head.data) yield head.data lst = Node('a', Node('b', Node('c')))

Now we can use the generator as in:

for x in get_reverse_iterator(lst): print(x)

or:

l = [x in get_reverse_iterator(lst)]

But an alternative to using recursion that avoids creating multiple generator objects, would be:

def get_reverse_iterator(head): stack = [] while head.next: stack.append(head) head = head.next yield head.data while len(stack): head = stack.pop() yield head.data

that's a great comparison, thank you. a small question - why do I need two yields using a generator, whereas when we were not using generators all that was needed is to print the data?
There is one generator function, get_reverse_iterator. When we were just printing we had one print statement and one recursive call back to "our self". In the generator case, the call to the print function becomes a yield statement and the recursive call to our self becomes a yield from "our self".
You "accepted the other answer and said, "brilliant." Presumably you understood the explanation. As mentioned in a previous comment, the yield from is an alternate, simplified syntax for your original for datum in get_reverse_iterator(head.next): yield datum. I think it parallels the recursive call I had in the printing example more closely besides being a simpler syntax.

Green Cloak Guy · Accepted Answer · 2019-06-21 18:43:54Z

Whenever you call the method as a generator (e.g. for x in get_reverse_iterator()), python starts executing that method line by line. Whenever it hits a yield, it stops cold and returns that. When it gets asked for a next() value in the next iteration of the for loop, it continues to execute.

This looks like a fairly straightforward linked-list-traversal idiom, where each element of the list contains data that is itself a list (or some other iterable value, like a string):

list[0].data = [1, 2, 3, 4] list[1].data = [5, 6, 7, 8] ... list[9].data = [37, 38, 39, 40]

So what the code is doing here is printing those sub-lists from the back of the main list to the front of the main list. The output should look something like this:

37 38 39 40 33 34 35 36 ... 5 6 7 8 [1, 2, 3, 4]

which becomes evident when you look at how the code executes. I'll rewrite it in words:

func get_reverse_iterator(head) { if head isn't the last element of the list, then call this function on the next element of the list (head.next) for every element in the return value of that, yield that element yield this element's data

The 'base case' is the last element of the list, which doesn't have a .next. So its data, which is iterable, gets returned to the second-to-last element. The second-to-last element yields every element of that data in turn, and then returns its own data to the third-to-last element. The third-to-last element yields every element of that data in turn, and so on, until finally you get to the first element of the list. Every single yield statement thus far has passed one element up the chain, recursively, and so that inner for loop for the first element has yielded 36 values so far. Finally, all the later elements in the list are done passing values through, and so the first element gets to the last statement of the function and yields its own data.

But there's nothing left to catch that yielded data and parse it by individual element, so it gets printed as the list it was in the first place. Or, at least, that is for my example presented above.

In your case, it's more straightforward, because when you iterate over a string each item is still a string. But it's the same thing on a smaller scale:

get_reverse_iterator() is called on the root node of lst
The root node (I'll call it NodeA) has a .next
get_reverse_iterator() is called on the next node, which I'll call NodeB
NodeB has a .next
get_reverse_iterator() is called on the next node, which I'll call NodeC
NodeC does not have a .next
get_reverse_iterator(NodeC) skips the for loop and yields NodeC.data, which is 'c'`
get_reverse_iterator(NodeB) catches 'c' inside the for loop and yields it
get_reverse_iterator(NodeA) catches 'c' inside the for loop and yields it
'c' gets assigned to x, and it gets printed.
The next iteration of the outer loop happens, and execution returns to get_reverse_iterator(NodeB)
The for loop ends, because get_reverse_iterator(NodeC) has stopped yielding things
get_reverse_iterator(NodeB) ends the for loop, exits the if block, and finally yields NodeB.data, which is 'b'
get_reverse_iterator(NodeA) catches 'b' inside the for loop and yields it
'b' gets assigned to x, and it gets printed.
The next iteration of the outer loop happens, and execution returns to get_reverse_iterator(NodeA)
The for loop ends, because get_reverse_iterator(NodeC) has stopped yielding things
get_reverse_iterator(NodeA) ends the for loop, exits the if block, and finally yields NodeA.data, which is 'a'
'a' gets assigned to x, and it gets printed
The outer for loop finishes, as get_reverse_iterator(NodeA) has stopped yielding things.

brilliant! that sums up all I needed!. one small thing - in step 12/13 can you explain why the for loop ends and exits the if block?
A for loop in python continues until there ceases to be a "next element". In step 12, what's happening is get_reverse_iterator(NodeB) asks for the next value of get_reverse_iterator(NodeC). But since get_reverse_iterator(NodeC) has completely finished its execution, there's no next value to yield, and so get_reverse_iterator(NodeB) decides the for loop is over and moves on. Because that's the only thing in the if block, it exits the if block in the process - the next line of code to be executed is the last one in the function.

Collectives™ on Stack Overflow

trying to understand recursive run of generators

2 Answers 2

3 Comments

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

2 Comments

Related