7.3 Understanding Recursive Functions: Partial Tracing

7.3 Understanding Recursive Functions: Partial Tracing#

We say that the call to sum_nested inside the for loop is a recursive function call, since it appears in the body of sum_nested itself. Such function calls are handled in the same way as all other function calls in Python, but the nature of recursion means that a single initial function call often results in many different calls to the exact same function.

When thinking about a function call on a complex nested list argument, beginners will often attempt to trace through the code carefully, including tracing what happens on each recursive call. Their thoughts go something like “Well, we enter the loop and make a recursive call. That recursive call will make this other recursive call, which will make this other recursive call,” and so on. This type of literal tracing is what a computer does, but it’s also extremely time-consuming and error-prone for humans to do.

Instead, whenever we trace a recursive function, we use the technique of partial tracing, which we’ll describe now. There are two cases.

Case 1: The input corresponds to a base case#

For example, suppose we want to trace the call sum_nested(5). The input 5 is an integer, the simplest kind of nested list. In our recursive function, the if condition is true, and so to trace our code we can simply trace the if branch, completely ignoring the else branch.

def sum_nested(obj: int | list) -> int:
    """Return the sum of the numbers in a nested list <obj>.
    """
    if isinstance(obj, int):
        # obj is an integer
        return obj
    else:
        ...

Tracing this is pretty easy: the line of code return obj simply returns the input, which was 5. This is the correct result: sum_nested(5) should return 5, since the sum of a single integer is just the integer itself.

Case 2: The input corresponds to a recursive case#

For example, suppose we want to trace the call sum_nested([1, [2, [3, 4], 5], [6, 7], 8]) and verify that the output is the correct value of 36. For this input, the if condition is false, and we need to trace the else branch of sum_nested, which is shown below:

    else:
        # obj is a list of nested lists: [lst_1, ..., lst_n]
        s = 0
        for sublist in obj:
            # each sublist is a nested list
            s += sum_nested(sublist)
        return s

This code is an instance of the accumulator pattern, in which we loop over obj, and for each value update the accumulator variable s. You’ve done this many times before, using ordinary loops. But now, at each loop iteration we make a recursive call to sum_nested. Tracing this in full detail would be time-consuming and prone to eror, because each recursive call may make its own recursive calls, as might they, and so on. We would have to trace through all of this before finally updating the accumulator s.

The key idea of partial tracing is the following: we’ll trace through the code, but every time there’s a recursive call, instead of tracing into it, we assume it is correct, and simply use the correct return value and continue tracing the rest of our code.[2] Think of this as no different than tracing code that calls a built-in method, such as list.sort. We don’t bother tracing the call to sort (and how would we, since we don’t have the code?). We just assume it will work and focus on tracing our own code.

To keep track of our partial tracing, we use a table of values, that we build up as follows.

First, we take our input obj, [1, [2, [3, 4], 5], [6, 7], 8], and identify each sub-nested-list. Note that there are only four of them (we don’t count sub-nested-lists of sub-nested-lists).

sublist

1

[2, [3, 4], 5]

[6, 7]

8
Next, beside each one we write down what sum_nested should return on each input. Remember that we aren’t doing any tracing here; instead, we’re filling this in based on the documentation for sum_nested.

sublist

sum_nested(sublist)

1

1

[2, [3, 4], 5]

14

[6, 7]

13

8

8
Finally, we trace through the code from the original else block, updating the value of the accumulator s using the above table. We show these updates in tabular form below.

sublist

sum_nested(sublist)

s

N/A

N/A

0 (initial value)

1

1

1 (s += 1)

[2, [3, 4], 5]

14

15 (s += 14)

[6, 7]

13

28 (s += 13)

8

8

36 (s += 8)

From our table, we see that after the loop completes, the final value of s is 36, and this is the value returned by our original call to sum_nested. It also happens to be the correct value!

Why does partial tracing work?#

The key idea in partial tracing is to not trace into the recursive call. Instead, we just assume it will do as promised in the docstring. How can this make any sense? Haven’t we been learning to be skeptical about our code, and to test it thoroughly? Let’s examine this reasoning carefully, to see if it holds up. We’ll put our familiar sum_nested function under the microscope:

def sum_nested(obj: int | list) -> int:
    """Return the sum of the numbers in a nested list <obj>.
    """
    if isinstance(obj, int):
        return obj
    else:
        s = 0
        for sublist in obj:
            s += sum_nested(sublist)
        return s

Lists of depth 0#

We’ll start by checking whether the function works on the simplest possible input we can give it: an integer. What happens if we call the function with, say, 3? The if-condition is satisfied, and we return 3. That’s the correct answer, according to the docstring. So great, it works on 3.

What about all the other integers? Let’s trace it on 0. The if-condition is satisfied, and we return 0, the correct answer. What about -101? The if-condition is satisfied, and we return -101. Again, this is the correct answer. Do you feel the need to check any more integers? If so, go ahead.

Eventually, it should become clear that we don’t need to check any more integers because the same thing always happens: The if-condition is satisfied, we return the integer that was passed in, and this is the correct answer. So we have convinced ourselves that sum_nested works on any integer, that is, on any nested list of depth 0. Great!

Lists of depth 1#

What about deeper lists? Let’s trace the function on a list of depth 1, say [3, 5, 9]. The if-condition is not satisfied, so we iterate over the sublists, recursing on each one, and adding up the values returned. We need to know what each recursive call will do.

The first call is nested_list(3). We already traced this, and know it will return 3.
The second call is nested_list(5). We didn’t trace this, but we already convinced ourselves that the function works on any integer, so it will return 5.
The third call is nested_list(9), and similarly, we know it will return 9.

The summing up will yield 17, which is then returned. This is the correct answer. So the function works on [3, 5, 9].

What about all the other lists of depth 1? Let’s check [10, 1, -4, 8]. Again, the if-condition is not satisfied, and we recurse on each sublist.

The first call is nested_list(10). It doesn’t matter whether or not we’ve traced this. We convinced ourselves that the function works on any integer, so it will return 10.
The second call is nested_list(1), and similarly, we know it will return the right answer, which is 1.
The third call is nested_list(-4), and again, we know it will return -4.
The fourth call is nested_list(8), and again, we know it will return 8.

The summing up will yield 15, which is the correct answer.

What about some edge cases? For a list of just one item, eg [99], if-condition is not satisfied but there will be only one recursive call, on the integer 99, which we know will correctly return 99. The summing up will yield just that, which is the correct answer. For an empty list, the if-condition is not satisfied, but there are no iterations and we return 0. Again, this is the correct answer.

Do you need to trace more examples of depth 1 lists? Perhaps not. Perhaps you are already bored. It should be coming clear that every recursive call is on a list of depth 0, and we already convinced ourselves that these always work. And since the summing up code is correct, the final answer will be correct. This means that nested_sum works on any list of depth 1.

Lists of depth 2#

Let’s go on to check lists of depth 2, such as [148, [1, 2], [], 10]. This will cause four recursive calls:

The first call is nested_list(148). It doesn’t matter whether or not we’ve traced this. We convinced ourselves that the function works on any list of depth 0, so it will return 148.
The first call is nested_list([1, 2]). It doesn’t matter whether or not we’ve traced this. We convinced ourselves that the function works on any list of depth 1, so it will return 3.
The first call is nested_list([]). This is a list of depth 1, so we know it will return the correct answer, in this case, 0.
The first call is nested_list(10). This is a list of depth 0, so we know it will return the correct answer, in this case, 10.

The summing up will yield 161, which is correct.

If you feel the need to trace more examples of depth 2, go ahead. If you are getting bored again, it’s probably because a pattern is coming clear: Every recursive call is on a list of depth either 0 or 1, and we already convinced ourselves that these always work. And since the summing up code is correct, the final answer will be correct. In other words, nested_sum works on any list of depth 2.

Lists of depth 3#

Now let’s look at lists of depth 3, such as [2, [4, 1], [[5]], 8, [1, [2], [3, 4, 5]]] This will cause five recursive calls:

The first call is nested_list(2). We convinced ourselves that the function works on any list of depth 0, so it will return 2.
The second call is nested_list([4, 1]). We convinced ourselves that the function works on any list of depth 1, so it will return 5.
The third call is nested_list([[5]]). We convinced ourselves that the function works on any list of depth 2, so it will return 5.
The fourth call is nested_list(8). We convinced ourselves that the function works on any list of depth 0, so it will return 8.
The fifth call is nested_list([1, [2], [3, 4, 5]]). We convinced ourselves that the function works on any list of depth 2, so it will return 15.

The summing up will yield 35, which is the correct answer. Again, there is a pattern here: Every recursive call is on a list of depth either 0, 1 or 2 (that’s all that can go into a list whose overall depth is 3!) and we already convinced ourselves that these always work. And since the summing up code is correct, the final answer will be correct. In other words, nested_sum works on any list of depth 3.

Lists of depth 4, and so on#

If you are getting bored again, that’s great, because there is another pattern at play here.

If we know the function works on lists of depth up to 3, the same reasoning will allow us to conclude that it works on lists of depth 4.
And if we know the function works on lists of depth up to 4, the same reasoning will allow us to conclude that it works on lists of depth 5.
And so on.

We can keep applying the same reasoning as many times as we want. In other words, for any depth (5, 22, 8103167 – any depth at all), we can continue this reasoning to show that the function works at that depth.

Or, we can skip all that, and be satisfied that the reasoning is correct and we don’t have to think through 8103166 smaller depths to believe that depth 8103167 works. We have convincingly argued that nested_sum works for any depth greater than or equal to 0!

So why does partial tracing work?#

Returning to our original, question: How can it make sense not to trace into the recursive calls – to simply assume they will work? But notice that we haven’t actually assumed the recursive calls will work. What we have done something quite different. We constructed an argument that if the recursive calls do work, the function will work.

This is induction!#

This idea is formalized in the Principle of Mathematical Induction, a formal proof technique that you may learn about in other courses, such as CSC165/240. If you’ve already encountered induction, you might like to see the structure of our argument expressed more formally:

Let C(n) represent the statement “function sum_nested works on lists of depth n.”

We showed that the following are true:

C(0)
For any k >= 0, if C(i) is true for all 0 <= i <= k, then C(k+1) is true.

By the principle of induction, we can conclude that for all n >= 0, C(n) is true.

`sublist`	`sum_nested(sublist)`	`s`
`N/A`	`N/A`	`0` `(initial value)`
`1`	`1`	`1` `(s += 1)`
`[2, [3, 4], 5]`	`14`	`15` `(s += 14)`
`[6, 7]`	`13`	`28` `(s += 13)`
`8`	`8`	`36` `(s += 8)`