Understanding Traceback in Python #AI - The Entrepreneurial Way with A.I.

Breaking

Friday, December 31, 2021

Understanding Traceback in Python #AI

#A.I.

Last Updated on December 24, 2021

When an exception occurs in a Python program, often a traceback will be printed. Knowing how to read the traceback can help you easily identify the error and make a fix. In this tutorial we are going see what the traceback can tell you.

After completing this tutorial, you will know:

  • How to read a traceback
  • How to print the call stack without exception
  • What is not shown in the traceback

Let’s get started.

Understanding Traceback in Python
Photo by Marten Bjork, some rights reserved

Tutorial Overview

This tutorial is divided into 4 parts; they are:

  1. The call hierarchy of a simple program
  2. Traceback upon exception
  3. Triggering traceback manually
  4. An example in model training

The call hierarchy of a simple program

Let’s consider a simple program:

def indentprint(x, indent=0, prefix="", suffix=""):
    if isinstance(x, dict):
        printdict(x, indent, prefix, suffix)
    elif isinstance(x, list):
        printlist(x, indent, prefix, suffix)
    elif isinstance(x, str):
        printstring(x, indent, prefix, suffix)
    else:
        printnumber(x, indent, prefix, suffix)

def printdict(x, indent, prefix, suffix):
    spaces = " " * indent
    print(spaces + prefix + "{")
    for n, key in enumerate(x):
        comma = "," if n!=len(x)-1 else ""
        indentprint(x[key], indent+2, str(key)+": ", comma)
    print(spaces + "}" + suffix)

def printlist(x, indent, prefix, suffix):
    spaces = " " * indent
    print(spaces + prefix + "[")
    for n, item in enumerate(x):
        comma = "," if n!=len(x)-1 else ""
        indentprint(item, indent+2, "", comma)
    print(spaces + "]" + suffix)

def printstring(x, indent, prefix, suffix):
    spaces = " " * indent
    print(spaces + prefix + '"' + str(x) + '"' + suffix)

def printnumber(x, indent, prefix, suffix):
    spaces = " " * indent
    print(spaces + prefix + str(x) + suffix)

data = {
    "a": [{
        "p": 3, "q": 4,
        "r": [3,4,5],
    },{
        "f": "foo", "g": 2.71
    },{
        "u": None, "v": "bar"
    }],
    "c": {
        "s": ["fizz", 2, 1.1],
        "t": []
    },
}

indentprint(data)

This program is to print the Python dictionary data with indentations. It’s output is the following:

{
  a: [
    {
      p: 3,
      q: 4,
      r: [
        3,
        4,
        5
      ]
    },
    {
      f: "foo",
      g: 2.71
    },
    {
      u: None,
      v: "bar"
    }
  ],
  c: {
    s: [
      "fizz",
      2,
      1.1
    ],
    t: [
    ]
  }
}

This is a short program but functions are calling each other. If we add a line at the beginning of each function, we can reveal how the output is produced with the flow of control:

def indentprint(x, indent=0, prefix="", suffix=""):
    print(f'indentprint(x, {indent}, "{prefix}", "{suffix}")')
    if isinstance(x, dict):
        printdict(x, indent, prefix, suffix)
    elif isinstance(x, list):
        printlist(x, indent, prefix, suffix)
    elif isinstance(x, str):
        printstring(x, indent, prefix, suffix)
    else:
        printnumber(x, indent, prefix, suffix)

def printdict(x, indent, prefix, suffix):
    print(f'printdict(x, {indent}, "{prefix}", "{suffix}")')
    spaces = " " * indent
    print(spaces + prefix + "{")
    for n, key in enumerate(x):
        comma = "," if n!=len(x)-1 else ""
        indentprint(x[key], indent+2, str(key)+": ", comma)
    print(spaces + "}" + suffix)

def printlist(x, indent, prefix, suffix):
    print(f'printlist(x, {indent}, "{prefix}", "{suffix}")')
    spaces = " " * indent
    print(spaces + prefix + "[")
    for n, item in enumerate(x):
        comma = "," if n!=len(x)-1 else ""
        indentprint(item, indent+2, "", comma)
    print(spaces + "]" + suffix)

def printstring(x, indent, prefix, suffix):
    print(f'printstring(x, {indent}, "{prefix}", "{suffix}")')
    spaces = " " * indent
    print(spaces + prefix + '"' + str(x) + '"' + suffix)

def printnumber(x, indent, prefix, suffix):
    print(f'printnumber(x, {indent}, "{prefix}", "{suffix}")')
    spaces = " " * indent
    print(spaces + prefix + str(x) + suffix)

and the output will be messed with more information:

indentprint(x, 0, "", "")
printdict(x, 0, "", "")
{
indentprint(x, 2, "a: ", ",")
printlist(x, 2, "a: ", ",")
  a: [
indentprint(x, 4, "", ",")
printdict(x, 4, "", ",")
    {
indentprint(x, 6, "p: ", ",")
printnumber(x, 6, "p: ", ",")
      p: 3,
indentprint(x, 6, "q: ", ",")
printnumber(x, 6, "q: ", ",")
      q: 4,
indentprint(x, 6, "r: ", "")
printlist(x, 6, "r: ", "")
      r: [
indentprint(x, 8, "", ",")
printnumber(x, 8, "", ",")
        3,
indentprint(x, 8, "", ",")
printnumber(x, 8, "", ",")
        4,
indentprint(x, 8, "", "")
printnumber(x, 8, "", "")
        5
      ]
    },
indentprint(x, 4, "", ",")
printdict(x, 4, "", ",")
    {
indentprint(x, 6, "f: ", ",")
printstring(x, 6, "f: ", ",")
      f: "foo",
indentprint(x, 6, "g: ", "")
printnumber(x, 6, "g: ", "")
      g: 2.71
    },
indentprint(x, 4, "", "")
printdict(x, 4, "", "")
    {
indentprint(x, 6, "u: ", ",")
printnumber(x, 6, "u: ", ",")
      u: None,
indentprint(x, 6, "v: ", "")
printstring(x, 6, "v: ", "")
      v: "bar"
    }
  ],
indentprint(x, 2, "c: ", "")
printdict(x, 2, "c: ", "")
  c: {
indentprint(x, 4, "s: ", ",")
printlist(x, 4, "s: ", ",")
    s: [
indentprint(x, 6, "", ",")
printstring(x, 6, "", ",")
      "fizz",
indentprint(x, 6, "", ",")
printnumber(x, 6, "", ",")
      2,
indentprint(x, 6, "", "")
printnumber(x, 6, "", "")
      1.1
    ],
indentprint(x, 4, "t: ", "")
printlist(x, 4, "t: ", "")
    t: [
    ]
  }
}

So now we knows the order of how each function is invoked. This is the idea of a call stack. At any point of time, when we run a line of code in a function, we want to know who invoked this function.

Traceback upon exception

If we make one typo in the code like the following:

def printdict(x, indent, prefix, suffix):
    spaces = " " * indent
    print(spaces + prefix + "{")
    for n, key in enumerate(x):
        comma = "," if n!=len(x)-1 else ""
        indentprint(x[key], indent+2, str(key)+": ", comma)
    print(spaces + "}") + suffix

The typo is at the last line, which the closing bracket should be at the end of line, not before any +. The return value of print() function is a Python None object. And adding something to None will trigger an exception.

If you run this program using Python interpreter, you will see this:

{
  a: [
    {
      p: 3,
      q: 4,
      r: [
        3,
        4,
        5
      ]
    }
Traceback (most recent call last):
  File "tb.py", line 52, in 
    indentprint(data)
  File "tb.py", line 3, in indentprint
    printdict(x, indent, prefix, suffix)
  File "tb.py", line 16, in printdict
    indentprint(x[key], indent+2, str(key)+": ", comma)
  File "tb.py", line 5, in indentprint
    printlist(x, indent, prefix, suffix)
  File "tb.py", line 24, in printlist
    indentprint(item, indent+2, "", comma)
  File "tb.py", line 3, in indentprint
    printdict(x, indent, prefix, suffix)
  File "tb.py", line 17, in printdict
    print(spaces + "}") + suffix
TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'

The lines starting with “Traceback (most recent call last):” is the traceback. It is the stack of your program at the time when your program encountered the exception. In the above example, the traceback is in “most recent call last” order. Hence your main function is at top while the one triggering the exception is at bottom. So we know the issue is inside the function printdict().

Usually you will see the error message at the end of the traceback. In this example, it is a TypeError triggered by adding None and string. But the traceback’s help stops here. You need to figure out which one is None and which one is string. By reading the traceback, we also know the exception-triggering function printdict() is invokved by indentprint(), and it is in turn invoked by printlist(), and so on.

If you run this in Jupyter notebook, the following is the output:

{
  a: [
    {
      p: 3,
      q: 4,
      r: [
        3,
        4,
        5
      ]
    }
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/var/folders/6z/w0ltb1ss08l593y5xt9jyl1w0000gn/T/ipykernel_37031/2508041071.py in 
----> 1 indentprint(x)

/var/folders/6z/w0ltb1ss08l593y5xt9jyl1w0000gn/T/ipykernel_37031/2327707064.py in indentprint(x, indent, prefix, suffix)
      1 def indentprint(x, indent=0, prefix="", suffix=""):
      2     if isinstance(x, dict):
----> 3         printdict(x, indent, prefix, suffix)
      4     elif isinstance(x, list):
      5         printlist(x, indent, prefix, suffix)

/var/folders/6z/w0ltb1ss08l593y5xt9jyl1w0000gn/T/ipykernel_37031/2327707064.py in printdict(x, indent, prefix, suffix)
     14     for n, key in enumerate(x):
     15         comma = "," if n!=len(x)-1 else ""
---> 16         indentprint(x[key], indent+2, str(key)+": ", comma)
     17     print(spaces + "}") + suffix
     18 

/var/folders/6z/w0ltb1ss08l593y5xt9jyl1w0000gn/T/ipykernel_37031/2327707064.py in indentprint(x, indent, prefix, suffix)
      3         printdict(x, indent, prefix, suffix)
      4     elif isinstance(x, list):
----> 5         printlist(x, indent, prefix, suffix)
      6     elif isinstance(x, str):
      7         printstring(x, indent, prefix, suffix)

/var/folders/6z/w0ltb1ss08l593y5xt9jyl1w0000gn/T/ipykernel_37031/2327707064.py in printlist(x, indent, prefix, suffix)
     22     for n, item in enumerate(x):
     23         comma = "," if n!=len(x)-1 else ""
---> 24         indentprint(item, indent+2, "", comma)
     25     print(spaces + "]" + suffix)
     26 

/var/folders/6z/w0ltb1ss08l593y5xt9jyl1w0000gn/T/ipykernel_37031/2327707064.py in indentprint(x, indent, prefix, suffix)
      1 def indentprint(x, indent=0, prefix="", suffix=""):
      2     if isinstance(x, dict):
----> 3         printdict(x, indent, prefix, suffix)
      4     elif isinstance(x, list):
      5         printlist(x, indent, prefix, suffix)

/var/folders/6z/w0ltb1ss08l593y5xt9jyl1w0000gn/T/ipykernel_37031/2327707064.py in printdict(x, indent, prefix, suffix)
     15         comma = "," if n!=len(x)-1 else ""
     16         indentprint(x[key], indent+2, str(key)+": ", comma)
---> 17     print(spaces + "}") + suffix
     18 
     19 def printlist(x, indent, prefix, suffix):

TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'

The information is essentially the same, but it gives you the lines before. and after each function call.

Triggering traceback manually

The easiest way to print a traceback is to add a raise statement to manually create an exception. But this will also terminate your program. If we want to print the stack at any time even without any exception, we can do so like the following:

import traceback

def printdict(x, indent, prefix, suffix):
    spaces = " " * indent
    print(spaces + prefix + "{")
    for n, key in enumerate(x):
        comma = "," if n!=len(x)-1 else ""
        indentprint(x[key], indent+2, str(key)+": ", comma)
    traceback.print_stack()    # print the current call stack
    print(spaces + "}" + suffix)

The line traceback.print_stack() will print the current call stack.

But indeed, we often want to print the stack only when there is error (so we learn more about why it is so). The more common use case is the following:

import traceback
import random

def compute():
    n = random.randint(0, 10)
    m = random.randint(0, 10)
    return n/m

def compute_many(n_times):
    try:
        for _ in range(n_times):
            x = compute()
        print(f"Completed {n_times} times")
    except:
        print("Something wrong")
        traceback.print_exc()

compute_many(100)

This is a typical pattern for repeatedly calculating a function, such as Monte Carlo simulation. But if we are not careful enough, we may run into some error, such as in the above example, we may have division by zero. The problem is, in case of more complicated computation you can’t easily spot the flaw. Such as in above, the issue buried inside the call to compute(). Therefore it is helpful to understand how we get the error. But at the same time we want to handle the case of error rather than let the entire program terminate. If we use the try-catch construct, the traceback will not be print by default. Therefore we need the use the traceback.print_exc() statement to do it manually.

Actually we can have the traceback more elaborated. Because the traceback is the call stack and indeed we can examine each function in the call stack and check the variables in each level. In the complicated case, this is the function I usually use to do more detailed trace:

def print_tb_with_local():
    """Print stack trace with local variables. This does not need to be in
    exception. Print is using the system's print() function to stderr.
    """
    import traceback, sys
    tb = sys.exc_info()[2]
    stack = []
    while tb:
        stack.append(tb.tb_frame)
        tb = tb.tb_next()
    traceback.print_exc()
    print("Locals by frame, most recent call first", file=sys.stderr)
    for frame in stack:
        print("Frame {0} in {1} at line {2}".format(
            frame.f_code.co_name,
            frame.f_code.co_filename,
            frame.f_lineno), file=sys.stderr)
        for key, value in frame.f_locals.items():
            print("\t%20s = " % key, file=sys.stderr)
            try:
                if '__repr__' in dir(value):
                    print(value.__repr__(), file=sys.stderr)
                elif '__str__' in dir(value):
                    print(value.__str__(), file=sys.stderr)
                else:
                    print(value, file=sys.stderr)
            except:
                print("", file=sys.stderr)

An example in model training

The call stack as reported in the traceback has a limitation: You can only see the Python functions. It should be just fine for the program you wrote but many large libraries in Python have part of them written in another language and compiled into binary. An example is Tensorflow. All the underlying operation are in binary for the performance. Hence if you run the following code, you will see something different:

import numpy as np

sequence = np.arange(0.1, 1.0, 0.1)  # 0.1 to 0.9
n_in = len(sequence)
sequence = sequence.reshape((1, n_in, 1))

# define model
import tensorflow as tf
from tensorflow.keras.layers import LSTM, RepeatVector, Dense, TimeDistributed, Input
from tensorflow.keras import Sequential, Model

model = Sequential([
    LSTM(100, activation="relu", input_shape=(n_in+1, 1)),
    RepeatVector(n_in),
    LSTM(100, activation="relu", return_sequences=True),
    TimeDistributed(Dense(1))
])
model.compile(optimizer="adam", loss="mse")

model.fit(sequence, sequence, epochs=300, verbose=0)

The input_shape parameter to the first LSTM layer in the model should be (n_in, 1) to match the input data, rather than (n_in+1, 1). This code will print the following error once you invoked the last line:

Traceback (most recent call last):
  File "trback3.py", line 20, in 
    model.fit(sequence, sequence, epochs=300, verbose=0)
  File "/usr/local/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/usr/local/lib/python3.9/site-packages/tensorflow/python/framework/func_graph.py", line 1129, in autograph_handler
    raise e.ag_error_metadata.to_exception(e)
ValueError: in user code:

    File "/usr/local/lib/python3.9/site-packages/keras/engine/training.py", line 878, in train_function  *
        return step_function(self, iterator)
    File "/usr/local/lib/python3.9/site-packages/keras/engine/training.py", line 867, in step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "/usr/local/lib/python3.9/site-packages/keras/engine/training.py", line 860, in run_step  **
        outputs = model.train_step(data)
    File "/usr/local/lib/python3.9/site-packages/keras/engine/training.py", line 808, in train_step
        y_pred = self(x, training=True)
    File "/usr/local/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler
        raise e.with_traceback(filtered_tb) from None
    File "/usr/local/lib/python3.9/site-packages/keras/engine/input_spec.py", line 263, in assert_input_compatibility
        raise ValueError(f'Input {input_index} of layer "{layer_name}" is '

    ValueError: Input 0 of layer "sequential" is incompatible with the layer: expected shape=(None, 10, 1), found shape=(None, 9, 1)

If you look at the traceback, you cannot really see the complete call stack. For example, the top frame you know you called model.fit() but the second frame is from a function named error_handler(). Which you cannot see how the fit() function triggered that. This is because Tensorflow is highly optimized. A lot of stuff is hidden in compiled code and not visible by the Python interpreter.

In this case, it is essential to patiently read the traceback and find the clue to the cause. Of course, usually the error message should give you some useful hints as well.

Further Reading

This section provides more resources on the topic if you are looking to go deeper.

Books

Python Official Documentation

Summary

In this tutorial, you discovered how to read and print the traceback from a Python program.

Specifically, you learned:

  • What information the traceback tells you
  • How to print a traceback at any point of your program without raising an exception

In the next post, we will see how we can navigate the call stack inside the Python debugger.

The post Understanding Traceback in Python appeared first on Machine Learning Mastery.



via https://AIupNow.com

Adrian Tam, Khareem Sudlow