A guide to logging in Python

Logging is a critical way to understand what's going on inside an application.

There is little worse as a developer than trying to figure out why an application is not working if you don't know what is going on inside it. Sometimes you can't even tell whether the system is working as designed at all.

When applications are running in production, they become black boxes that need to be traced and monitored. One of the simplest, yet most important ways to do so is by logging. Logging allows us—at the time we develop our software—to instruct the program to emit information while the system is running that will be useful for us and our sysadmins.

 

More Python Resources

The same way we document code for future developers, we should direct new software to generate adequate logs for developers and sysadmins. Logs are a critical part of the system documentation about an application's runtime status. When instrumenting your software with logs, think it like writing documentation for developers and sysadmins who will maintain the system in the future.

 

Some purists argue that a disciplined developer who uses logging and testing should hardly need an interactive debugger. If we cannot reason about our application during development with verbose logging, it will be even harder to do it when our code is running in production.

This article looks at Python's logging module, its design, and ways to adapt it for more complex use cases. This is not intended as documentation for developers, rather as a guide to show how the Python logging module is built and to encourage the curious to delve deeper.

Why use the logging module?

A developer might argue, why aren't simple print statements sufficient? The loggingmodule offers multiple benefits, including:

  • Multi-threading support
  • Categorization via different levels of logging
  • Flexibility and configurability
  • Separation of the how from the what

This last point, the actual separation of the what we log from the how we log enables collaboration between different parts of the software. As an example, it allows the developer of a framework or library to add logs and let the sysadmin or person in charge of the runtime configuration decide what should be logged at a later point.

What's in the logging module

The logging module beautifully separates the responsibility of each of its parts (following the Apache Log4j API's approach). Let's look at how a log line travels around the module's code and explore its different parts.

Logger

Loggers are the objects a developer usually interacts with. They are the main APIs that indicate what we want to log.

Given an instance of a logger, we can categorize and ask for messages to be emitted without worrying about how or where they will be emitted.

For example, when we write logger.info("Stock was sold at %s", price) we have the following model in mind:

 

1-ikkitaj.jpg

Python logging model diagram 1

We request a line and we assume some code is executed in the logger that makes that line appear in the console/file. But what actually is happening inside?

Log records

Log records are packages that the logging module uses to pass all required information around. They contain information about the function where the log was requested, the string that was passed, arguments, call stack information, etc.

These are the objects that are being logged. Every time we invoke our loggers, we are creating instances of these objects. But how do objects like these get serialized into a stream? Via handlers!

Handlers

Handlers emit the log records into any output. They take log records and handle them in the function of what they were built for.

As an example, a FileHandler will take a log record and append it to a file.

The standard logging module already comes with multiple built-in handlers like:

  • Multiple file handlers (TimeRotatedSizeRotatedWatched) that can write to files
  • StreamHandler can target a stream like stdout or stderr
  • SMTPHandler sends log records via email
  • SocketHandler sends LogRecords to a streaming socket
  • SyslogHandlerNTEventHandlerHTTPHandlerMemoryHandler, and others

We have now a model that's closer to reality:

 

2-crmvrmf.jpg

Python logging model diagram 2

But most handlers work with simple strings (SMTPHandler, FileHandler, etc.), so you may be wondering how those structured LogRecords are transformed into easy-to-serialize bytes…

Formatters

Let me present the Formatters. Formatters are in charge of serializing the metadata-rich LogRecord into a string. There is a default formatter if none is provided.

The generic formatter class provided by the logging library takes a template and style as input. Then placeholders can be declared for all the attributes in a LogRecordobject.

As an example: '%(asctime)s %(levelname)s %(name)s: %(message)s' will generate logs like 2017-07-19 15:31:13,942 INFO parent.child: Hello EuroPython.

Note that the attribute message is the result of interpolating the log's original template with the arguments provided. (e.g., for logger.info("Hello %s", "Laszlo"), the message will be "Hello Laszlo").

All default attributes can be found in the logging documentation.

OK, now that we know about formatters, our model has changed again:

 

3-olzyr7p.jpg

Python logging model diagram 3

Filters

The last objects in our logging toolkit are filters.

Filters allow for finer-grained control of which logs should be emitted. Multiple filters can be attached to both loggers and handlers. For a log to be emitted, all filters should allow the record to pass.

Users can declare their own filters as objects using a filter method that takes a record as input and returns True/False as output.

With this in mind, here is the current logging workflow:

 

4-nfjk7uc.jpg

Python logging model diagram 4

The logger hierarchy

At this point, you might be impressed by the amount of complexity and configuration that the module is hiding so nicely for you, but there is even more to consider: the logger hierarchy.

We can create a logger via logging.getLogger(<logger_name>). The string passed as an argument to getLogger can define a hierarchy by separating the elements using dots.

As an example, logging.getLogger("parent.child") will create a logger "child" with a parent logger named "parent." Loggers are global objects managed by the logging module, so they can be retrieved conveniently anywhere during our project.

Logger instances are also known as channels. The hierarchy allows the developer to define the channels and their hierarchy.

After the log record has been passed to all the handlers within the logger, the parents' handlers will be called recursively until we reach the top logger (defined as an empty string) or a logger has configured propagate = False. We can see it in the updated diagram:

 

5-sjkrpt3.jpg

Python logging model diagram 5

Note that the parent logger is not called, only its handlers. This means that filters and other code in the logger class won't be executed on the parents. This is a common pitfall when adding filters to loggers.

Recapping the workflow

We've examined the split in responsibility and how we can fine tune log filtering. But there are two other attributes we haven't mentioned yet:

  1. Loggers can be disabled, thereby not allowing any record to be emitted from them.
  2. An effective level can be configured in both loggers and handlers.

As an example, when a logger has configured a level of INFO, only INFO levels and above will be passed. The same rule applies to handlers.

With all this in mind, the final flow diagram in the logging documentation looks like this:

 

6-kd6eiyr.jpg

Python logging model diagram 6

How to use logging

Now that we've looked at the logging module's parts and design, it's time to examine how a developer interacts with it. Here is a code example:

 

import logging

def sample_function(secret_parameter):
    logger = logging.getLogger(__name__)  # __name__=projectA.moduleB
    logger.debug("Going to perform magic with '%s'",  secret_parameter)
    …
    try:
        result = do_magic(secret_parameter)
    except IndexError:
        logger.exception("OMG it happened again, someone please tell Laszlo")
    except:
        logger.info("Unexpected exception", exc_info=True)
        raise
    else:
        logger.info("Magic with '%s' resulted in '%s'", secret_parameter, result, stack_info=True)

This creates a logger using the module __name__. It will create channels and hierarchies based on the project structure, as Python modules are concatenated with dots.

The logger variable references the logger "module," having "projectA" as a parent, which has "root" as its parent.

On line 5, we see how to perform calls to emit logs. We can use one of the methods debuginfoerror, or critical to log using the appropriate level.

When logging a message, in addition to the template arguments, we can pass keyword arguments with specific meaning. The most interesting are exc_info and stack_info. These will add information about the current exception and the stack frame, respectively. For convenience, a method exception is available in the logger objects, which is the same as calling error with exc_info=True.

These are the basics of how to use the logger module. ʘ‿ʘ. But it is also worth mentioning some uses that are usually considered bad practices.

Greedy string formatting

Using loggger.info("string template {}".format(argument)) should be avoided whenever possible in favor of logger.info("string template %s", argument). This is a better practice, as the actual string interpolation will be used only if the log will be emitted. Not doing so can lead to wasted cycles when we are logging on a level over INFO, as the interpolation will still occur.

Capturing and formatting exceptions

Quite often, we want to log information about the exception in a catch block, and it might feel intuitive to use:

 

try:
    …
except Exception as error:
    logger.info("Something bad happened: %s", error)

But that code can give us log lines like Something bad happened: "secret_key." This is not that useful. If we use exc_info as illustrated previously, it will produce the following:

 

try:
    …
except Exception:
    logger.info("Something bad happened", exc_info=True)

 

Something bad happened
Traceback (most recent call last):
  File "sample_project.py", line 10, in code
    inner_code()
  File "sample_project.py", line 6, in inner_code
    x = data[“secret_key”]
KeyError: 'secret_key'

This not only contains the exact source of the exception, but also the type.

Configuring our loggers

It's easy to instrument our software, and we need to configure the logging stack and specify how those records will be emitted.

There are multiple ways to configure the logging stack.

BasicConfig

This is by far the simplest way to configure logging. Just doing logging.basicConfig(level="INFO") sets up a basic StreamHandler that will log everything on the INFO and above levels to the console. There are arguments to customize this basic configuration. Some of them are:

Format Description Example
filename Specifies that a FileHandler should be created, using the specified filename, rather than a StreamHandler /var/logs/logs.txt
format Use the specified format string for the handler "'%(asctime)s %(message)s'"
datefmt Use the specified date/time format "%H:%M:%S"
level Set the root logger level to the specified level "INFO"

This is a simple and practical way to configure small scripts.

Note, basicConfig only works the first time it is called in a runtime. If you have already configured your root logger, calling basicConfig will have no effect.

DictConfig

The configuration for all elements and how to connect them can be specified as a dictionary. This dictionary should have different sections for loggers, handlers, formatters, and some basic global parameters.

Here's an example:

 

config = {
    'disable_existing_loggers': False,
    'version': 1,
    'formatters': {
        'short': {
            'format': '%(asctime)s %(levelname)s %(name)s: %(message)s'
        },
    },
    'handlers': {
        'console': {
            'level': 'INFO',
            'formatter': 'short',
            'class': 'logging.StreamHandler',
        },
    },
    'loggers': {
        '': {
            'handlers': [‘console’],
            'level': 'ERROR',
        },
        'plugins': {
            'handlers': [‘console’],
            'level': 'INFO',
            'propagate': False
        }
    },
}
import logging.config
logging.config.dictConfig(config)

When invoked, dictConfig will disable all existing loggers, unless disable_existing_loggers is set to false. This is usually desired, as many modules declare a global logger that will be instantiated at import time, before dictConfig is called.

You can see the schema that can be used for the dictConfig method. Often, this configuration is stored in a YAML file and configured from there. Many developers often prefer this over using fileConfig, as it offers better support for customization.

Extending logging

Thanks to the way it is designed, it is easy to extend the logging module. Let's see some examples:

Logging JSON

If we want, we can log JSON by creating a custom formatter that transforms the log records into a JSON-encoded string:

 

import logging
import logging.config
import json
ATTR_TO_JSON = [‘created’, ‘filename’, ‘funcName’, ‘levelname’, ‘lineno’, ‘module’, ‘msecs’, ‘msg’, ‘name’, ‘pathname’, ‘process’, ‘processName’, ‘relativeCreated’, ‘thread’, ‘threadName’]
class JsonFormatter:
    def format(self, record):
        obj = {attr: getattr(record, attr)
                  for attr in ATTR_TO_JSON}
        return json.dumps(obj, indent=4)

handler = logging.StreamHandler()
handler.formatter = JsonFormatter()
logger = logging.getLogger(__name__)
logger.addHandler(handler)
logger.error("Hello")

Adding further context

On the formatters, we can specify any log record attribute.

We can inject attributes in multiple ways. In this example, we abuse filters to enrich the records.

 

import logging
import logging.config
GLOBAL_STUFF = 1

class ContextFilter(logging.Filter):
    def filter(self, record):
        global GLOBAL_STUFF
        GLOBAL_STUFF += 1
        record.global_data = GLOBAL_STUFF
        return True

handler = logging.StreamHandler()
handler.formatter = logging.Formatter("%(global_data)s %(message)s")
handler.addFilter(ContextFilter())
logger = logging.getLogger(__name__)
logger.addHandler(handler)

logger.error("Hi1")
logger.error("Hi2")

This effectively adds an attribute to all the records that go through that logger. The formatter will then include it in the log line.

Note that this impacts all log records in your application, including libraries or other frameworks that you might be using and for which you are emitting logs. It can be used to log things like a unique request ID on all log lines to track requests or to add extra contextual information.

Starting with Python 3.2, you can use setLogRecordFactory to capture all log record creation and inject extra information. The extra attribute and the LoggerAdapter classmay also be of the interest.

Buffering logs

Sometimes we would like to have access to debug logs when an error happens. This is feasible by creating a buffered handler that will log the last debug messages after an error occurs. See the following code as a non-curated example:

 

import logging
import logging.handlers

class SmartBufferHandler(logging.handlers.MemoryHandler):
    def __init__(self, num_buffered, *args, **kwargs):
        kwargs[“capacity”] = num_buffered + 2  # +2 one for current, one for prepop
        super().__init__(*args, **kwargs)

    def emit(self, record):
        if len(self.buffer) == self.capacity – 1:
            self.buffer.pop(0)
        super().emit(record)

handler = SmartBufferHandler(num_buffered=2, target=logging.StreamHandler(), flushLevel=logging.ERROR)
logger = logging.getLogger(__name__)
logger.setLevel("DEBUG")
logger.addHandler(handler)

logger.error("Hello1")
logger.debug("Hello2")  # This line won't be logged
logger.debug("Hello3")
logger.debug("Hello4")
logger.error("Hello5")  # As error will flush the buffered logs, the two last debugs will be logged

For more information

This introduction to the logging library's flexibility and configurability aims to demonstrate the beauty of how its design splits concerns. It also offers a solid foundation for anyone interested in a deeper dive into the logging documentation and the how-to guide. Although this article isn't a comprehensive guide to Python logging, here are answers to a few frequently asked questions.

My library emits a "no logger configured" warning

Check how to configure logging in a library from "The Hitchhiker's Guide to Python."

What happens if a logger has no level configured?

The effective level of the logger will then be defined recursively by its parents.

All my logs are in local time. How do I log in UTC?

Formatters are the answer! You need to set the converter attribute of your formatter to generate UTC times. Use converter = time.gmtime.

Read more

世界越快心越慢

在晚飯後的休息時間,我特別享受在客廳瀏灠youtube上各樣各式創作者的影音作品。很大不同於傳統媒體,節目多是針對大多數族群喜好挑選的,在youtube上我會依心情看無腦的動畫、一些旅拍記錄、新聞時事談論。 尤其在看了大量的Youtube的分享後,我真的感受到會限制我的是我的無知,特別是那些我想都沒想過的實際應用,在學習後大大幫助到我的生活和工作層面。 休息在家時,我喜歡想一些沒做過的菜,動手去設計生活和工作上的解決方案,自己是真的很難閒著沒事做。 如創作文章,陪養新的習慣都能感覺到成長的喜悅,是不同於吃喝玩樂的快樂的。 創作不去限制固定的形式,文字是創作、影像聲音也是創作,記錄生活也是創作,我想留下的就是創造—》實現—》回憶,這樣子的循環過程,在留下的足跡面看到自己一路上的成長、失敗、絕望、重新再來。 雖然大部份的時候去做這些創作也不明白有什麼特別的意義,但不去做也不會留下什麼,所以呀不如反事都去試試看,也許能有不一樣的水花也許有意想不到的結果,投資自己永遠不會是失敗的決定,不是嗎?先問問自己再開始計畫下一步,未來沒人說得準。 像最近看youtube仍大一群人在為DOS開

By Phillips Hsieh

知識管理的三個步驟:一小時學會把知識運用到生活上

摘錄瓦基「閱讀前哨站」文章作為自己學習知識管理的內容 Part1「篩選資訊」 如何從海量資訊中篩選出啟發性、實用性和相關性的精華,讓你在學習過程中不再迷失方向。 1. 實用性 2. 啟發性 Part2「提高理解」 如何通過譬喻法和應用法,將抽象的知識與日常生活和工作緊密結合,建立更深刻的理解。 1. 應用法 2. 譬喻法 Part3「運用知識」 如何連結既有知識,跟自己感興趣的領域和專案產生關聯,讓你在運用知識的路途上游刃有餘。 1. 跟日常工作專案、人際活動產生連結 # 為什麼要寫日記? * 寫日記是為了忘記,忘卻瑣碎事情,保持專注力 * 寫日記就像在翻譯這個世界,訓練自己的解讀能力 * 不只是透過日記來記錄生活,而是透過日記來發展生活 #如何寫日記? * 不要寫流水帳式的日記,而是寫覆盤式的日記 當我們試著記錄活動和感受之間的關聯,有助於辦認出真正快樂的事 日記的記錄方式要以過程為主,而非結果 * 感恩日記的科學建議,每日感恩的案例

By Phillips Hsieh
2024年 3月30日 14屆美利達環彰化百K

2024年 3月30日 14屆美利達環彰化百K

這是場半小時就被秒報名額滿的經典賽事, 能順利出賽實屬隊友的功勞, 這次的準備工作想試試新買的外胎, 因為是無內胎用的外胎, 特別緊超級難安裝的, 問了其他朋友才知道, 要沾上肥皂水才容易滑入車框。 一早四點起床準備, 五點集合備好咖啡在車上飲用, 約了六點在彰化田尾鄉南鎮國小, 整好裝四人一起出發前往會場。 被排在最後一批出發, 這次的路線會繞行的員林148上139縣道, 其實在早上五點多天就開始有點飄雨, 大伙就開始擔心不會要雨戰吧! 果不其然才出發準備上148爬坡雨勢越來越大, 戴著防風眼鏡的我在身體的熱氣加上雨水冷凝效果下, 鏡面上滿是霧氣肉眼可視距離才剩不到五公尺, 只能緊依前前方的車友幫忙開路, 之後洪大跟上來我立馬請求他幫忙開路, 上了139停下車把防風眼鏡收起來, 反正下雨天又陰天完全用不到太陽眼鏡了。 雨是邊下邊打雷, 大伙都在這條139上一台一台單車好像避電針, 一時有點害怕不然想平時沒做什麼壞事, 真打到自己就是天意了。 下了139雨勢開始變小, 大伙的速度開始有所提昇, 開高鐵列車的時機己成熟, 物色好列車就跟好跟滿。 最後找了一隊似乎整團有固定在練

By Phillips Hsieh
2023 12月9號 美利達單車嘉年華

2023 12月9號 美利達單車嘉年華

第二次參加美利達環南投賽事, 還記得去年第一次參加這美利達環南投, 還特地提前一天跟車友在魚池住了一晚。 這回用上了剛在7月份剛安裝的車頂架, 安裝了二種不同的攜車架, 都樂這邊可以不用拆車輪直上車頂, YAKIMA這邊選了經濟的款式, 折掉前輪利用前叉固定在攜車架上。 約了唯一一位一起參加的朋友, 二人一早四點約見面, 幫朋友帶上了拿鐵咖啡, 開上日月潭在水社碼頭停好車, 騎往向山遊客中心, 路過美麗的日月潭簡直不要太美了拍一張。 抵達會場己是人山人海了, 跟著大伙排隊順便也看網紅也欣賞名車。 出發就先沿著日月潭順時針騎, 騎到玄裝寺很急停下來上一下廁所, 比賽時都會尿都特別的滿, 一方面是比較緊張,一方面是特別興奮。 這時己經跟車友失散了, 只能獨推沿路看有沒有車友可以一起組隊的, 很可惜在山區大家的實力不一只求平安順騎了, 原則就是有補給就停有食物就吃。 下到水里人群再次聚集起來, 光等紅綠燈就是一條車龍。 騎行了一大圈水里再回到131縣道, 這時背後傳來熟悉的聲音叫菲哥, 終於跟車友重新集合接下來就一路邊聊邊騎。 最後來幾張專業攝影師拍攝的照片 回到終點台上

By Phillips Hsieh