Drain3 Log Template Miner

0.9.11 · active · verified Wed Apr 15

Drain3 is a Python library for mining log templates from raw log messages, designed for stream processing. It's based on the Drain algorithm and is suitable for real-time log analysis. The library is actively maintained with frequent patch releases, currently at version 0.9.11.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize Drain3 with a basic configuration, add log messages, and retrieve the identified log clusters. For production use, consider configuring persistence (file, Redis, or HTTP) via `TemplateMinerConfig` to save and load the model state.

from drain3 import Drain3
from drain3.template_miner_config import TemplateMinerConfig
import os

# Configure Drain3. For production, consider loading from a file or Redis.
# Example for file persistence:
# config = TemplateMinerConfig.load('drain3.ini')
# Ensure 'persist_state_to_file' and 'state_file_path' are set in config.

config = TemplateMinerConfig()
config.load_default_config()
config.drain_sim_th = 0.4
config.depth = 4
# If you want to use file persistence (ensure directory exists and is writable):
# config.persistence_type = 'FILE'
# config.file_persistence_path = os.path.join(os.getcwd(), 'drain3_state.bin')

drain = Drain3(config)

log_messages = [
    "081109 203619 143 INFO dfs.DataNode$PacketResponder: PacketResponder "
    "0 for block blk_3886504917409280145 terminating",
    "081109 203619 369 INFO dfs.DataNode$PacketResponder: PacketResponder "
    "0 for block blk_-6755409170280820986 terminating",
    "081109 203620 357 INFO dfs.DataNode$PacketResponder: PacketResponder "
    "2 for block blk_814013142207908518 terminating",
    "081109 203620 543 INFO dfs.DataNode$DataXceiver: Receiving block blk_-6755409170280820986 "
    "src: /10.250.9.141:50106 dest: /10.250.9.141:50010",
]

for log_message in log_messages:
    cluster_id = drain.add_log_message(log_message)
    print(f"Log: '{log_message}' -> Cluster ID: {cluster_id}")

# After processing, it's good practice to save the state if using persistence
# if config.persistence_type == 'FILE':
#     drain.save_state()

print("\n--- Current Clusters ---")
for cluster in drain.drain.clusters:
    print(cluster)

view raw JSON →