Day 10: Automating Log Analysis with Bash Script - Log Analyzer and Report Generator

Log files are essential for keeping systems running smoothly, helping with troubleshooting, and ensuring everything is maintained properly. However, going through these files by hand can be both boring and prone to mistakes. In this blog, we'll look at a Bash script that makes log file analysis easier by automating the process and creating a daily summary report. This script not only processes logs but also spots errors and critical events, and it even archives the logs once they're processed. Let's break down how it works, step by step.


Key Features of the Script

  1. Input Handling: Accepts the log file path as a command-line argument.

  2. Error Analysis: Counts the total number of error messages in the log file.

  3. Critical Events: Finds and lists lines containing critical events, along with their line numbers.

  4. Top Error Messages: Identifies and ranks the top 5 most frequent error messages.

  5. Summary Report: Generates a detailed summary report with all analysis results.

  6. Archiving Logs: Automatically moves processed log files to an archive directory for safekeeping.


The Script

Here’s the complete Bash script for log analysis:

bashCopy code#!/bin/bash

# Function to display usage information
function display_usage {
    echo "Usage: $0 /path/to/log_file"
}

# Check if the log file is provided and exists
if [ $# -eq 0 ] || [ ! -f "$1" ]; then
    echo "Error: Please provide a valid log file path."
    display_usage
    exit 1
fi

# Assign variables
log_file="$1"
report_file="log_summary_$(date '+%Y-%m-%d').txt"
archive_dir="./archived_logs"
temp_file="/tmp/errors.tmp"

# Step 1: Initialize variables
error_keyword="ERROR"
critical_keyword="CRITICAL"
declare -A error_count

# Step 2: Count total lines in the log file
total_lines=$(wc -l < "$log_file")

# Step 3: Count error occurrences and store error messages
total_errors=$(grep -c "$error_keyword" "$log_file")
grep "$error_keyword" "$log_file" > "$temp_file"
while IFS= read -r line; do
    message=$(echo "$line" | sed -E "s/.*$error_keyword: (.*)/\1/")
    error_count["$message"]=$((error_count["$message"] + 1))
done < "$temp_file"

# Step 4: Extract critical events with line numbers
critical_events=$(grep -n "$critical_keyword" "$log_file")

# Step 5: Identify top 5 error messages
top_errors=$(for key in "${!error_count[@]}"; do
    echo "${error_count[$key]} $key"
done | sort -rn | head -n 5)

# Step 6: Generate the summary report
{
    echo "Log Analysis Report"
    echo "===================="
    echo "Date of Analysis: $(date)"
    echo "Log File: $log_file"
    echo "Total Lines Processed: $total_lines"
    echo "Total Errors Found: $total_errors"
    echo
    echo "Top 5 Most Common Error Messages:"
    echo "$top_errors"
    echo
    echo "Critical Events Found:"
    echo "$critical_events"
} > "$report_file"

# Step 7: Archive the log file (optional enhancement)
if [ ! -d "$archive_dir" ]; then
    mkdir -p "$archive_dir"
fi
mv "$log_file" "$archive_dir/$(basename "$log_file")_$(date '+%Y-%m-%d_%H-%M-%S')"

# Cleanup temporary files
rm -f "$temp_file"

# Notify the user
echo "Log analysis complete. Summary report saved to $report_file."
echo "Original log file archived in $archive_dir."

Step-by-Step Explanation

1. Input Validation

bashCopy codeif [ $# -eq 0 ] || [ ! -f "$1" ]; then
    echo "Error: Please provide a valid log file path."
    display_usage
    exit 1
fi
  • The script checks if the user has provided a log file as a command-line argument.

  • If no argument is provided or the file doesn’t exist, an error message is displayed, and the script exits.

2. Variable Initialization

bashCopy codelog_file="$1"
report_file="log_summary_$(date '+%Y-%m-%d').txt"
archive_dir="./archived_logs"
temp_file="/tmp/errors.tmp"
  • The script assigns variables for the log file, report file, and archive directory.

  • temp_file is a temporary file for intermediate processing.

3. Line and Error Count

bashCopy codetotal_lines=$(wc -l < "$log_file")
total_errors=$(grep -c "$error_keyword" "$log_file")
  • Counts the total lines in the log file using wc -l.

  • Counts the total errors by searching for the keyword ERROR with grep -c.

4. Storing Error Messages

bashCopy codegrep "$error_keyword" "$log_file" > "$temp_file"
while IFS= read -r line; do
    message=$(echo "$line" | sed -E "s/.*$error_keyword: (.*)/\1/")
    error_count["$message"]=$((error_count["$message"] + 1))
done < "$temp_file"
  • Filters all error lines into a temporary file.

  • Extracts error messages using sed and tracks their frequency using an associative array.

5. Critical Event Extraction

bashCopy codecritical_events=$(grep -n "$critical_keyword" "$log_file")
  • Searches for lines containing CRITICAL and includes the line number using grep -n.

6. Top 5 Error Messages

bashCopy codetop_errors=$(for key in "${!error_count[@]}"; do
    echo "${error_count[$key]} $key"
done | sort -rn | head -n 5)
  • Collects all error messages and their counts, sorts them by frequency in descending order, and picks the top 5.

7. Generating the Report

bashCopy code{
    echo "Log Analysis Report"
    echo "===================="
    echo "Date of Analysis: $(date)"
    echo "Log File: $log_file"
    echo "Total Lines Processed: $total_lines"
    echo "Total Errors Found: $total_errors"
    echo
    echo "Top 5 Most Common Error Messages:"
    echo "$top_errors"
    echo
    echo "Critical Events Found:"
    echo "$critical_events"
} > "$report_file"
  • Compiles all analysis results into a well-structured summary report.

8. Archiving Logs

bashCopy codeif [ ! -d "$archive_dir" ]; then
    mkdir -p "$archive_dir"
fi
mv "$log_file" "$archive_dir/$(basename "$log_file")_$(date '+%Y-%m-%d_%H-%M-%S')"
  • Creates an archive directory if it doesn’t exist.

  • Moves the processed log file into the archive directory with a timestamped filename.

9. Cleanup and Notifications

bashCopy coderm -f "$temp_file"
echo "Log analysis complete. Summary report saved to $report_file."
echo "Original log file archived in $archive_dir."
  • Removes the temporary file used for error processing.

  • Notifies the user about the completion and locations of the report and archived log.


How to Use the Script

  1. Save the script to a file, e.g., log_analyzer.sh.

  2. Make the script executable:

     bashCopy codechmod +x log_analyzer.sh
    
  3. Run the script with a log file as the argument:

     bashCopy code./log_analyzer.sh /path/to/log_file.log
    

This Bash script simplifies log analysis and ensures systematic reporting and archiving, making it an indispensable tool for system administrators and developers alike