Log Analyzer is a Python-based tool designed to analyze nginx logs and generate HTML reports highlighting the most "
problematic" URLs based on request processing times ($request_time).
-
Log Analysis:
- Processes logs (both gzip and plain formats) from a specified directory.
- Identifies the most time-consuming URLs and calculates key statistics.
-
HTML Report Generation:
- Creates a sortable report in HTML format, including metrics like:
count: Total number of requests.count_perc: Percentage of total requests.time_sum: Total request processing time.time_perc: Percentage of total request time.time_avg: Average request processing time.time_max: Maximum request processing time.time_med: Median request processing time.
- Creates a sortable report in HTML format, including metrics like:
-
Error Handling:
- Logs unparsable lines and exits if the error threshold (configured) is exceeded.
- Automatically skips logs if a report for the same date already exists.
-
Custom Configuration:
- Supports configurable log directories, report sizes, and error thresholds via a JSON configuration file.
- Python 3.12+
- Poetry for dependency management
- Clone the repository:
git clone https://github.com/trifonovtema/log_analyzer.git poetry install poetry run pre-commit install
{
"REPORT_SIZE": 100000000,
"ERROR_THRESHOLD": 0.21,
"LOG_FILE": null,
"LOG_DIR": "./logs",
"REPORT_DIR": "./reports"
}To analyze logs and generate a report:
poetry run python -m app.main --config ./sample_config.json--config: Path to the configuration file (default: ./sample_config.json).
make testEnsure code quality using the following commands:
Format code with black:
make formatCheck imports with isort:
make lintThe HTML report includes a table summarizing URL statistics
To run the analyzer in a Docker container:
docker-compose up -buildRun tests:
make testFormat code:
make formatRun the application:
make run