High-performance log filtering tool with boolean expression support and multi-threaded processing.
- 🔍 Boolean Expressions: Search with AND, OR, NOT operators for complex patterns
- ⚡ Multi-threaded: Parallel processing delivers 5-10x speedup (5,000+ lines/sec)
- 🏷️ Log Level Normalization: Automatically matches abbreviated levels (E→ERROR, W→WARN, etc.)
- 📊 Statistics: Built-in metrics tracking and performance monitoring
- 🗓️ Date/Time Filtering: Native support for date and time range filtering
- 🔧 Flexible Configuration: YAML config files, environment variables, CLI arguments
- 🐳 Docker Ready: Production-ready containers and Kubernetes manifests
- 🛡️ Type Safe: Full type hints for better IDE support
- ✅ Production Tested: 706 tests with 89.73% coverage, zero critical vulnerabilities
# Clone the repository
git clone https://github.com/RomaYushchenko/log-filter.git
cd log-filter
# Install in development mode
pip install -e .
# Or with development dependencies
pip install -e ".[dev]"pip install log-filter# Display installed version
log-filter --version# Search for errors
log-filter "ERROR" /var/log
# Works with abbreviated levels (E, W, I, D, T, F)
# Searches for "ERROR" will match logs with both "ERROR" and "E" levels
log-filter "ERROR" /var/log/production
# Boolean expression
log-filter "ERROR AND database" /var/log
# Complex query
log-filter "(ERROR OR CRITICAL) AND NOT test" /var/log
# Save results
log-filter "ERROR" /var/log -o errors.txt --stats
# Date filtering
log-filter "ERROR" /var/log --after 2024-01-01
# Show statistics
log-filter "ERROR" /var/log --stats
# Disable level normalization (match exact text only)
log-filter "ERROR" /var/log --no-normalize-levelsProcessing logs from /var/log...
✓ app.log (25 matches)
✓ system.log (13 matches)
✓ database.log (8 matches)
Statistics:
Files Processed: 127
Lines Processed: 1,234,567
Matches Found: 5,432
Processing Time: 45.67s
Throughput: 27,024 lines/sec
# Build image
docker build -t log-filter:latest .
# Run on local logs
docker run --rm \
-v ${PWD}/test-logs:/logs:ro \
-v ${PWD}/output:/output \
log-filter:latest \
ERROR /logs -o /output/errors.txt --stats# Run with local logs
docker-compose -f docker-compose.local.yml run --rm log-filter-local
# Development mode with live reload
docker-compose -f docker-compose.dev.yml run --rm log-filter-devSee Docker Deployment Guide for detailed instructions.
- Quick Start Guide - Learn the basics in 5 minutes
- Configuration - Complete configuration reference
- API Documentation - Python API reference
- Deployment Guide - Docker, Kubernetes, production setup
- Migration Guide - Upgrade from v1.x to v2.0
# Find all errors from today
log-filter "ERROR" /var/log --after today -o errors-today.txt
# Monitor specific application
log-filter "ERROR AND myapp" /var/log --stats# Extract database errors
log-filter "ERROR AND (database OR sql OR connection)" /var/log -o db-errors.txt
# Find slow queries
log-filter "slow query" /var/log/mysql --time-after 09:00 --time-before 17:00# Only business hours (9 AM - 5 PM)
log-filter "ERROR" /var/log \
--time-after 09:00 \
--time-before 17:00 \
-o business-hours-errors.txt# Search multiple directories
log-filter "ERROR" /var/log/app /var/log/system /var/log/nginxProduction logs often use abbreviated log levels (E, W, I, D) to save space. Log Filter automatically normalizes these abbreviations, allowing you to search using full level names:
# Search for "ERROR" matches both "ERROR" and "E" in logs
log-filter "ERROR" /var/log/production
# Supported abbreviations:
# E → ERROR
# W → WARN (also WARN, WARNING)
# I → INFO
# D → DEBUG
# T → TRACE
# F → FATAL
# Example: Your production log format
# 2025-01-08 10:00:00.000+0000 E Database connection failed
# 2025-01-08 10:00:01.000+0000 W Connection pool exhausted
# Both will be matched by:
log-filter "ERROR OR WARN" /var/log
# Disable normalization if needed (exact match only)
log-filter "ERROR" /var/log --no-normalize-levels
# Configure in YAML
# processing:
# normalize_log_levels: true # defaultCreate config.yaml:
search:
expression: "ERROR OR CRITICAL"
ignore_case: false
files:
path: "/var/log"
include_patterns:
- "*.log"
exclude_patterns:
- "*.gz"
max_depth: 3
max_file_size: 100 # Skip files > 100 MB
max_record_size: 512 # Skip records > 512 KB
output:
output_file: "/var/log-filter/errors.txt"
overwrite: true
no_path: false # Include file paths
highlight: false # Highlight matches
stats: true
verbose: false
quiet: false
dry_run: false
processing:
max_workers: 8
buffer_size: 32768
encoding: "utf-8"
normalize_log_levels: true # Enable level normalization (default)
debug: falseRun with config:
log-filter --config config.yaml# Pull image
docker pull log-filter/log-filter:2.0.0
# Run
docker run --rm \
-v /var/log:/logs:ro \
-v $(pwd)/output:/output \
log-filter:2.0.0 \
"ERROR" "/logs" "-o" "/output/errors.txt" "--stats"version: '3.8'
services:
log-filter:
image: log-filter:2.0.0
volumes:
- /var/log:/logs:ro
- ./output:/output
environment:
- LOG_FILTER_WORKERS=8
command: ["ERROR", "/logs", "-o", "/output/errors.txt", "--stats"]apiVersion: batch/v1
kind: CronJob
metadata:
name: log-filter-hourly
spec:
schedule: "0 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: log-filter
image: log-filter:2.0.0
args: ["--config", "/config/config.yaml"]
volumeMounts:
- name: logs
mountPath: /logs
readOnly: true
restartPolicy: OnFailure| Workload | Throughput | Workers | Time (1 GB) |
|---|---|---|---|
| Single-threaded | 5,000 lines/sec | 1 | 180s |
| Multi-threaded | 40,000 lines/sec | 8 | 25s |
| High-performance | 80,000 lines/sec | 16 | 12s |
Scaling: Linear with CPU cores up to 16 workers Memory: ~50-100 MB base + ~10 MB per worker Tested: Up to 100 GB of logs with consistent performance
# Clone repository
git clone https://github.com/RomaYushchenko/log-filter
cd log-filter
# Create virtual environment
python -m venv venv
source venv/bin/activate # or venv\Scripts\activate on Windows
# Install with dev dependencies
pip install -e ".[dev]"# Run tests
pytest
# With coverage
pytest --cov=log_filter --cov-report=html
# Run specific test
pytest tests/test_parser.py -v# Format code
black src/ tests/
# Sort imports
isort src/ tests/
# Type checking
mypy src/
# Linting
pylint src/
flake8 src/log-filter/
├── src/log_filter/
│ ├── core/ # Expression parsing & evaluation
│ ├── domain/ # Business models & filters
│ ├── config/ # Configuration management
│ ├── infrastructure/ # File I/O & handlers
│ ├── processing/ # Multi-threaded pipeline
│ ├── statistics/ # Metrics & reporting
│ └── utils/ # Logging, progress, highlighting
├── tests/ # Comprehensive test suite
└── docs/ # Sphinx documentation
Contributions are welcome! Please read our Contributing Guide for details.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Homepage: https://github.com/RomaYushchenko/log-filter
- Documentation: https://log-filter.readthedocs.io
- PyPI: https://pypi.org/project/log-filter/
- Bug Tracker: https://github.com/RomaYushchenko/log-filter/issues
- Discussions: https://github.com/RomaYushchenko/log-filter/discussions
- Version: 2.0.0
- Status: Production Ready
- Python: 3.10+ required
- Tests: 706 tests, 89.73% coverage
- Security: Zero critical vulnerabilities
- Performance: 5,000+ lines/sec (single), 40,000+ (multi-threaded)
Developed by Roman Yushchenko with contributions from the community.
Special thanks to all contributors, testers, and users who provided feedback.
- Documentation: https://log-filter.readthedocs.io
- Issues: https://github.com/RomaYushchenko/log-filter/issues
- Discussions: https://github.com/RomaYushchenko/log-filter/discussions
- Email: yushenkoromaf7@gmail.com
Made with ❤️ by Roman Yushchenko