Welcome to Day 5 of our journey into distributed systems! Today, we're going to build something that forms the backbone of many production systems: a log storage mechanism with rotation capabilities. This seemingly simple component plays a crucial role in system reliability, debugging, and data analysis.
Why Log Storage Matters in Distributed Systems
Imagine you're running a restaurant with 20 chefs working simultaneously. If they all shouted their activities without recording them, you'd have chaos! Similarly, in distributed systems, components across multiple servers generate information constantly. Without proper storage and organization of these logs, troubleshooting becomes nearly impossible.
Real-world examples you might recognize:
Netflix uses sophisticated log management to monitor their streaming services across thousands of servers
Online games track player actions to detect cheating and improve gameplay
Banking apps record every transaction for security and compliance
Understanding Log Rotation
Think of log rotation like changing notebooks when one gets full. Without rotation:
Files grow endlessly, consuming disk space
Searching through massive log files becomes painfully slow
System performance degrades
You risk completely filling storage and crashing your application
Log rotation allows us to:
Cap file sizes
Organize logs by time periods
Automatically delete old logs
Compress older logs to save space
Where Log Storage Fits in System Design
In our distributed log processing system, the log storage component sits between log collection and log analysis. It acts as the persistent layer that ensures we don't lose valuable information even if processing components fail.
Today's component will later connect with:
The log parser we built yesterday
Future components like indexing and search
Analytics and visualization tools we'll build later
Building Our Log Storage System
Let's create a Python-based log storage system with rotation policies. Our system will:
Write logs to flat files
Rotate based on file size or time elapsed
Support basic compression of rotated logs
Maintain a configurable retention policy
Source Code :
Github : https://github.com/sysdr/course/tree/main/day5
Step 1: Create the Base Project Directory
First, let's create the main project directory:
# Create the main project directory
mkdir log_storage_system
cd log_storage_system
Step 2: Create the Project Structure
Now, let's create all the necessary subdirectories and files:
# Create directory structure
mkdir -p src/
mkdir -p tests/
mkdir -p logs/
mkdir -p docker/
# Create __init__.py files to make directories into packages
touch src/__init__.py
touch tests/__init__.py
Step 3: Create the Core Project Files
Let's create all the necessary Python files and other configuration files:
# Create source files
touch src/log_storage.py
touch src/rotation_policy.py
touch src/retention_policy.py
# Create test files
touch tests/test_log_storage.py
# Create main application file
touch main.py
# Create Docker files
touch docker/Dockerfile
touch docker/docker-compose.yml
# Create other project files
touch requirements.txt
touch README.md
Step 4: Create the Log Inspector Tool
# Create the log inspector tool
touch log_inspector.py
Step 5: Verify Your Project Structure
Your project structure should now look like this:
log_storage_system/
├── src/
│ ├── __init__.py
│ ├── log_storage.py
│ ├── rotation_policy.py
│ └── retention_policy.py
├── tests/
│ ├── __init__.py
│ └── test_log_storage.py
├── logs/ # Directory where logs will be stored
├── docker/
│ ├── Dockerfile
│ └── docker-compose.yml
├── main.py # Main application entry point
├── log_inspector.py # Log inspection tool
├── requirements.txt
└── README.md
Step 6: Add Content to Each File
Now, you'll need to add the code we provided earlier to each file. Here's a quick summary of what goes where:
src/rotation_policy.py
: Contains theRotationPolicy
,SizeBasedRotationPolicy
, andTimeBasedRotationPolicy
classessrc/retention_policy.py
: Contains theRetentionPolicy
,CountBasedRetentionPolicy
, andAgeBasedRetentionPolicy
classessrc/log_storage.py
: Contains the mainLogStorage
classmain.py
: Contains the demo application that generates logstests/test_log_storage.py
: Contains unit testslog_inspector.py
: Contains the log inspection tooldocker/Dockerfile
: Contains Docker configurationdocker/docker-compose.yml
: Contains Docker Compose configurationrequirements.txt
: Currently empty as we don't have external dependencies
You can copy and paste the code from the article into the appropriate files.
Step 7: Running the Application
To run locally:
mkdir -p logs
python main.py
To run with Docker:
cd docker
docker-compose up --build
To verify it's working:
Let the application run for a minute
Check the
logs
directoryYou should see the current log file and several rotated and compressed logs
Step 8: Run the Tests
To verify that everything is set up correctly, run the tests:
# Run from the project root
python -m unittest discover tests
If everything is set up correctly, the tests should pass.
Step 9: Run the Application Locally
Now you can run the application to generate some logs:
# Run the main application
python main.py
Let it run for a minute or so, then interrupt it with Ctrl+C.
Step 10: Inspect the Generated Logs
Use the log inspector tool to view the logs:
# List all log files
python log_inspector.py --list
# Read a specific log file (replace with an actual filename from the list command)
python log_inspector.py --read application.log
# Search for error messages
python log_inspector.py --search "ERROR"
Step 11: Build and Run with Docker
To build and run the application inside a Docker container:
# Navigate to the docker directory
cd docker
# Build and start the container
docker-compose up --build
Let it run for a minute or so, then interrupt it with Ctrl+C.
Step 12: Verify Docker Output
Check that logs are being generated inside the container by examining the mounted volume:
# Check the logs directory (from project root)
ls -la logs/
You should see both the current log file and several rotated log files.
Troubleshooting Tips
No logs being generated: Make sure the
logs
directory exists and is writable.Docker issues: Ensure Docker and Docker Compose are installed and running.
Import errors: Verify that you're running the scripts from the project root directory.
Test failures: Check that you've copied the code correctly and that all required files exist.
That's it! You now have a fully functional log storage system with rotation policies, complete with tests, Docker containerization, and a log inspector tool.
Key Insights for Understanding Log Storage
Now that we've implemented our log storage system, let's discuss some key insights that will help you understand why this component is so crucial in distributed systems:
1. Reliability Through Persistence
In distributed systems, logs serve as the persistent record of what happened and when. If your in-memory processing pipeline crashes, these logs become your only source of truth. By implementing proper storage with rotation, you ensure that:
You don't lose historical data
Your system can recover after failures
You maintain compliance with data retention requirements
2. Resource Management
Log files can grow extremely quickly in production environments. A busy web server might generate gigabytes of logs daily. Our rotation mechanism prevents resource exhaustion by:
Limiting individual file sizes
Preventing disk space from filling up
Compressing older logs to save space
Automatically removing logs that are no longer needed
3. Performance Considerations
The way we design our log storage affects the overall performance of our system:
Writing to the end of a file is much faster than modifying the middle
Our implementation uses append-only operations for maximum performance
The rotation happens in a controlled manner to minimize disruption
Compression happens after rotation to avoid slowing down active logging
4. Operational Flexibility
Our design provides several operational advantages:
Rotation policies can be changed without modifying application code
Retention policies allow compliance with data regulations
The system works well in containerized environments
Log files can be easily moved or analyzed by external tools
Industry Best Practices
When working with log storage in production systems, consider these best practices:
Structured Logging: While we're using plain text logs, consider JSON or other structured formats that are easier to parse and query
Centralized Storage: In real distributed systems, logs from different components should be shipped to a centralized location
Monitoring Your Logs: Set up monitoring for log volume and error patterns to detect issues early
Security Considerations: Logs can contain sensitive information; ensure proper access controls and consider log redaction
Disaster Recovery: Include logs in your backup strategies to ensure you don't lose critical information
Future Extensions
As you continue building your distributed log processing system, consider these enhancements:
Add encryption for sensitive log data
Implement a log shipper to send logs to a central location
Create indexes for faster searching through logs
Add monitoring for log volume and error rates
Implement log compression algorithms more sophisticated than gzip
Real-World Applications
Understanding log storage and rotation is foundational for many real-world applications:
Web Services: Track user behavior, errors, and performance metrics
IoT Systems: Collect data from sensors and devices
Financial Systems: Maintain audit trails for regulatory compliance
Security Systems: Detect and respond to suspicious activities
Machine Learning Pipelines: Store intermediate results and model performance metrics
Final Thoughts
Log storage might seem like a simple component, but it forms the foundation of any reliable distributed system. Without proper log management, troubleshooting becomes nearly impossible, and you risk losing valuable data.
As you build the rest of your distributed log processing system in the coming weeks, you'll appreciate how this foundation enables more advanced functionality like search, alerting, and analytics.
Remember, in distributed systems, logs are often your only window into what's happening across many servers. Take the time to design your log storage properly, and you'll thank yourself later when debugging complex issues!
In our next lesson, we'll build on today's work to implement log indexing for faster searches. This will allow us to quickly find specific events across our distributed system, making troubleshooting much more efficient.
Assignment: Building a Log Inspector Tool
Now it's your turn to build on what we've learned. Create a simple log inspector tool that:
Lists all available log files (current and rotated)
Allows reading the content of any log file (decompressing if necessary)
Implements a simple search feature to find log entries containing specific text
Assignment Steps:
Create a new Python file called
log_inspector.py
Implement functions to:
List all log files in the logs directory
Read and display log content (handling compressed files)
Search for text within logs
Create a simple command-line interface to interact with these features
Test your inspector with logs generated by our main application
Solution to Assignment
Here's a solution for the log inspector tool:
log_inspector.py
To use the log inspector:
# List all log files
python log_inspector.py --list
# Read a specific log file (use the filename from the list command)
python log_inspector.py --read application.log
# Search for text in all log files
python log_inspector.py --search "ERROR"