Frostbeat is a lightweight log streaming tool designed for Kubernetes environments. Its primary purpose is to collect structured logs (JSON format) from pods matching a specific label selector, and route them into a target backend such as a file or an etcd cluster.
A prerequisite for using this tool is that logs must be structured as JSON objects, and must contain both a
"traceId": "..."field and a"time": "..."field (in ISO 8601 format, e.g., "2025-06-10T08:25:41.63070246Z"). These fields are used to group logs and preserve their original event timestamp.
It supports:
- Real-time log streaming using the Kubernetes pod log API
- Grouping logs by
traceIDfor correlation and traceability - Buffered writing with batching and retry logic on failure
- Dynamic watching of pods: automatically starts/stops log streaming as pods appear or disappear
- Graceful shutdown with context and signal handling
- Pluggable writer interface - write logs to any destination, such as:
- Etcd (default)
- Files (local or mounted volume)
- ElasticSearch
- Kafka
- Cloud storage (e.g. GCS, S3)
- Any custom backend implementing the
LogWriterinterface
- At startup, it connects to the Kubernetes API and retrieves all pods matching a configured label.
- It launches a dedicated streamer for each pod to consume logs in real-time.
- Logs are sent to a buffered channel, where a writer component batches and persists them.
- If write operations fail, the batch is retried via a dedicated channel.
- A pod watcher observes for any new or terminated pods and updates the streaming set accordingly.
Frostbeat uses a simple LogWriter interface:
type LogWriter interface {
WriteBatch(lines []string) error
}This means you can easily plug in your own writer implementation to integrate with any external system.
For example:
- Write logs to ElasticSearch by implementing a writer that sends batched JSON via the bulk API.
- Push logs to Kafka or a cloud queue for async processing.
- Use a buffered channel-based writer for advanced processing pipelines.
The PodLogManager is the component responsible for managing the lifecycle of individual log streamers (PodLogStreamer) for Kubernetes pods. It serves as the bridge between the Kubernetes event stream (watcher) and the log ingestion pipeline.
- Dynamically responds to pod creation and deletion events.
- Starts a new
PodLogStreamerfor each relevant pod that appears. - Gracefully stops and cleans up log streamers for pods that are removed.
- Maintains an internal map (
activeStreamers) to track and manage running log streams.
In the context of the overall system:
- The
Kubernetes Pods Watcherdetects new or deleted pods and notifies thePodLogManager. - The
PodLogManagercreates or removesPodLogStreamerinstances accordingly. - Each
PodLogStreameropens a log stream from the Kubernetes API and pushes log lines to a shared log channel. - A separate
LogWritercomponent batches and persists logs to a destination (e.g., file, etcd, Elastic), consuming the shared channel. - Failed batches are retried through a retry channel mechanism.
- Automatically scales with your deployments.
- Avoids log duplication or loss during pod lifecycle transitions.
- Keeps the main loop clean by encapsulating all pod log lifecycle logic.
| Environment Variable | Default Value | Description |
|---|---|---|
| BATCH_SIZE | 20 | Maximum number of log entries to send in a single batch. |
| BATCH_PERIOD | 5s | Maximum time to wait before flushing a batch. |
| NAMESPACE | demo-system | Kubernetes namespace to watch for pods and collect logs from. |
| SELECTOR | app=snowplow | Label selector to filter pods for log collection. |
| ETCD_SERVERS | localhost:2379 | Comma-separated list of etcd endpoints for log storage. |
| TTL | 48h | TTL (Time-To-Live) duration for keys stored in etcd. |

