Performance Model¶
Kelora is designed to crunch large log streams quickly while staying responsive for interactive use. Understanding the performance levers helps you pick the right execution mode for ad-hoc investigations, CI jobs, and heavyweight batch pipelines.
Execution Modes¶
| Mode | Command | When to use | Characteristics |
|---|---|---|---|
| Sequential (default) | kelora ... |
Tail, streaming pipes, order-sensitive work | Strict input order, minimal buffering, deterministic output |
| Parallel | kelora --parallel ... |
Batch jobs, archives, CPU-bound transforms | Workload split across worker threads, configurable batching |
Sequential Mode¶
- Processes one event at a time and forwards it immediately.
-
Ideal for
tail -fpipelines, interactive filtering, or anything that needs deterministic ordering. -
Windowing (
--window), context flags (-A/-B/-C), and Rhai metrics operate with minimal latency.
Parallel Mode¶
-
Uses a worker pool (defaults to logical CPU count) to parse, filter, and transform events concurrently.
-
Requires buffering to preserve order unless you pass
--unordered(faster but only safe when ordering does not matter). -
Adjust batching:
--batch-size <N>– number of events per batch before flushing to workers.--batch-timeout <ms>– flush partially filled batches after idle period.--threads <N>– override the thread count (0 = auto).- Context windows and sliding windows still work, but they maintain per-worker
buffers internally. Increase
--windowsparingly to avoid large per-thread allocations.
Sequential vs Parallel in Practice¶
kelora -f combined examples/web_access_large.log.gz \
--stats
kelora -f combined examples/web_access_large.log.gz \
--stats --parallel
Detected format: combined
Lines processed: 1200 total, 0 filtered (0.0%), 0 errors (0.0%)
Events created: 1200 total, 1200 output, 0 filtered (0.0%)
Throughput: 46812 lines/s in 25ms
Timestamp: ts (auto-detected) - 1200/1200 parsed (100.0%).
Time span: 2025-10-04T08:27:22+00:00 (single timestamp)
Keys seen: bytes,ip,method,path,protocol,referer,request,status,ts,user,user_agent
Lines processed: 1200 total, 0 filtered (0.0%), 0 errors (0.0%)
Events created: 1200 total, 1200 output, 0 filtered (0.0%)
Throughput: 42738 lines/s in 28ms
Timestamp: ts (auto-detected) - 1200/1200 parsed (100.0%).
Keys seen: bytes,ip,method,path,protocol,referer,request,status,ts,user,user_agent
On this synthetic access log (1200 lines), parallel mode yields higher
throughput because the CPU-bound combined parser is spread across cores.
Real-world gains depend on disk speed, decompression cost, and script workload.
Pipeline Components That Affect Throughput¶
-
Input – Kelora streams from files or stdin, automatically decompressing
.gz. Network filesystems or slow disks can dominate runtime; consider usingpv/zcatto monitor upstream throughput. -
Parsing – Structured formats like
combined,syslog, orcols:spend more CPU cycles thanlineorraw. Parallel mode shines here. -
Filtering – Complex regex (
--filter,--keep-lines,--ignore-lines) benefit from batching; simple boolean predicates are cheap. -
Transformation – Rhai scripts are executed for every event. Expensive operations (regex extraction, JSON parsing, cryptographic hashes) may need
--parallelor optimized logic (e.g., caching in--begin). -
Output –
-F jsonand CSV/TSV encoders allocate more than the default key=value printer. Writing to disk (-o file) shifts performance to storage.
Measuring Performance¶
-
--statsor-sprint throughput, error counts, time span, and key inventory. Compare sequential vs parallel runs with the same dataset. -
--metricscombined withtrack_sum/track_bucketcan act as lightweight profilers (e.g., sumduration_msto estimate runtime distribution). -
Use
time,hyperfine, or CI timers around your Kelora command for wall clock baselines.
Memory Considerations¶
-
Multiline (
--multiline) and windowing (--window, context flags) enlarge per-event buffers. Monitor with--statsand consider lowering--batch-sizeif memory grows uncontrollably in parallel mode. -
--multiline allor gigantic regex chunks can hold the entire file in RAM. Prefer incremental processing or pre-splitting input. -
--metricskeeps maps in memory until the run ends. Guard high-cardinality structures (track_unique) with filters. - Stats/diagnostics cost CPU. Use
--silentor--no-diagnosticsto bypass per-event stats tracking when you only care about output files. This removes timestamp/key discovery and other counters from the hot path.
Ordering Guarantees¶
- Sequential mode preserves input order exactly.
-
Parallel mode preserves order by default through batch sequencing. Use
--unorderedonly when the output order is irrelevant (e.g., writing JSON lines to a file for downstream aggregation). -
--batch-sizetoo large can increase latency before the first events appear. Tune for the desired balance between throughput and interactivity.
Streaming vs Batch Recommendations¶
| Scenario | Suggested Flags |
|---|---|
| Watching logs live | Sequential (default), --stats for quick counters |
| Importing nightly archives | --parallel --batch-size 2000 -s |
| CPU-heavy Rhai transforms | --parallel --threads 0 --unordered (if orderless) |
| Tail with alerts | Sequential + --metrics for low-latency thresholds |
Troubleshooting Slow Pipelines¶
-
High CPU usage – Profile Rhai scripts. Move static setup to
--beginand eliminate redundant parsing inside--exec. -
Low throughput in parallel mode – Increase
--batch-size, decrease--batch-timeout, or allow Kelora to run more threads with--threads 0. -
Out-of-order events – Ensure
--unorderedis not set. Multiline plus--parallelmay delay chunk emission; reduce batch size. -
Backpressure when writing to files – Use
-o output.logto avoid stdout buffering by other processes. -
Gzip bottlenecks – Pre-decompress with
zcat file.gz | kelora -f combined -if CPU is the limiting factor and disk is fast.
Quick Checklist¶
-
Streaming workloads? Stay sequential and stream to stdout for the lowest latency.
-
Batch archives? Combine
--parallel --statsand tune--batch-size/--batch-timeoutafter inspecting skew. -
Heavy windowing? Keep
--windowsmall (50 or less) or sample upstream to cap memory. -
Verbose diagnostics? Drop to
-qonce the pipeline is stable to reduce stderr noise. -
Ordering critical? Avoid
--unordered; otherwise enabling it can flush parallel batches faster.
Fast Paths and Practical Hints¶
- Prefer native flags over Rhai where available: e.g.,
-l debugis faster than--filter "e.level == 'DEBUG'". - Silence diagnostics when benchmarking or exporting:
--silent(or at least--no-diagnostics) skips stats collection and trims per-event overhead. - JSON ingest is a major cost. Keep filters simple, avoid unnecessary
--execwork, and project only needed keys. - If you need Rhai filters, stick to pure comparisons to hit the native filter fast path; function calls fall back to the interpreter.
Troubleshooting Cheats¶
- Inspect parse hiccups with
-F inspector by raising--verbose. -
Timestamp drift? Pin down
--ts-field,--ts-format, or--input-tz(seekelora --help-time). -
Rhai panics? Guard lookups with
e.get_path("field", ())and conversions withto_int_or/to_float_or. -
Abundant
.gzfiles? No need for extra tooling—Kelora already detects and decompresses them automatically.
Related Guides¶
-
Metrics and Tracking Tutorial – build dashboards to observe throughput.
-
Multiline Strategies – large multiline blocks can influence memory and batching.
-
CLI Reference – Performance Options – full documentation for
--parallel,--threads, and friends.