Power-User Techniques¶
Things Kelora does in one line that would otherwise need a custom script or a chain of tools. Skim the gallery, find the trick you didn't know existed, and follow the link when you want the full guide.
How to read this page
Each entry is a teaser: a problem, one command, and a link to the deep dive. Nothing here is the complete reference — that lives in the Function Reference.
Group similar errors — normalized()¶
"Failed to connect to 192.168.1.10" and "...10.0.5.23" are the same
error. normalized() swaps variable data (IPs, emails, UUIDs, numbers) for
placeholders so they collapse into one pattern.
→ Pair it with track_freq() to rank error patterns, or let
--drain mine templates automatically. Full pattern list
and options: normalized() reference.
Discover log templates automatically — --drain¶
No normalization rules to maintain: Drain clusters raw lines into templates.
Formats: --drain (table), =full (line ranges + samples), =id (stable
IDs for diffs), =json (programmatic). → --drain reference.
Deterministic sampling — bucket()¶
--head, sample_prob(), and rand() give different rows every run.
bucket() hashes a key to a stable integer, so the same request shows up in
every run, every rotation, every service.
Same key → same bucket, so you can also shard a huge file into N partitions
(bucket() % 4 == $i) for parallel processing. → Function Reference.
Flatten deeply nested JSON — flattened()¶
Turn nested API payloads into flat, bracket-keyed fields ready for CSV or SQL.
kelora -j examples/deeply-nested.jsonl \
--exec 'e.flat = e.api.flattened()' \
--exec 'print(e.flat.to_json())' -q
{"queries[0].results.users[0].id":1,"queries[0].results.users[0].permissions.read":true,"queries[0].results.users[0].permissions.write":true}
{"queries[0].results.users[0].id":2,"queries[0].results.users[0].permissions.read":true,"queries[0].results.users[0].permissions.write":false,"queries[0].results.users[1].id":3,"queries[0].results.users[1].permissions.read":false,"queries[0].results.users[1].permissions.write":false}
{"queries[0].results.users[0].id":4,"queries[0].results.users[0].permissions.admin":true,"queries[0].results.users[0].permissions.read":true,"queries[0].results.users[0].permissions.write":true}
For arrays-within-arrays, chain emit_each() to fan out multiple levels into
flat rows. → Flatten Nested JSON for Analysis.
Inspect JWT claims — parse_jwt()¶
Read header and claims for debugging, no signature setup. The standard time
claims exp/iat/nbf come back as datetimes (expires_at, issued_at,
not_before), so you can format them or compare against now() directly.
kelora -j examples/auth-logs.jsonl \
--filter 'e.has("token")' \
--exec 'let jwt = e.token.parse_jwt();
e.user = jwt.claims.sub;
e.role = jwt.claims.role;
e.expires = jwt.expires_at.to_iso();
e.token = ()' \
-k timestamp,user,role,expires
timestamp='2024-01-15T10:00:00Z' user='user123' role='admin' expires='2024-11-21T01:46:40+00:00'
timestamp='2024-01-15T10:05:00Z' user='user456' role='user' expires='2024-11-21T02:46:40+00:00'
timestamp='2024-01-15T10:10:00Z' user='user789' role='guest' expires='2023-11-14T22:13:20+00:00'
timestamp='2024-01-15T10:15:00Z' user='user111' role='moderator' expires='2024-11-21T03:46:40+00:00'
Find expired tokens by comparing the decoded expiry against the current time:
To flatten the claims straight onto the event in one step (dropping the token),
use absorb_jwt() — the JWT member of the absorb family:
Warning
Does not verify signatures — debugging / trusted tokens only.
Surgical string extraction — between / before / after¶
Pull fields out of semi-structured lines without writing a regex.
Nth-occurrence (after(sep, 2)), last (-1), between(), and
extract_regexes() for multiple matches. → Function Reference.
Fuzzy matching — edit_distance()¶
Levenshtein distance finds typo'd errors or config drift (prod-web vs
prd-web).
Hashing & pseudonymization — hash() / pseudonym()¶
sha256 for integrity, xxh3 for fast bucketing, and pseudonym() for
consistent anonymous IDs (HMAC with KELORA_SECRET).
→ Sanitize Logs Before Sharing · Pseudonymize Identifiers.
Extract JSON & key-values from text — extract_json() / absorb_kv()¶
Lift structured data out of plain-text log lines.
echo '2024-01-15 ERROR: Failed with response: {"code":500,"message":"Internal error"}' | \
kelora --exec 'e.data = e.line.extract_json()' \
--filter 'e.has("data")' -k line,data
kelora hint: No input format detected; keeping whole lines as 'line'. For 'timestamp LEVEL message' app logs, extract fields with -f 'cols:ts(2) level *msg' (or a regex:). Mixed file? Cascade with repeated -f, e.g. -f json -f 'cols:ts(2) level *msg'. See --help-formats.
line='2024-01-15 ERROR: Failed with response: {"code":500,"message":"Internal error"}' data={"code":500,"message":"Internal error"}
extract_jsons() grabs every object; absorb_kv("line") promotes key=value
pairs to fields. → Function Reference.
Histogram buckets — track_freq()¶
See the distribution, not just the average.
Format conversion on the fly — to_json() / to_logfmt() / cascade¶
Convert between JSON, logfmt, CSV mid-pipeline, or let cascade mode
(-f json,logfmt,line) auto-detect mixed streams line by line.
{"line":"2024-01-15 10:00:00 [INFO] Server starting","_format":"line"}
{"timestamp":"2024-01-15T10:00:01Z","level":"DEBUG","message":"Connection pool initialized","format":"json","connections":50,"_format":"json"}
{"timestamp":"2024-01-15T10:00:02Z","level":"info","msg":"Cache layer ready","format":"logfmt","size":1024,"_format":"logfmt"}
{"line":"<34>Jan 15 10:00:03 appserver syslog: Authentication module loaded","_format":"line"}
{"line":"web_1 | 2024-01-15 10:00:04 [INFO] HTTP server listening on port 8080","_format":"line"}
Cross-event logic — state¶
When track_*() isn't enough — deduplication, request/response correlation,
session reconstruction, state machines — the state map remembers anything
across events.
Note
state is sequential-only (not available under --parallel). For simple
counting prefer track_*(), which works in parallel.
→ Full recipes (dedup, correlation, FSMs, session rebuild, memory management):
Cross-Event Logic with state.
Combine them¶
The payoff is composition — fan out nested orders, normalize errors, hash users, take a deterministic sample, and aggregate, in one command:
kelora -j api-responses.jsonl \
--filter 'e.api_version == "v2"' \
--exec 'emit_each(e.get_path("data.orders", []))' \
--exec 'emit_each(e.items)' \
--exec 'e.error_pattern = e.get("error_msg", "").normalized();
e.user_hash = e.user_id.hash("xxh3");
e.sample_group = e.order_id.bucket() % 10;
e.user_id = ()' \
--filter 'e.sample_group < 3' \
--metrics \
--exec 'track_freq("error_pattern", e.error_pattern)' \
-k order_id,sku,quantity,error_pattern -F csv
See Also¶
- Cross-Event Logic with
state— dedup, correlation, FSMs, sessions - Advanced Scripting — multi-stage transforms
- Metrics and Tracking — aggregation patterns
- Function Reference — complete catalog
- Flatten Nested JSON · Sanitize Logs