Skip to content

Script Variables Reference

Kelora exposes several built-in variables to Rhai scripts. Their availability depends on which stage is executing (per-event filters/execs, span hooks, begin/end hooks, etc.). Use this page whenever you need to know which data is in scope.

Stage Overview

Stage / Feature Variables Available Notes
--filter, --exec, --exec-file line, e, meta, conf, state Per-event context. e and state are writable; the others are read-only. state only in sequential mode.
--filter / --exec with --window line, e, meta, conf, state, window Adds the sliding window array (current event at index 0; read-only). state only in sequential mode.
--begin conf, state Map to seed configuration before events arrive. conf is read-only; state is writable (sequential mode only).
--end metrics, conf, state Inspect final tracker totals in metrics and final state; conf and metrics are read-only. state only in sequential mode.
--span-close span, metrics, conf Summarises the closed span with per-span data (span) and cumulative totals (metrics). All read-only.

Reading a variable that does not exist in the current stage raises a Rhai error. Variables listed here are populated with meaningful data (individual fields inside maps may still be ()). Other globals such as line/e exist behind the scenes for compatibility but start empty in stages where they are not useful.

Common Variables

line

  • Type: String
  • Snapshot of the text the parser captured for the current record. Structured formats strip trailing newlines before storing it, -f raw preserves every byte (including terminators), and --multiline reflects the chunk assembled by the multiline stage. Present in every stage and read-only.
  • You’ll also see the same text at meta.line; that copy travels with saved metadata (window snapshots, span hooks, cloned meta values) so the original record is still available later on.

e

  • Type: Map
  • Event map during per-event stages. Mutating e inside --exec updates the emitted event; setting a field to () removes it; assigning e = () clears the entire event. Writable.

meta

  • Type: Map
  • Metadata derived from the pipeline and span system. Attempting to mutate meta fields has no effect; use event fields on e for custom annotations instead. Read-only.
Key Type Description
line String Same as the standalone line variable.
line_num Int 1-based line number when input comes from files.
filename String Source filename, if known.
parsed_ts DateTime or () Parsed UTC timestamp before any --filter/--exec scripts. Missing if the event had no timestamp.
span_status String "included", "late", "unassigned", or "filtered" when spans are enabled. Missing otherwise.
span_id String or () Span identifier for the current event.
span_start / span_end DateTime or () Span bounds for the current event.

conf

  • Type: Map
  • Configuration produced by --begin and CLI options; the map is frozen so mutation attempts are ignored. Read-only.

state

  • Type: StateMap (special wrapper, not a regular Map)
  • Mutable global map for complex state tracking across events. Available in all per-event stages (--filter, --exec, --begin, --end) in sequential mode only.

Mental Model

Think of state as a persistent notebook that travels with your log processor:

  • Each event can READ from and WRITE to this notebook
  • Previous events can leave notes for future events
  • The notebook persists across all files in a single run
  • When processing ends, the notebook is discarded

This differs from track_*() functions, which are write-only counters that accumulate metrics but can't influence per-event decisions.

When to Use

Use state for:

  • Deduplication (tracking seen IDs)
  • Cross-event dependencies and correlation
  • Storing complex objects (nested maps, arrays)
  • Conditional logic based on previous events
  • State machines and session reconstruction

Don't use state for:

  • Simple counting or metrics → use track_count(), track_sum(), etc.
  • These work in parallel mode too
  • More efficient (atomic operations vs RwLock)
  • Time-based or count-based batching → use spans
  • --span 5m for time windows, --span 1000 for event batches
  • Automatic per-window metrics with span.metrics

Comparison with track_*():

Feature state track_*()
Purpose Complex stateful logic Simple metrics & aggregations
Read access ✅ Yes (during processing) ❌ No (write-only, read in --end)
Parallel mode ❌ Sequential only ✅ Works in parallel
Storage Any Rhai value Any value (strings, numbers, etc.)
Performance Slower (RwLock) Faster (atomic/optimized)
Use for Deduplication, FSMs, correlation Counting, unique tracking, bucketing

State Lifecycle

State persists:

  • ✅ Across multiple input files in one invocation
  • ✅ From --begin through all events to --end
  • ✅ Between events within the same run

State resets:

  • ❌ Between separate kelora invocations (no persistence to disk)
  • ❌ When using state.clear() explicitly
  • ❌ Not per-file or per-batch (common misconception)

Example: Processing 3 files maintains one shared state:

# State persists across all 3 files
kelora -j file1.json file2.json file3.json \
  --exec 'state[e.user] = true'  # Deduplicates across ALL files

Operations

Direct operations (no conversion needed):

  • Indexing: state["key"] for get/set, returns () if key doesn't exist
  • Methods: contains(key), get(key), set(key, value), len(), is_empty(), keys(), values(), clear(), remove(key)
  • Operators: +=, mixin(map), fill_with(map)

For other map functions: Convert to regular map first using state.to_map(), then use any map function:

// Convert state to use functions like to_logfmt(), to_kv(), etc.
print(state.to_map().to_logfmt());
let json_str = state.to_map().to_json();

Performance Considerations

Memory usage: State grows with unique keys. For deduplication of millions of IDs, consider:

// Periodic cleanup
if state.len() > 100000 {
    state.clear();
}

// Time-based expiration
if !state.contains("cleanup_time") {
    state["cleanup_time"] = now();
}
if (now() - state["cleanup_time"]).as_secs() > 3600 {
    // Remove old entries
    for key in state.keys() {
        if should_expire(state[key]) {
            state.remove(key);
        }
    }
    state["cleanup_time"] = now();
}

Sequential processing: State requires sequential mode, which processes one event at a time. For large files (100M+ events), consider:

  • Using track_*() functions with --parallel when possible
  • Filtering before stateful processing to reduce volume
  • Breaking into smaller files for distributed processing

Debugging State

Inspect state at any point:

# Print state size periodically
kelora -j logs.json \
  --exec 'state["count"] = (state["count"] ?? 0) + 1;
           if state["count"] % 1000 == 0 {
             eprint("State size: " + state.len() + " keys");
           }'

# Dump final state as JSON
kelora -j logs.json \
  --exec 'state[e.user] = true' \
  --end 'print(state.to_map().to_json())' -q > state_dump.json

# Check state contents in --end stage
kelora -j logs.json \
  --exec 'state[e.level] = (state.get(e.level) ?? 0) + 1' \
  --end 'eprint("Final state: " + state.to_map().to_kv())'

Parallel Mode Restriction

Accessing state in --parallel mode causes a runtime panic with a clear error message. State requires sequential processing to maintain consistency:

# This will fail:
kelora -j logs.jsonl --parallel \
  --exec 'state["count"] += 1'
# Error: 'state' is not available in --parallel mode

For parallel-safe tracking, use track_*() functions instead.

Example Use Cases

Deduplication - track seen IDs:

if !state.contains(e.request_id) {
    state[e.request_id] = true;
    // Process first occurrence
} else {
    // Skip duplicate
    e = ();
}

Store complex nested state:

if !state.contains(e.user) {
    state[e.user] = #{login_count: 0, last_seen: (), errors: []};
}
let user_data = state[e.user];
user_data.login_count += 1;
user_data.last_seen = e.timestamp;
if e.has("error") {
    user_data.errors.push(e.error);
}
state[e.user] = user_data;

State machines for protocol analysis:

if !state.contains(e.conn_id) {
    state[e.conn_id] = "NEW";
}
let current_state = state[e.conn_id];

// State transitions
if current_state == "NEW" && e.event == "SYN" {
    state[e.conn_id] = "SYN_SENT";
} else if current_state == "SYN_SENT" && e.event == "SYN_ACK" {
    state[e.conn_id] = "ESTABLISHED";
} else {
    e.protocol_error = true;  // Invalid transition
}
e.connection_state = state[e.conn_id];

See examples/state_examples.rhai for more patterns.

Span Hooks (--span-close)

span

  • Type: Span (custom Rhai type)
  • Binding available only in --span-close. Read-only.
Property Type Description
span.id String Unique span identifier (#0, #1, ... for count spans; 2024-01-15T10:00:00Z/5m for time spans).
span.start / span.end DateTime or () Span boundary timestamps (time spans only).
span.size Int Number of events that survived filters and entered the span.
span.events Array<Map> Copy of each included event, with helper fields (line, line_num, filename, span_status, span_id, span_start, span_end). Read-only.
span.metrics Map Per-span deltas computed from track_*() calls since the span opened; zero-delta keys are omitted. Read-only.

--span-close runs without a "current event", so e and meta are empty. Inspect span.events when you need per-event details from inside the close hook.

metrics vs span.metrics

  • span.metrics: Only the delta accumulated while the span was open. After the hook runs, Kelora resets the baseline for the next span; mutations are discarded. Read-only.
  • metrics: The global tracker map used across the entire run (the same map exposed to --end). Values persist between spans and reflect cumulative totals; update via track_*() calls, not direct assignment. Read-only.

Windowed Scripts

  • window: Present when using --window features. It is an array of event maps representing the sliding window, with window[0] being the current event followed by prior events retained by the window manager. Read-only.

Begin / End Hooks

  • --begin: Runs before any event is parsed. e and meta start empty; use this stage for initialization and configuration population (conf).
  • --end: Receives the final metrics map so you can report overall totals after processing completes.