Function Reference¶

Complete reference for all 150+ built-in Rhai functions available in Kelora. Functions are organized by category for easy lookup.

Function Call Syntax

Rhai allows two styles: value.method(args) or function(value, args). Use whichever feels more natural.

String Functions - Text manipulation, parsing, encoding
Array Functions - Array operations, sorting, filtering
Map/Object Functions - Field access, manipulation, conversion
DateTime Functions - Time parsing, formatting, arithmetic
Math Functions - Numeric operations
Type Conversion - Safe type conversions
Utility Functions - Environment, files, pseudonyms
Tracking/Metrics - Counters, aggregations
File Output - Writing data to files
Event Manipulation - Field removal, fan-out
Span Context - Per-span metadata & rollups

String Functions¶

Extraction and Searching¶

`text.extract_regex(pattern [, group])`¶

Extract first regex match or capture group.

e.error_code = e.message.extract_regex(r"ERR-(\d+)", 1)  // "ERR-404" → "404"
e.full_match = e.line.extract_regex(r"\d{3}")            // First 3-digit number

`text.extract_regexes(pattern [, group])`¶

Extract all regex matches as array.

e.numbers = e.line.extract_regexes(r"\d+")             // All numbers
e.codes = e.message.extract_regexes(r"ERR-(\d+)", 1)   // All error codes

`text.extract_re_maps(pattern, field)`¶

Extract regex matches as array of maps for fan-out with emit_each().

// Extract all error codes with context
let errors = e.log.extract_re_maps(r"(?P<code>ERR-\d+): (?P<msg>[^\n]+)", "error");
emit_each(errors)  // Each match becomes an event with 'code' and 'msg' fields

`text.extract_ip([nth])`¶

Extract IP address from text (nth: 1=first, -1=last).

e.client_ip = e.headers.extract_ip()                  // First IP
e.origin_ip = e.forwarded.extract_ip(-1)              // Last IP

`text.extract_ips()`¶

Extract all IP addresses as array.

e.all_ips = e.headers.extract_ips()                   // ["192.168.1.1", "10.0.0.1"]

`text.extract_url([nth])`¶

Extract URL from text (nth: 1=first, -1=last).

e.link = e.message.extract_url()                      // First URL

`text.extract_domain()`¶

Extract domain from URL or email address.

e.domain = "https://api.example.com/path".extract_domain()  // "example.com"
e.mail_domain = "user@corp.example.com".extract_domain()    // "corp.example.com"

String Slicing and Position¶

`text.before(delimiter [, nth])`¶

Text before occurrence of delimiter (nth: 1=first, -1=last).

e.user = e.email.before("@")                          // "user@host.com" → "user"
e.path = e.url.before("?")                            // Strip query string

`text.after(delimiter [, nth])`¶

Text after occurrence of delimiter (nth: 1=first, -1=last).

e.extension = e.filename.after(".")                   // "file.txt" → "txt"
e.domain = e.email.after("@")                         // "user@host.com" → "host.com"

`text.between(start, end [, nth])`¶

Text between start and end delimiters (nth: 1=first, -1=last).

Note: text.between(left, right, nth) is equivalent to text.after(left, nth).before(right).

e.quoted = e.line.between('"', '"')                   // Extract quoted string
"[a][b][c]".between("[", "]", 2)                      // "b" - same as .after("[", 2).before("]")

`text.starting_with(prefix [, nth])`¶

Return substring from prefix to end (nth: 1=first, -1=last).

e.from_error = e.log.starting_with("ERROR:")          // "INFO: ok ERROR: bad" → "ERROR: bad"

`text.ending_with(suffix [, nth])`¶

Return substring from start to end of suffix (nth: 1=first, -1=last).

e.up_to_end = e.log.ending_with(".txt")               // "file.txt more" → "file.txt"

`text.slice(spec)`¶

Slice text using Python notation (e.g., "1:5", ":3", "-2:").

e.first_three = e.code.slice(":3")                    // "ABCDEF" → "ABC"
e.last_two = e.code.slice("-2:")                      // "ABCDEF" → "EF"
e.middle = e.code.slice("2:5")                        // "ABCDEF" → "CDE"

Column Extraction¶

`text.col(spec [, separator])`¶

Extract columns by index/range/list (e.g., '1', '1,3,5', '1:4').

e.first = e.line.col("1")                             // First column (1-indexed)
e.cols = e.line.col("1,3,5")                          // Columns 1, 3, 5
e.range = e.line.col("2:5", "\t")                     // Columns 2-5, tab-separated

Parsing Functions¶

`text.parse_json()`¶

Parse JSON string into map/array.

e.data = e.payload.parse_json()
e.value = e.data["key"]

`text.parse_logfmt()`¶

Parse logfmt line into structured fields.

let fields = e.line.parse_logfmt()
e.level = fields["level"]

`text.parse_syslog()`¶

Parse syslog line into structured fields.

let syslog = e.line.parse_syslog()
e.priority = syslog["priority"]
e.message = syslog["message"]

`text.parse_combined()`¶

Parse Apache/Nginx combined log line.

let access = e.line.parse_combined()
e.ip = access["ip"]
e.status = access["status"]

`text.parse_cef()`¶

Parse Common Event Format line into fields.

let cef = e.line.parse_cef()
e.severity = cef["severity"]

`text.parse_kv([sep [, kv_sep]])`¶

Parse key-value pairs from text. Only extracts tokens containing the key-value separator; tokens without the separator are skipped (e.g., prose words or unpaired values).

e.params = e.query.parse_kv("&", "=")                 // "a=1&b=2" → {a: "1", b: "2"}
e.fields = e.msg.parse_kv()                           // "Payment timeout order=1234" → {order: "1234"}

`text.parse_url()`¶

Parse URL into structured components.

let url = e.request.parse_url()
e.scheme = url["scheme"]
e.host = url["host"]
e.path = url["path"]

`text.parse_query_params()`¶

Parse URL query string into map.

e.params = e.query_string.parse_query_params()        // "a=1&b=2" → {a: "1", b: "2"}

`text.parse_email()`¶

Parse email address into parts.

let email = "User Name <user@example.com>".parse_email()
e.name = email["name"]       // "User Name"
e.address = email["address"] // "user@example.com"

`text.parse_user_agent()`¶

Parse common user-agent strings into components.

let ua = e.user_agent.parse_user_agent()
e.browser = ua["browser"]
e.os = ua["os"]

`text.parse_jwt()`¶

Parse JWT header/payload without verification.

let jwt = e.token.parse_jwt()
e.user_id = jwt["payload"]["sub"]

`text.parse_path()`¶

Parse filesystem path into components.

let path = "/var/log/app.log".parse_path()
e.dir = path["dir"]          // "/var/log"
e.file = path["file"]        // "app.log"

`text.parse_media_type()`¶

Parse media type tokens and parameters.

let mt = "text/html; charset=utf-8".parse_media_type()
e.type = mt["type"]          // "text"
e.subtype = mt["subtype"]    // "html"

`text.parse_content_disposition()`¶

Parse Content-Disposition header parameters.

let cd = e.header.parse_content_disposition()
e.filename = cd["filename"]

Encoding and Hashing¶

`text.encode_b64()` / `text.decode_b64()`¶

Base64 encoding/decoding.

e.encoded = e.data.encode_b64()
e.decoded = e.payload.decode_b64()

`text.encode_hex()` / `text.decode_hex()`¶

Hexadecimal encoding/decoding.

e.hex = e.bytes.encode_hex()
e.bytes = e.hex_string.decode_hex()

`text.encode_url()` / `text.decode_url()`¶

URL percent encoding/decoding.

e.encoded = e.param.encode_url()                      // "hello world" → "hello%20world"
e.decoded = e.url_param.decode_url()

`text.escape_json()` / `text.unescape_json()`¶

JSON escape sequence handling.

e.escaped = e.text.escape_json()
e.unescaped = e.json_string.unescape_json()

`text.escape_html()` / `text.unescape_html()`¶

HTML entity escaping/unescaping.

e.safe = e.user_input.escape_html()                   // "<script>" → "&lt;script&gt;"
e.text = e.html_entity.unescape_html()

`text.hash([algo])`¶

Hash with algorithm (default: sha256, also: xxh3).

e.checksum = e.content.hash()                         // SHA-256
e.fast = e.data.hash("xxh3")                          // Fast non-crypto hash

`text.bucket()`¶

Fast hash for sampling/grouping (returns INT for modulo operations).

// Sample 10% of events
if e.user_id.bucket() % 10 == 0 {
    e.sampled = true
}

IP Address Functions¶

`text.is_ipv4()` / `text.is_ipv6()`¶

Check if text is a valid IP address.

if e.addr.is_ipv4() {
    e.ip_version = 4
}

`text.is_private_ip()`¶

Check if IP is in private ranges (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16).

if e.ip.is_private_ip() {
    e.internal = true
}

`text.is_in_cidr(cidr)`¶

Check if IP address is in CIDR network.

if e.ip.is_in_cidr("10.0.0.0/8") {
    e.corp_network = true
}

`text.mask_ip([octets])`¶

Mask IP address (default: last octet).

e.masked_ip = e.client_ip.mask_ip()                   // "192.168.1.100" → "192.168.1.0"
e.partial = e.ip.mask_ip(2)                           // Mask last 2 octets

Pattern Normalization¶

`text.normalized([patterns])`¶

Replace variable patterns with placeholders (e.g., <ipv4>, <email>).

Useful for identifying unique log patterns by normalizing variable data like IP addresses, UUIDs, and email addresses to fixed placeholders.

// Default patterns (IPs, emails, UUIDs, hashes, etc.)
e.pattern = e.message.normalized()
// "User user@test.com from 192.168.1.5" → "User <email> from <ipv4>"

// CSV-style pattern list
e.simple = e.message.normalized("ipv4,email")

// Array-style pattern list
e.custom = e.message.normalized(["uuid", "sha256", "url"])

Default patterns (when no argument provided): ipv4_port, ipv4, ipv6, email, url, fqdn, uuid, mac, md5, sha1, sha256, path, oauth, function, hexcolor, version

Available patterns (opt-in): hexnum, duration, num

Common use case - Pattern discovery:

# Recommended alias for easy pattern discovery
kelora --save-alias patterns \
  --exec 'track_unique("patterns", e.message.normalized())' \
  --metrics -q

# Usage
kelora -a patterns app.log

Output with many patterns:

patterns     (127 unique):
  User <email> from <ipv4>
  Request to <url> failed
  Error <uuid> occurred
  Connection <ipv4_port> established
  Processing <fqdn> with <sha256>
  [+122 more. Use --metrics-file or --end script for full list]

For custom analysis, access full data in --end scripts or --metrics-file.

String Manipulation¶

`text.strip([chars])` / `text.lstrip([chars])` / `text.rstrip([chars])`¶

Remove whitespace or specified characters.

e.clean = e.text.strip()                              // Remove leading/trailing whitespace
e.trimmed = e.line.lstrip("# ")                       // Remove "# " from left
e.path = e.filename.rstrip("/")                       // Remove trailing slashes

`text.clip()` / `text.lclip()` / `text.rclip()`¶

Remove non-alphanumeric characters from edges.

e.word = "'hello!'".clip()                            // → "hello"
e.left = "...start".lclip()                           // → "start"
e.right = "end...".rclip()                            // → "end"

`text.upper()` / `text.lower()`¶

Case conversion.

e.normalized = e.country_code.upper()                 // "us" → "US"
e.lowercase = e.name.lower()

`text.replace(pattern, replacement)`¶

Replace all occurrences of pattern.

e.cleaned = e.text.replace("ERROR", "WARN")

`text.split(separator)` / `text.split_re(pattern)`¶

Split string into array.

e.parts = e.path.split("/")
e.tokens = e.line.split_re(r"\s+")                    // Split on whitespace

String Testing¶

`text.contains(pattern)`¶

Check if text contains pattern.

if e.message.contains("timeout") {
    e.timeout_error = true
}

`text.like(pattern)`¶

Glob match (anchored) with * and ?.

if e.message.like("ERROR * timeout") {
    e.timeout_error = true
}

`text.ilike(pattern)`¶

Case-insensitive glob match with Unicode folding.

if e.message.ilike("*straße*") {
    e.locale = "de"
}

`text.matches(pattern)`¶

Regex search with cached compilation. Invalid patterns raise errors.

if e.path.matches(r"^/api/[^/]+/details$") {
    e.route = "details"
}

Text Matching Functions Comparison¶

Function	Anchored	Errors on invalid pattern	Case handling	Use case
`like()`	Yes	N/A (glob syntax)	Exact	Simple wildcard matching
`ilike()`	Yes	N/A	Unicode fold	Case-insensitive glob
`matches()`	No	Yes	Regex-driven	Full regex search with caching

⚠️ Regex performance tips: avoid nested quantifiers like (.*)*, prefer anchored patterns when possible, and reuse patterns to benefit from the per-thread cache.

`text.is_digit()`¶

Check if text contains only digits.

if e.status.is_digit() {
    e.status_code = e.status.to_int()
}

`text.count(pattern)`¶

Count occurrences of pattern in text.

e.error_count = e.log.count("ERROR")

`text.edit_distance(other)`¶

Compute Levenshtein edit distance between two strings.

if e.message.edit_distance("connection reset") <= 3 {
    e.is_connection_issue = true
}

`text.index_of(substring [, start])`¶

Find 0-based position of literal substring (-1 if not found). Optional start parameter specifies where to begin searching.

e.at_pos = e.url.index_of("?")                        // Find first "?"
e.second = e.text.index_of("test", 10)                // Search starting at position 10

Array Functions¶

Sorting and Filtering¶

`array.sorted()`¶

Return new sorted array (numeric/lexicographic).

e.sorted_scores = sorted(e.scores)                    // [3, 1, 2] → [1, 2, 3]
e.sorted_names = sorted(e.names)                      // Alphabetical

`array.sorted_by(field)`¶

Sort array of objects by field name.

let sorted_users = sorted_by(e.users, "age")
e.oldest = sorted_users[-1]

`array.reversed()`¶

Return new array in reverse order.

e.reversed = reversed(e.items)

`array.slice(spec)`¶

Slice array using Python notation (e.g., "1:5", ":3", "-2:").

e.top_three = e.values.slice(":3")                   // [9, 8, 7, 6] → [9, 8, 7]
e.tail = e.values.slice("-2:")                       // [9, 8, 7, 6] → [7, 6]
e.every_other = e.values.slice("0::2")               // [9, 8, 7, 6] → [9, 7]

`array.unique()`¶

Remove all duplicate elements (preserves first occurrence).

e.unique_tags = unique(e.tags)                        // [1, 2, 1, 3] → [1, 2, 3]

`array.filter(|item| condition)`¶

Keep elements matching condition.

e.errors = e.logs.filter(|log| log.level == "ERROR")

Aggregation¶

`array.max()` / `array.min()`¶

Find maximum/minimum value in array.

e.max_score = e.scores.max()
e.min_time = e.times.min()

`array.percentile(pct)`¶

Calculate percentile of numeric array.

e.p95 = e.latencies.percentile(95)
e.median = e.values.percentile(50)

`array.reduce(|acc, item| expr, init)`¶

Aggregate array into single value.

e.total = e.amounts.reduce(|sum, x| sum + x, 0)

Transformation¶

`array.map(|item| expression)`¶

Transform each element.

e.doubled = e.numbers.map(|n| n * 2)
e.names = e.users.map(|u| u.name)

`array.pluck(field)` / `array.pluck_as_nums(field)`¶

Extract a single field from each element in an array of maps/objects, returning a new array of just those field values.

pluck(field) - Extract field values as-is, skipping elements where the field is missing or ().

pluck_as_nums(field) - Extract and convert field values to f64 numbers, skipping elements where conversion fails or the field is missing.

// Given array of event objects
let events = [
    #{status: 200, time: "1.5"},
    #{status: 404, time: "0.3"},
    #{status: 200, time: "2.1"}
]

// Extract field values
let statuses = events.pluck("status")        // [200, 404, 200]
let times = events.pluck_as_nums("time")     // [1.5, 0.3, 2.1] (converted to numbers)

// Compare to manual approach
let manual = events.map(|e| e.status)        // Same result, but errors if field missing

Common use cases:

// Calculate average response time
let times = events.pluck_as_nums("response_time")
let avg = times.reduce(|sum, x| sum + x, 0) / times.len()

// Find most common status codes
let codes = events.pluck("status")
for code in codes {
    track_count(code)
}

// With window for rolling analysis (requires --window)
let recent_times = window.pluck_as_nums("response_time")
e.avg_recent = recent_times.reduce(|sum, x| sum + x, 0) / recent_times.len()
e.spike = recent_times.filter(|t| t > 1000).len()

Why use pluck() vs map():

Safe: Automatically skips missing fields instead of erroring
Clear intent: Explicitly shows you're extracting one field
Type conversion: pluck_as_nums() handles string-to-number conversion

`array.flattened([style [, max_depth]])`¶

Flatten nested arrays/objects.

e.flat = [[1, 2], [3, 4]].flattened()                 // Returns flat map
e.fields = e.nested.flattened("dot", 2)               // Flatten to dot notation

Testing¶

`array.contains(value)`¶

Check if array contains value.

if e.roles.contains("admin") {
    e.is_admin = true
}

`array.contains_any(search_array)`¶

Check if array contains any search values.

if e.tags.contains_any(["error", "critical"]) {
    e.alert = true
}

`array.starts_with_any(search_array)`¶

Check if array starts with any search values.

if e.path_parts.starts_with_any(["/api", "/v1"]) {
    e.api_call = true
}

`array.all(|item| condition)` / `array.some(|item| condition)`¶

Check if all/any elements match condition.

e.all_valid = e.scores.all(|s| s >= 0)
e.has_errors = e.logs.some(|l| l.level == "ERROR")

Other Operations¶

`array.join(separator)`¶

Join array elements with separator.

e.path = e.parts.join("/")
e.csv = e.values.join(",")

`array.push(item)` / `array.pop()`¶

Add/remove items from array.

e.tags.push("new_tag")
let last = e.items.pop()

Map/Object Functions¶

Field Access¶

`map.get_path("field.path" [, default])`¶

Safe nested field access with fallback.

e.user_name = e.get_path("user.profile.name", "unknown")
e.score = e.get_path("stats.score", 0)

`map.has_path("field.path")`¶

Check if nested field path exists.

if e.has_path("error.details.code") {
    e.detailed_error = true
}

`map.path_equals("path", value)`¶

Safe nested field comparison.

if path_equals(e, "user.role", "admin") {
    e.elevated = true
}

`map.has("key")`¶

Check if map contains key with non-unit value.

if e.has("error_code") {
    // Field exists and has a value
}

Field Manipulation¶

`map.rename_field("old", "new")`¶

Rename a field, returns true if successful.

e.rename_field("old_name", "new_name")

`map.merge(other_map)`¶

Merge another map into this one (overwrites existing keys).

e.merge(#{status: "ok", timestamp: now()})

`map.enrich(other_map)`¶

Merge another map, inserting only missing keys (does not overwrite).

e.enrich(#{user: "default", level: "info"})  // Only adds if keys don't exist

`map.flattened([style [, max_depth]])`¶

Flatten nested object to dot notation.

let flat = e.nested.flattened("dot")                  // {a: {b: 1}} → {"a.b": 1}
let flat = e.nested.flattened("dot", 2)               // With max depth

`map.flatten_field("field_name")`¶

Flatten just one specific field from the map.

let flat = e.flatten_field("metadata")                // Flattens only e.metadata

`map.unflatten([separator])`¶

Reconstruct nested object from flat keys.

let nested = e.flat.unflatten(".")                    // {"a.b": 1} → {a: {b: 1}}

Format Conversion¶

`map.to_json([pretty])`¶

Convert map to JSON string.

e.payload = e.data.to_json()
e.readable = e.data.to_json(true)                     // Pretty-printed

`map.to_logfmt()`¶

Convert map to logfmt format string.

e.formatted = e.fields.to_logfmt()                    // {a: 1, b: 2} → "a=1 b=2"

`map.to_kv([sep [, kv_sep]])`¶

Convert map to key-value string with separators.

e.query = e.params.to_kv("&", "=")                    // {a: 1, b: 2} → "a=1&b=2"

`map.to_syslog()` / `map.to_cef()` / `map.to_combined()`¶

Convert map to specific log format.

e.syslog_line = e.fields.to_syslog()
e.cef_line = e.security_event.to_cef()
e.access_log = e.request.to_combined()

DateTime Functions¶

Creation¶

`now()`¶

Current timestamp (UTC).

e.timestamp = now()

`to_datetime(text [, fmt [, tz]])`¶

Convert string into datetime value with optional hints.

e.parsed = to_datetime("2024-01-15 10:30:00", "%Y-%m-%d %H:%M:%S", "UTC")
e.auto = to_datetime("2024-01-15T10:30:00Z")          // Auto-detect format

`to_duration("1h30m")`¶

Convert duration string into duration value.

let timeout = to_duration("5m")
e.deadline = now() + timeout

`duration_from_seconds(n)`, `duration_from_minutes(n)`, etc.¶

Create duration from specific units.

let hour = duration_from_hours(1)
let day = duration_from_days(1)

Formatting¶

`dt.to_iso()`¶

Convert datetime to ISO 8601 string.

e.iso_timestamp = e.timestamp.to_iso()                // "2024-01-15T10:30:00Z"

`dt.format("format_string")`¶

Format datetime using custom format string (see --help-time).

e.date = e.timestamp.format("%Y-%m-%d")               // "2024-01-15"
e.time = e.timestamp.format("%H:%M:%S")               // "10:30:00"

Component Extraction¶

`dt.year()`, `dt.month()`, `dt.day()`¶

Extract date components.

e.year = e.timestamp.year()
e.month = e.timestamp.month()
e.day = e.timestamp.day()

`dt.hour()`, `dt.minute()`, `dt.second()`¶

Extract time components.

e.hour = e.timestamp.hour()

Timezone Conversion¶

`dt.to_utc()` / `dt.to_local()`¶

Convert timezone.

e.utc_time = e.local_timestamp.to_utc()
e.local_time = e.utc_timestamp.to_local()

`dt.to_timezone("tz_name")`¶

Convert to named timezone.

e.ny_time = e.timestamp.to_timezone("America/New_York")

`dt.timezone_name()`¶

Get timezone name as string.

e.tz = e.timestamp.timezone_name()                    // "UTC"

Arithmetic and Comparison¶

`dt + duration`, `dt - duration`¶

Add/subtract duration from datetime.

e.future = now() + duration_from_hours(1)
e.past = now() - duration_from_days(7)

`dt1 - dt2`¶

Get duration between datetimes.

let elapsed = now() - e.start_time
e.duration_ms = elapsed.as_milliseconds()

`dt1 == dt2`, `dt1 > dt2`, etc.¶

Compare datetimes.

if e.timestamp > to_datetime("2024-01-01") {
    e.this_year = true
}

Duration Operations¶

`duration.as_seconds()`, `duration.as_milliseconds()`, etc.¶

Convert duration to specific units.

e.seconds = duration.as_seconds()
e.ms = duration.as_milliseconds()
e.hours = duration.as_hours()

`duration.to_string()` / `humanize_duration(ms)`¶

Format duration as human-readable string.

e.readable = duration.to_string()                     // "1h 30m"
e.humanized = humanize_duration(5400000)              // "1h 30m"

Math Functions¶

`abs(x)`¶

Absolute value of number.

e.magnitude = abs(e.value)

`clamp(value, min, max)`¶

Constrain value to be within min/max range.

e.bounded = clamp(e.score, 0, 100)

`floor(x)` / `round(x)`¶

Rounding operations.

e.floored = floor(e.value)
e.rounded = round(e.value)

`mod(a, b)` / `a % b`¶

Modulo operation with division-by-zero protection.

e.bucket = e.id % 10

`rand()` / `rand_int(min, max)`¶

Random number generation.

if rand() < 0.1 {                                     // 10% sampling
    e.sampled = true
}
e.random_id = rand_int(1000, 9999)

Type Conversion Functions¶

`to_int(value)` / `to_float(value)` / `to_bool(value)`¶

Convert value to type (returns () on error).

e.status = to_int(e.status_string)
e.score = to_float(e.score_string)

`to_int_or(value, default)` / `to_float_or(value, default)` / `to_bool_or(value, default)`¶

Convert value to type with fallback.

e.status = e.status_string.to_int_or(0)
e.score = e.score_string.to_float_or(0.0)

`value.or_empty()`¶

Convert empty values to Unit () for removal/filtering.

Converts conceptually "empty" values to Unit, which:

Removes the field when assigned (e.g., e.field = value.or_empty())
Gets skipped by track_*() functions
Works with missing fields (passes Unit through unchanged)

Supported empty values:

Empty string: "" → ()
Empty array: [] → ()
Empty map: #{} → ()
Unit itself: () → () (pass-through)

String extraction:

// Extract only when prefix exists, otherwise remove field
e.name = e.message.after("prefix:").or_empty()

// Track only non-empty values
track_unique("names", e.extracted.or_empty())

Array filtering:

// Only assign tags if array is non-empty
e.tags = e.tags.or_empty()  // [] becomes (), field removed

// Track only events with items
track_bucket("item_count", e.items.len())
if e.items.len() == 0 {
    e.items = e.items.or_empty()  // Remove empty array
}

Map filtering:

// Only keep non-empty metadata
e.metadata = e.parse_json().or_empty()  // {} becomes (), field removed

// Safe chaining with missing fields
e.optional = e.maybe_field.or_empty()  // Works even if maybe_field is ()

Common pattern - conditional extraction and tracking:

e.extracted = e.message.after("User:").or_empty()
track_unique("users", e.extracted)  // Only tracks when extraction succeeds

// Filter events with no data
e.results = e.search_results.or_empty()
track_unique("result_sets", e.results)  // Skips empty arrays and ()

Utility Functions¶

`get_env(var [, default])`¶

Get environment variable with optional default.

e.branch = get_env("CI_BRANCH", "main")
e.build_id = get_env("BUILD_ID")

`pseudonym(value, domain)`¶

Generate domain-separated pseudonym (requires KELORA_SECRET).

e.user_alias = pseudonym(e.username, "users")
e.ip_alias = pseudonym(e.client_ip, "ips")

`read_file(path)` / `read_lines(path)`¶

Read file contents.

e.config = read_file("config.json")
e.lines = read_lines("data.txt")

`print(message)` / `eprint(message)`¶

Print to stdout/stderr (suppressed with --no-script-output or data-only modes).

print("Processing event: " + e.id)
eprint("Warning: " + e.error)

`exit(code)`¶

Exit kelora with given exit code.

if e.critical {
    exit(1)
}

`skip()`¶

Skip the current event, mark it as filtered, and continue with the next one. Downstream stages and output for the skipped event do not run.

if e.endpoint == "/health" {
    skip();
}

`type_of(value)`¶

Get type name as string.

e.value_type = type_of(e.value)                       // "string", "int", "array", etc.

`window.pluck(field)` / `window.pluck_as_nums(field)`¶

Extract field values from the sliding window array (requires --window). See array.pluck() for detailed documentation.

The window variable is an array containing the N most recent events, making pluck() especially useful for rolling calculations and burst detection.

// Rolling average of response times
let recent_times = window.pluck_as_nums("response_time")
e.avg_recent = recent_times.reduce(|sum, x| sum + x, 0) / recent_times.len()

// Detect error bursts
let recent_statuses = window.pluck("status")
e.error_burst = recent_statuses.filter(|s| s >= 500).len() >= 3

// Compare current vs recent average
let recent_vals = window.pluck_as_nums("value")
e.spike = e.value > (recent_vals.reduce(|s, x| s + x, 0) / recent_vals.len()) * 2

Tracking/Metrics Functions¶

All tracking functions require the --metrics flag.

Unit Value Handling

All track_*() functions that accept values silently skip Unit () values. This enables safe tracking of optional or extracted fields without needing conditional checks.

Tracking Functions¶

`track_avg(key, value)`¶

Track average of numeric values for key. Automatically computes the average during output. Skips Unit () values. Works correctly in parallel mode.

track_avg("avg_latency", e.response_time)
track_avg(e.endpoint, e.duration_ms)

// Safe with conversions that may fail
let latency = e.latency_str.to_float()  // Returns () on error
track_avg("avg_ms", latency)            // Skips () values

`track_count(key)`¶

Increment counter for key by 1.

track_count(e.service)                                // Count by service
track_count("total")                                  // Global counter

`track_sum(key, value)`¶

Accumulate numeric values for key. Skips Unit () values.

track_sum("total_bytes", e.bytes)
track_sum(e.endpoint, e.response_time)

// Safe with conversions that may fail
let score = e.score_str.to_int()  // Returns () on error
track_sum("total_score", score)   // Skips () values

`track_min(key, value)` / `track_max(key, value)`¶

Track minimum/maximum value for key. Skips Unit () values.

track_min("fastest", e.response_time)
track_max("slowest", e.response_time)

`track_unique(key, value)`¶

Track unique values for key. Skips Unit () values.

track_unique("users", e.user_id)
track_unique("ips", e.client_ip)

// Combined with .or_empty() for conditional tracking
track_unique("names", e.message.after("User:").or_empty())

`track_bucket(key, bucket)`¶

Track values in buckets for histograms. Skips Unit () values.

let bucket = floor(e.response_time / 100) * 100
track_bucket("latency", bucket)

// Safe with optional fields
track_bucket("user_types", e.user_type.or_empty())  // Skips empty/missing

`track_top(key, item, n)` / `track_top(key, item, n, value)`¶

Track top N most frequent items (count mode) or highest-valued items (weighted mode). Skips Unit () values.

Count mode tracks the N items that appear most frequently:

// Track top 10 most common errors
track_top("common_errors", e.error_type, 10)

// Track top 5 most active users
track_top("active_users", e.user_id, 5)

Weighted mode tracks the N items with the highest custom values:

// Track top 10 slowest endpoints by latency
track_top("slowest_endpoints", e.endpoint, 10, e.latency_ms)

// Track top 5 biggest requests by bytes
track_top("heavy_requests", e.request_id, 5, e.bytes)

// Handles missing values gracefully
track_top("cpu_hogs", e.process, 10, e.cpu_time.or_empty())  // Skips ()

Output format: - Count mode: [{key: "item", count: 42}, ...] - Weighted mode: [{key: "item", value: 123.4}, ...] - Results are sorted by value descending, then alphabetically by key

`track_bottom(key, item, n)` / `track_bottom(key, item, n, value)`¶

Track bottom N least frequent items (count mode) or lowest-valued items (weighted mode). Skips Unit () values.

Count mode tracks the N items that appear least frequently:

// Track bottom 5 rarest errors
track_bottom("rare_errors", e.error_type, 5)

// Track least active users
track_bottom("inactive_users", e.user_id, 10)

Weighted mode tracks the N items with the lowest custom values:

// Track 10 fastest endpoints by latency
track_bottom("fastest_endpoints", e.endpoint, 10, e.latency_ms)

// Track smallest requests
track_bottom("tiny_requests", e.request_id, 5, e.bytes)

Output format: - Count mode: [{key: "item", count: 1}, ...] - Weighted mode: [{key: "item", value: 0.5}, ...] - Results are sorted by value ascending, then alphabetically by key

Memory Efficiency

track_top() and track_bottom() use bounded memory (O(N) per key) unlike track_bucket() which stores all unique values. For high-cardinality fields, prefer top/bottom tracking over bucketing.

Parallel Mode Behavior

In parallel mode, each worker maintains its own top/bottom N. During merge, the lists are combined, re-sorted, and trimmed to N. Final results are deterministic.

File Output Functions¶

All file output functions require the --allow-fs-writes flag.

`append_file(path, text_or_array)`¶

Append line(s) to file; arrays append one line per element.

append_file("errors.log", e.message)
append_file("batch.log", [e.line1, e.line2, e.line3])

`truncate_file(path)`¶

Create or zero-length a file for fresh output.

truncate_file("output.log")

`mkdir(path [, recursive])`¶

Create directory (set recursive=true to create parents).

mkdir("logs")
mkdir("deep/nested/path", true)

Event Manipulation¶

`emit_each(array [, base_map])`¶

Fan out array elements as separate events (returns emitted count).

emit_each(e.users)                                    // Each user becomes an event
emit_each(e.items, #{batch_id: e.batch_id})           // Add batch_id to each

// Use return value to track emission count
let count = emit_each(e.batch_items, #{batch_id: e.id})
track_sum("items_emitted", count)

`e = ()`¶

Clear entire event (remove all fields).

if e.should_drop {
    e = ()  // Event is filtered out
}

`e.field = ()`¶

Remove individual field from event.

e.password = ()                                       // Remove sensitive field
e.temp_data = ()                                      // Clean up temporary field

`e.absorb_kv(field [, options])`¶

Parse inline key=value tokens from a string field, merge the pairs into the event, and get a status report back. Returns a map with status, data, written, remainder, removed_source, and error so scripts can branch without guessing.

let res = e.absorb_kv("msg", #{ sep: ",", kv_sep: "=", keep_source: true });
if res.status == "applied" {
    e.cleaned_msg = res.remainder ?? "";
    // Parsed keys now live on the event; res.data mirrors the inserted pairs
}

Options:

sep: string or () (default whitespace) – token separator; () normalizes whitespace.
kv_sep: string (default "=") – separator between key and value.
keep_source: bool (default false) – leave the original field untouched; use remainder for cleaned text.
overwrite: bool (default true) – allow parsed keys to overwrite existing event fields; set false to skip conflicts.

Unknown option keys set status = "invalid_option"; in --strict mode this aborts the pipeline.

`e.absorb_json(field [, options])`¶

Parse a JSON object from a string field, merge its keys into the event, and return the same status map as absorb_kv(). On success the source field is deleted unless keep_source is true, and remainder is always ().

let res = e.absorb_json("payload");
if res.status == "applied" {
    e.actor = e.actor ?? e.user;      // merged from payload
} else if res.status == "parse_error" {
    warn(`bad payload: ${res.error}`);
}

Options:

keep_source: bool (default false) – keep the original JSON string instead of deleting the field.
overwrite: bool (default true) – allow parsed keys to replace existing event fields (false skips conflicts).

Other absorb options (like sep) are accepted for consistency but ignored. JSON parsing is all-or-nothing: invalid JSON or non-object payloads set status = "parse_error" and leave the event untouched.

Span Context – `--span-close` Only¶

A read-only span object is injected into scope whenever a --span-close script runs. Use it to emit per-span rollups after Kelora closes a count- or time-based window.

Span Identity¶

span.id returns the current span identifier. Count-based spans use #<index> (zero-based). Time-based spans use ISO_START/DURATION (e.g. 2024-05-19T12:00:00Z/5m).

let id = span.id;  // "#0" or "2024-05-19T12:05:00Z/5m"

Span Boundaries¶

span.start and span.end expose the half-open window bounds as DateTime values. Count-based spans return () for both fields.

if span.start != () {
    print(`Window: ${span.start} → ${span.end}`);
}

Span Size and Events¶

span.size reports how many events survived filters and were buffered in the span. span.events returns those events in arrival order. Each map includes span metadata fields (span_status, span_id, span_start, span_end) alongside the original event data.

let included = span.events
    .filter(|evt| evt.span_status == "included")
    .len();

Metrics Snapshot¶

span.metrics contains per-span deltas from track_* calls. Values reset automatically after each span closes, so you can emit per-span summaries without manual bookkeeping.

let metrics = span.metrics;
let hits = metrics["events"];          // from track_count("events")
let failures = metrics["failures"];    // from track_count("failures")
let ratio = if hits > 0 { failures * 100 / hits } else { 0 };
print(span.id + ": " + ratio.to_string() + "% failure rate");

Quick Reference by Use Case¶

Error Extraction:

e.error_code = e.message.extract_regex(r"ERR-(\d+)", 1)

IP Anonymization:

e.masked_ip = e.client_ip.mask_ip()
e.ip_alias = pseudonym(e.client_ip, "ips")

Time Filtering:

if e.timestamp > to_datetime("2024-01-01") {
    // Process recent events
}

Metrics Tracking:

track_count(e.service)
track_sum("bytes", e.response_size)
track_unique("users", e.user_id)

Array Fan-Out:

emit_each(e.users, #{batch_id: e.batch_id})

Safe Field Access:

e.user_name = e.get_path("user.profile.name", "unknown")
if e.has_path("error.details.code") {
    e.detailed = true
}

Function Reference¶

Quick Navigation¶

String Functions¶

Extraction and Searching¶

text.extract_regex(pattern [, group])¶

text.extract_regexes(pattern [, group])¶

text.extract_re_maps(pattern, field)¶

text.extract_ip([nth])¶

text.extract_ips()¶

text.extract_url([nth])¶

text.extract_domain()¶

String Slicing and Position¶

text.before(delimiter [, nth])¶

text.after(delimiter [, nth])¶

text.between(start, end [, nth])¶

text.starting_with(prefix [, nth])¶

text.ending_with(suffix [, nth])¶

text.slice(spec)¶

Column Extraction¶

text.col(spec [, separator])¶

Parsing Functions¶

text.parse_json()¶

text.parse_logfmt()¶

text.parse_syslog()¶

text.parse_combined()¶

text.parse_cef()¶

text.parse_kv([sep [, kv_sep]])¶

text.parse_url()¶

text.parse_query_params()¶

text.parse_email()¶

text.parse_user_agent()¶

text.parse_jwt()¶

text.parse_path()¶

text.parse_media_type()¶

text.parse_content_disposition()¶