Evaluator Engine

The evaluator engine executes magic rules against file buffers to identify file types. It provides safe, efficient rule evaluation with hierarchical processing, graceful error recovery, and configurable resource limits.

Overview

The evaluator processes magic rules hierarchically:

Load file into memory-mapped buffer
Resolve offsets (absolute, relative, from-end)
Read typed values from buffer with bounds checking
Apply operators for comparison
Process children if parent rule matches
Collect results with match metadata

Architecture

File Buffer → Offset Resolution → Type Reading → Operator Application → Results
     ↑              ↑                  ↑              ↑                    ↑
Memory Map    Context State      Endian Handling   Match Logic      Hierarchical

Core Components

EvaluationContext (`evaluator/mod.rs`)

Maintains state during rule processing:

#![allow(unused)]
fn main() {
pub struct EvaluationContext {
    /// Current offset position for relative calculations
    current_offset: usize,
    /// Current recursion depth for safety limits
    recursion_depth: u32,
    /// Configuration for evaluation behavior
    config: EvaluationConfig,
}
}

Note: Fields are private; use accessor methods like current_offset(), recursion_depth(), and config().

Key Methods:

new() - Create context with default configuration
with_config() - Create context with custom configuration
check_timeout() - Verify evaluation hasn’t exceeded time limit
increment_depth() / decrement_depth() - Track recursion safely

MatchResult (`evaluator/mod.rs`)

Represents a successful rule match:

#![allow(unused)]
fn main() {
pub struct MatchResult {
    /// Human-readable description from the matched rule
    pub message: String,
    /// Offset where the match occurred
    pub offset: usize,
    /// Depth in the rule hierarchy (0 = root rule)
    pub level: u32,
    /// The matched value (parsed according to rule type)
    pub value: Value,
}
}

The Value type is from parser::ast::Value and represents the actual matched content according to the rule’s type specification.

Offset Resolution (`evaluator/offset.rs`)

Handles all offset types safely:

Absolute offsets: Direct file positions (0, 0x100)
Relative offsets: Based on previous match positions (&+4)
From-end offsets: Calculated from file size (-4 from end)
Bounds checking: All offset calculations are validated

#![allow(unused)]
fn main() {
pub fn resolve_offset(
    spec: &OffsetSpec,
    buffer: &[u8],
    context: &EvaluationContext,
) -> Result<usize, EvaluationError>
}

Type Reading (`evaluator/types.rs`)

Interprets bytes according to type specifications:

Byte: Single byte values
Short: 16-bit integers with endianness
Long: 32-bit integers with endianness
String: Byte sequences with length limits
Bounds checking: Prevents buffer overruns

#![allow(unused)]
fn main() {
pub fn read_type_value(
    buffer: &[u8],
    offset: usize,
    type_kind: &TypeKind,
) -> Result<TypeValue, TypeReadError>
}

Operator Application (`evaluator/operators.rs`)

Applies comparison operations:

Equal (=, ==): Exact value matching
NotEqual (!=, <>): Non-matching values
BitwiseAnd (&): Pattern matching for flags
BitwiseAndMask: AND with mask then compare

#![allow(unused)]
fn main() {
pub fn apply_operator(
    op: &Operator,
    actual: &TypeValue,
    expected: &Value,
) -> bool
}

Evaluation Algorithm

The evaluator uses a depth-first hierarchical algorithm:

#![allow(unused)]
fn main() {
pub fn evaluate_rules(
    rules: &[MagicRule],
    buffer: &[u8],
) -> Result<Vec<MatchResult>, EvaluationError>
}

Algorithm:

For each root rule:
- Resolve offset from buffer
- Read value at offset according to type
- Apply operator to compare actual vs expected
- If match: add to results, recursively evaluate children
- If no match: skip children, continue to next rule
Child rules inherit context from parent match
Results accumulate hierarchically (parent message + child details)

Hierarchical Processing

flowchart TD
    R[Root Rule<br/>e.g., "0 string \x7fELF"]
    R -->|match| C1[Child Rule 1<br/>e.g., ">4 byte 1"]
    R -->|match| C2[Child Rule 2<br/>e.g., ">4 byte 2"]
    C1 -->|match| G1[Result:<br/>ELF 32-bit]
    C2 -->|match| G2[Result:<br/>ELF 64-bit]

    style R fill:#e3f2fd
    style C1 fill:#fff3e0
    style C2 fill:#fff3e0
    style G1 fill:#c8e6c9
    style G2 fill:#c8e6c9

Configuration

Evaluation behavior is controlled via EvaluationConfig:

#![allow(unused)]
fn main() {
pub struct EvaluationConfig {
    /// Maximum recursion depth for nested rules (default: 20)
    pub max_recursion_depth: u32,
    /// Maximum string length to read (default: 8192)
    pub max_string_length: usize,
    /// Stop at first match or continue for all matches (default: true)
    pub stop_at_first_match: bool,
    /// Enable MIME type mapping in results (default: false)
    pub enable_mime_types: bool,
    /// Timeout for evaluation in milliseconds (default: None)
    pub timeout_ms: Option<u64>,
}
}

Preset Configurations:

#![allow(unused)]
fn main() {
// Default balanced configuration
let config = EvaluationConfig::default();

// Optimized for speed
let config = EvaluationConfig::performance();

// Find all matches with full details
let config = EvaluationConfig::comprehensive();
}

Safety Features

Memory Safety

Bounds checking: All buffer access is validated before reading
Integer overflow protection: Safe arithmetic using checked_* and saturating_*
Resource limits: Configurable limits prevent resource exhaustion

Error Handling

The evaluator uses graceful degradation:

Invalid offsets: Skip rule, continue with others
Type mismatches: Skip rule, continue with others
Timeout exceeded: Return partial results collected so far
Recursion limit: Stop descent, continue siblings

#![allow(unused)]
fn main() {
pub enum EvaluationError {
    BufferOverrun { offset: usize },
    InvalidOffset { offset: i64 },
    UnsupportedType { type_name: String },
    RecursionLimitExceeded { depth: u32 },
    StringLengthExceeded { length: usize, max_length: usize },
    InvalidStringEncoding { offset: usize },
    Timeout { timeout_ms: u64 },
    TypeReadError(TypeReadError),
}
}

Timeout Protection

#![allow(unused)]
fn main() {
// With 5 second timeout
let config = EvaluationConfig {
    timeout_ms: Some(5000),
    ..Default::default()
};

let result = evaluate_rules_with_config(&rules, buffer, config)?;
}

API Reference

Primary Functions

#![allow(unused)]
fn main() {
/// Evaluate rules with default configuration
pub fn evaluate_rules(
    rules: &[MagicRule],
    buffer: &[u8],
) -> Result<Vec<MatchResult>, EvaluationError>;

/// Evaluate rules with custom configuration
pub fn evaluate_rules_with_config(
    rules: &[MagicRule],
    buffer: &[u8],
    config: EvaluationConfig,
) -> Result<Vec<MatchResult>, EvaluationError>;

/// Evaluate a single rule (used internally and for testing)
pub fn evaluate_single_rule(
    rule: &MagicRule,
    buffer: &[u8],
    context: &mut EvaluationContext,
) -> Result<Option<MatchResult>, EvaluationError>;
}

Usage Example

#![allow(unused)]
fn main() {
use libmagic_rs::{evaluate_rules, EvaluationConfig};
use libmagic_rs::parser::parse_text_magic_file;

// Parse magic rules
let magic_content = r#"
0 string \x7fELF ELF executable
>4 byte 1 32-bit
>4 byte 2 64-bit
"#;
let rules = parse_text_magic_file(magic_content)?;

// Read target file
let buffer = std::fs::read("sample.bin")?;

// Evaluate with default config
let matches = evaluate_rules(&rules, &buffer)?;

for m in matches {
    println!("Match at offset {}: {}", m.offset, m.message);
}
}

Implementation Status

Basic evaluation engine structure
Offset resolution (absolute, relative, from-end)
Type reading with endianness support (Byte, Short, Long, String)
Operator application (Equal, NotEqual, BitwiseAnd)
Hierarchical rule processing with child evaluation
Error handling with graceful degradation
Timeout protection
Recursion depth limiting
Comprehensive test coverage (100+ tests)
Indirect offset support (pointer dereferencing)
Regex type support
Performance optimizations (rule ordering, caching)

Libmagic-rs Developer Guide