Scenario Schema

Scenarios define test cases for workflows. They provide simulated events (mocking LLM and tool interactions) and expectations (assertions about what should happen).

Scenario

Scenario defines a complete test case for a workflow. A scenario provides simulated events that mock external interactions (LLM calls, tool executions) and expectations that verify the workflow behaves correctly. Example - Basic LLM response test:

name: simple_response
description: Tests basic LLM response handling
events:
  - node: call_llm
    type: llm_response
    text: "Hello!"
expect:
  outcome: completed
  reached: [call_llm]

Example - Agent loop with multiple iterations:

name: agent_two_iterations
description: Agent uses a tool then completes
events:
  - node: agent_loop.call_llm
    type: llm_response
    tool_calls: [{name: bash, input: {command: ls}}]
  - node: agent_loop.execute_tools
    type: tool_result
    tool: bash
    output: {result: "file.txt"}
  - node: agent_loop.call_llm
    type: llm_response
    text: "Done!"
expect:
  outcome: completed
  reached: [agent_loop.call_llm, agent_loop.execute_tools]

Example - Partial testing with state:

name: test_from_middle
description: Test from a specific point with pre-populated state
start_at: process_result
state:
  call_llm:
    message: {role: assistant, content: "Previous response"}
events:
  - node: process_result
    output: {formatted: "Processed: Previous response"}
expect:
  outcome: completed

Fields

Field	Type	Required	Description
`name`	string	Yes	Name uniquely identifies this scenario. Required.
`apiVersion`	string	No	ApiVersion specifies the schema version. Optional, for future compatibility.
`description`	string	No	Description documents what this scenario tests.
`events`	SimulatedEvent[]	Yes	Events lists the simulated events in execution order. Required.
`expect`	Expectation	No	Expect defines the expected outcome and assertions. Optional.
`inputs`	object	No	Inputs overrides workflow inputs for this scenario. Optional.
`start_at`	string	No	StartAt begins execution at a specific node instead of the entry point. Optional.
`state`	map[string]object	No	State pre-populates node outputs before execution. Optional.

SimulatedEvent

SimulatedEvent represents a single mocked event in a test scenario. Events target specific nodes and provide mock outputs. There are two modes:

Raw output mode: Set Output directly with the activity’s output structure
Typed mode: Set Type with typed fields (text, tool_calls, tool_output)

Typed mode is automatically converted to the appropriate output structure. Example - LLM response with text:

node: call_llm

type: llm_response
text: "Hello, how can I help you?"

Example - LLM response with tool calls:

node: agent_loop.call_llm

type: llm_response
tool_calls:

name: bash

input: {command: "ls -la"}

Example - Tool result:

node: agent_loop.execute_tools

type: tool_result
tool: bash
output: {result: "file1.txt\nfile2.txt"}

Example - Raw output mode:

node: custom_node

output:
message:
role: assistant
content: "Custom response"

Fields

Field	Type	Required	Description
`node`	string	No	Node targets a specific node using dot-notation for qualified IDs.
`output`	object	No	Output is the raw mock output (mutually exclusive with Type).
`type`	string	No	Type specifies the event type for automatic conversion.
`text`	string	No	Text is the LLM response text (for type: llm_response).
`tool_calls`	SimToolCall[]	No	ToolCalls are tool invocations from the LLM (for type: llm_response).
`tool`	string	No	Tool is the tool name (for type: tool_result or tool_error).
`tool_output`	object	No	ToolOutput is the tool execution result (for type: tool_result).

SimToolCall

SimToolCall represents a tool call in a simulated LLM response. Example:

tool_calls:
  - name: bash
    input: {command: "echo hello"}
  - name: search
    input: {query: "test results"}

Fields

Field	Type	Required	Description
`name`	string	Yes	Name is the tool name (e.g., “bash”, “search”, “edit”).
`input`	object	No	Input contains the tool’s input parameters.

Expectation

Expectation defines what to verify after a scenario runs. All fields are optional - specify only what you want to assert. Node references support qualified IDs for inner loop and workflow nodes. Example - Assert completion and reached nodes:

expect:
  outcome: completed
  reached: [call_llm, process_result]
  not_reached: [error_handler]

Example - Assert error state:

expect:
  outcome: error
  error_contains: "rate limit exceeded"
  error_node: call_llm

Example - Assert specific output values:

expect:
  outcome: completed
  node_outputs:
    classify:
      category: "question"
      confidence: 0.95

Fields

Field	Type	Required	Description
`outcome`	string	No	Outcome specifies whether the workflow should complete or error.
`reached`	string[]	No	Reached lists nodes that must be scheduled during the scenario (completed, skipped, or errored).
`not_reached`	string[]	No	NotReached lists nodes that must NOT be scheduled during the scenario.
`completed`	string[]	No	Completed lists nodes that must have executed successfully.
`skipped`	string[]	No	Skipped lists nodes that must have been skipped due to a false condition.
`error_contains`	string	No	ErrorContains specifies a substring that should appear in the error message.
`error_node`	string	No	ErrorNode specifies which node should produce the error.
`node_outputs`	map[string]object	No	NodeOutputs specifies expected output values for specific nodes.

Targeting Nodes

The node field on events targets specific nodes by ID:

Top-level nodes: node: "call_llm"
Inner loop nodes: node: "loop_id.inner_node_id" (dot-separated)
Inner workflow nodes: node: "workflow_id.inner_node_id" (for type: workflow with inline:)
Nested structures: node: "outer.inner.node_id"

For inline loops and inline workflow nodes, the simulator executes each inner node individually, evaluates conditions, and tracks skipped nodes with their qualified IDs. For ref-based nodes, the default is black-box mocking with the ref name. If the runner can resolve the reference and the scenario targets qualified inner nodes, the simulator executes the referenced workflow internally instead. Event matching:

Events with a node field are matched to that specific node
Events without a node field are consumed sequentially in order
Multiple events with the same node are consumed in order per-node (use for multi-iteration loops)

Event Types

Type	Description	Required Fields
`llm_response`	Simulate LLM returning text and/or tool_calls	`text` or `tool_calls`
`tool_result`	Simulate tool returning output	`tool`, `tool_output`
`tool_error`	Simulate tool error	`tool`, `tool_output` (with error)
`llm_error`	Simulate LLM error	`output` (with error structure)
`user_input`	Simulate user message	`text`

Complete Example

name: agent_tool_usage
description: Tests agent loop with tool execution

events:
  # First iteration: LLM requests a tool
  - node: agent_loop.call_llm
    type: llm_response
    tool_calls:
      - name: bash
        input: {command: "ls -la"}
  
  # Tool returns result
  - node: agent_loop.execute_tools
    type: tool_result
    tool: bash
    output: {result: "file1.txt\nfile2.txt"}
  
  # Second iteration: LLM completes
  - node: agent_loop.call_llm
    type: llm_response
    text: "I found 2 files: file1.txt and file2.txt"

expect:
  outcome: completed
  reached:
    - agent_loop.call_llm
    - agent_loop.execute_tools
    - agent_loop.save_result
  not_reached:
    - error_handler
  node_outputs:
    agent_loop:
      iterations: 2

​Scenario

​Fields

​SimulatedEvent

​Fields

​SimToolCall

​Fields

​Expectation

​Fields

​Targeting Nodes

​Event Types

​Complete Example

Scenario

Fields

SimulatedEvent

Fields

SimToolCall

Fields

Expectation

Fields

Targeting Nodes

Event Types

Complete Example