Skip to main content

Approvals and Guardrails

OpenBox evaluates governed Mastra boundaries and returns verdicts that the SDK enforces at runtime.

Verdicts

VerdictMeaningRuntime effect
allowContinue normallyExecution proceeds
constrainContinue with advisory constraintsExecution proceeds with constraints available in the response
require_approvalHuman review requiredExecution suspends or polls for approval
blockOperation must not continueExecution throws a stop-style error
haltWorkflow or agent run must stopExecution throws a halt error

Enforcement Model

For governed activities:

  1. ActivityStarted is evaluated first
  2. Input-side guardrails may apply
  3. The tool or step executes
  4. ActivityCompleted is evaluated
  5. Output-side guardrails may apply
  6. Approval may be required on either side

For workflows and agents:

  • WorkflowStarted can stop execution early
  • WorkflowCompleted can still be evaluated for policy and telemetry
  • WorkflowFailed records failure context

Important Live-Run Behavior

In a standard OpenBox deployment, policy evaluates before guardrails for a given event.

Operational consequence:

  • If policy returns a non-allow verdict such as require_approval, block, or halt, guardrails for that event may not run.
  • If a guardrail UI test passes but the live run shows no guardrail result, inspect the policy verdict first.

Guardrail Field Selection

For live activity guardrails, match on ActivityStarted whenever possible.

Recommended fields:

Activity typeField to checkExample use
writeFileinput.contentBanned content or PII in file contents
writeFileinput.pathPath restrictions
runCommandinput.commandBanned shell commands

Important:

  • Agent prompts are emitted as SignalReceived(user_input).
  • If your deployment only evaluates guardrails on activity events, those prompts are not inspected as activity inputs.

Approval Handling

When OpenBox returns require_approval, the SDK chooses the approval path based on execution context.

Workflow-Backed Execution

Preferred behavior:

  • approval state is created
  • the workflow suspends through Mastra resume behavior
  • later resume paths emit signals and continue after approval resolves

Non-Workflow Execution

Fallback behavior:

  • the SDK polls approval inline
  • execution continues only after approval is granted
  • timeout or rejection raises an approval error

Output-Time Approval

Approval is not limited to requested action. ActivityCompleted can also return require_approval, which is useful when policy needs to review actual output instead of just the requested operation.

Runtime Errors You Should Expect

ErrorMeaning
GovernanceHaltErrorOpenBox returned a stop or halt verdict, or fail-closed converted an API failure into a halt
GuardrailsValidationErrorGuardrail validation failed
ApprovalPendingErrorApproval is still pending or polling timed out
ApprovalRejectedErrorApproval explicitly rejected the activity
ApprovalExpiredErrorApproval expired before resolution

Production Recommendations

  1. Keep approval policy focused on business boundaries.
  2. Treat hook-triggered telemetry as internal by default.
  3. Test live guardrails only after confirming policy returns allow for that event.
  4. Use ActivityStarted selectors for tool-input guardrails.