Troubleshooting
Use this page to diagnose the most common startup, runtime, and UI interpretation issues with the OpenBox LangGraph SDK.
Startup Validation Fails
Typical causes:
OPENBOX_URLis missingOPENBOX_API_KEYis missing or malformed- the URL is not HTTPS outside localhost development
- only one of
OPENBOX_AGENT_DIDorOPENBOX_AGENT_PRIVATE_KEYis set - the API key validation call fails
What to do:
- Verify
OPENBOX_URLandOPENBOX_API_KEY. - If Require signing is enabled, verify both
OPENBOX_AGENT_DIDandOPENBOX_AGENT_PRIVATE_KEY. - Confirm the service can reach OpenBox Core.
- Use
validate=Falseonly for local mocks or tests.
OpenBox Returns 401 invalid token or agent identity
This usually means API-key authentication succeeded far enough to reach OpenBox, but the agent identity material did not match the registered agent.
What to verify:
- The API key belongs to the same OpenBox agent as
OPENBOX_AGENT_DID. OPENBOX_AGENT_DIDuses thedid:aip:<uuid>format.OPENBOX_AGENT_PRIVATE_KEYis the base64 raw 32-byte Ed25519 seed returned by OpenBox, not a PEM key or public key.- The DID private key has not been rotated since the runtime environment was configured.
- The runtime clock is synchronized so signature timestamp checks pass.
No Runs Appear In OpenBox
Check these first:
- The service can reach
OPENBOX_URLfrom its runtime environment. - The agent API key is valid for the intended OpenBox agent.
- The service invokes the governed handler returned by
create_openbox_graph_handler(), not the raw compiled graph. - The process was restarted after configuration changes.
- The graph invocation actually completed or emitted events through
ainvoke(),invoke(),astream(), orastream_events().
I See The User Prompt But No Tool Activity
This usually means the graph did not execute a tool path for that request. LangGraph can complete entirely through an LLM node.
What to verify:
- the model selected a tool call
- the graph's conditional edge routed to the tool node
- the tool name is not listed in
skip_tool_types - the event stream includes
on_tool_startandon_tool_end
A Guardrail UI Test Passes But The Live Run Does Not Fire
This usually means the live event shape differs from the test payload, or policy returned a non-allow verdict before the guardrail ran.
What to verify:
- the live event returns
allowfrom policy - the guardrail targets the correct event type, usually
ActivityStartedfor tool input - the field selector matches the emitted payload shape, such as
input.query,input.path,input.command, orprompt
I Do Not See Model Usage
Model usage depends on the provider response and event metadata. Some providers or model wrappers do not expose token usage in the callback payload.
What to verify:
- the provider returns token usage metadata
- the model call has human-turn prompt content and is not skipped as an internal promptless call
send_llm_start_eventis enabled so the SDK creates the LLM row that completion telemetry can close
I See Runs But No Tool Health
Tool health only appears for agents that actually execute tools. A graph that only performs model generation, routing, or delegation may not populate tool health for that run.
Approval Never Resumes
Check these first:
hitl.enabledis set toTrue.- The approval request appears in OpenBox.
- The process can reach OpenBox while polling.
- The approval has not expired.
- The runtime is not shutting down while the SDK waits for the decision.
The Dashboard Shows Hook Telemetry But I Expected A Business Activity
OpenBox can attach internal span-derived telemetry around the same runtime path. That telemetry is operational evidence, not a second business step.
Recommended interpretation:
- tool, LLM, and resolved subagent events are business activities
- HTTP, DB, traced-function, and optional file spans are supporting telemetry
- use
activity_idto correlate started and completed boundaries
Debug Logging
Enable verbose SDK logging to trace governance decisions:
OPENBOX_DEBUG=1 python agent.py
This helps diagnose missing events, unexpected verdicts, and configuration problems.