Skip to main content
Test agents before you deploy or after you make meaningful changes to Knowledge, Skills, Tools, Persona, or Behavior.

Testing Surfaces

SurfaceUse it for
SandboxSimulate a conversation with a selected agent and realistic context.
ActivityInspect actual runs after an agent is published.
Inbox regenerateRe-run an unsent proposed reply after updating configuration.
OperatorAsk for help diagnosing why an agent responded a certain way.

Use Sandbox

Open Operator, switch to Sandbox when available, then choose the agent you want to test. Depending on your workspace, Sandbox settings can include:
  • Chat or voice agent selection
  • Impersonated contact
  • Listing or entity context
  • Reservation context
  • Custom field values
Send messages that match real customer language. Include edge cases, missing information, negative examples, and messages that should trigger escalation.

Inspect Sources And Tools

When a response includes supporting artifacts, review them:
  • Sources show which Knowledge was retrieved.
  • Tool calls show which action the agent attempted.
  • Tool results show whether the action succeeded, failed, or returned unexpected data.
Use this inspection to decide whether the fix belongs in Knowledge, a Skill, a Tool description, Persona, or Behavior.

Use Activity

Activity is the best place to inspect real published behavior. Use it after a publish to confirm that the agent is operating with the expected configuration. Use Activity when:
  • A customer-facing run behaved unexpectedly
  • You need to compare the agent’s current behavior with previous runs
  • You want to confirm the agent used the right source or tool
  • You need evidence before changing a Skill or Tool

Regenerate From The Inbox

For an unsent proposed reply, regenerate after updating configuration to check whether the proposed answer improves. Regenerate is useful for one conversation. Sandbox is better for repeated testing and edge cases. Activity is better for real published runs.

Test Checklist

  • The agent answers common questions accurately.
  • The agent refuses or escalates when it should.
  • The agent uses the right Knowledge source.
  • The agent calls tools only when the required information is present.
  • The agent handles missing tool results without inventing an outcome.
  • The reply tone matches Persona and relevant Skills.
  • The behavior still works for negative examples.