Testing Surfaces
| Surface | Use it for |
|---|---|
| Sandbox | Simulate a conversation with a selected agent and realistic context. |
| Activity | Inspect actual runs after an agent is published. |
| Inbox regenerate | Re-run an unsent proposed reply after updating configuration. |
| Operator | Ask for help diagnosing why an agent responded a certain way. |
Use Sandbox
Open Operator, switch to Sandbox when available, then choose the agent you want to test. Depending on your workspace, Sandbox settings can include:- Chat or voice agent selection
- Impersonated contact
- Listing or entity context
- Reservation context
- Custom field values
Inspect Sources And Tools
When a response includes supporting artifacts, review them:- Sources show which Knowledge was retrieved.
- Tool calls show which action the agent attempted.
- Tool results show whether the action succeeded, failed, or returned unexpected data.
Use Activity
Activity is the best place to inspect real published behavior. Use it after a publish to confirm that the agent is operating with the expected configuration. Use Activity when:- A customer-facing run behaved unexpectedly
- You need to compare the agent’s current behavior with previous runs
- You want to confirm the agent used the right source or tool
- You need evidence before changing a Skill or Tool
Regenerate From The Inbox
For an unsent proposed reply, regenerate after updating configuration to check whether the proposed answer improves. Regenerate is useful for one conversation. Sandbox is better for repeated testing and edge cases. Activity is better for real published runs.Test Checklist
- The agent answers common questions accurately.
- The agent refuses or escalates when it should.
- The agent uses the right Knowledge source.
- The agent calls tools only when the required information is present.
- The agent handles missing tool results without inventing an outcome.
- The reply tone matches Persona and relevant Skills.
- The behavior still works for negative examples.