CS-002 / B2B SaaS · customer support / AI audit
Found the agent quietly issuing $112K of unauthorized refunds.
An autonomous agent was resolving 38% of inbound tickets. Finance flagged a refund-rate anomaly; the support team couldn't explain it
// problem
An autonomous agent was resolving 38% of inbound tickets. Finance flagged a refund-rate anomaly; the support team couldn't explain it. We were brought in for a four-week diagnostic before a board review.
// constraints
- Production access read-only; no code changes during the audit
- Findings had to land before the board meeting in 28 days
- Client wanted truth, not reassurance
// approach
What changed.
Trace reconstruction
Replayed 30 days of agent traffic against an offline scorer; tagged every action by financial impact and policy compliance.
Adversarial battery
Ran 240 cases including jailbreaks, prompt injections, and emotionally-loaded escalations the agent hadn't seen.
Cost & drift analysis
Found a silent prompt change made 6 weeks earlier had loosened the refund-authorization criterion.
Report
30-page written report; 14 prioritized findings; two read-out sessions.
// results
Measured outcomes.
Unauthorized refunds (30d retroactive)
Identified
$112K → —
Prompt-injection success rate
↓ after fix
11% → 0.4%
Tickets requiring human review
↑ intentional
62% → 78%
Top 3 fixes closed regression
✓
— → Yes
The audit cost less than a quarter of what the unauthorized refunds had already cost us. We hired them to implement the fixes the next week.
VP of Customer Operations
An autonomous agent was resolving 38% of inbound tickets. Finance flagged a refund-rate anomaly; the support team couldn't explain it. We were brought in for a four-week diagnostic before a board review.
Silent prompt edits with no eval gate are the most common source of 'sudden' AI failures. Every prompt change should fail closed unless the suite passes.