Case Study

Gen-AI Customer Service Agent

Fortune 500 Global Athletic Retailer

May – July 2025 (ongoing scale-up)

Executive Summary

The retailer's legacy rule-based chat bot contained only 5% of 16M+ annual support chats, forcing 95% of customers to a live agent and driving service costs above $6 per contact.

In eight weeks we piloted, validated, and began scaled roll-out of a large-language-model (LLM) agent on Salesforce Service Cloud. The agent now handles 10% of English-language traffic in the mobile app and is tracking to full ramp by 23 July 2025.

Achieving the FY25 target of 15% containment will save $5–10M in annual BPO spend while freeing live agents for revenue-generating interactions.

Pilot Results

Capture / Containment
Baseline: 5%
Pilot: 12%
⬆ 2.4×
Avg. Resolution Time
Baseline: 7 min
Pilot: < 3 min
⬇ 57%
CSAT (5-pt)
Baseline: 4.46
Pilot: 4.62
+ 0.16
Cost per Contained Chat
Baseline: $6.33
Pilot: $1.98
– 69%

The Challenge

  • Limited self-service: The existing bot covered only two post-purchase flows (order & return status).
  • Low engagement: Just 26% of chats touched the bot at all, and < 1% of pre-purchase chats were contained.
  • High transfer cost: Every escalation required manual look-ups in multiple systems (orders, returns, payments).

Objectives & Success Criteria

KPITarget (Phase 1)ResultStatus
Capture rate≥ 10%12%
CSAT delta vs. human≥ 0+0.16
Time-to-first-token< 2 s1.4 s
Hallucination rate< 1%0.3%

Solution Architecture

  • Intent Router: Semantic router selects the best LLM prompt or routes to an agent.
  • Gen-AI Agent Service: Azure-hosted GPT-4-o fine-tuned on retailer content; Claude Sonnet used for long-form explanations.
  • Agentic Actions: Secure adapters to Order, Returns, Gift-Card and Payment APIs; atomic actions executed and summarized for the user.
  • Fallback & Escalation: Automatic hand-off with full context when confidence < 0.6 or policy triggers.
  • Governance & Observability: Real-time hallucination tracer, PII redaction, and daily capture-rate dashboard in Looker.

Roll-Out Plan

Phase% TrafficDatesMilestone
Pilot (internal)0.5%6 – 17 MaySafety & baseline sign-off
Limited Launch10% EN-US20 May – 22 JulSlack launch 📣 & tuning
Full EN-US100%23 JulGo / No-Go
Global + MultilingualH2 2025planAdd ES, FR, DE

Business Impact

  • Cost avoidance: Each +1 pp capture ≈ $0.9M annual savings.
  • Agent productivity: Live agents spend 35% less time on status look-ups.
  • Reusable framework: Same micro-service now powers forthcoming Finance Helpdesk and B2B Sales agents.

My Role — Lead Program Manager, GenAI

Product Owner

Set KPIs, acceptance criteria, and guardrails.

Delivery Lead

Orchestrated 11 cross-functional squads to ship pilot in under eight weeks.

Change Manager

Authored launch comms, program brief, and weekly executive read-outs.

Scale Strategist

Built phased roll-out plan, cost-impact model, and multilingual roadmap.

"Andrew's program took us from 5% containment to a scalable LLM agent in weeks, unlocking measurable savings and setting the foundation for our next generation of customer-service experiences."
— VP, Digital Service & Support

Next Steps

  • Boost capture-rate to > 20% via richer agentic actions (promo code fixes, payment disputes).
  • Deploy semantic routing to upsell high-intent pre-purchase chats.
  • Expand analytics to voice-of-customer sentiment for continuous prompt tuning.

Ready to explore how an LLM agent could streamline your own support operations?

andrew@andrewhallberg.com

Andrew Hallberg

Andrew Hallberg

Senior Program Manager – AI @ Microsoft | Co-Founder & CTO @ HirelyAI

Andrew leads cross-functional AI and digital commerce programs at Microsoft and co-founded HirelyAI, a GenAI-native hiring platform. He specializes in AI program management, product strategy, and ethical AI implementation.