Case Study

Gen-AI Customer Service Agent

Fortune 500 Global Athletic Retailer

May – July 2025 (ongoing scale-up)

Executive Summary

The retailer's legacy rule-based chat bot contained only 5% of 16M+ annual support chats, forcing 95% of customers to a live agent and driving service costs above $6 per contact.

In eight weeks we piloted, validated, and began scaled roll-out of a large-language-model (LLM) agent on Salesforce Service Cloud. The agent now handles 10% of English-language traffic in the mobile app and is tracking to full ramp by 23 July 2025.

Achieving the FY25 target of 15% containment will save $5–10M in annual BPO spend while freeing live agents for revenue-generating interactions.

Pilot Results

Capture / Containment

Baseline: 5%

Pilot: 12%

⬆ 2.4×

Avg. Resolution Time

Baseline: 7 min

Pilot: < 3 min

⬇ 57%

CSAT (5-pt)

Baseline: 4.46

Pilot: 4.62

+ 0.16

Cost per Contained Chat

Baseline: $6.33

Pilot: $1.98

– 69%

The Challenge

Limited self-service: The existing bot covered only two post-purchase flows (order & return status).
Low engagement: Just 26% of chats touched the bot at all, and < 1% of pre-purchase chats were contained.
High transfer cost: Every escalation required manual look-ups in multiple systems (orders, returns, payments).

Objectives & Success Criteria

KPI	Target (Phase 1)	Result	Status
Capture rate	≥ 10%	12%	✅
CSAT delta vs. human	≥ 0	+0.16	✅
Time-to-first-token	< 2 s	1.4 s	✅
Hallucination rate	< 1%	0.3%	✅

Solution Architecture

Intent Router: Semantic router selects the best LLM prompt or routes to an agent.
Gen-AI Agent Service: Azure-hosted GPT-4-o fine-tuned on retailer content; Claude Sonnet used for long-form explanations.
Agentic Actions: Secure adapters to Order, Returns, Gift-Card and Payment APIs; atomic actions executed and summarized for the user.
Fallback & Escalation: Automatic hand-off with full context when confidence < 0.6 or policy triggers.
Governance & Observability: Real-time hallucination tracer, PII redaction, and daily capture-rate dashboard in Looker.

Roll-Out Plan

Phase	% Traffic	Dates	Milestone
Pilot (internal)	0.5%	6 – 17 May	Safety & baseline sign-off
Limited Launch	10% EN-US	20 May – 22 Jul	Slack launch 📣 & tuning
Full EN-US	100%	23 Jul	Go / No-Go
Global + Multilingual	H2 2025	plan	Add ES, FR, DE

Business Impact

Cost avoidance: Each +1 pp capture ≈ $0.9M annual savings.
Agent productivity: Live agents spend 35% less time on status look-ups.
Reusable framework: Same micro-service now powers forthcoming Finance Helpdesk and B2B Sales agents.

My Role — Lead Program Manager, GenAI

Product Owner

Set KPIs, acceptance criteria, and guardrails.

Delivery Lead

Orchestrated 11 cross-functional squads to ship pilot in under eight weeks.

Change Manager

Authored launch comms, program brief, and weekly executive read-outs.

Scale Strategist

Built phased roll-out plan, cost-impact model, and multilingual roadmap.

"Andrew's program took us from 5% containment to a scalable LLM agent in weeks, unlocking measurable savings and setting the foundation for our next generation of customer-service experiences."
— VP, Digital Service & Support

Next Steps

Boost capture-rate to > 20% via richer agentic actions (promo code fixes, payment disputes).
Deploy semantic routing to upsell high-intent pre-purchase chats.
Expand analytics to voice-of-customer sentiment for continuous prompt tuning.

Ready to explore how an LLM agent could streamline your own support operations?

andrew@andrewhallberg.com

Andrew Hallberg

Senior Program Manager – AI @ Microsoft | Co-Founder & CTO @ HirelyAI

Andrew leads cross-functional AI and digital commerce programs at Microsoft and co-founded HirelyAI, a GenAI-native hiring platform. He specializes in AI program management, product strategy, and ethical AI implementation.